Extending OOXML

by jlundstocholm 30. April 2009 20:26

This article will have to topics - one about extending OOXML using the built-in extension mechanisms and one about extending OOXML itself.

Using built-in mechanisms

As I have written about earlier OOXML has a (fun) part containing mechanisms for extending OOXML with vendor/domain-specific extensions. That part is "Part 3 - Markup Compatibility and Extensibility". The part describes different techniques when extending OOXML - most interesting is propably the sections about "Markup Compatibility Attributes and Elements" describing ways to extend OOXML while enabling compatibility to e.g. earlier/current version of the specification.

So if you were a vendor wanting to add something to the spec - but couldn't wait for the slow ISO pace or simply needed the competitive edge of not revealing anything about future software releases to your competitors ... what could you do?

The first thing you should do is to decide if you want your new stuff to eventually make it into the spec. If you don't want that - you're done already.

Assuming you want it into the spec, here are a couple of hints to how you might approach it:

  1. Document your extensions thoroughly
  2. Present these extensions to SC34/WG4 with justification to how and why you want it into the spec
  3. Work with us to polish the nitty-gritty details that you overlooked
  4. Make sure there are no legal nor technical barriers to implementing these new features for your competitors
  5. Wait for the stuff to eventually be included in IS29500

So the real target of this is - if you haven't already guessed it - Microsoft. So to be even more specific, here's a little list of things to do for Microsoft - in case they want to extend IS29500:

You will propably have some additions to IS29500 in your implementation of Office 14. Assuming that you will at some point like these to be added to IS29500, this is what you should do:

  1. Document your extensions thoroughly. Remember, the quality of the documentation will be under the same scrutiny as the text of DIS29500 so please do it right the first time.
  2. Add the documentation of your extensions to your "Implementer's notes" on the DII-website. 
  3. Present these extensions to SC34/WG4 with justification to how and why you want it into the spec.
  4. Work with us to polish the nitty-gritty details that you overlooked.
  5. Include the extensions and the documentation for it in your OSP.
  6. Wait for the stuff to eventually be included in IS29500.

Remember, the minute the first public beta of Office 14 hits the web, the documentation of the extensions as well as inclusion in OSP should be finished. Not a month later, not a week later - on day one!

Extending IS29500 itself

There has been a lot of talk lately to how IS29500 will be extended in the future. Specifically, how - and where - will new additions be included? IS29500 is comprised of two schema sets - a strict set and a transitional set. Currently the strict set is created from the transitional set, so strict is in fact a proper subset of the transitional set.

However - there is no guarentee that this will always be so.

My gut feeling is that transitional should be preserved as the "reflection" of the existing Microsoft Office documents (until March 2008) - in other words in term with the scope of IS29500.  I think that any new stuff should be added to the strict schema set only. The term "transitional" clearly implies this. As I recall the feeling in Geneva at the BRM, the idea behind the transitional set was, that eventually it would no longer be needed and hence removed from the standard - at some point in the future. If we continue to add new features to the transitional set, we will never get to the point where we can honor the sentiment of this particular issue.

...  now at the moment, we haven't decided anything yet ... so right now anything goes.

But what are your thoughts?

Comments

4/30/2009 11:41:22 PM #

Jirka Kosek

Current IS29500 is organized in the way that Transitional adds some addtional features to Strict. So having new features only in Strict would mean quite drastic reorganization of spec. So I don't think that this is way to go.

Moreover I don't think that current Strict/Transitional division is very useful after all. If new features should be added only to Strict then Strict should be made really Strict first, for example by removing serial dates from it and overally cleaning it little bit. But I don't think that we have resources and mandate for doing it.

I think that we should keep Strict and Transitional in sync for now and maybe one day we might drop Transitional. Meanwhile we can collect somewhere things which should be designed differently in Strict OOXML (and also in ODF and other formats) and in 20 years come with new "perfect" office document format.

Jirka Kosek |

5/1/2009 8:04:02 PM #

Alex Brown

Jesper hi

This is a complex topic. Here's where I'm at …

It seems to me that the scope statement of 29500 is clear: the standard faces two ways, and one of those ways sees it capable of “faithfully representing the preexisting corpus […] produced by […] Microsoft Office”. To comply with this scope statement the standard needs to do this, and the transitional spec (Part 4) was created expressly for this purpose at the BRM.

I believe WG 4 should therefore be working to ensure that 29500 Transitional does in fact reflect that document corpus, and that it should be an honest “warts and all” guide to those documents that users can rely on, as demanded by the Nations at the BRM.

This means a handful of BRM mistakes need to be reverted.

So, my principle #1 is that the transitional (only) standard MUST align with existing (Office 2008 and earlier) MS documents, for that is its very raison d’être.

A corollary of this is that the transitional schemas are NOT to be used for extensions to the standard. Indeed this was envisaged in the original Canadian BRM text which stated:

“The intent of this Annex is to enable a transitional period during which existing binary documents being migrated to DIS 29500 can make use of legacy features to preserve their fidelity, while noting that new documents should not use them. […] The intent is to enable the future DIS 29500 maintenance group to choose, at a later date, to remove this set of features from a revised version of DIS 29500.” (my emphasis)

So my principle #2 is that the transitional standard is NOT extended (except to make it align better with the “existing corpus” referred to in the BRM text). On the contrary, SC 34 should seek to stabilise this Part as soon as possible (that would be in 2011, I think).

From this, it is clear that extensions must only be made for the Strict standard (Part 1). Granted, this will tend to force implementers away from using the transitional part for their future products – but that, it seems to me, was the clear intent anyway … the transitional features were only standardised for the purpose of legacy preservation.

Another consequence of this is that the strict schemas, when extended, will no longer be a proper subset of the transitional schemas. But that relationship, while interesting, seems to me to have little practical value, and I don’t think it’s a legitimate driver for deciding how extensions for 29500 are done.

Alex Brown United Kingdom |

5/3/2009 5:41:34 AM #

hAl

You are messing with PJ ?
You should be aware that Groklaw has a nasty habbit of putting accounts or IP adresses in a sandbox so that you think your reactions are visible to other people but in reality they are only visible to yourself and not to other Groklaw users.

Happened to me twice about 2 years ago  

hAl |

5/3/2009 5:59:43 AM #

hAl

the sandboxing of course after first she removed my regular account on groklaw.
She really hates people who oppose the groklaw opinion and its rather sheepish followers (not surprising if you are used to ban oposite opinions of course).

On Groklaw the world is simple.
Microsoft is evil, IBM is godlike and all those who disagree are shills and in the end no logner allowed to comment.

hAl |

5/5/2009 10:22:36 AM #

trackback

Trackback from Doug Mahugh

Links for 05/04/2009

Doug Mahugh |

5/12/2009 2:00:20 PM #

pingback

Pingback from codedstyle.com

Links for 05/04/2009 | Coded Style

codedstyle.com |

7/8/2011 2:37:33 AM #

trackback

Links for 05/04/2009

PHPPowerPoint 0.1.0 was released last week, as an open-source PHP API for generating PPTX files, much

Doug Mahugh |

Comments are closed