a 'mooh' point

clearly an IBM drone

Only fools rush in

In the past weeks the OOXML-battleground has been covered with buzz, rumours, and innuendo about the missing IS 29500. Rob kicked it off with him revealing the little secret that he was actually in posession of a preliminary edition of the not-yet published final text.

Today I was told that ITTF was frantically (my interpretation) rushing to get the text done and have it published.

That scares me a bit.

When ITTF/IEC publishes IS29500 - that will be the IS29500 we will use in the foreseeable future. Yes, there will likely be an errata-sheet, but that is the edtion we're gonna use. Rushing to finish it could impose errors in the text (yes, I am aware of the irony).

So dear ITTF, please take your time to finish putting the spec together - there is no rush. ODF was approved on May 8th 2006 and was not published until November 30th the same year. That is more than 6 months from approval to publication. Granted, there are differences between the ODF-case and the OOXML-case. ODF was not changed during the approval-phase of ISO whereas substantial changes were made to OOXML. Never the less, I would much rather wait another month to have it done right.

At least we are all equal here - it is to the advantage of no single perty to have publication delayed ... so shouldn't we wait until it is ready? 

No reason anymore to mandate anything but ODF?

Yesterday the news broke about Microsoft adding support for ODF in Microsoft Office 2007 SP2. Within minutes the news spread like fire on the hills of Malibu, California and blog-entries started to pop up everywhere - even Brian Jones has apparently returned from Winter hibernation and has made his first blog post in almost 6 weeks. Welcome back to the party, Brian.


The Denmark IT-news sphere was not hesitant on the keyboard as well and ComputerWorld Denmark posted an article yesterday evening and the competition on version2.dk followed up on the news this morning. I myself got the information from Luc Bollen in his comment in the article I wrote on document translation (and why it sucks). I was sitting under a maple-tree (or some other wooden artifact) having a beer with a friend after a fabulous sushi-dinner and could do absolutely nothing about it.


Well, the reactions to Microsoft's move has actually been surprisingly positive. Even the ADD-bunch at noooxml.org said "If this is an honest attempt to play nice, it is a very welcome move" and even IBM has been quite positive - prompting Bob Sutor to turn the axe on Apple saying: "Hey, Apple, what about you? Let’s see you do this in iWork!". Simply starting to beat on someone else reminds me of the John Wayne quote "A day without blood is like a day without sunshine".

But what is missing from the reactions?

OSP coverage of ODF

One of the side-effects of Microsoft joining OASIS ODF TC is that ODF will likely be included in the list of specifications covered by Microsoft's Open Specification Promise (OSP). The list of specificationshas not yet been updated, but I would expect it to be updated soon - or at least when they officially join the ODF TC. When you think about all the fuss around IPR in this Spring, it is quite surprising that noone has picked up on this. It rams a huge stick through the FUD about the OSP not being applicable for GPL-licensed software. Now the OSP covers ODF as well and thereby the native document format of OpenOffice.org [LGPL 3.0 license] and (I think) OpenOffice Novell Edition.

But why OOXML, then?

A lot of people are now spinning information about this move pulling the rug under OOXML and that ODF should be mandated everywhere - but nothing could be further from the truth. The reason why we approved OOXML still stands and the incompatible feature-sets of OOXML and ODF did not suddenly become compatible. There are still stuff in OOXML that cannot be persisted in ODF and vice versa. The backwards compatibility to the content in the existing corpus of binary documents is still a core value of OOXML and this incompatibility of ODF has not dissapeared. You will still loose information and functionality when you choose to persist an OOXML-file in ODF ... just as you would when persisting it to old WordPerfect formats. Insisting that having ODF-support in Microsoft Office (12 SP2) makes the need for OOXML go away is a moot point - since I am sure no one would argue to replace OOXML with TXT - simply because TXT is a supported format in Microsoft Office.

Microsoft steps up to the task at hand

Some quite extraordinary news emerged from the Redmond, WA, headquarters of Microsoft today. In summary, they announced that

  1. Microsoft will join OASIS ODF TC
  2. Microsoft will include ODF in their list of specifications covered by the Open Specification Promise (OSP)
  3. Microsoft will include full, native support for ODF 1.1 in Microsoft Office 14 and in Microsoft Office 12 SP2 - scheduled for Q2 2009. Microsoft Office 12 SP" will have built-in support for the three most widely used ISO-standards for document formats, e.g. OOXML, ODF and PDF.

My initial reaction when I heard it was "Wow . that's amazing". I am sure a lot of people will react "It's too little, too late", though, but let me use a couple of bytes to describe why I think it is a good move by Microsoft.

Microsoft joins OASIS ODF TC

Well, Microsoft has been widely criticised for not joining OASIS a few years ago. I think it is a bogus claim, but never the less; it has been on the minds of quite a lot of people. Novell has had a seat in both ECMA TC45 and OASIS ODF TC for some time now, and it is my firm belief that both consortia has benefited by this. The move by Microsoft to join OASIS ODF TC will likely have a similar effect. One of the most frequent requests in the standardisation of OOXML was to increase the feature-overlap of ODF and OOXML. This is quite difficult to accomplish (effectively) without knowing what the features of the other document format is (going to be). By Microsoft participating on both committees (and IBM will hopefully consider joining ECMA TC45) harmonization (or "enlargement of the feature-overlap") will likely occur at a quicker pace.

This also means that the worries some of us have had about Microsoft's future involvement in standardisation work around document formats has been toned down a bit. Microsoft is now actively participating in this work in ECMA, in ISO and also in OASIS. I think this is really good news. Not good news for Microsoft - but good news for those of us that are working with document formats every day.

Microsoft will cover ODF with OSP

One of the most difficult, non-technical, discussions during the standardisation of OOXML was legal aspects. It was discussions about different wordings in Sun's CNS, IBM's ISP and Microsoft's OSP (Jesus Christ, guys, pick ONE single acronym, already!) and the possible impact on implementers of ODF and OOXML. One of the aspects of the discussion that never really surfaced was that if IBM has software patents covering ODF - some of them quite possibly cover parts of OOXML as well. But the ISP of IBM does not mention OOXML - it only mentions ODF. This leaves me as a developer in quite a legal pickle, because by implementing OOXML I am covered by the OSP - but I am not covered by IBM's ISP (and vice versa). To me as a developer, Microsoft's coverage of ODF in their OSP is a good move, because it should remove all legal worries I might have around stepping into SW-patent covered territory.

ODF support in Microsoft Office

Microsoft will finally deliver on requests for native ODF-support for ODF in Microsoft Office. Microsoft will support ODF 1.1 in Microsoft Office 12 SP2 and also have built-in support for PDF and XPS (these are currently only available as a separate download).

Denmark is one of the countries where both ODF and OOXML have been approved for usage in the public sector. This is currently bringing quite a bit of complexity to the daily work of information workers since there are not many (if any) applications offering high fidelity, native support for both formats. They hence rely on translators like ODF-Converter or similar XSLT-based translators. It's a bad, but currently necessary, choice. The usage of translators for document conversion has been widely criticised, amongst others by Rob and I, and the built-in support for ODF in Microsoft Office is a great step in the right direction.

As with everything Microsoft does, we need a healthy amount of scepticism as to which extend they will deliver on their promises. However, I truly believe that the moves by Microsoft here are good news - regardless of the scepticism. An old proverb says "don't count your chickens before they hatch" - and this applies perfectly here. We will have to wait and see what will eventually happen - but so far . it looks good.

Beer, beer, beer ... bed, bed, bed ... (Danish)

I den danske del af ODF/OOXML-blogsfæren har vi et stykke tid talt om, at det kunne være skægt at mødes til en pilsner et eller andet sted i København. De fleste af os kender jo kun hinanden fra vores respektive navne og gravatars, og det er i hvert fald min erfaring, at man er lidt langsommere på aftrækkeren, når man har fået sat et ansigt på navnet.

Jeg indbyder derfor til en omgang øl (eller lignende, det kunne også være noget uden alkoholisk indhold) 

torsdag d. 12. juni 2008 kl. 17:00  

Jeg finder ud af et sted at være inden længe - men hvis du har et bud på et sted, så sig til. Jeg hælder selv til ScrollBar på ITU i Ørestaden.

Tilmelding sker i kommentartråden herunder. 

Official complaint on OOXML-procedures in Denmark

I just wanted to let you in on a bit of information here from sunny Copenhagen.

Denmark has joined Norway in the strange sense that the Danish NSB (Dansk Standard) has received an official complaint regarding the Danish vote on March 29th 2008. I am sure the news will spread to the rest of the blog-sphere soon, so be the first to get the information here (my translation)

The Municipality of Aarhus, who was a member of the OOXML-committee in Dansk Standard, has now complained about the "Yes"-vote in the ISO-approval [of OOXML]. The reason: No one knows the real content of the specification. [...] . "It is strange to vote 'Yes' to a standard, that could still be filled with flaws and defects. In principle, it might say that the Moon is made of Swiss cheese, and they voted 'Yes' to that, explains Jens Kjellerup to Computerworld."

Check it out here (in Danish): http://www.computerworld.dk/art/45835?a=fp_2&i=514

Document translation sucks (When Rob is right, he's right)

It is very seldom I read one of Rob's posts and think "That is just so true" - but yesterday was one of those occasions. I was reading through his latest post about load of different documents in a couple of applications and I couldn't help but smile when I got to the part where Rob made som observations about possible reasons for the poor load times of ODF-files using Microsoft Office 2003:

What is a file filter? It is like 1/2 of a translator. Instead of translating from one disk format to another disk format, it simply loads the disk format and maps it into an application-specific memory model that the application logic can operate directly on. This is far more efficient than translation. This is the untold truth that the layperson does not know. But this is how everyone does it. That is how we support formats in SmartSuite. That is how OpenOffice does it. And that is how MS Office does it for the file formats they care about. In fact, that is the way that Novell is now doing it now, since they discovered that the Microsoft approach is doomed to performance hell.

I have been trying to pitch my idea of "document format channels" for some time now. The basic idea is not to do translations between formats but to support the feature sets of both formats in the major applications.

I remember when I participated in the interop-work for the Danish Government in Fall 2007 and we tried to say something clever about the dissapointing results we saw of translation, we heard the rumours of Novell skipping the XSLT-translation of ODF to OOXML (and vice versa) and instead extend the internal object model of Novell's edition of OpenOffice.org . This was there the idea was born.

The idea was to round-trip documents in the format they were born and not to attempt translation (also, how the hell do you translate e.g. a digital signature between an ODF-file and an OOXML-file?).  What triggered the "vision" was that 1) the formats are not fully compatible and 2) translation sucks. In every interop-session I have attended and in every piece of interop-work I have participated in, there has been one, crystal clear conclusion:

When you translate, you loose information.

Essentially, translation is a poor-man's document consumption, because if you loose information when translating - why would do it? As Rob so correctly points out - when Microsoft chooses to use translators to enable "support" for ODF in their Microsoft Office suites, it's really another way of saying: "We don't really care about ODF". The same thing naturally goes for OpenOffice.org (and spin-offs). When they insist of implementing just import filters for OOXML and use translators to do so - they are saying exactly the same: "We don't really care about OOXML". In both cases what they are communicating to their users is really

We don't care that you loose information - you'll just have to settle for half of the correct solution

It's the same message I hear when some of my colleagues come to me and say: "Jesper, I finished the piece of code you wanted me to do". Sometimes I am blessed with conversations like:

Colleage: I finished the code piece
Jesper: Cool - does it work all right?
Colleage: Eh well, it compiles just fine ...

Is that good enough?

(and with this friendly post, I can only hope "someone" will accept the LinkedIn-invitation I sent in February just before the BRM in Geneva ... or maybe I should try Diigo instead?)


Do you license your blog-content?

A few weeks back I attended an IT-architecture conference in Aarhus, Denmark and one of the sessions I participated in was about licensing your software with OSS-licensing. It was originally about software licensing, but at the end of the session, the speaker asked the audience:

How many of you are bloggers?

A few of us raised our hands. Then he asked:

How many of you have thought about how you license your blog entries?

Well, I for one didn't have a clue. Then the other day I noticed a small image on the bottom of the posts of Rick Jelliffe saying "Some rights reserved". It linked to Creative Commons and that kindda got the ball rollin'. I read about the different license-models and I have come to the conclusion that the license most applicaple to me and the content I put online is the "Attribution"-model. This is the least restrictive of the Creative Commons licenses and is says in abstract:

This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered, in terms of what others can do with your works licensed under Attribution.

One reason I chose this was that I hereby grant everyone the right to use my work commercially. You see, say I in a post made an argument that Rob Weir liked so much that he wanted to quote me on his blog. Even though I am not a lawyer, I could fear that he might not publish it if it was under a "non-commercial"-license (IBM being a commercial company and all). So to be sure that most of you will be able to use the work I publish here, I chose the "Attribution"-license for my entries.

What about the rest of you - have you thought of this? 

Challenge (Part II)

A tongue-in-cheek challenge for Mr. Rob Weir.

[code=xml]<?xml version="1.0" encoding="UTF-8"?>
      <table:table table:name="Sheet1" table:protected="true" table:protection-key="8A45FB0C33667F9E33ECA007FCE4F6684DC5F242">
        <table:table-column />
        <table:table-row >
          <table:table-cell office:value-type="float" office:value="10">
        <table:table-row >
          <table:table-cell office:value-type="string">
              Dear Rob Weir. Please prove by this example that ODF is an "interoperable"
              document format and tell me how a consuming application should determine if the
              user should be allowed to modify the document. I do not think that it is.
              In fact I think that your statements that ODF is a document format that
              provides interoperability are brash, irresponsible and indefensible
              pieces of bombast that you should retract.

(and yes, one of the reasons for this post is to show off the cool syntax highlighter of this blog engine)


And could you guys please stop the bickering and let's move on to something a bit more interesting? 

What is conformance, really?

The OOXML/ODF-blogsphere has been in a frenzy the last couple of weeks after a couple of posts made by yours truly and Alex Brown that was picked up by Rob Weir. I don't want to get into the technical details here - you should catch up on the conversations taking place in the comment sections of their respective blogs.

Bu I do want to talk a bit about conformance - because conformance should be much more than schema-validation. To be able to have a clear perspective, we need to look in the two specifications for how conformance is described.

ODF 1.0 (IS 26300):

Conformance is described in section 1.5 

Documents that conform to the OpenDocument specification MAY contain elements and attributes not specified within the OpenDocument schema. Such elements and attributes must not be part of a namespace that is defined within this specification and are called foreign elements and attributes.

So this means that the only requirements for a document to have an "ODF-conformant" sticker slapped on it is to be able to validate against the ODF schema. If the document contains elements or attributes not defined in ODF 1.0, they should be marked with their own namespaces. This is actually all there is to say about conformance of individual documents in ODF 1.0 .

The section further describes conformance requirements for consuming and producing applications:

Conforming applications either MUST read documents that are valid against the OpenDocument schema if all foreign elements and attributes are removed before validation takes place, or MUST write documents that are valid against the OpenDocument schema if all foreign elements and attributes are removed before validation takes place.

So this section describes requirements to how foreign elements are handled when writing and reading ODF documents.

OOXML 1.0 (IS 29500):

The conformance clauses for OOXML were (drastically) changed at the BRM. Conformance in OOXML is described with more details and most specifically it contains conformance clauses for the OOXML-package itself, the so-called "OPC-package".

As with ODF, an OOXML 1.0 document is conformant if it adheres to the schema described in the standard.

More specifically it says in Part 1 section 2.4

Document conformance is purely syntactic; it involves only Items 1 and 2 in §2.3 above.

  • A conforming document shall conform to the schema (Item 1 above) and any additional syntax constraints (Item 2).

Now, this is already more difficult to "put down on paper" than the ODF-equivilant. Because "Item 1" and "Item 2" are described in Part 1 section 2.3 as

  1. Schemas and an associated validation procedure for validating document syntax against those schemas. (The validation procedure includes un-zipping, locating files, processing the extensibility elements and attributes, and XML Schema validation.)
  2. Additional syntax constraints in written form, wherever these constraints cannot feasibly be expressed in the schema language.

As a side-note, Item 2 above was the exact reason Stepháne Rodriguez' example with the broken Calculation Chain was actually a non-conforming OOXML-document, but that's a completely different story.

Moreover OOXML describes a few "conformance classes", specifically "Wordprocessing", "Spreadsheet" and "Presentation"-classes. The intent here is to be able to claim conformance to parts of the OOXML-spec.

And just as ODF contained requirements for applications, so does OOXML. But it takes conformance a bit wider. Since there is an "Item 1" and "Item 2" above, there is also an "Item 3". This was modified at the BRM and now says:

3. Descriptions of element semantics. The semantics of an element refers to its intended interpretation by a human being. 

In section 2.5 of Part 1 it now says:

Application conformance incorporates both syntax and semantics; it involves items 1, 2 and 3 in §2.3 above.

So a conforming application also has to abide by the semantics of the specification of elements and attributes. In lay-man's terms this could be described as "A conforming application has to treat content faithfully with respect to the specification of it". So it basically tells applications not to make their own interpretation of the elements it encounter as it traverses the XML-tree.

Now, I know that this is just a crude introduction to conformance of ODF- and OOXML-documents, but I think it is important to get the ball rolling and to give everyone a feeling of the complexity of the concept.

Thoughts, anyone?