a 'mooh' point

clearly an IBM drone

Beer - revisited

The beer-drinking was a huge success. We were only four of us, but we had a great afternoon and start of the evening. It was a lot of fun to meet offline and have a decent conversation on ODF and OOXML ... much more fruit-full than blogging.

Clock-wise, from bottom left, it is Henrik Stig Jørgensen, Eskild Nielsen, Jesper Lund Stocholm and Christian Nobel.

I am looking forward to doing this again, guys!

Smile

Get the ball rollin'

The latest couple of weeks have been almost as quiet as the couple of weeks after the BRM in Geneva. Most bloggers have cut down on their posts regarding ODF and OOXML (including your's truly) and have apparently gone back to their day-jobs to do a bit of Summer cleaning before the Summer vacations. There has, however, been some interesting development in some of the corners of the ODF/OOXML blogsphere.

First,

ODF TC has launched the preliminary work to form another ODF TC to focus on interoperability between ODF-enabled applications. I am one of the "lurkers" on the mail list, and I encourage everyone to either eavesdrop on the conversations taken place or actively participate in the debate. As noted before, conformance and interoperability is so much more than schema validation and the initiative is highly valuable. The topics being discussed are conformance, interoperability, IPR, AcidTests and much, much more.

Second,

Microsoft has joined ODF TC in OASIS. By Paul's message to the maillist of ODF OIIC  it seems that Microsoft has already been admitted into ODF TC. Microsoft is now listed as one of the "Sponsor Level"-members along with Adobe, Google Inc, IBM, Intel Corporation, Microsoft Corporation, Novell, and Sun Microsystems. (PS: Why is there an asterix after the names of Google and Novell?). This would mean that, as far as I get the legal mumbo-jumbo, that Microsoft will include ODF in its OSP. According to the content of the list as I write this, it has not yet been included. This also reminds me, that Microsoft said that they would modify the OSP for OOXML and specify that it also covers any future versions of OOXML that Microsoft participated in producing. We have still to see the modified text of the OSP, so Microsoft please "encourage" your legal team to finish up the wording.

and third,

(and by far, most importantly) 

Today is the day of the off-line beer drinking of the participants in the Danish debates around ODF/OOXML. We decided that it would be fun to have an off-line meeting between those of us that know each other by (blog)name only. It takes place at 17.00 this afternoon at BrewPub in Copenhagen. If you have a chance, feel free come by and share a beer with the rest of us.

Smile

No reason anymore to mandate anything but ODF?

Yesterday the news broke about Microsoft adding support for ODF in Microsoft Office 2007 SP2. Within minutes the news spread like fire on the hills of Malibu, California and blog-entries started to pop up everywhere - even Brian Jones has apparently returned from Winter hibernation and has made his first blog post in almost 6 weeks. Welcome back to the party, Brian.

Smile

The Denmark IT-news sphere was not hesitant on the keyboard as well and ComputerWorld Denmark posted an article yesterday evening and the competition on version2.dk followed up on the news this morning. I myself got the information from Luc Bollen in his comment in the article I wrote on document translation (and why it sucks). I was sitting under a maple-tree (or some other wooden artifact) having a beer with a friend after a fabulous sushi-dinner and could do absolutely nothing about it.

Dammit!

Well, the reactions to Microsoft's move has actually been surprisingly positive. Even the ADD-bunch at noooxml.org said "If this is an honest attempt to play nice, it is a very welcome move" and even IBM has been quite positive - prompting Bob Sutor to turn the axe on Apple saying: "Hey, Apple, what about you? Let’s see you do this in iWork!". Simply starting to beat on someone else reminds me of the John Wayne quote "A day without blood is like a day without sunshine".

But what is missing from the reactions?

OSP coverage of ODF

One of the side-effects of Microsoft joining OASIS ODF TC is that ODF will likely be included in the list of specifications covered by Microsoft's Open Specification Promise (OSP). The list of specificationshas not yet been updated, but I would expect it to be updated soon - or at least when they officially join the ODF TC. When you think about all the fuss around IPR in this Spring, it is quite surprising that noone has picked up on this. It rams a huge stick through the FUD about the OSP not being applicable for GPL-licensed software. Now the OSP covers ODF as well and thereby the native document format of OpenOffice.org [LGPL 3.0 license] and (I think) OpenOffice Novell Edition.

But why OOXML, then?

A lot of people are now spinning information about this move pulling the rug under OOXML and that ODF should be mandated everywhere - but nothing could be further from the truth. The reason why we approved OOXML still stands and the incompatible feature-sets of OOXML and ODF did not suddenly become compatible. There are still stuff in OOXML that cannot be persisted in ODF and vice versa. The backwards compatibility to the content in the existing corpus of binary documents is still a core value of OOXML and this incompatibility of ODF has not dissapeared. You will still loose information and functionality when you choose to persist an OOXML-file in ODF ... just as you would when persisting it to old WordPerfect formats. Insisting that having ODF-support in Microsoft Office (12 SP2) makes the need for OOXML go away is a moot point - since I am sure no one would argue to replace OOXML with TXT - simply because TXT is a supported format in Microsoft Office.

Microsoft steps up to the task at hand

Some quite extraordinary news emerged from the Redmond, WA, headquarters of Microsoft today. In summary, they announced that

  1. Microsoft will join OASIS ODF TC
  2. Microsoft will include ODF in their list of specifications covered by the Open Specification Promise (OSP)
  3. Microsoft will include full, native support for ODF 1.1 in Microsoft Office 14 and in Microsoft Office 12 SP2 - scheduled for Q2 2009. Microsoft Office 12 SP" will have built-in support for the three most widely used ISO-standards for document formats, e.g. OOXML, ODF and PDF.


My initial reaction when I heard it was "Wow . that's amazing". I am sure a lot of people will react "It's too little, too late", though, but let me use a couple of bytes to describe why I think it is a good move by Microsoft.

Microsoft joins OASIS ODF TC

Well, Microsoft has been widely criticised for not joining OASIS a few years ago. I think it is a bogus claim, but never the less; it has been on the minds of quite a lot of people. Novell has had a seat in both ECMA TC45 and OASIS ODF TC for some time now, and it is my firm belief that both consortia has benefited by this. The move by Microsoft to join OASIS ODF TC will likely have a similar effect. One of the most frequent requests in the standardisation of OOXML was to increase the feature-overlap of ODF and OOXML. This is quite difficult to accomplish (effectively) without knowing what the features of the other document format is (going to be). By Microsoft participating on both committees (and IBM will hopefully consider joining ECMA TC45) harmonization (or "enlargement of the feature-overlap") will likely occur at a quicker pace.

This also means that the worries some of us have had about Microsoft's future involvement in standardisation work around document formats has been toned down a bit. Microsoft is now actively participating in this work in ECMA, in ISO and also in OASIS. I think this is really good news. Not good news for Microsoft - but good news for those of us that are working with document formats every day.

Microsoft will cover ODF with OSP

One of the most difficult, non-technical, discussions during the standardisation of OOXML was legal aspects. It was discussions about different wordings in Sun's CNS, IBM's ISP and Microsoft's OSP (Jesus Christ, guys, pick ONE single acronym, already!) and the possible impact on implementers of ODF and OOXML. One of the aspects of the discussion that never really surfaced was that if IBM has software patents covering ODF - some of them quite possibly cover parts of OOXML as well. But the ISP of IBM does not mention OOXML - it only mentions ODF. This leaves me as a developer in quite a legal pickle, because by implementing OOXML I am covered by the OSP - but I am not covered by IBM's ISP (and vice versa). To me as a developer, Microsoft's coverage of ODF in their OSP is a good move, because it should remove all legal worries I might have around stepping into SW-patent covered territory.

ODF support in Microsoft Office

Microsoft will finally deliver on requests for native ODF-support for ODF in Microsoft Office. Microsoft will support ODF 1.1 in Microsoft Office 12 SP2 and also have built-in support for PDF and XPS (these are currently only available as a separate download).

Denmark is one of the countries where both ODF and OOXML have been approved for usage in the public sector. This is currently bringing quite a bit of complexity to the daily work of information workers since there are not many (if any) applications offering high fidelity, native support for both formats. They hence rely on translators like ODF-Converter or similar XSLT-based translators. It's a bad, but currently necessary, choice. The usage of translators for document conversion has been widely criticised, amongst others by Rob and I, and the built-in support for ODF in Microsoft Office is a great step in the right direction.

As with everything Microsoft does, we need a healthy amount of scepticism as to which extend they will deliver on their promises. However, I truly believe that the moves by Microsoft here are good news - regardless of the scepticism. An old proverb says "don't count your chickens before they hatch" - and this applies perfectly here. We will have to wait and see what will eventually happen - but so far . it looks good.

Beer, beer, beer ... bed, bed, bed ... (Danish)

I den danske del af ODF/OOXML-blogsfæren har vi et stykke tid talt om, at det kunne være skægt at mødes til en pilsner et eller andet sted i København. De fleste af os kender jo kun hinanden fra vores respektive navne og gravatars, og det er i hvert fald min erfaring, at man er lidt langsommere på aftrækkeren, når man har fået sat et ansigt på navnet.

Jeg indbyder derfor til en omgang øl (eller lignende, det kunne også være noget uden alkoholisk indhold) 

torsdag d. 12. juni 2008 kl. 17:00  

Jeg finder ud af et sted at være inden længe - men hvis du har et bud på et sted, så sig til. Jeg hælder selv til ScrollBar på ITU i Ørestaden.

Tilmelding sker i kommentartråden herunder. 

Official complaint on OOXML-procedures in Denmark

I just wanted to let you in on a bit of information here from sunny Copenhagen.

Denmark has joined Norway in the strange sense that the Danish NSB (Dansk Standard) has received an official complaint regarding the Danish vote on March 29th 2008. I am sure the news will spread to the rest of the blog-sphere soon, so be the first to get the information here (my translation)

The Municipality of Aarhus, who was a member of the OOXML-committee in Dansk Standard, has now complained about the "Yes"-vote in the ISO-approval [of OOXML]. The reason: No one knows the real content of the specification. [...] . "It is strange to vote 'Yes' to a standard, that could still be filled with flaws and defects. In principle, it might say that the Moon is made of Swiss cheese, and they voted 'Yes' to that, explains Jens Kjellerup to Computerworld."

Check it out here (in Danish): http://www.computerworld.dk/art/45835?a=fp_2&i=514

(D)IS 29500 ISO process F.A.Q.

Due to the still overwhelming interest of the now done ISO DIS 29500 process, ISO has created a small F.A.Q. to answer some of the more frequently asked questions.

My excerpts from the F.A.Q. are listed here:

Q: How could a 6.000-page document be fast-tracked?

Because the information technology (IT) sector is fast-moving, the joint technical committee ISO/IEC JTC 1, Information technology, introduced the "fast track" process for the adoption as ISO/IEC standards of documents originating from the IT sector on which substantial development has already taken place.

(...)

The number of pages of a document is not a criterion cited in the JTC 1 Directives for refusal. It should be noted that it is not unusual for IT standards to run to several hundred, or even several thousand pages.

ISO/IEC 29500 has spent a total of 15 months being processed within the ISO/IEC system, from its submission in December 2006 to the deadline of 29 March 2008 approving it.

Q:  Why would ISO and IEC allow two standards for the same subject?

(...)

In this particular case, some claim that the Open Document Format (ODF), which is also an ISO/IEC standard (ISO/IEC 26300) and ISO/IEC 29500 are competing solutions to the same problem, while others claim that ISO/IEC 29500 provides additional functionalities, particularly with regard to legacy documents.

The ability to have both as International Standards was something that needed to be decided by the market place. ISO and IEC and their national members provided the JTC 1 infrastructure that facilitated such a decision by the market players.

Q: What about hidden patent issues?

(...)

Microsoft, the holder of patents involved in the implementation of ISO/IEC 29500, has made such a declaration to ISO and IEC. If, after publication of the standard, it is determined that licenses to all required patents are not so available, one option would be to withdraw the International Standard.

Q: What about contradictions with other ISO and IEC Standards?

(...)

A number of such claimed contradictions were identified during the one-month JTC 1 fast-track review period, prior to its release for voting and comment. The submitter, Ecma International, responded to these comments at the end of the review period.

Some of these comments were reflected in national body comments on the fast-track Draft International Standard (DIS). These comments, e.g. the non-alignment with ISO 8601, Data elements and interchange formats – Information interchange – Representation of dates and times, were dealt with in the ballot resolution meeting (BRM).

It is possible that others may still remain, but these can be taken care of during the maintenance of the standard.  In all cases, the final decision on whether there are contradictions and how to resolve them rests with the national members of ISO and IEC.

Q: Will ISO and IEC review how ISO/IEC 29500 was adopted?

We reviewed the process before it started, all the while during its course and afterwards as well. While the voting on ISO/IEC 29500 has attracted exceptional publicity, it needs to be put in context. ISO and IEC have collections of more than 17 000 and 7 000 successful standards respectively, these being revised and added to every month. This suggests that the standards development process is credible, works well and is delivering the standards needed, and widely implemented, by the market. (...)

Object-embedding in OOXML with Microsoft Office 2007

(updated 2008-04-14, added links to external resources) 

Now that the ISO-vote and approval of OOXML is done with, it is time to continue the coverage of implementing OOXML as well as ODF – this time about OOXML, Microsoft Office 2007 and embedded objects.

As I have previously said, there are always quirks when it comes to implementations of any standard in large applications. I have covered a few of these already regarding mathematical content [0], [1] and it is no different with regards to object embedding. I should say that a source of inspiration to this article was Stepháne Rodrigues’ article about binary Parts of an OOXML-file (OPC-package).

Now, embedding objects in an OOXML-file is pretty straight-forward: Simply add the object somewhere in the package and make a reference to the location and specify what kind of file you are embedding. This is very similar to how it is done in ODF.

(note: the specific schema-fragments defining how to do this were dealt with and changed at the BRM, so I will not include these until the final version of IS 29500 is released. I will update this article according to the revised spec).

As I have noted earlier, interoperability happens at application-level, so it is worth pondering a bit on how the specification is implemented in the major implementations of it. So let’s see how Microsoft Office acts when embedding objects.

What I did was this: 

I used Microsoft Office 2007, created a text-document and I embedded an object in it – in this case an OpenOffice.org Calc Spreadsheet. The spreadsheet is also inspired by one of Stepháne Rodrigues’ articles, the infamous “OOXML is defective by design”.

 

The object is inserted and displayed in the document. When activating the object, I can edit it as if it was in OOo Calc itself. Actually it is OOo Calc itself. It is invoked using OLE and as a side-note it shows a cool thing about OLE – or similar other object linking techniques. Microsoft Office 2007 does not know anything about OpenOffice.org, yet it is still able to invoke the application and edit the embedded object.

 

Ok – now let’s look at the OOXML-file created. In the file document.xml the following fragment is located:


The <v:shape>-element is part of the nasty VML-dependency that luckily was dealt with at the BRM. This will be replaced by DrawingML in the final IS 29500. The <o:OLEObject>-element specifies the type of the embedded object (“opendocument.CalcDocument.1”) and the location of it (“rId5”). There is really nothing platform dependent here in the OOXML-markup.What is more interesting, though, is looking at the Calc-object after it is embedded. By navigating through the relationship-model of the OPC-package, the embedded object is located.

 

One might think that this file was simply the Calc-file renamed, but sadly this is not so. This file is actually the Calc-file wrapped in an OLE2 Compound file (“CF”). The CF-file is basically a stream wrapper which allows a number of streams to be persisted in a file as well as information about these streams. Using one of the many CF-viewers you can get the data of the wrapped file itself as well as the persisted information of it, here “com.sun.star.comp.Calc.SpreadsheetDocument _   Embedded Object _   opendocument.CalcDocument.1”.

 

 

Technically this is really not a big deal – there are well-known ways to manipulate these files on all platforms and most programming languages and extracting the required data should really be a no-brainer. OpenOffice.org is licensed under LGPL, so you can use the source-code from this to figure out how to do it on the platforms supported by OpenOffice.org. It is also pretty evident why Microsoft Office 2007 works this way. Microsoft Office 2007 is the latest incarnation of the Microsoft Office Suite – a suite that has depended on this file format since at least 1999 … and of course on OLE itself as well. So if you want to implement a document consumer, this is simply something to be aware of when consuming OOXML-files.

From the perspective of a developer, however, this is really annoying. I would definitely opt for Microsoft Office 2007 embedding the objects simply as the objects they are – and not wrapping them in a CF-wrapper. This is how it is done in OpenOffice.org. Granted, this suite does other weir(d) things like renaming the files and not being entirely clear how to embed all object types, but the objects are embedded as they are (unless they are OpenDocument objects). This is a benefit to me as a developer when examining OOXML-files, because I can simply extract the object in question from the document package and verify the file.

So this might be the first new post-vote change-modification to IS 29500:

 

When embedding objects an application shall not modify or wrap the embedded object in any way before embedding it in the package. When a document consumer encounters an embedded object, this shall not be converted to another object type without knowledge-based confirmation by the user.

 

This (or similar woring in standard-lingo) would prevent Microsoft Office in wrapping objects on CF-wrappers, but it would also prevent applications like OpenOffice.org on SUSE to convert embedded Excel-objects to Calc-spreadsheets. FYI, this kills interop too.

A final request: Microsoft, please, as you must already be implementing the changes from the BRM for Office 2007, would you be so kind to make this change to the application as well? It should really be a no-brainer, and if there should be any requirements in your code for the CF-files, feel free to load the objects, wrap them in an in-memory CF-file and take it from there.

Smile

Three monkeys - one was Håkon Lie

(corrected quote of Håkon Lie) 

After the demonstration in Oslo yesterday (damn I wish I had been there) the CTO of Opera Software, Håkon Wium Lie was interviewed by Norwegian newspaper VG. The interview is in Norwegian but let me translate a bit for you:

Håkon Lie: What might happen if Microsoft gets this [OOXML ISO-approval] [OOXML added to the list of approved mandatory document formats in Norway, JLS addition ] through is that Norwegian authorities may be forced to use it, and this means that if you receive an email with an attachment and you don't have a program to read this attachment - it could be a message from a teacher of your child that attends a Norwegian school - when you cannot open this attachment, you will have to buy software from Microsoft. So this is really a "Microsoft-tax" that may be the consequence if Microsoft wins here. We are against this.

Dear Håkon, I love the software you guys make - I use it every day on my cell-phone ... but are you out of your mind? I would expect those kinds of arguments from the typical Tux-f**kers (or in reverse, from the usual Microsoft fan-boys whose coding-skills evolve around point-and-click in Visual Studio Web Developer). I would not expect this from the CTI of the third-largest browser-producer in the world and your argument here makes it all so much clearer for me why Standard Norge discarded your arguments.

I am sure Gene Amdahl would be proud of you.

Smile

 

OOXML is now IS 29500

Long awaited, the votes on the ISO/IEC DIS 29500 have been counted and verified. The unofficial result began circulating between the national bodies yesterday but the result was not made public until today, Wednesday April 2nd 2008.

The results are pretty clear: OOXML has now been approved as an ISO/IEC international standard.

Result of voting

P-Members voting: 24 in favour out of 32 = 75 % (requirement >= 66.66%)


(P-Members having abstained are not counted in this vote.)


Member bodies voting: 10 negative votes out of 71 = 14 % (requirement <= 25%)


Approved

I think this pretty much summarizes it.

It's been a good one - thanks to all I have worked with throughout this process the last year or so - it's been great getting to know you. Also thanks to everyone contributing their valuable input to this process.