Are document formats silver-bullets?

by jlundstocholm 6. August 2008 09:04

A new study from the University of Illinois College of Law has made its way to cyberspace. The title is "Lost in Translation: Interoperability Issues for Open Standards - ODF and OOXML as Examples" and is done by Rajiv Shah and Jay P. Kesan. The study takes a rather novel approach compared to the debates that have been raging through the last year or so: Is the choice of a(ny) document format a silver bullet for interoperability?

The answer in the paper is a clear "No". When discussing the various interop-studies internationally, they note

While it is widely acknowledged that there are problems with interoperability across different formats, e.g., going from ODF to OOXML, there is an assumption here that all implementations produce the same ODF or OOXML.

Their conclusion is that this is not the case. What they did was to create a number of test documents using the reference implementation for each format, OpenOffice.org for ODF and Microsoft Office 2007 for OOXML. They then opened these documents in other applications supporting these formats.

The results are rather interesting:

Results for ODF

Implementation Raw score  Raw score Percentage
Weighted Percent
OpenOffice
 151  100% 100%
StarOffice  149  99%  97%
Sun plug-in for Word
 142  94%  96%
CleverAge/MS plug-in for Word  139  92%  94%
WordPerfect  122  81%  86%
KOffice
 121  80%  79%
Google Docs  117  77%  76%
TextEdit
 55  36%  47%
AbiWord
 48  32%  55%

Results for OOXML

Implementation
Raw score
Raw score Percentage
Weighted Percent
Office 2007
148
 100% 100%
Office 2003
148
 100% 100%
Office 2008 (Mac)
147  99%  99%
OpenOffice
141  95%  96%
Pages 142  96%

 95%

WordPerfect 114  77%  84%
ThinkFree Office
101  68%  83%
TextEdit
52  35%  43%

They further conclude that

The final implication stems from the surprisingly good results for OOXML implementations. Critics of OOXML have argued that it was too complex and difficult to implement. While OOXML is a long and complex standard, it is possible to offer good compatibility. In fact, our results suggest that implementations of OOXML work as well as implementations of ODF. At the level of basic word-processing that we examined, neither standard had a dominant advantage over the other in terms of compatibility scores. While ODF has had a head start that has lead to more implementations, there appears no reason why OOXML cannot catch up. After all, several developers have provided independent implementations of OOXML.

... which should be interesting for those mandating usage of (an open) document format.

If nothing else this study highlights a couple of very interesting points:

  1. You don't get good interoperability simply by choosing an open document format
  2. Interoperability still has a long way to go and there is still a lot of work to be done. 
Smile

Comments

8/6/2008 7:31:34 PM #

Ian Easson

Hi Jesper,

The results are quite interesting, but they shouldn't be surprising to someone like you who has a deep understanding of what interoperability really means (and doesn't mean).  But, since few people actually understand the subject, it will be a big surprise to most people.

I have one nit-pick with the study --they missed the whole idea of the impact on interoperability of conformance subsets for OOXML and the very loose conformance requirements for ODF.

I was also wondering how you managed to quickly pick up on this study, until I noticed you were acknowledged at the end!

Ian Easson Canada | Reply

8/7/2008 12:16:31 PM #

jlundstocholm

Hi Ian,

Thanks for your comment.

Indeed the report (and its conclusions) did not come as a surprise to me, but it was not due to having provided feedback to Rajiv and Jay. I am sure that anyone that has worked intensely with interoperability with respect to document formats quickly became aware that interoperability is not even close to 100% when using the same format across different applications ... regardless of whether we are using OOXML, ODF or any third document format.

There are other interesting aspects of the paper - I just picked out a couple that was the easiest to write about without going into technical detail. The conclusions of the tables I included are pretty much on par with the experience I have: Usage of all implementations aside "the big ones" like Microsoft Office and OOo will limit interoperability because the quality of the applications is really not that good. I am also "glad" (or sad, really) to see the low scores on Google Docs. I have been trying the get the message out in quite some time that ODF-support in Google Docs is really not that good, so I hope that this paper will help in getting people to stop referring to Google Docs as "one of the many good independant implementations of ODF".

Keep also in mind that Rajiv and Jay only tested basic features and only in text-documents. I hope someone will use the work they did to test "the other half" of the problem, i.e. actually saving documents before round-tripping them.

jlundstocholm Denmark | Reply

8/8/2008 4:43:15 PM #

Ian Easson

Hi Jesper, I have a follow-up question.

It seems to me it going to take a lot of effort -- years perhaps -- to create a proper and comprensive suite of platform-independent tools to test conformance of applications with respect to inputting and outputting OOXML or ODF files.  Do you know if:
- The ad-hoc committees of SC34 (AGH1, etc) are looking at this issue as part of the maintenance task?
- Any of the big vendors with stakes in document formats (e.g., Microsoft, IBM, Sun) are willing to fund the devlopment of such a suite of conformance testing tools?

Ian Easson Canada | Reply

8/9/2008 1:02:03 PM #

jlundstocholm

Hi Ian,

I cannot speak for AHG2 since I do not participate in it, and I haven't hear of any meetings being held for it and I have not seen any meeting reports being made public as we saw it after the AHG1-meeting in London. Maybe Murata can help out with some details here?

As for who to be the task-leader, I don't really know. AHG1 is formally discharged (as I understand it) since we delivered our task when we made the suggestions to create WG4 in SC34. About the companies taking charge, well I would be very surprised to see IBM be the one, and I suppose they are already working with the ODF-OIIC work in OASIS. Microsoft would be a clear candidate (they certainly have the resources for it). Maybe they could use the same model they did with the ODF Converter and Binary/OOXML-mapping - bothj on SourceForge.net .

jlundstocholm Denmark | Reply

8/9/2008 5:30:03 PM #

trackback

Trackback from Doug Mahugh

Links for 08-09-2008

Doug Mahugh | Reply

8/10/2008 3:38:00 AM #

Murata

Conformance testing of ODF or OOXML is outside the scope of OOXML AHGs1 and 2
of SC34. It is in the scope of SC34, though.

Can SC34 do something about conformance testing of OOXML?  I believe that
such activities can be started in SC34 if the appeal process is terminated
and OOXML is published.

As for ODF, the <a href="lists.oasis-open.org/.../msg00010.html">ODF Implementation, Interoperability and Conformance (IIC) TC</a>
of OASIS looks interesting.


Murata Japan | Reply

8/16/2008 6:56:24 PM #

Andre

I wonder if a law school is the right authority to provide these kind of studies. You see a certain lack of qualification e.g. underdocumentation of the methods and tests used. Also on terminological grounds I have strong doubts about the professionality of the research. They seem to mix interoperability and compatibility. They don't lay open which precise version they used for their tests. And further more they imply OOXML was an open standard despite the unclarity in terms of patent licensing conditions.

The German Foreign Office did present interesting data on compatibility of different odf solutions.

"However, our research shows that ODF as written by OpenOffice.org will not be read 100% correctly in other implementations, such as Microsoft Office or Wordperfect."

Let's guess why. Anyway, an entertaining subject.

Andre Belgium | Reply

8/18/2008 9:37:28 AM #

jlundstocholm

Andre,

I think you are missing the point. The paper does not discuss whether one standard is more open than another standard nor does it discuss what happened during the ISO process. The paper looks at interoperability and does that by looking at the document formats. The reason the paper is interesting is that it takes a pragmatic approach and asks: "What level of interoperability can you expect if you stick to one document format?".

Smile

jlundstocholm Denmark | Reply

11/15/2008 12:13:55 PM #

seo

great post sir.
thank you very much.

seo United States | Reply

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading



Powered by BlogEngine.NET 1.5.0.7
Theme by Mads Kristensen

about ...

Image of Jesper Lund Stocholm

Name: Jesper
Nationality: Danish
Civil status: Married
Kids: a girl
Home town: CPH
iPhone-owner: Yes
ubuntu-edition: 9.04
Spam-impact: Moderate

(when spam-impact is anything but low, this blog is sadly moderated.)

Update 2009-07-11: I am experiencing really, really high spam influx these days, so I appologize if your legitimate comment is lost in the flood of crap.

OOXML ISO

Country adoption

Country ODF1.0
ODF1.1 OOXML
Ecuador x    
Norway
x  
UK x    
US x    
Vietnam x    

If you know of other countries that have decided on an approved document format, please let me know. I'll need a reference before it is added to the list.

Smile

Badges of honor

Burst A J
Microsoft Lackey  I I,
Microsoft booster   I,
Microsoft drone   I
ODF basher   I,
ODF Offender
I
I
Microsoft mole I  
IBM Drone   I
MS Provocateur   I
Pro-ms saboteur   I
Crony  I I
Microsoft's special friend I  
Micro$oft nazi I  
Microsoft minion   I
Nattering nabob O  


From Dec 16th 2009, new additions are marked with "O"

Quote

You have it backwards, OpenOffie should switch to OOXML so they are interoprable woth the worlds largest installed base.

Sorry, but the tail does not wag the dog.

From ZDNet (No_Ax_to_Grind)

License

The content of this blog is licensed under the Creative Commons "Attribution license"

Creative Commons Attribution license

This basically means, that you can do just about anything with the content I provide, but not in any way that suggests that I endorse you or your use of the content on my blog.