Are document formats silver-bullets?

by jlundstocholm 6. August 2008 18:04

A new study from the University of Illinois College of Law has made its way to cyberspace. The title is "Lost in Translation: Interoperability Issues for Open Standards - ODF and OOXML as Examples" and is done by Rajiv Shah and Jay P. Kesan. The study takes a rather novel approach compared to the debates that have been raging through the last year or so: Is the choice of a(ny) document format a silver bullet for interoperability?

The answer in the paper is a clear "No". When discussing the various interop-studies internationally, they note

While it is widely acknowledged that there are problems with interoperability across different formats, e.g., going from ODF to OOXML, there is an assumption here that all implementations produce the same ODF or OOXML.

Their conclusion is that this is not the case. What they did was to create a number of test documents using the reference implementation for each format, OpenOffice.org for ODF and Microsoft Office 2007 for OOXML. They then opened these documents in other applications supporting these formats.

The results are rather interesting:

Results for ODF

Implementation Raw score  Raw score Percentage
Weighted Percent
OpenOffice
 151  100% 100%
StarOffice  149  99%  97%
Sun plug-in for Word
 142  94%  96%
CleverAge/MS plug-in for Word  139  92%  94%
WordPerfect  122  81%  86%
KOffice
 121  80%  79%
Google Docs  117  77%  76%
TextEdit
 55  36%  47%
AbiWord
 48  32%  55%

Results for OOXML

Implementation
Raw score
Raw score Percentage
Weighted Percent
Office 2007
148
 100% 100%
Office 2003
148
 100% 100%
Office 2008 (Mac)
147  99%  99%
OpenOffice
141  95%  96%
Pages 142  96%

 95%

WordPerfect 114  77%  84%
ThinkFree Office
101  68%  83%
TextEdit
52  35%  43%

They further conclude that

The final implication stems from the surprisingly good results for OOXML implementations. Critics of OOXML have argued that it was too complex and difficult to implement. While OOXML is a long and complex standard, it is possible to offer good compatibility. In fact, our results suggest that implementations of OOXML work as well as implementations of ODF. At the level of basic word-processing that we examined, neither standard had a dominant advantage over the other in terms of compatibility scores. While ODF has had a head start that has lead to more implementations, there appears no reason why OOXML cannot catch up. After all, several developers have provided independent implementations of OOXML.

... which should be interesting for those mandating usage of (an open) document format.

If nothing else this study highlights a couple of very interesting points:

  1. You don't get good interoperability simply by choosing an open document format
  2. Interoperability still has a long way to go and there is still a lot of work to be done. 
Smile

Comments

8/7/2008 4:31:34 AM #

Ian Easson

Hi Jesper,

The results are quite interesting, but they shouldn't be surprising to someone like you who has a deep understanding of what interoperability really means (and doesn't mean).  But, since few people actually understand the subject, it will be a big surprise to most people.

I have one nit-pick with the study --they missed the whole idea of the impact on interoperability of conformance subsets for OOXML and the very loose conformance requirements for ODF.

I was also wondering how you managed to quickly pick up on this study, until I noticed you were acknowledged at the end!

Ian Easson Canada |

8/7/2008 9:16:31 PM #

jlundstocholm

Hi Ian,

Thanks for your comment.

Indeed the report (and its conclusions) did not come as a surprise to me, but it was not due to having provided feedback to Rajiv and Jay. I am sure that anyone that has worked intensely with interoperability with respect to document formats quickly became aware that interoperability is not even close to 100% when using the same format across different applications ... regardless of whether we are using OOXML, ODF or any third document format.

There are other interesting aspects of the paper - I just picked out a couple that was the easiest to write about without going into technical detail. The conclusions of the tables I included are pretty much on par with the experience I have: Usage of all implementations aside "the big ones" like Microsoft Office and OOo will limit interoperability because the quality of the applications is really not that good. I am also "glad" (or sad, really) to see the low scores on Google Docs. I have been trying the get the message out in quite some time that ODF-support in Google Docs is really not that good, so I hope that this paper will help in getting people to stop referring to Google Docs as "one of the many good independant implementations of ODF".

Keep also in mind that Rajiv and Jay only tested basic features and only in text-documents. I hope someone will use the work they did to test "the other half" of the problem, i.e. actually saving documents before round-tripping them.

jlundstocholm Denmark |

8/9/2008 1:43:15 AM #

Ian Easson

Hi Jesper, I have a follow-up question.

It seems to me it going to take a lot of effort -- years perhaps -- to create a proper and comprensive suite of platform-independent tools to test conformance of applications with respect to inputting and outputting OOXML or ODF files.  Do you know if:
- The ad-hoc committees of SC34 (AGH1, etc) are looking at this issue as part of the maintenance task?
- Any of the big vendors with stakes in document formats (e.g., Microsoft, IBM, Sun) are willing to fund the devlopment of such a suite of conformance testing tools?

Ian Easson Canada |

8/9/2008 10:02:03 PM #

jlundstocholm

Hi Ian,

I cannot speak for AHG2 since I do not participate in it, and I haven't hear of any meetings being held for it and I have not seen any meeting reports being made public as we saw it after the AHG1-meeting in London. Maybe Murata can help out with some details here?

As for who to be the task-leader, I don't really know. AHG1 is formally discharged (as I understand it) since we delivered our task when we made the suggestions to create WG4 in SC34. About the companies taking charge, well I would be very surprised to see IBM be the one, and I suppose they are already working with the ODF-OIIC work in OASIS. Microsoft would be a clear candidate (they certainly have the resources for it). Maybe they could use the same model they did with the ODF Converter and Binary/OOXML-mapping - bothj on SourceForge.net .

jlundstocholm Denmark |

8/10/2008 2:30:03 AM #

trackback

Trackback from Doug Mahugh

Links for 08-09-2008

Doug Mahugh |

8/10/2008 12:38:00 PM #

Murata

Conformance testing of ODF or OOXML is outside the scope of OOXML AHGs1 and 2
of SC34. It is in the scope of SC34, though.

Can SC34 do something about conformance testing of OOXML?  I believe that
such activities can be started in SC34 if the appeal process is terminated
and OOXML is published.

As for ODF, the <a href="lists.oasis-open.org/.../msg00010.html">ODF Implementation, Interoperability and Conformance (IIC) TC</a>
of OASIS looks interesting.


Murata Japan |

8/17/2008 3:56:24 AM #

Andre

I wonder if a law school is the right authority to provide these kind of studies. You see a certain lack of qualification e.g. underdocumentation of the methods and tests used. Also on terminological grounds I have strong doubts about the professionality of the research. They seem to mix interoperability and compatibility. They don't lay open which precise version they used for their tests. And further more they imply OOXML was an open standard despite the unclarity in terms of patent licensing conditions.

The German Foreign Office did present interesting data on compatibility of different odf solutions.

"However, our research shows that ODF as written by OpenOffice.org will not be read 100% correctly in other implementations, such as Microsoft Office or Wordperfect."

Let's guess why. Anyway, an entertaining subject.

Andre Belgium |

8/18/2008 6:37:28 PM #

jlundstocholm

Andre,

I think you are missing the point. The paper does not discuss whether one standard is more open than another standard nor does it discuss what happened during the ISO process. The paper looks at interoperability and does that by looking at the document formats. The reason the paper is interesting is that it takes a pragmatic approach and asks: "What level of interoperability can you expect if you stick to one document format?".

Smile

jlundstocholm Denmark |

11/15/2008 9:13:55 PM #

seo

great post sir.
thank you very much.

seo United States |

7/8/2011 2:33:11 AM #

trackback

Links for 08-09-2008

Interoperability Study. Jay Kesan and Rajiv Shah of the University of Illinois have published a study

Doug Mahugh |

Comments are closed