a 'mooh' point

clearly an IBM drone

The complexity of SpreadsheetML - oh the sheer joy of it!

Having a bit of time on my hands while attending the SC34/WG4-meeting in Okinawa, I thought I'd write up a blog post I have wanted to write in quite some time.

The reason for me doing this was a requirement I am often presented by CIBER's customers - export my data to Excel. The data they want us to export are traditionally grouped into three categories:

  • Text (strings)
  • Numbers
  • Dates

Creating cells with numbers and text are really a no-brainer in OOXML. It is a bit more complicated when it comes to dates, because dates in e.g. ISO 8601-format are not as such supported as "built-in cell data types" in SpreadsheetML. Instead, dates are presented by styling content in number-cells. This means that to be able to display a date in SpreadsheetML, you need to be know "a bit" about styling in spreadsheets.

Now, as some of you remember, representation of dates in spreadsheets using OOXML is done in "serial form" meaning that dates are stored as numbers. These numbers are also known as "Julian days" - not to be mistaken with the "Julian Calendar". In even other words a date is represented as the number of days since some starting point in time.

So if I wanted to store the date "December 20nd 2009" in OOXML, I would have to convert it to a "julian representation" - in this case "40167". This is really just a minor annoyance - the conversion is trivial and a no-brainer. However - the fun has not started yet.

If you look at the markup required, it would have to be like this:

[code:xml]<sheetData>
  <row r="1">
    <c r="A1">
      <v>40167</v>
    </c>
  </row>
</sheetData>[/code]

So this will give me a cell with a serial representation of 2009-12-22. However, if I open this in an OOXML-compliant application, it will display "40167". As I mentioned above, it turns out that displaying the serial representation as a "proper date" requires styling of the cell content.

The key is an attribute on the <c>-element I omitted in the example above.

[code:xml]<sheetData>
  <row r="1">
    <c r="A1" s="0">
      <v>40167</v>
    </c>
  </row>
</sheetData>[/code]

The "s"-attribute specified the style for the given cell. The specefication says this for this particular attribute:

The index of this cell's style. Style records are stored in the Styles Part.

Ok - cool so the good thing here is, that we now know what the attribute is used for. The bad thing is that we don't know anything about "how".

Styles for SpreadsheetML are described in section 3.8. The complete section is about 110 pages and it describes at length each element name and attribute but again it more answers "what" than "how".

(I just talked to another delegate about if a standard should describe both the hows and the whats, and it seems that the jury is still out on that one, so these are simply my personal observations of using the specification to solve a concrete problem).

So in figuring out how to do this, a good starting point would be to look at the list of valid child elements. These are defined as

[code:xml]<complexType name="CT_Stylesheet">
  <sequence>
    <element name="numFmts" type="CT_NumFmts" minOccurs="0" maxOccurs="1"/>
    <element name="fonts" type="CT_Fonts" minOccurs="0" maxOccurs="1"/>
    <element name="fills" type="CT_Fills" minOccurs="0" maxOccurs="1"/>
    <element name="borders" type="CT_Borders" minOccurs="0" maxOccurs="1"/>
    <element name="cellStyleXfs" type="CT_CellStyleXfs" minOccurs="0" maxOccurs="1"/>
    <element name="cellXfs" type="CT_CellXfs" minOccurs="0" maxOccurs="1"/>
    <element name="cellStyles" type="CT_CellStyles" minOccurs="0" maxOccurs="1"/>
    <element name="dxfs" type="CT_Dxfs" minOccurs="0" maxOccurs="1"/>
    <element name="tableStyles" type="CT_TableStyles" minOccurs="0" maxOccurs="1"/>
    <element name="colors" type="CT_Colors" minOccurs="0" maxOccurs="1"/>
    <element name="extLst" type="CT_ExtensionList" minOccurs="0" maxOccurs="1"/>
  </sequence>
</complexType>[/code]

The elements that should (ahem) draw attention to them are "cellStyles", "cellStyleXfs" and "cellXfs".So, if you want to apply formatting directly to a cell, look at e.g. the element <cellXfs> defined in section 3.8.10. It says (in abstract)

This element contains the master formatting records (xf) which define the formatting applied to cells in this workbook. These records are the starting point for determining the formatting for a cell. Cells in the Sheet Part reference the xf records by zero-based index.

The <cellXfs>-element has a child element called <xf>. The element is defined as

[code:xml]<complexType name="CT_Xf">
  <sequence>
    <element name="alignment" type="CT_CellAlignment" minOccurs="0" maxOccurs="1"/>
    <element name="protection" type="CT_CellProtection" minOccurs="0" maxOccurs="1"/>
    <element name="extLst" type="CT_ExtensionList" minOccurs="0" maxOccurs="1"/>
  </sequence>
  <attribute name="numFmtId" type="ST_NumFmtId" use="optional"/>
  <attribute name="fontId" type="ST_FontId" use="optional"/>
  <attribute name="fillId" type="ST_FillId" use="optional"/>
  <attribute name="borderId" type="ST_BorderId" use="optional"/>
  <attribute name="xfId" type="ST_CellStyleXfId" use="optional"/>
  <attribute name="quotePrefix" type="xsd:boolean" use="optional" default="false"/>
  <attribute name="pivotButton" type="xsd:boolean" use="optional" default="false"/>
  <attribute name="applyNumberFormat" type="xsd:boolean" use="optional"/>
  <attribute name="applyFont" type="xsd:boolean" use="optional"/>
  <attribute name="applyFill" type="xsd:boolean" use="optional"/>
  <attribute name="applyBorder" type="xsd:boolean" use="optional"/>
  <attribute name="applyAlignment" type="xsd:boolean" use="optional"/>
  <attribute name="applyProtection" type="xsd:boolean" use="optional"/>
</complexType>[/code]

The attribute you want here is "numFmtId". The attribute is described as "Id of the number format (numFmt) record used for this cell format".

(are we getting there soon?)

Anywho, going to the reference of numFmt will lead you to paragraph 3.8.30 numFmt (Number Format) and it will tell you, that some of the values of the attribute are implied. That's really just another way of saying "reserved values". 

ID
formatCode
 
 0
 General
 1  0
 2  0.00
 3  #,##0
 4  #,##0.00
 9  0%
 10  0.00%
 11  0.00E+00
 12  # ?/?
 13  # ??/??
 14  mm-dd-yy
 15  d-mmm-yy
 16  d-mmm
 17  mmm-yy
 18  h:mm AM/PM
 19  h:mm:ss AM/PM
 20  h:mm
 21  h:mm:ss
 22  m/d/yy h:mm
 37  #,##0 ;(#,##0)
 38  #,##0 ;[Red](#,##0)
 39  #,##0.00 ;(#,##0.00)
 40  #,##0.00 ;[Red](#,##0.00
 45  mm:ss
 46  [h]:mm:ss
 47  mmss.0
 48  ##0.0E+0
 49  @


It looks like id 15 could be the one we are looking for. So I'm gonna add this number format to the xf-elements's numFmt-attribute and create this xml-fragment:

[code:xml]<cellXfs count="2">
  <xf numFmtId="15" (...)  />
</cellXfs>[/code]

Behold - it actually works. When I load this in Microsoft Office 2007, it will display this:



So what have I learned here (apart from the astounding complexity of this relatively trivial task)? Well, to display a date using SpreadsheetML, you need to know a bit about SpreadsheetML styles. You will also need to do a fair amount of digging in the specification as well as in existing OOXML-files, since I could not find this information anywhere. Luckily for you, the content of this blog is licensed under Creative Commons attribution license, so feel free to use it however you should wish to do so.

To sum it all up, you will need the following items to display a cell in SpreadsheetML:

1. The cell fragment

[code:xml]<sheetData>
  <row r="1">
    <c r="A1" s="0">
      <v>40167</v>
    </c>
  </row>
</sheetData>[/code]

Notice that the cell is styled using the attribute "s" with a value of "0".

2. The style part

[code:xml]<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
  <cellXfs count="1">
    <xf numFmtId="15" (...) />
  </cellXfs>
</styleSheet>[/code]

Notice that index "0" of the <cellXfs>-collection has a numFmt-attribute with the value "15" resulting in displaying the date correctly.

I have created a small test file based on the walk-through above and it is available here: test_dates.xlsx (2.25 kb).

And in other news:

So, you might ask, how is this done using other document formats? Well, it turns out to be drastically less complex.

ODF

[code:xml]<table:table-row>
  <table:table-cell office:value-type="date" office:date-value="2009-12-20">
    <text:p>20-12-09</text:p>
  </table:table-cell>
</table:table-row>[/code]

OOXML IS29500

[code:xml]<sheetData>
  <row r="1">
    <c r="C4" t="d">
      <v>1976-11-22T08:30Z</v>
    </c>
  </row>
</sheetData>[/code] 

Both examples above should require no additional formatting.

You might also ask, if this could have been done in any other way in OOXML? Well, as far as I read the specification, there is no way around the style-part-trouble. But you could create your own number formatting if you should wish so. I would actually prefer this angle, since it would be a step away from pre-determined (implied) values in styles and keep the package content self-contained.

You know, this could actually be the basis for a nice new defect report for WG4: "Remove all implied values in the specification and move them to the transitional Part 4".

Is there an end of it?

I know this was quite a lenghty post - but is it of any value at all - and would you like more of these investigative posts in the future?

Smile

JTC1/SC34 WG4 appointed Danish expert

On Friday, October 24th the Danish mirror-committee to JTC1/SC34 had its bi-monthly meeting. On the agenda was, amongst other things, assignment of participants to the newly created working groups in JTC1/SC34, WG4 and WG5.

For those of you not familiar with the establishment of these two groups, WG4 will deal with maintenance and development of OOXML. WG5 will work to "Develop principles of, and guidelines for, interoperability among documents represented using heterogeneous ISO/IEC document file formats." So the latter WG is not really about translating between document formats such as ODF and OOXML. No, it is about creating some guidelines that all (future or present) document formats could use as inspiration when designing the formats to be "interoperable".

I think the prospects of this could be really, really good and I hope as many stakeholders as possible chooses to join the work. It would be great to have som kind of guidelines for interoperability comparable to the Accessibility-guidelines from W3C (those that was added to OOXML during the BRM in Geneva).

We did not get any confirmed pledges to participate from the members of the Danish committee, but I was very pleased to hear that both ORACLE Denmark as well as the Technical University of Denmark would investigate if they could join the working group.

More interesting to me was assignment of participants for Working Group 4 to develop and maintain OOXML. Not surprisingly (since most of the participants of the committee are much more "anti-OOXML" than "pro-ODF" this point of the agenda received far less attention. We have in CIBER Denmark discussed for quite some time if we should join the working group, and we have reached the conclusion that we would. We do this of the following reasons:

  1. We believe that we would be able to deliver some technical skills that would be valuable to the work around OOXML
  2. We believe that it is important that development and maintenance of OOXML is not done exclusively by ECMA under the "ISO brand" and
  3. we believe that it is important to create a Danish "foot-print" on the development of the document format
So when the committee was asked if anyone would join, CIBER stepped up to the plate. I am happy to say that both the potential commitment of ORACLE Denmark and Technical University of Denmark and the confirmed commitment from CIBER received unanimous support from the other committee members.

So now what?

well, the first draft of the agenda for the meeting in Okinawa has been posted on the SC34-website. At present the agenda is this:

Draft agenda

  1. Opening - 2009-01-28 10:00
  2. Roll call of Delegates
  3. Adoption of the Agenda
  4. Defect Reports
  5. Any other business
  6. Closing

I think we will also talk about what to actually do in the foreseeable future both with respect to handling of defect reports and future maintenance. One of the things I will not accept (and I hope nor will the other appointed experts) is that the working group will primarily focus our time on defect handling - all while ECMA works on new stuff for OOXML and eventually dumping this on our table. So we will need to establish some sort of agreement around this.

Also we will need to talk about future places to meet. Next meeting will likely be held in Pragh, and I would like to some how make sure that future meetings are held in cities near major airport hubs around the world. It will take me about 24 hours to travel from Copenhagen to Okinawa, and that travel period would be cut in two, if the meeting was held in e.g. Tokyo or Kyoto. This is not a criticisme of the Japaneese decision to have the meeting in Okinawa, but I believe we would indirectly encourage more participation if the required travelling was not so extensive.

Oh ... and did anyone notice that I was only mentioned in the "Small news"-section of Alex Brown's recent post "More Standards news"? This really helps keeping both feet solidly on the ground and not thinking too much of myself.

Wink

Microsoft, its time to deliver

Just before OOXML was approved in JTC1/SC34, a lot of us spent a lot of time discussing the differences of between Sun's CNS, IBM's ISP and Microsoft's OSP. Specifically, a thread on Oliver Bell's blog dealt with this topic. The post was called "The OSP will apply to future versions of DIS29500". Oliver said

For developers wanting to use the ISO/IEC DIS29500 specification this has raised some questions around exactly what level of support Microsoft will pledge to future versions of the OpenXML specification as it continues to evolve through the ISO process.

This is an important issue, and to date I don’t think we have been clear enough around our intent in this area. This has come up in internal discussions several times recently and today a decision was taken to make a public statement to continue to make the intellectual property that developers or users may need available to future versions.

The statement will appear on http://microsoft.com shortly

This was in late March 2008. I just checked the OSP-page and this change has still not been applied to the OSP. The text still says:

Q: Does this OSP apply to all versions of the standard, including future revisions?

A: The Open Specification Promise applies to all existing versions of the specification(s) designated on the public list posted at http://www.microsoft.com/interop/osp/, unless otherwise noted with respect to a particular specification (see, for example, specific notes related to web services specifications).

Then in late May Microsoft announced their support of ODF in Microsoft Office 12 and joining ODF TC.I myself wrote a bit on it on my blog, and I made the following list of things Microsoft wanted to do:

  1. Microsoft will join OASIS ODF TC
  2. Microsoft will include ODF in their list of specifications covered by the Open Specification Promise (OSP)
  3. Microsoft will include full, native support for ODF 1.1 in Microsoft Office 14 and in Microsoft Office 12 SP2 - scheduled for Q2 2009. Microsoft Office 12 SP2 will have built-in support for the three most widely used ISO-standards for document formats, e.g. OOXML, ODF and PDF.

Well, I clearly misunderstood something with regards to OSP covering ODF, because that has not happened (yet). I was under the impression that it was a requirement when joining OASIS, but maybe Rob is right in saying that the OASIS IPR-policy participants in OASIS-work are required to sign actually trumps the ISP for IBM and perhaps also the OSP from Microsoft. On a funny note, I was actually quoted in their press release praising their modifications to specifications covered by their OSP ... but maybe they changed their mind.

Still, I think it would be a good move by Microsoft to include ODF in their OSP. As I wrote at that time

One of the aspects of the discussion that never really surfaced was that if IBM has software patents covering ODF - some of them quite possibly cover parts of OOXML as well. But the ISP of IBM does not mention OOXML - it only mentions ODF. This leaves me as a developer in quite a legal pickle, because by implementing OOXML I am covered by the OSP - but I am not covered by IBM's ISP (and vice versa). To me as a developer, Microsoft's coverage of ODF in their OSP is a good move, because it should remove all legal worries I might have around stepping into SW-patent covered territory.

This is still true, dear Microsoft.

I all bad, then? Well no - Microsoft recently won praise from no other than Groklaw with expanding their FAQ on their OSP - now specifically  making it clear that the OSP covers GPL-licensed implementations. Groklaw seemed so confused by the "good news" they had to ask: "Are pigs flying, or what?"

Smile

So Microsoft - what are you going do?

ISO says: continue with ISO/IEC 29500

This just in ...

The two ISO and IEC technical boards have given the go-ahead to publish ISO/IEC DIS 29500, Information technology – Office Open XML formats, as an ISO/IEC International Standard after appeals by four national standards bodies against the approval of the document failed to garner sufficient support.

Source: http://www.iso.org/iso/pressrelease.htm?refid=Ref1151

As I think you can imagine, I think this really good news ... more information to come.

Smile

Are document formats silver-bullets?

A new study from the University of Illinois College of Law has made its way to cyberspace. The title is "Lost in Translation: Interoperability Issues for Open Standards - ODF and OOXML as Examples" and is done by Rajiv Shah and Jay P. Kesan. The study takes a rather novel approach compared to the debates that have been raging through the last year or so: Is the choice of a(ny) document format a silver bullet for interoperability?

The answer in the paper is a clear "No". When discussing the various interop-studies internationally, they note

While it is widely acknowledged that there are problems with interoperability across different formats, e.g., going from ODF to OOXML, there is an assumption here that all implementations produce the same ODF or OOXML.

Their conclusion is that this is not the case. What they did was to create a number of test documents using the reference implementation for each format, OpenOffice.org for ODF and Microsoft Office 2007 for OOXML. They then opened these documents in other applications supporting these formats.

The results are rather interesting:

Results for ODF

Implementation Raw score  Raw score Percentage
Weighted Percent
OpenOffice
 151  100% 100%
StarOffice  149  99%  97%
Sun plug-in for Word
 142  94%  96%
CleverAge/MS plug-in for Word  139  92%  94%
WordPerfect  122  81%  86%
KOffice
 121  80%  79%
Google Docs  117  77%  76%
TextEdit
 55  36%  47%
AbiWord
 48  32%  55%

Results for OOXML

Implementation
Raw score
Raw score Percentage
Weighted Percent
Office 2007
148
 100% 100%
Office 2003
148
 100% 100%
Office 2008 (Mac)
147  99%  99%
OpenOffice
141  95%  96%
Pages 142  96%

 95%

WordPerfect 114  77%  84%
ThinkFree Office
101  68%  83%
TextEdit
52  35%  43%

They further conclude that

The final implication stems from the surprisingly good results for OOXML implementations. Critics of OOXML have argued that it was too complex and difficult to implement. While OOXML is a long and complex standard, it is possible to offer good compatibility. In fact, our results suggest that implementations of OOXML work as well as implementations of ODF. At the level of basic word-processing that we examined, neither standard had a dominant advantage over the other in terms of compatibility scores. While ODF has had a head start that has lead to more implementations, there appears no reason why OOXML cannot catch up. After all, several developers have provided independent implementations of OOXML.

... which should be interesting for those mandating usage of (an open) document format.

If nothing else this study highlights a couple of very interesting points:

  1. You don't get good interoperability simply by choosing an open document format
  2. Interoperability still has a long way to go and there is still a lot of work to be done. 
Smile

Generated by Microsoft Office 2007

As we are testing the various test ODF-files we/I have brought to the Microsoft DII workshop here in Redmond, I stumpled over the following XML-fragment:

[code=xml]<office:document-meta
  xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
  xmlns:meta="urn:oasis:names:tc:opendocument:xmlns:meta:1.0"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  office:version="1.1">
  <office:meta>
    <dc:generator>MicrosoftOffice/12.0 MicrosoftWord</dc:generator>
    ...
  </office:meta>
</office:document-meta>[/code]

*giggles*

A year ago - who would have thought this?

Smile

EEE - the SC34-way

In my recent post about the outcome of the AHG1-meeting in London, IBM's Rob Weir pointed out, that

What everyone is missing is the fact that Microsoft is not obligated to participate in SC34/WG4 maintenance, or to do maintenance exclusively in SC34/WG4. Ecma is fully capable of submitting any future version of OOXML under Fast Track rules directly to JTC1 (not SC34) for another 6-month ballot.

Well, as I have said repetedly before, when Rob is right, he is right, and this is no exception. As JTC1 Class A Liaison (there are actually only three of those, the other being ITU and the European Union, if I remember correctly), they can do pretty much what they want. So if we wanted to ensure maintance of OOXML would take place completely in SC34, we couldn't rely on JTC1 directives for help. We had to do something else.

One way would be to strong-arm ECMA into signing a binding, legal letter in which they committed to exclusive maintenance of OOXML in SC34. I you ask me, I don't think that is a good idea. I think most of us will agree, that this process has seen too many lawyers already.

Another way would be to indirectly make sure that ECMA does all their activities within the SC34-sphere. Like all organisations, people with the right skills are a constrained resource and it is in no way different for ECMA. As Rob pointed out, there is only limited time available for everyone, and we all need to prioritize our resources. So even though we did not discuss this particular angle with respect to ensuring maintenance in SC34, this was in effect what was the result.

SC34 needs to appoint resources to two areas when setting up WG4:

  1. The editor of the project
  2. Who should run the secretariat for the working group


ECMA volunteered to manage both areas, and we discussed quite a bit about that being a good idea or not. Come to think of it, I think it is.

ECMA (here Rex Jaeschke) has been the editor throughout the process - first in ECMA itself and next in ISO. It is clear that he has the right skill-set to follow the process and he has a clear interest in a fast-paced process. ECMA also volunteered to run the secretariat for the WG. As I wrote in my previous post, the work-load of the WG will be quite big, and a secretariat is really needed to keep track with everyone, to coordinate meetings and to create meeting reports, agendas etc.

So what we essentially did was a "Tripple-E" on ECMA. We have embraced ECMA and their OOXML-resources and we have sure, that given the amount of resources they are to put into SC34/WG4, they are not likely to have their own work run in parallel in ECMA. Now, at some point WG4 will be presented suggestions for additions to the specification. These could be from ECMA, but they could be from just about any member of SC34 - including countries opposing OOXML or competing companies represented in their favorite country. This is really the "ISO-model" at work.

So, you might say, this is just a load of good intentions and wishes for the best possible outcome ... and this is perfectly correct. Sadly though, these were our only options given the JTC1 directives. You might also say that it is highly likely that ECMA will be the only entity that will show up with additions to the specification. I have absolutely no idea of the propabilities for this to happen, but I think it would be very sad. We need other stakeholders and participants in the work with OOXML than ECMA and the national bodies' standards people and I think it would be unfortunate if the only stakeholder in WG4 with real, hands-on experience with creation of Office applications was ECMA. As Rick pointed out the major participants in ODF TC all develop applications based on the same code base and rumour even has it that development of additions to ODF is largely driven by Sun's development of OpenOffice.org in a "we need this element for our implementation - please put it in the specification"-kind-of-way. I have no idea if it is only a theoretical issue or if it is a concrete problem, but I would imagine that a document format to be used by a variety of implementations would benefit from different implementations being present at the development table. This is indeed also true for OOXML. We need the competitors of Microsoft at the table as well.

(ECMA TC-45 already consists of major office suite developers (Apple, Microsoft, Novell etc), so you already have the diversity of vendors present, but I have collectively referred to them as "ECMA". I would still, however, prefer to have participants present not associated with ECMA)

Side-note:

Luckily, you should brace yourselves that even in the event that everything mentioned above falls apart and ECMA litteraly goes their own way, the net effect will "just" be that maintenance of OOXML will be as with ODF where OASIS ODF TC works on maintenance completely seperate from JTC1/SC34.

I missed you, Rob

So today was the last day of the two-day meeting in the, in Oslo created, Ad Hoc Group 1-committee (AHG1). It has been a couple of interesting days and also rather productive. We managed to cover quite a lot of ground and I am quite pleased with the outcome of the meeting. Note, that we have not made any decisions at the AHG1-meeting. All we did was to suggest a possible structure for future work on OOXML in a new Working Group under SC34, Working Group 4 (WG4). SC34 will have to decide themselves (or, ourselves) on what to do in the event that the appeals on OOXML-approval are overthrown.

There were a total of 18 people attending the meeting – of these were nine people of either ECMA or Microsoft. The rest was comprised of representatives from British Standards Institute (BSI) and Dansk Standard (DS) and a couple of “neutral” people, herein myself, Francis Cave (UK), a guy from NL-net Foundation in the Netherlands, Keld Simonsen (NO), a guy from IBM (HUN), the convener Dr. (*giggles*) Alex Brown (UK) and Murata Makato (JPN). The meeting took place in the pleasant surroundings of British Library near St. Pancras Station in London.

The meeting report is available from the SC34 website.

I think the content of most of the discussions of the meeting can be summed up in these three words: “Openness”, “transparency” and “participation” – with the latter perhaps being the most important.

Participation

Murata Makato is currently the convener of SC34 WG1 – the WG that currently holds responsibility of the 29500-project in SC34. He will therefore be acting convener of WG4 as well until SC34 formally points out who should hold the position. We have suggested to SC34 that Murata Makato be pointed convener of WG4, but as with all positions in JTC1, it is at the discretion of the NBs and I encourage each NB to think hard on whether they have someone who could fill the position (I should note that I personally think that Murata is an excellent choice as convener).  Since the work-load of WG4 quite possibly will be rather large, we have also suggested that WG4 should have a secretariat. It is at the discretion of the convener to point out who should run the secretariat – it could be an NB, but in theory just about anyone. ECMA has offered to run the secretariat for the W4.

We have suggested to SC34 that an editor of OOXML should be appointed to oversee and coordinate the overall process of work on OOXML. ECMA has offered to continue to fill out this position, but it will ultimately be up to SC34 to decide. We talked quite a lot about how to structure editing of the 29500-project. In ISO-terms, maintenance is defined as “revision, withdrawal, periodic review, correction of defects, amendment, and stabilization”. We discussed having multiple editors (possibly one for each of the IS 29500-parts (four in total)), we discussed multiple editors sharing responsibility of editing the whole text as well, but we ended with a suggestion to have a single editor for the project and assign an “editorial team” to him/her. We agreed that this was the most flexible way to structure this task. The editorial team should consist of SMEs (Subject Matter Experts) of either the NBs or ECMA or any other expert invited/nominated by an NB. This will allow WG4 to adapt the resource-level in terms of body-count to the specific work-load being thrown upon it.

Now, if you ask me, it is crucial that the NBs step up to the task and participate in the work on maintenance and future development of IS 29500 – should the appeals have a favorable outcome. If we don’t step up, SC34 – and WG4 - will in effect simply be a new place for ECMA to meet and work on OOXML. But it is a daunting task – at least compared to the normal work load of some WGs under the JTC1 umbrella. We talked a lot about the possible magnitude of work, but since none of us were able to predict the future, we have suggested the following as initial meeting schedule and work-load:

  • Weekly teleconferences
  • Quarterly face-to-face meetings
  • Overall communication via, possibly, email

The first face-to-face meeting should take place in early 2009.

If the work-load is not as big as we fear (or hope), the activity-level will quite possibly be adjusted really quickly to a more infrequent level.

But this work-load is rather big - at least if you ask me. Even if you don’t count the quarterly meetings in, we are talking about weekly teleconferences of maybe 2 hours each – and with preparations for them of at least a couple of hours for each, initially. This amounts to between a half and a whole work-day each week. Note that none of this is funded by ISO. This effectively means that you don’t want to join the WG4 (or the editorial team) unless you really mean it. I don’t think there is any way to sugar-coat this – participating in standards work, be that in OASIS, ECMA or ISO is serious business and it takes up a lot of time. That being said, we need the NB-participation in this – otherwise the whole ISO-process regarding OOXML becomes, well, ‘moot’.

We actively encourage the NBs of SC34 to participate in the WG4 [and on a personal note; this should be regardless of position on OOXML itself]. The 29500-project drew an enormous amount of attention throughout the last months, and especially the feedback from those opposing OOXML has been extremely valuable. It would be sad, if all those good resources chose not to participate.

Openness

Openness is here referred to as the ability of participating in the process. This was sadly one of those areas, where not much was changed – mostly due to JTC1 directives. The members of SC34 (and the subordinate working groups) are national bodies, so if you want to participate in maintenance of OOXML, you need to join a national body. In some countries there is a fee, in some not. In Denmark, as an example, it is free for NGOs and private persons to participate and there are discounts for SMBs if they should choose to join. For “regular” companies as CIBER or IBM, the annual cost is about €3000.

We talked about setting up an informal channel for feedback from the community, but we ended with a decision about a closed NB-website for submission of comments to WG4 – essentially an electronic edition of the “Defect report” form from ISO.

I am thinking about creating an open, informal channel for feedback and comments on IS 29500. It should allow everyone to submit comments to the site about the spec and allow “comments on comments” to facilitate discussions on the feedback. The channel (a website, really) will be REST-enabled and allow any NB to use the contents of the website in their own work on IS 29500. The idea is to create a single point of feedback to enable not only the community to provide feedback but also assist the NBs with their technical work on IS 29500. The website is not a direct channel to SC34 but an informal place to discuss issues with the text. If anyone would like to contribute to this work with either ideas, funding (website costs etc) or technical expertise, please let me know.

Transparency

And what about transparency? Well, this will follow the rules of JTC1 which means that meeting reports and attendance-lists of face-to-face meetings will be posted on the SC34 website and be accessible for everyone. We will have to find a specific form in doing this, but I will do whatever I can to have the meeting reports be as much as possible like the meeting minutes from OASIS ODF TC. This means that not only will any decisions be recorded – details about the discussions around them should also be available. We will also likely be posting intermediate drafts of the specification for everyone to see – in exactly the same way OASIS ODF TC and ECMA TC45 has done until now. This will allow everyone to follow not only the work going on in WG4 – but also what the result of the work in the actual specification will be.

Participation – a final note

I missed a lot of people in London – the people and organizations opposing OOXML. I had expected a stronger representation from some of the big companies that have criticized DIS 29500 and I had also expected more of the opposing countries to attend. In effect, by not participating in the meeting, they contributed to the alleged “Microsoft stuffing”. I think it calls for a bit of after-thought on their part. They might not have had their cake (to eat) in Geneva, but not participating in the work is the sure road to an ECMA/Microsoft dominated WG in SC34. I will not begin to speculate (much) on possible reasons for not attending – maybe it’s just much easier to sit on one’s hands and claim “Microsoft stuffing” than actually attending the meeting. Just note, that it’s hardly “Microsoft-stuffing” when no one but Microsoft participates.

The happiness of solitude

Oh the lonelyness ... oh the solitude.

They say that parting is such sweet sorrow, but I beg to differ. Things have really, really cooled down in the otherwise warm and cozy OOXML/ODF-blogsphere. Rob and Arnoud seem to have gone back to their day-jobs and Brian has somehow completely dissapeared from the face of the Earth. Doug is mostly writing about what other people are writing about and Groklaw has gone back to their original angle - the SCO-Shenanigans. The only active blogger at the moment seems to be Rick, but even here, the normally so loyal Rick-bashers in the comment-threads seem to have gone AWOL.

Nothing seems to happen here in Denmark as well. The Danish NSB met about a week ago, and we decided to make the working documents public that formed the foundation of the arguments and decisions that took place in the last year. We formed a small technical sub-committee that did the technical work on first the responses to the Danish public hearing in Spring 2007 and later the responses from ISO to the Danish 168 comments to DIS 29500. The group consisted of CIBER Denmark, Ementor, IBM, Microsoft, ORACLE and the County of Aarhus. The technical group was an advisory group to the Danish SC34 mirror-committee. The working documents were made to allow us to keep up momentum and to document the progress we made. In short, for each meeting we made a list of the ISO editor responses that we could accept and those the we could not accept - and they were sent back to ISO editor for further processing. The documents are in Danish, but it still gives a good idea (regardless of native tongue) of what we did in the technical group and how we dealt with each issue. The documents are available at the Danish NSB website (last 7 documents at the bottom of the page in the section "Arbejdsgruppe-notater").

I have also more or less gotten back to my day-job as an Engineer with CIBER. I am currently investigating how to generate documents (ODF and OOXML) using .Net and is actually kind of fun. With that in mind I was interviewed for a video-cast by Microsoft for a small discussion about ODF and OOXML (they conveniently cut out the part where I said that I prefer the markup of ODF over the markup of OOXML but still prefer the tools for OOXML over the tools for ODF (for generating documents on e.g. a webserver or ERP-system), but what can you do?). One of the points I made in the interview was, that the tools were really important. If there are no good tools to create documents - it will slow down the adoption-rate of the particular file type. Regardless of SW-political view, the .Net-platform is rather large on a world-wide basis and the install-base of .Net-technology makes it a platform that should not be ignored (by size alone, if nothing more). And this puzzles me. If you look at the developer-hub of OOXML, you wil find libraries, scripts and tools for just about any operating system and programming language available. But if you want to generate an ODF-file using .Net technology - what do you do? Well, you will propably find that the only (OSS) library available is AODL, a project under the ODF Toolkit umbrella. Unfortunately, the project is not a priority of OpenOffice.org. I wrote an emails to the lead of the project (Dieter Loeschky from Sun) and he suggested that I joined the project as contributor. I have thought a bit about it, and I just might do so. I find it really important for the adoption of ODF that there are tools available for it, so if no one else will, I just might do it myself. I wonder if that will help everyone realize that I am a true ODF supporter.

And finally - the SC34 Ad Hoc Group 1 will convene in London in the end of July. We will meet and talk about what to do with both ODF and OOXML in the future. I am really looking forward to the meeting. The initial mail list reveales that there will be delegates from all over the world:

 
Country
 #
Austria  2
Canada  1
Chile  1
Czech Republic
 1
Germany  2
Denmark  3
Finland  3
India  4
Japan  3
Republic of Korea  2
Malaysia  2
Norway  3
New Zealand
 6
United Kingdom
 3
United States

 2


I hope we will have a couple of productive days in London. As Alex Brown wrote about after the Oslo plenary in April 2008, transparency of the process is a key point and any input from you, dear reader, to how this could be achieved would be appreciated.

And finally-finally, I seem to have been struck by a bad was of "YABS" - Yet Another Blog Syndrome. Within the next few weeks I will begin blogging on the best IT-website in Denmark, Version2.

Generation of ODF-files on the .Net-platform

Some time ago I wrote a couple of articles about how to generate ODF-files as well as OOXML-files using .Net technology (both articles are in Danish). For generation of OOXML-files I used the - at that time - new .Net 3.0 System.IO.Packaging assembly and for generation of ODF-files I used AODL - a part of ODF Toolkit.

I thought it was time to refresh my skills - and share them with you guys - since the OOXML/ODF-debate has cooled down to a more relaxing level.

A few weeks back Microsoft released the first production-code edition of their OpenXml SDK - version 1.0. I will dig into this a bit later.

I thought I'd kick this series off with a couple of articles about ODF-file generation on the .Net platform, but I was unpleasantly surprised to realize, that it might not be as easy as it sounded. First, I was told that AODL was a dead project. Surely, the latest addition of code was in April 2007 and it seems that nothing has happened since.  It looks as if the resources of ODF Toolkit is focused on ODFDOM - currently a Java-project. The problem is - AODL seems to be the only .Net-project available. I have stumpled across the ODF .Net project by IndependentSoft, but they sell an ODF library as Closed Source Software ... for (brace yourself) €999 a pop! Seriously - selling CSS-libraries is just sooo 2006 ...

And then I come to you, dear reader ... what the hell do I do? Do you know of other .Net libraries that allow me to create and manipulate ODF-files?

Smile