a 'mooh' point

clearly an IBM drone

Is "interoperability" a transitive characteristic?

Way back when I was a math-major at university, we were taught about "operations on sets". A set could simply be "the natural numbers", which could be defined as all positive integers including the number 0. An operation on this set could be addition of numbers, multiplication of numbers and so forth. An operation can have a lot of characteristics, e.g "commutative", "associative" or "transitive". An "associative" operator means that you can group the operands any way you want and a "commutative" operator means that you can change the order of the operands. Confused? Well, it's not that complex when you think of it. The mathematical operator "addition" is an "associative" operator (or "relation") since (1+2) + 3 = 6 and 1 + (2+3) = 6. The operator "divide" is not associative since (1/2) / 3 = 1/6 whereas 1 / (2/3) = 3/2. Addition is also a commutative property since you can change the order of the numbers being added together. This is evident since 1+2+3 = 6 and 3+2+1 = 6. Similarly "subtraction" is not a commutative operator since 1-2-3 = -4 whereas 3-2-1 = 0.

The transitive characteristic is a bit different than this and the "everyday equivilant" would be when we infer something. So think of transitivity is a mathematical formulation of what we do when we infer.

The relation "is greater than" is a transitive characteristic - as well as "is equal to". Basically, a relation (is greater than) being transitive means, that if A > B and B > C then A > C.

The latter popped into my mind the other day when I was pondering over interoperability between implementations of document formats.

Ever since Rob's ingenious article "Update on OpenOffice.org Calc ODF interoperability", I haven't been able to get it out of my head.

 

1  /  2  / 3   

Extending OOXML

This article will have to topics - one about extending OOXML using the built-in extension mechanisms and one about extending OOXML itself.

Using built-in mechanisms

As I have written about earlier OOXML has a (fun) part containing mechanisms for extending OOXML with vendor/domain-specific extensions. That part is "Part 3 - Markup Compatibility and Extensibility". The part describes different techniques when extending OOXML - most interesting is propably the sections about "Markup Compatibility Attributes and Elements" describing ways to extend OOXML while enabling compatibility to e.g. earlier/current version of the specification.

So if you were a vendor wanting to add something to the spec - but couldn't wait for the slow ISO pace or simply needed the competitive edge of not revealing anything about future software releases to your competitors ... what could you do?

The first thing you should do is to decide if you want your new stuff to eventually make it into the spec. If you don't want that - you're done already.

Assuming you want it into the spec, here are a couple of hints to how you might approach it:

  1. Document your extensions thoroughly
  2. Present these extensions to SC34/WG4 with justification to how and why you want it into the spec
  3. Work with us to polish the nitty-gritty details that you overlooked
  4. Make sure there are no legal nor technical barriers to implementing these new features for your competitors
  5. Wait for the stuff to eventually be included in IS29500

So the real target of this is - if you haven't already guessed it - Microsoft. So to be even more specific, here's a little list of things to do for Microsoft - in case they want to extend IS29500:

You will propably have some additions to IS29500 in your implementation of Office 14. Assuming that you will at some point like these to be added to IS29500, this is what you should do:

  1. Document your extensions thoroughly. Remember, the quality of the documentation will be under the same scrutiny as the text of DIS29500 so please do it right the first time.
  2. Add the documentation of your extensions to your "Implementer's notes" on the DII-website. 
  3. Present these extensions to SC34/WG4 with justification to how and why you want it into the spec.
  4. Work with us to polish the nitty-gritty details that you overlooked.
  5. Include the extensions and the documentation for it in your OSP.
  6. Wait for the stuff to eventually be included in IS29500.

Remember, the minute the first public beta of Office 14 hits the web, the documentation of the extensions as well as inclusion in OSP should be finished. Not a month later, not a week later - on day one!

Extending IS29500 itself

There has been a lot of talk lately to how IS29500 will be extended in the future. Specifically, how - and where - will new additions be included? IS29500 is comprised of two schema sets - a strict set and a transitional set. Currently the strict set is created from the transitional set, so strict is in fact a proper subset of the transitional set.

However - there is no guarentee that this will always be so.

My gut feeling is that transitional should be preserved as the "reflection" of the existing Microsoft Office documents (until March 2008) - in other words in term with the scope of IS29500.  I think that any new stuff should be added to the strict schema set only. The term "transitional" clearly implies this. As I recall the feeling in Geneva at the BRM, the idea behind the transitional set was, that eventually it would no longer be needed and hence removed from the standard - at some point in the future. If we continue to add new features to the transitional set, we will never get to the point where we can honor the sentiment of this particular issue.

...  now at the moment, we haven't decided anything yet ... so right now anything goes.

But what are your thoughts?

Markup compatibility and extensibility (MCE)

Part 3 of ISO/IEC 29500 is the fun part and if you haven’t read it yet, you really should do so – especially if you are thinking about implementing an IS29500-document consumer. Part 3 basically consists of two distinct areas – one that deals with compatibility and one that deals with extensibility. The first area is the target of this post.

To any markup consumer and producer of a format not cast in stone it is important to be able to ensure compatibility both forwards and backwards as the format changes over time. This is where the “compatibility-thingy” comes into play.

The compatibility-features of OOXML enable markup producers to target different versions of applications supporting different versions of the specification or different features all together. The tools to do this are called “Alternate Content Elements” (ACE) and “Compatibility-rule attributes” and Part 3 is supposedly an exact remake of how compatibility and extensibility is handled in the binary Microsoft Office files.

The latter tool enables markup producers to “force” other markup producers to preserve specific content – even if it is not known to them as well as instructing markup producers to which parts of the document could safely be ignored. It can even instruct markup consumers to fail if it doesn’t understand some parts of the markup. If this sounds kind-of “SOAP-ish” to you, the attribute name “MustUnderstand” to enable just this should sound even more familiar to you.

The first tool can be thought of as sort of “a switch statement for markup”. It allows a markup producer to serve alternate versions of markup to target alternate feature-sets of different applications. The diverging markup would be listed as different “alternate content blocks” or “ACB’s”, and it is essentially an intelligent way for a markup producer to tell a consumer that “if you don’t understand this bit, use this instead”.

An interesting use case would be to use ACE to improve interoperability when making text documents with mathematical content. It has long been a public secret that interoperability with OOXML was improving day by day – but not with mathematical content. Mathematical content in OOXML (or “OMML”) has for some reason not been a top priority with implementers of OOXML, so interoperability has been really, really bad.

Now, wouldn’t it be cool if there was some way for markup producers to serve MathML as well as OMML to consuming applications? Let’s face it – most of the competition to Microsoft Office 2007 is from applications supporting ODF, and they all (to a varying degree) support MathML. So a “safe assumption” would be that “if I create an OOXML text document with OMML and send it to a different application, it probably understands MathML much better than OMML”. Wouldn’t it be cool, if you could actually do this?

Well, to the rescue comes ACE.

ACE enables exactly this use case. ACE is based on qualified elements and attributes, so as long as you can distinguish between the qualified names of the content you are dealing with, ACE is your friend.

So let’s see how this would work out.

Take a look at this equation: 

 

In Office Math ML (OMML) this is represented as:

[code:xml]<m:oMath>
  <m:r>
    <m:t>a=</m:t>
  </m:r>
  <m:f>
    <m:num>
      <m:r>
        <m:t>b</m:t>
      </m:r>
    </m:num>
    <m:den>
      <m:r>
        <m:t>c</m:t>
      </m:r>
    </m:den>
  </m:f>
</m:oMath>
[/code]

In MathML thhe formula is represented as:

[code:xml]<math:math >
  <math:mrow>
    <math:mi>a</math:mi>
      <math:mo >=</math:mo>
      <math:mfrac>
        <math:mi>b</math:mi>
        <math:mi>c</math:mi>
      </math:mfrac>
    </math:mrow>
</math:math> [/code]

(both examples have been slightly shrinked)

So how would one specify both these ways of writing mathematical content? Well, it could look like this:

[code:xml]<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document
  xmlns:omml="http://schemas.openxmlformats.org/officeDocument/2006/math"
  xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
 
  xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"

  >
  <w:body>
    <w:p>
      <omml:oMathPara>
        <mc:AlternateContent xmlns:mathml="http://www.w3.org/1998/Math/MathML">
          <mc:Choice Requires="mathml">
            <mathml:math >
              <mathml:mrow>
                <mathml:mi>a</mathml:mi>
                <mathml:mo >=</mathml:mo>
                <mathml:mfrac>
                  <mathml:mi>b</mathml:mi>
                  <mathml:mi>c</mathml:mi>
                </mathml:mfrac>
              </mathml:mrow>
            </mathml:math>
          </mc:Choice>
          <mc:Choice Requires="omml">
            <omml:oMath>
          <omml:r>
            <omml:t>a=</omml:t>
            </omml:r>
            <omml:f>
            <omml:num>
              <omml:r>
              <omml:t>b</omml:t>
            </omml:r>
            </omml:num>
            <omml:den>
              <omml:r>
                <omml:t>c</omml:t>
              </omml:r>
            </omml:den>
            </omml:f>
            </omml:oMath>
          </mc:Choice>
          <mc:Fallback>
            <!-- do whatever -->
          </mc:Fallback>
        </mc:AlternateContent>
      </omml:oMathPara>
    </w:p>
  </w:body>
</w:document>[/code]

So you simply add the compatibility-namespace the file and add the "AlternateContent"-element. This element includes a list of "choices" and possibly a fallback choice. The choices are evaluated in the sequence they appear in the list of "choices".

And the benefit? Well, you can now have your cake and eat it too. If the consuming application supports it, it will display the equation based on the mathml-fragment – otherwise it will use OMML.

This is immensely interesting and applies to all sorts of places and use cases – heck, you can even use it to gain advantage of some of the new stuff in the strict schemas of IS29500 while keeping intelligent compatibility with existing applications only supporting ECMA-376 1st Ed. Imagine the ECMA-376-way of doing dates in spreadsheets and now add the possibility of using some of the new functionality added at the BRM - and without the risk of breaking applications nor losing data.

… that is if we change the namespace of the strict schemas, of course.

Smile

IBM: Thumbs up for OOXML!

Today news broke that ANSI (the US national standardisation guys) recently voted on the subject of approving OOXML as an "American National Standard".

The text of the ballot was:

Approval to Adopt the International Standards listed below as American National Standards:

  • ISO/IEC 29500-1:2008 (...) Part 1: Fundamentals and Markup Language Reference
  • ISO/IEC 29500-2:2008 (...) Part 2: Open Packaging Conventions
  • ISO/IEC 29500-3:2008 (...) Part 3: Markup Compatibility and Extensibility
  • ISO/IEC 29500-4:2008 (...) Part 4: Transitional Migration Features

A total of 20 organisations/entities voted and the result was

  • Approve: 12
  • No: 0
  • Abstain: 2
  • Not voted: 2

The details are here:

DateOrganizationYesNoAbstainNot Yet
TOTAL 12 0 2 4
03/16/2009 Adobe Systems       X
04/13/2009 Apple Inc X      
04/15/2009 Department of Homeland Security X      
03/16/2009 DMTF       X
04/09/2009 Electronic Industries Alliance X      
03/16/2009 EMC X      
03/16/2009 Farance, Incorporated       X
03/16/2009 Google       X
04/15/2009 GS1 US     X   Comments
04/13/2009 Hewlett Packard Co X      
03/24/2009 IBM Corp X      
04/15/2009 IEEE     X   Comments
04/08/2009 Intel X      
03/18/2009 Lexmark International X      
03/17/2009 Microsoft X      
03/16/2009 NIST X      
03/19/2009 Oracle X      
03/16/2009 US Department of Defense X      

An interesting vote here is naturally the vote of "International Business Machines Corp", otherwise known as IBM. It seems they now support OOXML - good for them.

I think it is an extremely positive move from IBM and I salute them for finally getting their act together and supporting OOXML. I also hope IBM will tread in the footsteps of Microsoft in terms of TC-participation and join us in SC34/WG4 to contribute to the work we do. I think it is positive for the industry that Microsoft finally joined OASIS ODF TC last summer, and I hope IBM will do the same with SC34/WG4 - we need other vendors besides Microsoft at the table. I also hope this means that IBM will speed up support for OOXML in either Lotus Symphony or OpenOffice.org. The support for OOXML in other applications than Microsoft Office 2007 is ridiculously low.

Thank you, IBM - you really made my day.

Smile

PS: I appologize for the colors of the table above

 

Lo(o)sing data the silent way - all the rest of it

Ok - this post is going to be soooo different than what I had envisioned. I had prepared documents for "object embedding" and "document protection" but when I started testing them, I soon realized that only Microsoft Office 2007 implemented these features - at least amongst the applications I had access to. These were:

Microsoft Office 2007 SP2

OpenOffice.org 3.0.1 (Windows)

OpenOffice.org 3.0.1 (Mac OS X)

NeoOffice (Mac)

iWorks 09 (Mac)

The reason?

  • OOo3 doesn't fully support object embeddin
  • OOo3 doesnt support document protection
  • iWorks doesn't support object embedding at all
  • iWorks doesn't support document protection

So I'll just give you one example of what will happen when strict documents come into play - when applied to document protection.

Document protection is the feature that allows an application to have a user enter a password and unless another user knows of this password, he or she cannot open the document in, say, "write-mode". There is no real security to it, though, it is simply a hashed password that gets stored in the document.

This data is stored in the "settings.xml"-file in the document, and this was rather drastically changed during the ISO-process.

If you use Microsoft Office 2007 to protect your document, it will result in an XML-fragment like this:

[code:xml]<w:documentProtection
  w:edit="readOnly"
  w:enforcement="1"
  w:cryptProviderType="rsaFull"
  w:cryptAlgorithmClass="hash"
  w:cryptAlgorithmType="typeAny"
  w:cryptAlgorithmSid="4"
  w:cryptSpinCount="100000"
  w:hash="XbDzpXCrrK+zmGGBk++64G99GG4="
  w:salt="aX4wmQT0Kx6oAqUmX6RwGQ=="/>[/code]

You will have to look into the specification to figure out what it says, but basically it tells you that it created the hash using the weak algorithm specified in ECMA-376.

But as I said, this was changed during the BRM. Quite a few of the attributes are now gone for the strict schemas, and my take on a transformation of the above to the new, strict edition is this:

[code:xml]<w:documentProtection
  w:edit="readOnly"
  w:enforcement="1"
  w:algorithmName="typeAny"
  w:spinCount="100000"
  w:hashValue="XbDzpXCrrK+zmGGBk++64G99GG4="
  w:saltValue="aX4wmQT0Kx6oAqUmX6RwGQ=="/>[/code] 

'Only thing I am a bit unsure about is the value for the attribute "algorithmName", but I guess it would be "typeAny". The result? Microsoft Office 2007 detects that the document has been protected, but it cannot remove the protection again - presumably due to the new attributes added to the schemas. I thought about creating new values using e.g. SHA-256 as specified in the spec, but the chances that Microsoft Office 2007 would detect this in unknown attribute values are almost nothing, so I didn't bother doing this. Feel to play around with it yourself.

The Chase

We need a namespace change for the strict schemas - and am thinking about ALL of the strict schemas including OPC. If we don't do it this way, my estimate is that we will lose all kinds of data - and the existing applications will not (as they behave currently) inform their users of it. Making existing applications break is a tough call, but I value data/information integrity more than vendors needing to update a bit of their code.

And as for the conformance attribute? Well, the suggestion as it is currently is to enlarge the range of allowed values of this attribute. Somehow I think it makes sense to enlarge the range as well.I think it would make sense to have the values one of

  • strict
  • transitional
  • ecma-376

or something similar. Then when we make a new revision at some point in the future, we can add version numbers to them at that time. Changing the namespaces will also make it possible to use MCE to take advantage of new features of IS29500 while maintaining compatibility with existing applications supporting only ECMA-376 1ed. (more about this later)

And what should the schemas be named?

Well, they are currently like "http://schemas.openxmlformats.org/wordprocessingml/2006/main" . So an obvious choice would be "http://schemas.openxmlformats.org/wordprocessingml/JLUNDSTOCHOLM/main"

Smile

... or maybe simply "http://schemas.openxmlformats.org/wordprocessingml/main" would be better? Of course it introduces easy causes for errors for developers, so maybe "http://schemas.openxmlformats.org/wordprocessingml/iso/main" would be even better?

Losing data the silent way - ISO8601-dates

In Prague we spent quite some time discussing how to deal with the fact that applications supporting ECMA-376 1st Ed. not necessarily support ISO/IEC 29500:2008 strict as well. Our talks revolved primarily around how major implementations dealt with the modified functionality of the elements <cell> and <v> in SpreadsheetML now that ISO-dates are allowed as content of the <v>-element. But “dates in spreadsheets” is not the only place where changes occurred. Changes were also made to other areas, including

  • Object embedding
  • Comments in spreadsheets
  • Hash-functions for document protection

This will be the first post in a series of posts evolving around how IS29500 differs from ECMA-376 and how existing applications behave when encountering a document with new content. What I will do here is to create some sample documents and load them in the applications I have access to that supports OOXML the best. In my case these are Microsoft Office 2007 SP2, OpenOffice.org 3.0.1 and NeoOffice for Mac and Apple iWorks. If you want to contribute and you have access to other applications, please let me know the result and I’ll update the article with your findings. If you have access to Microsoft Office 2007 SP1, I'd really like to know. When the series is done I’ll post a bit about MCE and how it might help overcome some of the problems I have highlighted (if we’ll get to change the namespace for the strict edition of IS29500 schemas)

I should also note that as the series progresses, the examples I make will increase in complexity. A consequence of this will be that my examples will be more of a “magic-8-ball-type prediction” than “simple examples of IS29500-strict documents”. Since there is not a single application out there supporting IS29500-strict, the examples will be my “qualified guesses” to how applications might interpret IS29500-strict when they implement it.

ISO-8601 dates in SpreadsheetML

Let me first touch upon the problem with dates in SpreadsheetML since this was the problem we talked about the most. Gareth Horton from the UK national body hand-crafted a spreadsheet document with these new dates. I have modified his example a bit to better illustrate the point. Files are found at the bottom of this post.

In the original submission to ISO dates were persisted in SpreadsheetML as “Julian numbers” (serial representation) and subsequently formatted as dates using number format styles.

[code=xml]<sheetData>
  <row r="1">
    <c r="A1" s="1">
      <v>39904</v>
    </c>
  </row>
  <row r="2">
    <c r="A2" s="1">
      <v>39905</v>
    </c>
  </row>
(…)
  <row r="10">
    <c r="A10" s="1">
      <v>39913</v>
    </c>
  </row>
</sheetData>[/code]

So the above would create a column with 10 rows displaying the dates from April 1st to April 10th.

Let’s change one of the cells to contain a date persisted in ISO-8601 format.

[code=xml]<row r="9">
  <c r="A9" s="1" t="d">
    <v>2009-04-09T01:02:03.04Z</v>
  </c>
</row>[/code]

So the cell contains an ISO-8601 date and it is formatted using the same number format as the other cells. I have added a bit of additional data to the spreadsheet to illustrate the problem with using formulas on these values.

Result

The interesting thing to investigate iswhat happens when this cell is loaded in a popular OOXML-supporting application. Note here that the existing corpus of implementations supporting OOXML supports the initial edition of OOXML, ECMA-376 1st Ed.So they would have no way to look into the specification and see what to do with a cell containing an ISO/IEC 8601 date value.

Microsoft Excel 2007 SP2

As you can see Excel 2007 screws up the content of the cell. And on top of that, should you try to manipulate the content of the cells with formulas, they are also basically useless. The trouble? Well, you are not notified that Excel 2007 does not know how to handle the content of the cell, so chances are that you’ll never find out – until you find yourself in a position where there are real consequences to the faulty data and kittens are killed.

OpenOffice 3.0.1 Calc

 

 

The result here is almost the same. Data is lost and the user is not notified.

NeoOffice for Mac

 

Again we see the same result. This is not so strange, since the latest version of NeoOffice shares the same code base as OOo 3.0.1 so behavious should be the same.

iWorks 09 Numbers

 



Wow, so for iWorks on the Mac, the user is actually warned that something went wrong. Only trouble is - it does not warn you that the content of the cell is not valid - it informs you that the system cannot find the font "Calibri".

Conclusion

It is pretty hard to conclude enything but "this sucks!". None of the applications warn the user that they have lost data - and they all do exactly that - loose data.

Original file: Book1.xlsx (8.82 kb)

Modified file: 

book2.xlsx (8.22 kb)

The actual work we did in Prague

I thought I’d try to outline a bit what we actually did and what constituted our work in Prague.

The agenda framing our work throughout these three days was this:

  1. Opening 2009-03-24 09:00
  2. Roll call of delegates
  3. Adoption of the agenda
  4. Schedule for publication of reprints or Technical Corrigenda
  5. Defect reports
  6. Future meeting (face-to-face and teleconferences)
  7. Any other business
  8. Extension proposals from member bodies and liaisons
  9. Conformance testing
  10. Closing

The vast majority of our work was in item number 5 on the agenda and each and every single minute was used discussing the defect reports – including in lavatories, on our way to work, on our way back from work, during lunch, dinner, breaks and drinks … in short – we discussed DRs 24/7. This was as it was supposed to be – this was really the reason for all of us being in Prague.

The initial list of DRs we discussed was this (just to give you an idea of what we talked about):

08-0001 — DML, FRAMEWORK: REMOVAL OF ST_PERCENTAGEDECIMAL FROM THE STRICT SCHEMA
08-0002 — PRIMER: FORMAT OF ST_POSITIVEPERCENTAGE VALUES IN STRICT MODE EXAMPLES
08-0003 — DML, MAIN: FORMAT OF ST_POSITIVEPERCENTAGE VALUES IN STRICT MODE EXAMPLES
08-0004 — DML, DIAGRAMS: TYPE FOR PRSET ATTRIBUTES
08-0005 — PML, ANIMATION: DESCRIPTION OF HSL ATTRIBUTES LIGHTNESS AND SATURATION
08-0006 — PML, ANIMATION: DESCRIPTION OF RGB ATTRIBUTES BLUE, GREEN AND RED
08-0007 — DML, MAIN: FORMAT OF ST_TEXTBULLETSIZEPERCENT PERCENTAGE
08-0008 — DML, MAIN: FORMAT OF BUSZPCT PERCENTAGE VALUES IN STRICT MODE EXAMPLE
08-0009 — WML, FIELDS: INCONSISTENCY BETWEEN FILESIZE BEHAVIOUR AND EXAMPLE
08-0010 — WML: USE OF TRANSITIONAL ATTRIBUTE IN TBLLOOK STRICT MODE EXAMPLES
08-0011 — WML: USE OF TRANSITIONAL ATTRIBUTE IN CNFSTYLE STRICT MODE EXAMPLE
08-0012 — SCHEMAS: SUPPOSEDLY INCORRECT SCHEMA NAMESPACE NAMES

I think it’d be fair to say that we have come a long way since the time we were discussing if it was possible to use XSLT to simulate bit-switching or if an OOXML-file was “proper XML”.

For each of the DRs we covered we discussed if the DR was a technical defect or an editorial defect, what the possible implications of the DR would be to existing documents and existing implementations and if the DR belonged in a corrigendum (COR) or if it was an amendment (AMD). It was quite tedious work, but we managed to cover quite a lot of ground in the three days.

Corrigendum or amendment?

One of the first things to accept when working in ISO is that there are quite the number of rules to comply to. As it turns out, it is not our prerogative to decide if a DR goes into “the COR bucket” or if it goes into “the AMD bucket” – there are rules for this. The ISO directives section 2.10.2 state that

A technical corrigendum is issued to correct [...] a technical error or ambiguity in an International Standard, a Technical Specification, a Publicly Available Specification or a Technical Report, inadvertently introduced either in drafting or in printing and which could lead to incorrect or unsafe application of the publication

If the above is not the case, the modification should be handled as an amendment.

Still, there are quite a lot of DRs that fall into the more gray outskirts of this definition. So to facilitate our work we made some guiding principles, and these principles were discussed at the SC34 plenary in Prague:

[…] in the interest of resolving minor omissions in a timely fashion, WG4 plans to apply the following criteria for deciding that the unintentional omission or restriction of a feature may be resolved by Corrigendum rather than by Amendment. All of the following criteria should be met for the defect to be resolved by Corrigendum:

  1. WG 4 agrees that the defect is an unintentional drafting error.
  2. WG 4 agrees that the defect can be resolved without the theoretical possibility of breaking existing conformant implementations of the standard.
  3. WG 4 agrees that the defect can be resolved without introducing any significant new feature.

Unless all the above criteria are met, the defect should be resolved by Amendment.

Of course we will still have to do an assessment for each and every DR we look at, but it is our view that these principles will help us quite a bit along the way and to have a more expeditious workflow. Notice also the wording “WG4 agrees”. A very small number of DRs falls clearly into the COR- or AMD-bucket, so it is not possible to regard these principles as a mere algorithm with a deterministic result. The principles requires WG4 to agree to the categorization of DRs so we’ll actually have to sit down and talk everything through.

On the first day (or was it second?) we also touched briefly upon the subject of modifying decisions made at the BRM. The delegates at the BRM were nothing but normal people, and due to the short timeframe of the meeting, errors likely occurred. At some point or another, someone will discover we made a mistake and put a DR on our table. At this point we will have to figure out if we think the decisions made at the BRM are now cast in stone or if they should be treated by the same criteria as the other DR we receive. As I said, we just touched upon the subject and didn’t reach any conclusions to this. If you have any thoughts regarding this, please let me (and us) know. My personal opinion on this subject is, that we in WG4, at this point in time, should be extremely careful when thinking about reversing decisions made at the BRM.

And finally, I thought I’d give you some pointers about what is in the pipeline of blog entries (I don’t have a sophisticated system as some people, so I’d just give you a small list of topics at the top of my mind these days:

  • Markup Compatibility and Extensibility
  • Conformance class whatnots
  • Namespace changes and the considerations about doing it or not
  • Why should we care about XPS?
  • Why I like the ISO model
  • Maintenance of IS26300 in ISO

 

Maintenance of IS26300 in SC34

The streets of Prague are buzzing with rumours coming out of the work in the working groups of SC34 and SC34 itself as SC34 is currently having its plenary meeting in Prague.

It seems that SC34 has done the only clever thing to do - to create an Ad Hoc Group (AHG) to have responsibility of maintaining IS26300. I applaud the decision to do so, and it has in my view been a long time coming.

The details and scope of the group is yet to be seen, but I am glad that SC34 has chosen to create it. There is only one entity responsible for maintaining ISO standards, and that is ISO. Maintenance of IS26300 has falling between two chairs at the moment, where WG1 was initially responsible for it, but it has been preoccupied with other tasks. Also, I think the maintenance agreement of IS26300 has been mentally prohibiting any work being done.

The upside of this is that there is now a group in SC34 responsible for receiving defect reports submitted by NBs. One group is responsible for preparing reports to OASIS and the get the responses back in the ISO system.

This is a clear improvement and it is a sign and a statement that we believe that IS26300 is too important to not have a group responsible for its maintenance in ISO.

Smile

WG4 meetings in Prague

Wow – this has been a tough week. I arrived at the hotel here in Prague (I am currently waiting in Prague Airport for my flight back to Copenhagen) at around 21:00. I met Doug in Copenhagen and flew with him to Prague and in the airport we ran into Kimmo. After 15 minutes in my hotel room I went down to the bar to get a “welcome to Prague”-beer. After another 15 minutes I crawled back to my room completely devastated due to a flu I hadn’t been able to get rid of. 5 seconds later Florian called and ordered me to get my ass down in the basement wine-bar where he was having drinks with Doug and Megan. I went back to my room when the bar closed at around half past midnight, did some last-minute updates/tweets and almost cried myself to sleep because of near-death-like fatigue.

… and the meetings hadn’t actually started yet.

The next morning the meetings started with a joint session between WG4 and WG5 at the Czech Standardisation Institute. A total of 31 delegates attended this initial meeting. Apart from the SC34 officers (SC34 chair, SC34 secretariat, WG4 convener), there were delegates from Canada, China, Czech Republic, Denmark, ECMA, Finland, France, Germany, Korea, Norway, South Africa, UK and USA. We had quite a lot of work on our table for these three days, and we immediately got to work after the initial pleasantries. A rough list of categories to be dealt with was “Defect reports”, “Rules of engagement” (or “Prime directives”), “Future work”, “Roadmap for future editions/corrections” and “Planning of future meetings and tele-conferences”.

If you’ve been following my twitter-feed (and the ones of Alex, Doug an Inigo) you’ll already have a notion of the insanely interesting things we talked about. But for those not following me (and you should!!!) we talked about sexy things like whether “named ranges” in spreadsheets were defined on the workbook-level or the worksheet-level, whether a reference to Unicode 5 implied dependencies of XML 1.1, whether xml:space applied to whitespace-only-nodes or just to trailing- and leading whitespace in element content, whether font-substitution algorithms in OOXML had a bias for Panose-fonts and if “Panose” really meant “Panose1” and suttle differences between the Panose-edition of Hewlett-Packard and the one of Microsoft (as far as I understood it, anyway)

Can you imagine all the fun we had?

And you know what? We didn’t stop talking about it during lunch, dinner nor brakes. As Doug noted in one of his tweets, the only difference between session and breaks was that during session, only one person talked at any given time.

Well, apart from all this fun, we made an enormous amount of progress. A total of about 169 defect reports have been submitted to us until this point, and we processed almost all of them. We didn’t close all of them, but we managed to process the most important ones and prepare ourselves for our first tele conference in mid April. We laid down some ground principles upon which we will make decisions in the future and we talked about a set of “Prime directives” to form a mental basis for our work (think: The three Laws of Robotics).

In short – it was a good week. I’ll post a series of blog posts in the next weeks outlining the results we achieved (and did not achieve) including both the extremely boring ones as well as the more controversial ones. So Watch this space …

PS: I almost forgot. Microsoft sponsored a dinner/buffet for the participating experts on Wednesday. But what was even cooler was that they had lined up a bunch of Ferraris and Lamborghinis for us outside the restaurant, and we could just take a pick to choose a car to take home. Mine was red! Is that wicked or what?

To the nitwits from <no>ooxml.org: Take it home, boys!

OASIS to JTC1: Bye, bye ...

Ever since the hoola about OOXML-approval there has been quite some discontent in the ISO community regarding how ODF TC has fulfilled its obligations after IS26300 approval. A few meetings have taken place to "amend the harsh feelings" and now some preliminary results have been sent to the NBs for consideration. For those with ISO privileges the documents [1], [2] can be found in the SC34 document repository.

There has been a lot of debate as to where maintenance of ODF should take place, be it in OASIS via ODF TC or via some construction as with OOXML, where the originating TC is included (assimilated) into SC34 and maintenance and development takes place there. I really don't care where these activities take place. I just want the best qualified people to do it.

Now, the documents deal with a definition of principles and a more specific definition of "who takes care of what?"-items. When reading through the documents, I couldn't help getting the feeling that what OASIS was essentially telling JTC1 was "It's my way or the highway".

JTC1 and OASIS have come to the following agreement around maintenance: 

  • OASIS ODF TC takes care of maintenance and development of ODF. 
  • National body participation in this work is encouraged to take place in ODF TC by either direct membership, via the "Comment mail list" or via TC Liaison (I didn't know JTC1/SC34 had one of those in ODF TC)
  • OASIS will submit each approved edition of ODF to JTC1/S34 for approval to make sure that approved standards are equivilant.

I completely agree on item 1) and 3) above, but item 2)? In the paper there is not a single sentence on how the procedures in JTC1 fit into all this. Why are there no wording regarding voting procedures in SC34? If ODF TC comes up with something new and "substantially different", it should be submitted using the "PAS submitter status" of OASIS (similar to the Fast track procedure ECMA used with OOXML). But a PAS submission requires voting in SC34 and if the vote fails (or substantial concern is raised), a BRM is scheduled. If the comments are fixed, the result of the BRM will be an "errata-sheet" and a new vote takes place.

Suppose the post-BRM vote approves the submitted ODF edition

  • what will OASIS do with the errata-sheet?
  • what are the principles for getting them back into the OASIS-approved edition of ODF?
  • what is the time frame?

Is the truth really, that OASIS doesn’t want JTC1/SC34 to do anything to ODF but rubber-stamp it when it comes our way?

When the original ODF 1.0 was submitted to JTC1, a maintenance plan was agreed upon. It had two small but really important words in it: "as is". The maintenance agreement said (AFAIR) that JTC1/SC34 was expected to approve future editions of ODF "as is". In other words, what OASIS managed to get JTC1 to agree to was essentially: "Don't look at it, don’t' open it, don't flip through it, just - don't touch it. Get a hold of the ISO-approval stamp, stamp it and send it back to us".

The only possible conclusion is that OASIS does not want any direct ISO-involvement in development of ODF.

That is fine - the ODF TC should do what they find best. But I am wondering if that also means, that OASIS will not send future editions of ODF to JTC1 for approval? Surely, OASIS can't live with the reputation of having their standards simply rubber-stamped by ISO? 

You may also ask why it is not good enough for JTC1-members to contribute to ODF through ISO. Well, OASIS is a vendor-consortium and the interests of the vendors seem to be somewhat different than the interests of the national bodies. If you look at the contributions of Murata Makato and Alex Brown through the ODF Comment list, it is clear that their interests in quality in schemas, constructs and the specification itself was not prioritized in the TC at all. To me a mix of vendor interests and national bodies is the best way to ensure high quality in any specification, but the proposed agreement between JTC1 and OASIS seems to cut out the national bodies acting as "national bodies"

I think it is a good idea to ISO-approve ODF in the future. But JTC1 needs to send a clear signal to OASIS saying, that is it fine that they want the “Seal of ISO” and we welcome them. But in order to have the cake, OASIS must eat it too. The ISO package must come with two items, 1) the ISO quality stamp and 2) national body involvement. You cannot just have the stamp! It should be emphasized that it is the prerogative of the national bodies to process the standards that come their way and that cutting them off and have them do nothing but rubber-stamping the specification is completely unacceptable.

The proposed maintenance proposal will be discussed at the JTC1/SC34 plenary in Prague on Friday, and I hope all national bodies have understood the ramifications of approving the maintenance agreement. I suggest the plenary responds by saying to JTC1/OASIS: "Thank you for your suggestion for a maintenance plan for ODF, but come back again when we as  national bodies have a solidly founded role in the maintenance of the specification".