Moving towards OOXML(S) (update)

by jlundstocholm 31. May 2010 18:01

Some time ago I wrote about some of the enhancements of Microsoft Office in terms of how far they have made it in implementing the content of the conformance profile "Strict" or "<S>". As you might recall, I made a run-through of a list of feature areas and marked each with either a green, yellow or red traffic light. There were no red traffic lights, but some areas had a yellow marking. These were

  • "ink"
  • "legacy diagrams"
  • "groups"
  • "form controls"
  • "activeX objects".

These document types previously used VML as containing frame etc, but Microsoft Office 2010 was now supposedly using DrawingML for these. The reason for them being yellow and not red was that I did simply not know how to test these things - either because of poor Microsoft Office skills or lack of proper hardware ("ink" is used on tablet PC's and I don't have one of those at hand).

Stockholm plug-fest

When WG4 met in Stockholm a couple of months ago, I got a chance to take a look at the documents I couldn't create myself. The cool thing about participating in these meetings is that there is an abundance of different hardware and software on the laptops of the delegates, so after one of the sessions a few of us had our own little "Microsoft Office OOXML <S> interop plug-fest" and I finally had a chance to get my hands on those files.

I could have simply updated the previous article with the new information, but a couple of interesting thing emerged that made me write up a new piece.

First, the results are this:

File typeFeatureComment 
DOCX Ink Drawings Previously used VML, now uses DrawingML
Success, green traffic light
XLSX Ink Drawings Previously used VML, now uses DrawingML Success, green traffic light
PPTX Ink Drawings Previously used VML, now uses DrawingML Success, green traffic light
DOCX Legacy Diagrams Previously used VML, now uses DrawingML Success, green traffic light
XLSX Legacy Diagrams Previously used VML, now uses DrawingML Success, green traffic light
PPTX Legacy Diagrams Previously used VML, now uses DrawingML Success, green traffic light
DOCX Drawing Shapes Previously used VML, now uses DrawingML Success, green traffic light
DOCX Textboxes Previously used VML, now uses DrawingML Success, green traffic light
DOCX WordArt Previously used VML, now uses DrawingML Success, green traffic light
DOCX Groups Previously used VML, now uses DrawingML Success, green traffic light
XLSX Form Controls Previously used VML, now uses DrawingML - except on "chart sheets" Success, green traffic light
XLSX ActiveX Objects Previously used VML, now uses DrawingML Success, green traffic light
PPTX ActiveX Objects Previously used VML, now uses DrawingML Success, green traffic light
XLSX OLE Objects Previously used VML, now uses DrawingML Success, green traffic light
DOCX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Success, green traffic light
XLSX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Success, green traffic light
PPTX ST_OnOff Uses the new ISO-approved simple type without the values "on" and "off" Success, green traffic light
XLSX ISO-dates Can persist dates in ISO-8601 format and avoids the "evil" serial dates. Failure, red traffic light

The trained eye will notice that all the yellow lights have been replaced by green lights - In other words, the list above clearly shows that even though Microsoft Office 2010 does not write <S>, the developers in Redmond have clearly made some significant progress.

There are a couple of interesting points about the technicalities of the files I looked at.

Predicting the future is difficult

The files containing "legacy diagrams" stand out, because of the way Microsoft Office 2010 breaks compatibility with e.g. Microsoft Office 2003 and earlier versions. The thing is - when loading a PPT-file with a legacy diagram from e.g. Microsoft Office 2003 the diagram will be in VML-format. When it is loaded in Microsoft Office 2010, modified and saved again - it won't save the diagram in VML. It just won't. The diagram will be saved all right - but now using DrawingML instead of VML. So this is essentially a case where interoperability with this "legacy" application is hurt since Microsoft Office 2003 has no idea what to do with the DrawingML it loads.

MCE to the rescue

For all the other files, MCE once again steps up.

If we look at the file containing the ink notations, (some of) the markup will look like this:


<mc:AlternateContent>
  <mc:Choice Requires="wpi">
    <w:drawing>
      <wp:anchor>
        <a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
        <a:graphicData uri="http://schemas.microsoft.com/office/word/2010/wordprocessingInk">
      </wp:anchor>
    </w:drawing>
  </mc:Choice>
  <mc:Fallback>
    <w:pict>
      <v:shapetype >
        <v:stroke joinstyle="miter" />
        <v:formulas>
          <v:f eqn="if lineDrawn pixelLineWidth 0" />
        </v:formulas>
      </v:shapetype>
      <v:shape id="Ink 15">
        <v:imagedata r:id="rId6" o:title="" />
      </v:shape>
    </w:pict>
  </mc:Fallback>
</mc:AlternateContent>

(I have modified and trimmed the real XML for easier reading - especially the VML was really, really ugly)

So with this approach you can actually have the best of two worlds - the new and the old without losing information. That is - if you know MCE, of course. This again shows what a great tool alternating content blocks (ACB) of MCE are for this task. It allows you to innovate while still making it possible to ensure some sort of compatibility with earlier programs that did not know of the new technology.

And what about them dates?

The even more trained eye will have noticed that the green traffic light for ISO-dates in SpreadsheetML has been downgraded to a flashing red traffic sign.

The first test I did with Microsoft Office 2010 CTP1 had a small check-box in the "backstage" area that would allow dates in spreadsheets to be persisted in ISO-8601 format. With RTM of Microsoft Office 2010 this check-box is gone so we are now back to using serial dates again.

It would be easy to hit on Microsoft for removing this check-box, and I am sure that many will. But the truth is that the removal of this feature is due to activities in WG4 where we maintain OOXML.

As some of you may recall, the introduction of ISO-dates in OOXML was done in Geneva at the BRM in those hectic days we spent there. The trouble with introducing the ISO-dates in OOXML was that it looked really, really good on paper - but it sucked in real life.

The reason is that we "forgot" to change namespace name for ISO OOXML-files so documents conforming to ISO OOXML<T> share namespace name with documents conforming to ECMA 376. This has has enormous consequences for spreadsheets and those applications designed to support ECMA-376 but not necessarily ISO OOXML. At the second F2F of WG4 in Prague in 2009, we had a demonstration of how bad is was - not a single application would interpret these new dates correctly and - what was perhaps even worse - they did not display any warnings to the user.

We have been discussing this a lot in WG4, and in the end we decided to start the work to remove usage of ISO-dates from Part 4. This correction of a BRM decision was not easy to agree on (AFAIR it has not been finally approved as of yet) and the removal of the "Save-as ISO-dates" feature in Microsoft Office 2010 is propably in anticipation of this pending removal of ISO-dates from <T>. I think it might be important to note that this removal was not due to "pleasing Microsoft". In fact - they had already implemented support for this in CTP1. We are removing ISO-dates from Part 4 due to problems with everybody else.

I always like to give credit where credit is due, and I think this is one of those cases. Microsoft has clearly worked with - and listened to - the standardisation community and has chosen to remove a feature they had already implemented.

So what's next?

Well, Doug Mahugh recently wrote about the approach of Microsoft when dealing with OOXML<S>. Amongst other things he wrote that Microsoft Office 2010 will have read-support for OOXML<S> but that "a small number of optional features" will still be lost (that's just new-speak meaning "we haven't implemented support for all of Part 1"). I asked him what that list consisted of, and Doug said they'd provide anwsers when they have them. I hope that list will come soon.

As I have mentioned earlier, CIBER Denmark A/S (the company I work for) is not in the "productivity-suite-business" - but we develop solutions that work with these suites be that Microsoft Office, OpenOffice.org, iWork or others. Having read-support for OOXML<S> in Microsoft Office 2010 helps us a great deal, because we can now start trimming our code to target OOXML<S> instead of OOXML<T>. We think that adds great value to us and our customers. But we need a definitive list of the areas where we can expect Microsoft Office 2010 to ignore our markup. If we can't have that we are forced to go the safe-route and keep producing OOXML<T>-files and we'd hate to do that. But without a list from Microsoft, we feel that our hands are tied behind our backs.

So please, Microsoft - give us the list ASAP. Otherwise the uncertainty of what Microsoft Office will ignore is to great a risk for us to start producing strict files and your read-support for OOXML<S> is more or less useless to us.

Comments

6/11/2010 4:18:07 AM #

Alex Brown

@Jesper

Your inclusion, in your post title, of an "S" in pointy brackets has caused your entire site to appear (for me) in a struck-through typeface - this also spills over onto the blogroll on my own blog .

- help!

Alex Brown United Kingdom |

6/11/2010 6:20:25 AM #

jlundstocholm

Hi Alex,

well, this has caused all sorts of wierd problems for me too, so I have changed the title to use normal parentheses.

Is this better?

jlundstocholm Denmark |

6/11/2010 6:51:00 AM #

Alex Brown

yup !

looks like a problem with blogengine ...

Alex Brown United Kingdom |

6/11/2010 6:58:58 AM #

jlundstocholm

Hi Alex,

Yes - I believe it is - or at least the theme "standard" I am using.

Which browser do you use? I use FF and it looks fine to me.

jlundstocholm Denmark |

Comments are closed