Microsoft Office 2010 Beta, ODF and leap-year-bug

by jlundstocholm 5. February 2010 23:45

Some time ago I did some tests of Excel in Microsoft Office 2010 (CTP). The test was around OOXML - but test of ODF-support was missing.

One of the things ODF is missing but is in OOXML is the leap-year-bug ... although most of propably don't miss it all that much. The leap-year-bug is the good ol' Lotus 1-2-3 bug that treated 1900 as a leap year. As a consequence of that, calculations based on dates in the range from January 1st 1900 and February 28th 1900 with dates after this period will be off with one day.

Since Microsoft Office supports (a subset of) ODF, I thought it'd be fun to look at how Excel 2010 handles the leap-year-bug.

The first thing to do is to show how the leap-year-bug is handled by Excel:

So adding a day to February 28th 1900 will result in the non-existing date February 29th 1900, and if you subtract the dates February 27th 1900 and March 2nd 1900 (you'd expect the a value of 3) you actually get a value of 4.

So what will happen if you save this spreadsheet in ODF-format and open it again in Excel? You might expect that - since it was round-tripped through a format not supporting the leap-year-bug, the calculations would now be correct.

... but you'd be wrong. The result is excatly the same:

As I was, you might be wondering how the hell that was possible. But a simple inspection of the markup generated by Microsoft Excel 2010 reveals the answer:


<table:table-cell
  office:value-type="date"
  office:date-value="1900-02-29T00:00:00"
  table:formula="msoxl:=A2+1"
  >
  <text:p>29-02-1900</text:p>
  </table:table-cell>

A quick-and-dirty conclusion to this would be that Microsoft Excel 2010 violates not only ODF but also xsd:datetime, since February 29th is not a valid xsd:datetime. However, an inspection of ODF reveals that this is not the case. Microsoft Office claims conformance to ODF 1.1. and ODF 1.1 states the following about the value-space of the attribute "office:date-value" (Section 16.1 , p 702) :

A dateOrDateTime value is essentially an [xmlschema-2] date and time value with an optional time component. In other words, it may contain either a date, or a date and time value.

So strictly (*giggle*) speaking, Microsoft Office 2010 does not violate ODF 1.1 .

However - specifying an invalid date in an attribute that might contain xsd:dates is not very smart, dear Microsoft. Those of us wanting to use standard libraries to process the content of an ODF-document will likely get unpredictable results when trying to parse this invalid date. Heck, even .Net's DateTime.Parse()-method throws an exception when trying to parse this value.

Also, ODF TC has tightened up the prose in ODF 1.2 and it is now:

A dateOrDateTime value is either an [xmlschema-2] date value or an [xmlschema-2] dateTime value.

So Microsoft Office 2010 might not violate it now - but it will when ODF 1.2 comes out.

Extending ODF

Microsoft could always opt for extending ODF using the extension mechanism (to add elements and attributes using a foreign namespace). So Microsoft could chose to add their own attribute to the <office:spreadsheet>-element saying something like


<office:spreadsheet mso:EnableLeapYear="true"/>

The problem with this approach is that is comes into conflict with the new conformance clauses of ODF where a clear distinction between "normal" documents and "extended" documents is made. Procurement-wise it is a big no-no only to support extended documents (look what happened in Denmark!) and Microsoft risks that some government somewhere decides not to use Microsoft Office due to lack of conformance to the "normal" conformance clause of ODF 1.2.

Thus, Microsoft needs to find another solution ...

Configuration to the rescue!

Luckily for Microsoft (and we all know how picky they are wrt "preserving functionality" etc), there is a fully compliant way out of this while still preserving the leap-year-bug in spreadsheets - regardless of persistance format.

As you probably know that so-called config-item-sets are a gold-mine of endless possibilities. Originally (until ODF 1.1) the purpose of these elements and attributes were to store application specific settings, like (and this is a quote from ODF 1.1) "document settings, for example a default printer or view settings, for example zoom level". In ODF 1.2, all bets are off and there are no restrictions to the usage of the elements. The config-item-set elements were never meant to be an extension mechanism (by ODF TC co-chair from Sun/ORACLE - go figure), but OpenOffice.org uses them extensively - in fact, when creating a "blank" text-document, spreadsheet or presentation in OpenOffice.org, a total of 228 (76 for text documents, 66 for spreadsheets and 86 for presentations) settings (of which non are described in ODF) are defined in the the settings.xml-file of the packages. Somehow ODF TC has not found it necessary to include usage of config-item-sets in the "extended conformance clause", so a document can claim 100% conformance to ODF 1.2 "normal documents" while throwing dozens of settings into config-item-set elements. So the solution to claim conformance to ODF while enabling the leap-year-bug is simply


<config:config-item-set  config:name="mso:spreadsheet-settings">
  <config:config-item
    config:name="EnableLeapYearBug"
    config:type="boolean"
  >
    true
  </config:config-item>
</config:config-item-set>

This should be combined with this markup for the specific cell


<table:table-cell
  office:value-type="float"
  office:-value="60"
  table:formula="msoxl:=A2+1"
  >
  <!--<text:p>29-02-1900</text:p>-->
  </table:table-cell>

(and you don't really need the bit I have commented out).

I don't know about you, but I find this just darn right fantastic!

Smile

Comments

2/6/2010 3:06:22 AM #

hAl

You might want to try this example as well with a decimal date which ODF also supports

hAl Netherlands |

2/6/2010 5:41:49 AM #

jlundstocholm

Hi hAl,


You might want to try this example as well with a decimal date which ODF also supports


Well, I think you got it slightly wrong. For formulas in spreadsheets in ODF the persisting-format for dates is "ISO-dates". But most of the formulas in OpenFormula working on dates take "serial dates" as arguments and return-values. In other words - there is an implicit runtime-conversion of "ISO-dates" to "serial dates" in OpenFormula.

This is exactly the same way it is done in OOXML<S>. In OOXML<T> dates are also persisted in "serial format" so no conversion is necessary.

jlundstocholm Denmark |

2/6/2010 11:26:58 AM #

tzs

Here are a couple ways they could store the information they need in the date-value, without violating the schema.

1. Represent 1900-02-29 as 1900-02-28T23:59:60. Second values of 60+ are allowed to represent leap seconds, according to this: http://www.w3.org/TR/xmlschema-2/. It says that if they are encountered on days that did not have leap seconds, the extra second should be added to the following minute.

2. Seconds can have a fractional part, to an arbitrary precision. To represent a time on 1900-02-29, do the following. Take the representation of the same time on the 28th. Then, if the seconds are not fractional, add a '.'. Finally, append some long magic string of digits. Say, 47189234709129856123098741124.

For example, the time 1900-02-29T12:23:41 would be represented as 1900-02-28T12:23:41.47189234709129856123098741124. The time 1900-02-29T12:23:41.56 would be represented as 1900-02-28T12:23:41.5647189234709129856123098741124.

Excel, when reading dates, could check for dates on the 28th whose fraction seconds end in the magic string 47189234709129856123098741124, and could recover the correct buggy date. Programs that aren't supporting the Lotus leap year bug would see these as legitimate dates on the 28th.

tzs United States |

2/6/2010 11:12:48 PM #

jlundstocholm

Hi tzs,

For example, the time 1900-02-29T12:23:41 would be represented as 1900-02-28T12:23:41.47189234709129856123098741124. The time 1900-02-29T12:23:41.56 would be represented as 1900-02-28T12:23:41.5647189234709129856123098741124.

I have no doubt that this would work - but it seems lile "a hack on a hack"-way to represent 1900-02-29 in ODF. I am sure my suggestion has flaws in it as well, but it mimicks the way Excel handles dates today.

Smile

jlundstocholm Denmark |

2/6/2010 1:44:15 PM #

Rob Weir

ODF 1.1 says a bit more about office:date-value than you quote there.  If you drill down into the schema definitions you see that office:date-value may be a dateTime or time schema type.  And then if you drill down into the W3C's schema datatypes spec you see that they define dateTime in terms of the ISO-8601 Gregorian calendar.  Further they specifically call out (in 3.2.7) that putting a 29th on February on a non-leap year is not allowed:

"The value of each numeric-valued property (other than timeOnTimeline) is limited to the maximum value within the interval determined by the next-higher property. For example, the day value can never be 32, and cannot even be 29 for month 02 and year 2002 (February 2002)."

Ironically, we've been discussing the leap year calculation settings and Excel compatibility issues on the ODF TC this week, for ODF 1.2.  I dare say we're coming up with a far better solution than the silly one you are describing.

Rob Weir United States |

2/6/2010 11:30:45 PM #

jlundstocholm

Hi there Rob,

The value of each numeric-valued property (other than timeOnTimeline) is limited to the maximum value within the interval determined by the next-higher property. For example, the day value can never be 32, and cannot even be 29 for month 02 and year 2002 (February 2002).

So what you are saying is that there is a conflict in ODF 1.1 between the (lax) prose of the text and the schema definition?

Ironically, we've been discussing the leap year calculation settings and Excel compatibility issues on the ODF TC this week, for ODF 1.2.  I dare say we're coming up with a far better solution than the silly one you are describing.

Oh, I'd be dissapointed if you couldn't - I just thought I'd show the enormous possibilities of usage of config-item-sets without envoking the extended conformance clause.

As to leap-year-bug - I am curious to hear a bit on why you would want to add direct support for this in ODF? Just last week the Danish national body (SC34 mirror) submitted a defect report to WG4 requesting to have leap-year-support removed from strict documents.

Can you elaborate on your reasoning for this?

jlundstocholm Denmark |

2/7/2010 12:25:03 AM #

Rob Weir

Jesper, there is no conflict.  Conformance in ODF requires that you follow the schema as well other requirements stated in the spec. I'll grant you that the text of the specification can be vague at times.  But you must grant me that the schema is not vague in this regard.

And I never said that I wanted to add the leap year bug in ODF.  I said that the TC was discussing the topic.  The TC is free to discuss topics, even where I disagree.  This might be strange to you, working on OOXML.  You may not have had experience working with on a standard where there were multiple implementations and members were free to discuss alternatives, without requirement to be a slave to any one vendor or implementation.  But that is what happens on a real standard.

Rob Weir United States |

2/7/2010 12:48:05 AM #

jlundstocholm

Hi Rob,

Conformance in ODF requires that you follow the schema as well other requirements stated in the spec

Ok - I was actually going to check which was the "top dog" in terms or conformance in ODF - "the prose" or "the schema", but I was distracted by a baby Wink. If you are telling me that the schemas take precedense, it sounds perfectly legitimate to me.

But you must grant me that the schema is not vague in this regard.

Yes - perfectly true. The schema says this:

[code:xml]<define name="dateOrDateTime">
  <choice>
    <data type="date"/>
    <data type="dateTime"/>
  </choice>
</define>[/code]

... not much to discuss here.

And I never said that I wanted to add the leap year bug in ODF.  I said that the TC was discussing the topic.  The TC is free to discuss topics, even where I disagree.

Ok - this is a "Danish/English" misunderstanding. When I was saying "you" I was referring to "The ODF TC" and not you specifically. In Danish, we have two distinct words for your usage of "you".

But can you tell us a bit about the reasoning from other members of ODF TC to "discuss the topic"? I mean, if everyone says "no", there's not much to discuss, is there?

You may not have had experience working with on a standard where there were multiple implementations and members were free to discuss alternatives

Actually, for most of the meetings (face-to-face and TC calls) Novell, DataWatch and Microsoft has participated. So I do know a bit of what it feels like.

without requirement to be a slave to any one vendor or implementation.  But that is what happens on a real standard

You are a funny man, Rob Smile

jlundstocholm Denmark |

2/7/2010 11:59:52 AM #

Rob Weir

OK.  We have that distinction only in regional dialects in the U.S., in particular parts of the U.S. South, where "You all" or "Y'all" is the informal 2nd person plural pronoun.

In any case, I won't speak for the TC on this matter, since we have not come to an agreement yet.  If you want to participate in that discussion, you will need to join the ODF TC.

But my work here is done.  Just stopping by to point out another one of your misreadings of the ODF standard.  

Until next time.  

Regards,

-Rob

Rob Weir United States |

2/7/2010 10:38:00 PM #

jlundstocholm

Hi Rob,

"misreadings"? I thought we'd just agreed that there is a conflict between the prose and the schema of ODF?

Anyways, the discussion on whether a string is a valid dateOrDateTime is a bit moot since the attribute containing the value (office:date-value) is optional, so the attribute could simply be discarded and the table:formula would be the only indication of what to do (including the non-intrusive config-item-set setting of EnableLeapYearBug, of course).

Since you apparently only stopped by to correct me on the dateOrDateTime-thingy (and I do thank you for that), I assume you agree with me on my reading of the config-item-set/conformance clause sections of ODF.

Smile

PS: The Twitterfeed from JIRA is an excellent tool ... what a good idea.

jlundstocholm Denmark |

2/10/2010 1:17:30 AM #

Carlos

Rob, Jesper

Behave kids !  

Smile)

Carlos United States |

8/25/2010 5:08:23 AM #

Marlo

Very funny little test there with the leap year bug.  That darn leap year will get ya every time.  Especially if your lucky enough to have a birthday fall on the leap year. Haha Smile


(URL deleted

//Jesper)

Marlo United States |

9/5/2010 6:24:18 AM #

Qasim

Damn! My birthday is on a leap year lol! Oh God hehe. Nice little conversation there, especially with the Yall part


--
Ed: URL deleted

/Jesper

Qasim United States |

Comments are closed