a 'mooh' point

clearly an IBM drone

Will you be my friend too?

The other day I noticed that the traffic on my blog had increased dramatically. I couldn’t really understand why (I have not written anything substantial in quite some time) – all I could see was that the majority of the visitors originated from Google. The search results were mainly “Jesper Lund Stocholm”, “Alex Brown” and “OOXML” … which told me just about nothing at all.

But then I noticed that a few of them also included the term “boycottnovell” and “techrights”. This lead me to check out the feed from #boycottboy’s website – and behold – I was actually mentioned in one of his defamatory articles.

The article is this: http://techrights.org/2010/09/11/sc34-is-still-a-farce/ .

The quotes start like this:

Weir is then met by opposition from Brown’s longtime right-winger, the ODF-hostile Jesper Lund Stocholm [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. He is a known Microsoft booster and Weir’s responses to him go like this:

I wonder if I shall refer to myself from now on as “Goose” or similar. Alex is clearly the Maverick here.

Now – and this is where I envy Rob – #boycottboyquotes a conversation I had with Rob (which I clearly won, btw :-) ) – but conveniently leaves out everything I wrote. It seems to me that Rob and #boycottboy live in some sort of symbiosis – each benefitting from on another. #Boycottboy clearly regards every little thing Rob writes to him as a “badge of honor” – regardless of the content itself. And Rob very much benefits from #boycottboys critique-less c/p of his comments/articles … with the infamous “#boycottboy reality distortion field” applied.

I wish I had a friend like that. I wish I had a friend that would blindly recite every syllable coming from my lips – without any sanity-check at all.

I totally get why #boycottboy needs Rob – but why IBM’s chief ODF Architect (elsewhere known as “ODF’s one-man-army”) needs someone like #boycottboy is beyond me.

And you know what the silver lining is here? I actually benefit greatly (and not only in a monetary sense) from being posted on #boycottboy’s site. The increased traffic from techrights.org keeps the fire burning and it is almost always a certainty that someone will write to me to invite me to speak on document formats at conference, potential customers or even political parties.

So please keep it up, you two – it helps me avoid the wife and kid going to bed hungry at night.

Struck by the Wrath of Roy "Kahn" Schestowitz

As the real work of maintaining OOXML in ISO has begun, I have had some time to ponder over events throughout the last year - starting with the BRM in Geneva in February 2008.

Being in Geneva was really hard work, negotiating all day in a 120-seat plenum while in the evening preparing suggestions in coorporation with other delegates from other countries. It was fun, but hard, nevertheless. I remember sitting on my bed in the hotel room trying to sort out everything while trying to keep up with the debates happening outside our meeting room (a defecto radio silence had been initiated voluntarily by the more prominent bloggers around the world, so no information was being released to the people desperate for the slightest amount of information).

One of the tools I used was to keep track of the sites referring to my blog and one evening as I sat eating Swiss chocolate on my bed in the hotel, I noticed a new referral from Google Groups.




Three monkeys - one was Håkon Lie

(corrected quote of Håkon Lie) 

After the demonstration in Oslo yesterday (damn I wish I had been there) the CTO of Opera Software, Håkon Wium Lie was interviewed by Norwegian newspaper VG. The interview is in Norwegian but let me translate a bit for you:

Håkon Lie: What might happen if Microsoft gets this [OOXML ISO-approval] [OOXML added to the list of approved mandatory document formats in Norway, JLS addition ] through is that Norwegian authorities may be forced to use it, and this means that if you receive an email with an attachment and you don't have a program to read this attachment - it could be a message from a teacher of your child that attends a Norwegian school - when you cannot open this attachment, you will have to buy software from Microsoft. So this is really a "Microsoft-tax" that may be the consequence if Microsoft wins here. We are against this.

Dear Håkon, I love the software you guys make - I use it every day on my cell-phone ... but are you out of your mind? I would expect those kinds of arguments from the typical Tux-f**kers (or in reverse, from the usual Microsoft fan-boys whose coding-skills evolve around point-and-click in Visual Studio Web Developer). I would not expect this from the CTI of the third-largest browser-producer in the world and your argument here makes it all so much clearer for me why Standard Norge discarded your arguments.

I am sure Gene Amdahl would be proud of you.



We shall gather at the riiiiiiiveeeer!

Today it happened ... the world lost its perspective.


Censored by Big Blue-hoo?

Today I posted a comment on Arnaud Lehors'  blog - I wanted to share my thoughts on his article about what JTC1's Fast-Track process was designed for. Arnaud moderated his blog and he has been critized for moderating his blog too rigid and not allowing posts that dissagree with him (check out the comment section of my previous article about IBM's trench war) and Doug Mahugs article Similar accusations have been made at the other two of "the three stooges", Bob Sutor and Rob Weir.

I don't know where Arnaud lives (presumably in US), so he might have been at sleep when I posted my comment, but it took a few hours before my comment appeared on his blog. In the mean time I couldn't help thinking about whether or not I had been moderated to death as well ... or if it was all a storm in a tea-cup.

So on my way home from work I thought I'd help out a little with straightning out the confusion. I don't moderate my blog (and never will) so I hereby put forward, as a service to you all, the option of using the comment section of this entry as a "Big Blue Comment censorship archive".

So if you are about to post a comment on one of IBM's blogs, feel free to also post it here with a link to the blog post you would expect it to appear in.

I think this would be a win/win situation for us all. It will provide means to say and claim, that IBM is really censoring their blogs ... and if IBM stops moderating so aggressively, they will be able to claim that we were all wrong.

I will cast the first stone.


Committee-stuffing (the anti-OOXML-way)

I just wanted to give everyone a heads up on some information I recently got on our cold (but warm at heart) friends way up in the most Northern part of Europe - the Norwegians.

It seems that Google and IBM have just within recent days joined the Norwegian NSB (National Standardisation Body). So much for critizising supporters of OOXML if they were late joiners in various countries, claiming abuse of the standardisation process by undue influence.

If I know the FOSS-community right, they will now be tripping over each other's feet for a shot at "first post" being pissed about Google and IBM's actions - demanding that they withdraw completely from the process.

(or maybe not) 

Now if you ask me, it's not that big of a deal that some companies arrive late.

Matthew 20:16 - So the last shall be first, and the first last.

What is a big deal is that people should naturally contribute to the work in the NSBs if they join ... but simply focusing on the admission-date is really stupid. Contributing in the work is about taking part in the debate and discussions in the NSB. It's about doing homework between meetings and knowing what the hell is being talked about. Basically, it's doing almost anything but simply attending the meetings, sipping in the free coffee. One could argue, though, that when paying DKK 20.000 for an annual membership, it doesn't really make sense to talk about "free coffee", but I am sure you catch my drift ...

Granted, being late does make it difficult to achieve other influence than raising your hand when voting ... but having been a member of a committe for several years does not in itself ensure that you have participated. There are members of the Danish committee that I have never heard speak and there are members of the Danish committee that alter the attending employee for each new meeting. They may not speak at the meeting - but they have certainly raised their hands when voting.

What is also important to me is that the rules in the specific NSB are not broken. If the Danish NSB decides that members can join the day before a vote (they can) - it's probably because the Danish NSB felt that it was OK to do so. If the Danish NSB decides that a member cannot vote until after a month of membership - it's probably because the Danish NSB felt that it was OK. Different countries have different rules and it is up to each NSB to manage these rules and make sure members obey them.

So what can you do? well, how about rules that say:

  • You must have attended at least two meetings before eligible to vote.
  • You must be actively participating in the meetings by actively participating in the discussions.
  • Every two months point two above is evaluated and be simply majority it is decided who gets kicked out ot the committee.
Is it a bit extreme? Welll maybe ... but it is also a bit extreme to judge solely on the basis of the admission date.

IBM is now fighting from the trenches

After the BRM it seems to be more the rule than the exception to be denied "speech" on the blogs of the front-runners of the IBM bloggers. First it happened to me on Robs blog (where I commented on his patronizing tone towards a Czech delegate at the BRM and now it happened to me on Bob Sutors blog as well. Actually I thought it was just my point of view that was really stupid (so that Bob was essentially doing me a favour in not approving my comment), but today in the newsgroup comp.os.linux.advocacy I heard that Bob had completely disabled comments to this particular post and a post routing more of the, ahem, "balanced" views of the ODF Alliance. It seems to me that IBM has given up debating the issues at hand and are now using their blogs as mere portals with no user-interaction ... at least not interaction of the people opposing their views.

Well, to the amusement of you all - here is what I wrote:

Hi Bob,

I am a bit confused to why the lawyers of the Software Freedom Law Center has not compared the OSP of Microsoft to IBMs Interoperability Specifications Pledge at http://www-03.ibm.com/linux/opensource/isplist.shtml .

They seem to focus on two sentences from the OSP, but similar ones are present in IBMs ISP:

New versions of previously covered specifications will be separately considered for addition to the list.

IBM will evaluate new versions or additional specifications for inclusion based on their consistency with the objectives of this pledge which is to support widespread adoption of open specifications that enable software interoperability for our customers, and may, from time to time, make additional pledges.

The OSP does not apply to any work that you do beyond the scope of the covered specification(s).

IBM irrevocably covenants to you that it will not assert any Necessary Claims against you for your making, using, importing, selling, or offering for sale  Covered Implementations [...]. Covered Implementations" are those specific portions of a product (hardware, software, services or combinations thereof) that implement and comply with a Covered Specification and are included in a fully compliant implementation of that Covered Specification.

By decuction, shouldn't OSS-developers avoid ODF too?

I won't repeat Bobs response to me, since it was in a private email, but Bob, please feel free to comment here.



Blog-roll update

Small note:

The other day I was mistakingly taken as being a Microsoft employee by Bob ... due to the contents of my Blog-roll.


So now the persons on it are sorted alphabetically and not in the order they were added. I hope this clears up any confusion. 

Venstrehåndsarbejde på Version2

Version2 er jo en aflægger af Ingeniøren - det ugentlige fagblad for ingeniører. Ingeniøren har igennem mange år opbygget et renomé som en valid faktakilde og defacto-mediet for tekniske, ingeniørmæssige diskussioner og debatter. Ingeniøren har ry for at være et lødigt, teknisk blad og selvom ingeniørkunsten hurtigt bliver politisk i det øjeblik flere end tre interessenter skal dele viden, så har Ingeniøren ikke haft ry for at være farvet politisk den ene eller den anden vej. Ingeniøren har også (tidligere) været kendt og elsket for de dybdeborende artikler om emner, der var interessante for "omverden". Et eksempel på dette var artiklen om stråling fra mobiltelefoner og -master, der hamrede en tyk pæl igennem mange af de hysteriske kommentarer, som debatten i offentligheden på daværende tidspunkt flød over af.

Desværre har IT været underrepræsenteret i Ingeniøren og derfor glædede jeg mig, da jeg så de første udgaver af Version2. Der var for mig klare tegn i sol og måne på, at Version2 kunne udvikle sig til at blive "IT-branchens Ingeniøren" med fokus på teknik og IT og med en velafbalanceret dækning af relevante emner. Specielt glædede jeg mig til at se, hvordan de ville lave Version2s udgave af de dybdeborende artikler fra Ingeniøren - her med fokus på IT.

Jeg blev derfor positivt overrasket, da jeg på forsiden af Version2 for et par uger siden så, at der var en artikel omkring OOXML og ODF. Oplægget (vignetten, eller hvad det end hedder) talte om, at "Alle taler om dem, men få ved, hvad de taler om. Version2 blotlægger indmaden og stiller filer og mappestrukturer til skue". Min tanke var: "Hurra - nu kommer de dybdeborende artikler endeligt".

Artiklen er delt op i to - en ODF-del og en OOXML-del. OOXML-delen (som jeg vil kigge på her) er skrevet af journalist Jakob Møllerhøj. Lad mig slå fast med det samme - der er ikke noget faktuelt forkert i artiklen. Til gengæld er den et fremragende eksempel på, hvordan Version2 er softwarepolitisk farvet og i sine artikler har en åbenbar, tydelig snert af desavouering af OOXML.

Indledende kommentar

Jeg vil gerne understrege, at ovenstående ikke er et personligt angreb på journalisten men derimod en kritik af en artikel, som han har skrevet. I selvsamme udgave af Version2 er der andre artikler af ham, som jeg læste med glæde. Jeg anklager ham derfor ikke for at være en "dårlig journalist" men kritiserer blot, at han har kastet sig ud i at skrive en teknisk artikel,hvor han ikke er godt nok hjemme i stoffet til at gøre artiklen lødig.

MS Office er ikke OOXML

For det første laver journalisten fejlen, at han sammenligner MS Office og OOXML og sætter dem lige. Det er i øvrigt samme hul Finn Gruwier falder i (og naturligvis Stéphane Rodriguez), der har skrevet artiklens modpart om ODF, så man kan sige, at han er i godt selskab. For Finn er det blot ODF og OpenOffice. Jeg ved ikke, hvor misforståelsen kommer fra, men det giver ikke mening at sammenligne hverken ODF og OpenOffice eller OOXML og MS Office. Den af de to programmer genererede XML er naturligvis OOXML hhv ODF, men for begge programmer gælder det, at de ikke danner "minimal" XML men sovser det ind i snavs hist og pist.

OOXML er ikke forstået korrekt

For det andet viser artiklens gennemgang af OOXML, at journalisten ikke helt har forstået de mere basale, esoteriske detaljer og at han desværre bruger denne manglende viden (måske ubevidst?) til uretmæssigt at kritisere OOXML. Lad mig i flæng nævne de tilfælde, hvor kæden hopper af:

  • "Microsoft introducerede OOXML-formatet i den virkelige verden med lanceringen af Microsoft Office 2007-kontorpakken, og gik dermed væk fra det traditionelle binære filformat, som altid har været fremherskende i virksomhedens office-pakker."
    Excel 2003-filformatet var baseret på XML - det som nu er blevet til SpreadsheetML.

  • "En OOXML-fil er en samling af filer i pakkeformatet ZIP. Det vil sige, at eksempelvis OOXML-filtypen .docx, som Word 2007 fil-formatet hedder, kan udpakkes med et almindeligt pakkeprogram som WinZip eller i dette tilfælde Winrar."
    Det er ikke helt forkert men heller ikke helt rigtigt. Som beskrevet i Part 2 af OOXML er OPC-formatet "pakke-agnostisk" og det er kun den fysiske persistering af en OPC-pakke, der anvender ZIP. Dette er i øvrigt præcist som ODF gør det - og som Finn Gruwier Larsen også bemærker det i sin artikel om ODF.

  • "Den udpakkede fil Hello World.docx indeholder en række mapper heriblandt den applikationsspecifikke mappe 'word'."
    Mappen "word" er ikke applikationsspecifik men derimod persisteringsspecifik.

  • ".docProps indeholder egenskaber for den aktuelle applikation. Eksempelvis hvilken Word-skabelon, OOXML-filen skal anvende."
    Som navnet antyder er det ikke applikationsspecifikke ting, der gemmes i mappen docProps men derimod dokumentspecifikke ting. Det drejer sig heller ikke om en Word-specifik skabelon men om en OOXML-specifik skabelon.

Screenshots er misvisende

For det tredje vises der en række screen-shots, der skal underbygge teksten i artiklen. Ét af disse er et billede af den XML-fil, der dannes af MS Office med teksten "Hello World". Men eksemplet er ovenud komplekst og viser ikke reelt, hvad OOXML er. Ikke alene udgøres en del af XML-filen af data, der ikke er relevante for det aktuelle dokument, bla. referencer til skemaer for matematisk indhold, men der er også XML indeholdt, der hidrører fra stavekontrol af dokumentet. I XML-filen er inkluderet et skema med ordene "wordml" - ganske som ODF gør tilsvarende. Dette ord er søreme markeret med blåt, så det giver indtryk af, at det er et problem med denne tekst. Hvis jeg tæller i artiklen, så fremkommer ordet "applikationsspecifik" 4 gange - i øvrigt hver gang forkert - og screenshot er manipuleret, så det fremstår som om at OOXML i sig selv er applikationsspecifik for Word 2007.

Som jeg nævnte ovenfor, så er artiklen ikke en gennemgang af OOXML men derimod en (fejlbehæftet) gennemgang af den XML, som Word 2007 danner. Det er heller ikke et klassisk "Hello World!"-eksempel, da det er unødigt komplekst.

Hjælp til selvhjælp 

For nu at hjælpe journalisten lidt til næste artikel, så har jeg selv dannet mit eget "Hello World!"-OOXML dockument [ Minimal OOXML.docx (1,16 kb) ]. Det er dannet via applikationen "jlundstocholm", hvilket kan ses i mappestrukturen. Jeg har - i modsætning til artiklen - startet med OOXML og ender til sidst med at vise dokumentet i MS Office 2003. For at gøre eksemplet sammenligneligt med ODF-eksemplet er billedet fjernet fra dokumentet.

Indhold af OPC-pakken:

(læg mærke til den persisteringsspecifikke mappe 'jlundstocholm')

Indhold af XML-filen document.xml

(jeg er helt sikker på, at et ODF-dokument kan nedbarberes til tilsvarende størrelse) 

Og filen vist i MS Office 2003


For overskuelighedens skyld har jeg også postet indholdet af de to hovedfiler for et ODF-dokument og et OOXML-dokument. Dokumenterne er dannet af hh. OpenOffice 2.2 DA og MS Word 2003 DA. 

OOXML document.xml (1,02 kb)

ODF content.xml (2,57 kb)

Bit-masks i OOXML

Når kampen for OOXML og ODF er så meget op ad bakke skyldes det i høj grad usikkerhed om, hvad der rent faktisk er fakta i diskussionen. Én af kilderne til denne usikkerhed er desværre Grokdoc.net. Når jeg skriver "desværre" er det fordi Grokdoc har lavet en ganske omfattende gennemgang af specifikationen for OOXML (hvilket jo i sig selv piller ved argumentet om, at den er for stor til at nogen andre end Microsoft kan overskue den), men den er desværre også fejlbehæftet. De fejlbehæftede dele er ikke så store - problemet ligger i, at det netop er disse fejlantagelser, der bliver brugt som primære argumenter imod OOXML som standard.

Én af disse fejl er de såkalte "bit-masks".

Der er såmænd ikke så meget forkert i selve observationen af, at der er bitmasks i OOXML-spec. Problemet ligger i, at problematikken i deres tilstedeværelse overdrives, ja Rick Jelliffe fra O'Reilly går så langt som at sige: "Some of the technical claims are silly (such as the "bitmask rubbish").

Grokdoc definerer en "bitmask" som:

"A bitmask is a technique to encode multiple values inside a single variable, by assigning a meaning to each individual bits of the variable. For example, the binary 10110001 (decimal 177) would mean Yes/No/Yes/Yes/No/No/No/Yes and contain the answers to 8 different yes/no questions." Grokdoc inkluderer herefter en liste over de steder i OOXML-spec, der omhandler bitmasks af forskellig art.

OOXML indeholder en række eksempler på anvendelsen af disse bitmasks. Ét af eksemplerne er fra sektion for attribut usb1:

[Example: Consider font information specified as follows:
<w:font w:name="Times New Roman">
<w:sig w:usb2="00000008" … />

The usb0 attribute value of 80000000 specifies that the first 32 bits of the bitfield are
00000000000000000000000000001000, which corresponds to:
Arabic Presentation Forms-B
end example]

Læg her mærke til, at bitmasken er en string-repræsentation af en Hex'et bitværdi.

Et andet eksempel som jeg selv er faldet over tidligere (men ikke er nævnt på Grokdoc) er fra afsnit 2.8.11 ST_Cnf (Conditional Formatting Bitmask):

Fra OOXML-spec har jeg klippet følgende forklaring:

10 This simple type specifies the format for the set of conditional formatting properties that have been applied to
11 this object.
12 These properties are expressed using a string serialization of a binary bitmask for each of the following
13 properties (reading from the first character position right):

[Example: Consider a paragraph in the top right corner of a table with a table style applied. This paragraph
35 would need to specify the following WordprocessingML:
36 <w:p>
37   <w:pPr>
38     <w:cnfStyle w:val="101000000100" />

1       …
2     </w:pPr>
3     …
4 </w:p>
5 This paragraph specifies that it has the conditional properties from the table style for the first column, first row,
6 and the NW corner of the parent table by setting the appropriate bits in the val attribute. end example]

Grokdoc har følgende indvendinger overfor anvendelsen af disse bitmasks:

1.Bitmasks cause significant validation problems

Using bitmasks creates a new data model, separate from the XML data model. In particular, the bitmask cannot be described in or validated by XML Schema, Relax NG, Schematron or any standard XML schema language or current validator.

Selve bitmask-værdien er en ganske almindelig element-attribut og OOXML-spec beskriver endda, at indholdet af denne her skal matche det regulære udtryk [01]* . At sige at det ikke kan lade sig gøre at validere denne værdi er en pudsig anke imod OOXML, da man jo heller ikke kan validere andre (komplekse) værdier i en XML-fil for mening. Selvom jeg anvendte enums i mit XML-schema ville jeg ikke kunne validere om det enkelte "1" eller "0", hhv "true" eller "false" rent faktisk gav mening.

2. Bitmasks defeat XSLT manipulation

XSLT is the W3C standard for manipulating and converting XML documents, and is by far the most popular tool for working with XML. XSLT has no tools for bitwise operators, since bitmasks are not part of the XML data model.

Det er vigtigt at understrege, at bitmaskerne er reelt bit-flag, der ikke er relaterede til hinanden. Det er også vigtigt at understrege, at bitmasken ikke er "en række bits" men derimod en serialisering af en række flag. Derfor giver det ikke mening at tale om, at XSLT ikke kan anvendes på disse bitflag, da XSLT ikke indeholder mulighed for "bitwise operators". Bitwise operators er jo noget med OR, AND, XOR, NAND etc, men det giver slet ikke mening at tale om dette i denne kontekst. Hvis OOXML-spec i stedet havde udskilt disse 12 formatteringsflag i selvstændige attributter/elementer med mønstret [true,false] ville det have været nøjagtigt lige så nemt/svært at manipulere disse værdier som det er nu i bitmasken, som jeg hellere vil benævne en "short-hand positioning-definition" end en "bitmaske". At omtale disse flag som bits forvirrer mere end det gavner (man kan sige, at OOXML har skudt sig selv i foden ved at benævne dem som "bitmasks").

3. Bitmasks conflict with the Ecma TC45 charter

The TC45 is the Ecma Technical Committee charged with developing the Ecma 376 specification. The charter of the TC45 includes the specific goal of: "...enabling the implementation of the Office Open XML Formats by a wide set of tools and platforms in order to foster interoperability across office productivity applications and with line-of-business systems"

Since bitmasks cannot be implemented in any of the standard tools for XML data formats, their use is in conflict with the TC45's charter.

Er dette ikke liiige at stramme den? For det første er værktøjerne til manipulering af OOXML ikke kun indsnævret til XML-værktøjer som XSLT. For det andet er der intet i disse eksempler på bitflag, der afholder manipulering af dem fra at blive implementeret i XSLT. Man kan argumentere for, at bitflagene ikke er specielt "XML-agtige", men at bruge dette som argument for at afvise standarden er i mine øjne for langt ude.

4. Bitmasks are not extensible

The bitmasks specified by Ecma 376 are mostly of fixed length (a fixed number of bits). For example, the bitmasks used in sections 2.4.51, 2.4.52,, and are all of type ST_ShortHexNumber (2.18.86, p. 2591), which is defined as consisting of exactly 4 hexadecimal digits (16 bits, see above regarding conflicting definitions). The bitmasks in section are of type ST_LongHexNumber (2.18.57, p. 2542) which is defined as consisting of exactly 8 hexadecimal digits (32 bits, see above regarding conflicting definitions). The bitmasks in sections, 2.4.7, and 2.4.8 are of type ST_Cnf (2.18.11, p. 2478), which is defined as consisting of exactly 12 binary digits (12 bits). The bitmask in section (p. 5227) consists of exactly "three bits".

Because it is not possible to add new bits to a fixed-length bitmask, extensibility is extremely limited.

Jeg kan godt forstå, hvorfor Grokdoc ønsker denne mulighed, for det er netop én af arkitekturprincipperne for ODF - nemlig muligheden for at udvide spec'en afhængigt at applikationen. Pudsigt nok er den manglende mulighed for applikationsspecifik udvidelse netop ét af grundprincipperne for OOXML. Argumentet fra OOXML er, at det er svært at sikre interoperabilitet, når hver implementering af formatet kan udvide dette som man lyster. Var det her muligt at udvide en af bitflagsrækkerne med ekstra flag - hvordan skulle andre applikationer kunne anvende dette? Dette er ét af en række eksempler på, hvordan en arkitekturbeslutning påvirker den konkrete implementering af den. Diskussionen om disse flag bør altså reelt ikke dreje sig om deres eksistens men derimod om hvilket arkitekturprincip, der bør ligge bag formatet.