Vitaly's WebLog
Software development, startups, marketing

Microsoft opens its Office binary format to public

February 21, 2008

A couple of years after introducing Office 2007 Open XML file formats, Microsoft recently published specifications of their doc, xls and ppt binary formats. It seems that it was surprising for everyone how complicated these formats are. For example, the Excel 97-2003 file format is a 349 page PDF file.

Joel Spolsky, who worked on Microsoft's Excel development team, shed some light why the Microsoft Office file formats are so complicated. He provides many points describing why that happened, but it seems that it can be summarized just in 2 main points:

  • These file formats were designed long ago in the era of slow machines

  • They were designed to be fast on very old computers. For the early versions of Excel for Windows, 1 MB of RAM was a reasonable amount of memory, and an 80386 at 20 MHz had to be able to run Excel comfortably
  • Microsoft did not care to clean the format or to design new ones for a long time

  • A lot of the complexities in these file formats reflect features that are old, complicated, unloved, and rarely used. They’re still in the file format for backwards compatibility, and because it doesn’t cost anything for Microsoft to leave the code around.

When reading that, one question continuously popping up in my head. Why it did take so long to switch to a better file format? Computers became fast enough, not to deal with binary formats, more than 10 years ago, Internet is here for 2 decades, XML became popular in 1990s, but Microsoft switched one of their most selling product’s format to a better one only in 2006. If they do not care about its interoperability, how many efforts it took to support those formats and to train new people who became part of Office team….


Comments

Comments are closed