The year is 2012. Open XML revision 2.0 has just made it through standardization by ECMA and of course ISO, despite heavy protests of Big Blue (and the still raging battles between the Autobots and Decepticons, since late 2005). Even though the new standard has been well received by the community, difficulties lie ahead. It seems Open XML object frameworks are breaking down all over the place, complaining about parsing errors and unexpected content.
Enter Markup Compatibility
If an Open XML vNext is created, which in most likelihood will happen, you will need to start differentiating between old and new versions of the XML markup. Luckily Microsoft has thought of this beforehand and has introduced the concept of Markup Compatibility. Since I haven’t heard too many people talking about it, or even better, implement it, I thought it would be a good idea to focus on it a little bit.
So first of all, what is the concept of Markup Compatibility, or MC in short? This feature of Open XML is all about future versions of markup and how consumers like Microsoft Office / Star Office / Open Office should respond to the changes. There is actually a dedicated part of the ECMA specs for Markup Compatibility, which you can read by downloading part five of the specs. When you do, you will find a very important line of spec for all of you who build frameworks for generating and editing Open XML:
Future versions of markup specifications shall specify new namespaces for any markup that is enhanced or modified by the new version.
New XML namespaces you say? Yes, new XML namespaces. Doesn’t this kill all my X-Path queries? Yes it undoubtedly will, unless you have enabled your framework to work with Markup Compatibility from the get-go (this is also the reason why my OpenXMLObjects framework is delayed a bit).
Besides differentiating between old and new versions of markup, it can also be used to create independent extensions of the markup specification. Something we can for instance use MC to overcome the nested table issue when doing Open XML / ODF conversions.
To ignore or to understand, that is the question
So, markup is allowed to have more than one namespace to for instance indicate versions of that markup. Because this is a distinct possibility it would be nice if we can instruct the markup consumers a little bit on how to process the contents. Perhaps you want a namespace which is not understood by the consumer to raise an error, or perhaps you want to provide alternatives to the not-understood content.
The first markup compatibility related markup is Ignorable. You can use this attribute to indicate to a markup consumer that if it does not understand some piece of markup, and that piece resides in a namespace which is ignorable, no error may occur. This is XML namespace based. A little sample:
<Circles xmlns="http://.../Circles/v1"
xmlns:mc="http://.../markup-compatibility/2006"
xmlns:v2="http://.../Circles/v2"
mc:Ignorable="v2">
<Circle Center="0,0" Radius="20" Color="Blue"
v2:Opacity="0.5" />
</Circles>
Here you find a <Circle> element. The second version of this element has a new Opacity attribute which does not need to be understood because of the Ignorable attribute. Note how the XML namespace prefix is used to indicate ignorable namespaces. You can specify a space delimited list of namespace prefixes.
Next is MustUnderstand, which obviously does the opposite.
<Circles xmlns="http://.../Circles/v1"
xmlns:mc="http://.../markup-compatibility/2006"
xmlns:v2="http://.../Circles/v2"
mc:MustUnderstand="v2">
<Circle Center="0,0" Radius="20" Color="Blue"
v2:Opacity="0.5"/>
</Circles>
Here a consumer of this markup has to raise an error when it encounters this markup and does not know about the ‘v2’ namespace.
The availability of these two attributes raises two interesting questions. What about child elements of <Circle>? We may, or may not, understand those child elements. And what about saving the document back again, should this destroy the not-understood markup? There are a few things you can do to steer this. First of all, for Ignorable sections you can specify that the markup consumer should process the contents of the not-understood section anyway. For this you can use the ProcessContents attribute. For round-tripping purposes you can specify PreserveElements and PreserveAttributes.
Providing alternatives for the blind
Now that you can specify a markup consumer should or may understand a certain XML namespace, you will want to provide the non-understanding consumers with some alternatives. This is also available through Markup Compatibility. Take a look at the following sample:
<Circles xmlns="http://.../Circles/v1"
xmlns:mc="http://.../markup-compatibility/2006"
xmlns:v2="http://.../Circles/v2">
<mc:AlternateContent>
<mc:Choice Requires=”v2”>
<Circle Center="0,0" Radius="20" Color="Blue"
v2:Opacity="0.5" />
</mc:Choice>
<mc:FallBack>
<Circle Center="0,0" Radius="20" Color="Blue"/>
</mc:FallBack>
</mc:AlternateContent>
</Circles>
The idea here is pretty obvious. You can provide alternative pieces of markup for consumers who understand a specific namespace. Just provide a few <choice> elements for each namespace and you are done.
Final thoughts
Let me finish up this (quite long) blog post with some final thoughts. The granularity of Markup Compatibility is based on namespaces. Although you can understand some, but not all, elements within that namespace, my guess is that you should try to conform and understand all elements in a namespace and not just a part of it.
The second final thought is on naming your namespaces. Here it will probably be wise not to change the entire namespace around, but to include some kind of version indicator. That will ease the work load of the code frameworks, allowing the use of regular expressions for instance for recognizing a specific namespace.
Hope it helps