One of the bigger issues I encountered while writing code for the Open XML Objects project is validation of the various XML files. Of course I want and need to use the XML-Schema files downloadable from ECMA, but these schemas contain circular imports which isn’t supported correctly by the XmlSchemaCollection class. The second version of the .NET framework provides the XmlSchemaSet class which ought to alleviate us from these problems, but it was a bit harder to get going than I expected. I do have a working sample for you, which is why I am writing this blog post. Be aware of the fact that the code is a bit strange due to how the XmlSchemaSet resolves imports of referenced schemas., just bear with me.
First of all you need to create the XmlSchemaSet class and add the ‘wml.xsd’ schema to the set. Easy enough, a few simple lines of code:
XmlDocument doc = new XmlDocument();
doc.Load("document.xml");
XmlSchemaSet set = new XmlSchemaSet( doc.NameTable);
XmlSchema wmlSchema = XmlSchema.Read(new StreamReader("wml.xsd"),
new ValidationEventHandler(Validate));
doc.Schemas.Add(wmlSchema);
Now we are getting to the strange part. When you add the ‘wml.xsd’ schema to the schema set, all the imported schemas are added to the set as well. Because one of these imports circularly imports the ‘wml.xsd’ schema again, it is present two times in the schema set right after the ‘Add’ call to. When you try and validate the document now using this schema set, you will receive an error which tells you the ‘txbxContent’ element is defined twice. This error is correct, because the schema is twice present in the schema set. We can of course provide are -really- simple solution to this problem.
doc.Schemas.Remove(wmlSchema);
Now that was easy enough! Removal only removes the schema you specify, all the imports will remain. This means we have only removed a duplicate, not all schemas in the set. Next we can validate the XmlDocument with the schema set:
doc.Validate(new ValidationEventHandler(Validate));
Be very, very aware of the fact that these issues are due to some limitations of the XmlSchemaSet class which might be fixed later on. For this to work you also need to know which schemas are added twice. You can of course start writing code to find out these issues to make a more generic approach to the circular import problem.
Just add a nonsense node to your XML to see validation fail. I have included a demo-application for you to test with, get it in the downloads section.
Hope it helps!
Wouter
Posted
07-09-2006 12:34
by
Anonymous