Sunday, February 12, 2012

Basic ISO-Schematron functions

Every blogger loves Google Analytics: oh, the voyeuristic joy of knowing what search terms people misspelled for Google to return your blog as a 'top' result is irresistible! My blog isn't popular because it's about VMware, it's because I keep misspelling VMware as WMware. Did you know my blog is the most popular WMware blog in the world?! Actually, that's not true. My blog isn't popular at all. But I'm hoping today's post about ISO-Schematron will turn more fortunes around. I'll be grabbing some search terms out of Google Analytics and answering the questions you never asked me, which is not dissimilar to hosting an unpopular talk back radio show.

I seem to get a lot of ISO-Schematron-related hits, so if you're uninterested in this topic, close your browser window now and throw your computer into a fountain. If you're interested in the XML validating magic of ISO-Schematron, feel free to read on or throw your computer into a fountain as well.

What is ISO-Schematron?

ISO-Schematron is an XML schema language. Schema languages are ways of making sure your XML is valid. Take a look at the following XML code

<dog id="500">
     <name>Chop Chop</name>
     <breed>Silky Terrier</breed>
     <dob>2012-01-05</dob>
</dog>

It's well formed XML. Without knowing too much about the XML, you can tell it describes a dog named Chop Chop. The dog has the breed Silky Terrier and a date of birth sometime in 2012. This reminds me, I better change my password reset questions now. Let's look at the same snipplet of XML code except with some creativity.


<dogPASTA
     <name>Chop Chop</name
     <breed>Silky Terrier
     <dob>2012-01-05</date fo birth y'all>
<dog> PASTAPASTASPTA


This is definitely not well formed! Not every opening tag has a closing tag, there are slashes missing and there's pasta everywhere! You don't use an XML schema language to determine whether this XML is well formed or not well formed: you use an XML syntax checker and common sense. Let's look at yet another snipplet of XML.


<dog id="500">
     <name>Chop Chop</name>
     <name>Choppy</name>
     <breed>Silky Terrier</breed>
     <dob>2012-01-05</dob>
</dog>


It's easy to tell that this XML is well formed (there's no pasta lying around, for starters). But is the XML valid? Is a dog allowed to have two names? Whether or not a dog is allowed to have two names is the decision of the owner. I, for one, welcome our multi-named dog overlords! But what if you don't? What if you believe that a dog can have one name only?! We'd be in disagreement! We'd both agree that the XML is well formed, but we would disagree on whether the XML was valid.

What is ISO-Schematron? (no really, answer my question this time).

I need to test XML code to see whether it meets some rules.

  • A dog has an ID attribute
  • A dog can have multiple names, but has to have at least one name.
  • A dog definitely has a date of birth
  • A dog must have a breed
  • A dog's date of birth cannot be in the future
With ISO-Schematron, I can write rules. Each rule will contain one or more assertions. Writing these assertions gets tricky.

Alright! Let's write assertions!

Let's start by ripping search terms out of analytics like uncreative Law & Order writers rip stories out of headlines and back episodes.

Google search term: count elements schematron children must have
Translation: Señor Paul, what ISO-Schematron could check whether an element has the correct amount of child elements?

What you're looking for is the count() function. The following snipplet will make sure you have 5 child elements.

<iso:rule context="dog">
<iso:assert test="count(breed) = 1">The dog element must have one breed element only!</iso:assert>
</iso:rule>

With a simple and, you can check for the correct amount of multiple types of child elements.

<iso:rule context="dog">
<iso:assert test="count(breed) = 1 and count(dob) = 5">The dog element must have 1 breed element and 1 dob element</iso:assert>
</iso:rule>

Easy, next!

Google search term: check schematron first element
Translation: Señor Paul, I too use ISO-Schematron to verify the validity of my XML files, possibly because I'm a student trying to cheat on my homework! Please assist me by explaining how to check if the first child element within an element is of a certain type.

Try this buster. This will check if the first element in a DiskSection element is Info.

<iso:rule context="dog">
<iso:assert test="*[1][self::name]">The first element within dog must be name</iso:assert>
</iso:rule>

Google search term: check existance of attribute schematron
Translation: do my homework for me please

First of all, you spelled existence incorrectly. I'm guessing you want to check if an element had an attribute. This snipplet checks whether the dog element has a name attribute

<iso:rule context="dog">
<iso:assert test="@id">The dog element doesn't have an ID attribute!</iso:assert>
</iso:rule>


Easy! If you have any other tricky ISO-Schematron questions, put them in the comments and I'll try to help you and then make fun of you.

No comments:

Post a Comment