Chameleon schemas considered harmful
Despite some warts, XML and the Web have done pretty well. They work and they work well. A large part of that is because both were designed with certain basic principles in mind. This gives them a unifying vision and a clean architecture that solves many problems.
However, when a technology becomes successful it often attracts developers who recognize its success but don’t recognize or understand the underlying reasons for its success. Each one wants to make a change here, an addition there, a deletion somewhere else. Sometimes these suggestions are good and valid. Sometimes they’re not. However, even the suggestions that address real needs and use cases cause problems if they’re made without a deep understanding of the principles of the thing being changed. It’s like modifying a building by knocking down walls, cutting new windows, and erecting an extra bedroom on the roof. If you do this without consulting the original blueprints and understanding of the architectural principles that went into the house design, the best you can hope for is an ugly mess. More likely the whole structure will collapse around you, as the changes weaken the foundation the whole edifice rests upon.
Previous examples include cookies, frames, SOAP, YAML, SimpleXML, binary XML, RSS, and many other cases I could mention. However the latest is coming from a place I really didn’t expect it: the W3C XForms and XHTML working groups. These two are working together to eviscerate XML namespaces, and make it difficult to impossible to process XHTML2 and XForms with standard XML tools like XSLT and DOM.
The problem is the notion of chameleon schemas. These two working groups have decided to make it possible to change the namespace of an XForm so that in one document it has the namespace http://www.w3.org/2002/xforms, in another it has the namespace http://www.w3.org/2006/xhtml2, and in a third it has the namespace http://www.example.com/I/cant/believe/theyre/seriously/considering/this.
Because modern XML software identifies elements by local name and namespace URI, it will break every time an XForm is embedded in a a new namespace. Writing a generic XForm processors will become extremely difficult. I expect developers will begin looking at the local name and ignoring the qualified name, thus eliminating all benefits namespaces were supposed to have. Alternately they’ll just recognize a couple of namespaces (the default XForms namespace an the XHTML 2 namespace) and ignore all other possibilities. But even this will greatly complicate the code that implements the spec.
I honestly don’t understand why they want to do this. Perhaps the XHTML working group simply wants one less xmlns:xf
attribute in each form-using XHTML 2 document? That hardly seems worth the trouble this will cause, but I haven’t been able to think of any other reason.
I have expressed my opposition to this insanity with the working group. You may wish to do the same. This is not how namespaces are designed to work, and it’s going to cause massive problems for anyone writing any sort of software to process XForms, whether it’s DOM, SAX. XSLT, XPath, or almost anything else. XForms elements should be able to be recognized by their namespace alone. You should not have to care about the host language in which they’re embedded. If we’re going to start changing the namespace for every host language that comes along, we might as well not have namespaces in the first place.
October 26th, 2006 at 8:21 AM
Elliotte,
The reason you are finding it so easy to ‘save the world’ from this “insanity” is that you have set up a strawman…and by definition strawmen are pretty easy to defeat. However, in this case you’ve not understood what it is you are railing against.
The reason for the chameleon namespace schemas is to allow languages to incoporate XForms functionality directly. Say for example, the VoiceXML group wanted to re-use the XForms data model, but they wanted it to be part of their language. They could do this by make the XForms
model
andinstance
elements part of VoiceXML, and the chameleon schemas allow them to do this easily.XHTML 2 is doing this; it still defines an
input
control in the same way that HTML does, but it says that the functionality or definition of this control comes from XForms. This is good re-use in my book, and doesn’t necessarily require that the control is prefixed with xf:.Regards,
Mark
Mark Birbeck
CEO, x-port.net
http://skimstone.x-port.net/
October 27th, 2006 at 10:07 AM
Why is it so important that the name not be prefixed by
xf:
? This is going to cause massive problems for anybody trying to process this stuff with standard tools. What benefit is gained by avoidingxf:
that counterbalances this cost? Why might the VoiceXML group want to include XForms elements in their application in their own namespace?Let’s be clear: chameleon schemas are not required to allow languages to incorporate XForms functionality directly. There is no reason I can see that a language such as VoiceXML or XHTML 2 cannot simply add the XForms schema by reference, while keeping those elements in their own namespace. That is a completely reasonable and plausible option. Why has this been rejected?
If you see a straw man here, please explain to me how what I’ve set up differs from what the XHTML2 and XForms working groups are actually doing. It was not my intent to set up a straw man argument. This is my understanding of what’s actually going on based on reading the specs and communications from various working group members. If the XForms working group is not planning to use different namespaces for XForms in different host languages (and specifically in XHTML 2) then please say so, and explain what they’re actually trying to accomplish with section 2.2.1 of the XForms spec, because if it’s not this I can’t figure out what it is.
October 30th, 2006 at 11:08 AM
I took a look at the spec section and I don’t think the sky is falling. If I understand correctly, the “included” elements from XForms would gain the namespace of the including/master schema. So any processing would be on the master namespace. It just so happens that the definitions of those elements come from the XForms schema. It is a way to clone the desired XForms elements without having to have multiple namespaces in the master schema.
Now, why do it this way rather than just include the namespaced XForms schema, I don’t know.
– Jasen.
October 31st, 2006 at 5:29 AM
Namespaces are broken anyway, in that the usage of namespaces in the wild no longer seems to be remotely related to the use cases put forward in the Namespaces spec. When was the last time you saw two elements with the same local names but different namespaces together? I’m not saying that there isn’t a use case for namespaces, from a developer perspective namespaces are good for extending formats but this use case is hampered by validation being too fragile in most cases to allow for easy extension.
Add to this that the number of namespaces are increasing at a faster rate than the number of Specifications, this increase being fueled in part by the need of most organizations to use XML Schema, and to use it in an efficient maintainable manner, which basically requires increasing the number of namespaces.
But I don’t think chameleon schemas are a big problem, if, as I understand it, the namespace of xforms in XHTML will always be the XHTML namespace. That said it is annoyingly stupid. It seems to be predicated on increasing productivity so that poor markup writers won’t have to write xf: when what it will probably do is decrease productivity because of having to port solutions for namespace x to namespace x+ all the time.
The big problem for XSL-T, DOM, any namespace processing application is the existence of xsi:type.
Luckily every application I’ve ever seen just assumes that xsi:type doesn’t exist. Some day someone will use that to cause problems I bet.
November 4th, 2006 at 2:44 PM
Elliotte,
I’m inclined to agree with Mark on this one, though you’re raising some salient points. XForms itself is a mess with regards to namespace – in addition to the xf namespace you have a separate namespace for events and another for schemas, and this is even before you get to a stage where you are dealing with out of band integration. If you assume, as I do, that at least from a browser standpoint the purpose of a namespace is to act as a key for a look-up table of libraries, then so long as the enveloping namespace – in this case, the XHTML 2.0 ns, maps the appropriate sub-bindings into the language and integrates the libraries – then the fact that the xforms namespace isn’t explicitly declared becomes moot.
The one aspect where I think the caveats should be raised is that this should in essence be a line in the sand kind of decision, and not be made to apply retroactively. That is to say, at this stage it would be dangerous and foolhardy to institute a chameleon rule for XHTML1.0, given that this specification is already frozen and forming part of the infrastructure. It seems rational to me to assume that if you have some nsXHTML2 namespace, the provision for chameleoning would have to be explicitly built into the schema, and perhaps would require that all chameleoned namespaces would have to be explicitly declared (I’m still going through the proposal on this, so just making guesses at this stage). This also raises a second tacit assumption that any collisions in namespaces that occurred would have to be fully qualified – for instance, if both XHTML and XForms had a switch statement, you would have to explicitly declare xf:switch as a child.
To go back to your previous analogy, I see such chameleons as being a critical piece in better integration of modular units. As it stands now, an XHTML 1.0 document increasingly looks like a Victorian house that’s been added to over the years by different builders – with siding on one part of the house, stone on a second and wood on a third. Chameleoning would at least make it a little easier to state “Here is a new wing of the house, but it needs to be sided with wood of a certain shade in order to keep it looking out of place.” As someone who’s had to fight dealing with four or five different namespaces in XForms documents, I can say that I’d certainly welcome the ability to “hide” them.