" /> Bill de hÓra: September 2004 Archives

« August 2004 | Main | October 2004 »

September 12, 2004

WWW cubed: syndication and scale

The rise of RSS reminds us once again that the web doesn't scale, but it's not time to throw the towel in yet.

The boy who cried RSS

Recently Robert Scoble of Microsoft announced that RSS, the syndication format used for news and weblog feeds, doesn't scale. He said this based on costs incurred by blogs.msdn.com, a hugely popular aggregator of Microsoft technology oriented weblogs and news feeds. The problem was sufficiently severe that the sponsers of blogs.msdn.com took drastic measures to reduce the size of the content been accessed in order to lower the bandwidth costs, a step that has annoyed some users. Scoble concluded that RSS is "not scalable when 10's of thousands of people start subscribing to thousands of separate RSS feeds and start pulling down those feeds every few minutes (default aggregator behavior is to pull down a feed every hour)".

However if there are problems with scale, they do not lie at the level of the RSS or (the more recent) Atom formats. The Atom IETF working group has been discussing this recently and in its discussion the issue and potential solutions clearly lie with the protocol used to serve RSS, HTTP. The characteristics of RSS use only represent a manifestation of a deeper problem with the Web. We need to look at the Web infrastructure and design as to the causes.

It doesn't scale

The argument, "it doesn't scale", in it's worst form is an invitation not to think, and can be something of a dangerous and loaded accusation in technical communities, not unlike the way "you are in league with devil" used to be in village communities. It's certainly not an accusation to be throwing around causually. The suggestion is that "Doesn't scale" == "Bad", and that scale is an inherent goodness of some systems and not of others - a kind of zero sum game for technology. The truth is not so simple - some technologies do have better scaling characteristics than others, but most technologies can be made sufficiently scalable with some work. Werner Vogels of Amazon has said repeatedly that the implicit polling style of the Web doesn't scale. But, this doesn't mean he's lazy or unthoughtful - far from it. Vogels, before joining Amazon did massively distributed systems research; we can thus imagine he probably has a different idea of scale to most of us, in the same way Michael Schumacher has a different idea of what a fast car is to most of us.

The truth is most of us don't need the scale that Amazon, Ebay or blogs.msdn.com does. It would be silly and wasteful to buy all that bandwidth and computational horsepower and them watch it idle for 99.9% of the time. Yet this is what many people do or are told to do. They buy a lot of expensive hardware typically to scale (also to to have physical redundancy) but that hardware is doing nothing most of the time and represents in networking terms "overcapacity", or in financial terms, sunk capital costs (if you think depreciation on a car fleet is bad, server infrastucture will have you crying into your spreadsheet).

There are precendents for this kind of problem. The electrical industry in its early days had difficulty in catering for demand spikes rather than the average - power plants had to be designed in order to provide for maximum demand - but most of the time demand was minimal and the power plants were losing money. Electricity was not something easily or efficiently stockpiled like oil or coal, so the use of battery storage wasn't a viable option. Early workarounds included the invention of the electrical consumber goods industry, so that we would have a reason to consume electricity around the clock and smooth out demand. The real breakthrough lay in the development of national grids that allowed excess to flow to whereever it was needed. That is why to today in some places you can get your meter to run backwards if you feed a surplus of electricity into the grid. Today, a grid for computing is very popular, well-researched and well-funded idea. But it's not clear yet that Grid Computing, as it's known, will allow applications to function untethered from the limitation of bandwidth and computation, if only because application data is more localized and biased than electricity, and as such is less interchangeable - information is not yet a currency. It's also not in everyone's commercial interest to decouple applications to that level from the infrastructure they run on - the evolution of a computing grid can be expected to be fractious.

This is not news

Back to HTTP. Anyone who has worked with HTTP for a while will know it doesn't react well to traffic spikes. On the average the HTTP Web has scaled very in terms of its reach (it's a global network phenemenon). On the individual level of sites and site owners, it's not proven to scale as well. The problem has at least two names, The Curse of Popularity, and the Slashdot Effect.

For the Web to scale to its current levels has required both significant individual investment in servers and a massive investment and deployment of an almost invisible system of server caches and storage networks. To avail of this other network you pay handsomely. The result is that what most people think of as the web (web sites, out there), is in fact a logical and abstract architecture. Physically, due to caching networks and any number of tricks to keep things running, it works rather differently.

Even so, the characteristics of news and weblog aggregation have the potential to overwhelm what has been done so far. This is because what has been done so far was down to cater for human use of the web, not machines. Humans are very slow at accessing the web, but have always had the advantage of being able to read semi-structured HTML markup. Machine reading of HTML, commonly known as "scraping" has in the past been the provence of specialist tools and search engines such as Google. The advance of RSS and Atom markup has made reading content much easier for machines and as a result has seen a rise in automated applications that can and do download content at far greater frequencies than before (Indeed the author has of this piece has claimed in the past that the web and organisational intranets would come under increasing pressure due to the order of magnitude increases in traffic resulting from further automation). The impression users of RSS aggregators are left with is a push medium or semi-realtime update of news and content delivered direct to their computer. But that's the swan above the water line. Below the water line the aggregator is paddling furiously, frequently connecting and downloading content from dozens or even hundreds of sites, doing many times a day what would take a human days to do. It's as if the number of web users has started to grow exponentially again as it did in the mid-Nineties. However, much of the time this results in the same content being downloaded repeatedly on the offchance that anything has changed; a case of busy work.

Solutions beyond eminently clever and expensive caching techniques have been varied. Web servers based on different programing approaches to the popular servers (Apache, IIS) can scale to huge numbers of users, but these are not widely used, and can end up making matters complicated for application developers. In the syndication case, all a more capable web server means is that you will be hit for even greater bandwidth usage charges. This is because the problem is not so much supporting the number of visitors, but the number of times they are visiting.

In theory it would be much more efficient if the server could tell the aggregator what has changed. The thinking then tends to focus on dropping the Web and using alternative network protocols, such as the much maligned peer to peer (P2P) file sharing systems. It has been sometimes claimed of such systems that their ability to scale increases as more users (peers) join the network. Of these, Bittorrent represents perhaps the most viable candidate for integration into RSS usage - indeed Bittorrent was created to solve the problem of the Curse of Popularity. Another possibility is the use of instant message technologies such as XMPP as pioneered by the PubSub aggregator service. Yet another is the old NNTP system on which Usenet runs. However the key attraction of HTTP is its ubiquity and vast reach - people love using it and adminstrators let it past their firewalls, something that can't always be said for IM and P2P protocols.

The most advanced thinking that doesn't involve throwing out the Web is probably Rohit Khare's PhD thesis [pdf], which suggests an "eventing", or push style extension to the Web model. An early example of this approach where the server calls back to the connected client instead of the client initiating each time, called mod_pubsub is available as open source. One of HTTP's designers, Roy Fielding, is rumoured to be working on a new protocol, that could feature support for easing of the load on servers.

It's common to hear an argument along the lines that you should expect to pay for popularity on the Web. This is specious and self-serving, for two reasons. First, you only pay for popularity because the Web is architected in a way that massively favours consumers of content over producers. It's a series of design decisions that has things this teed up this way for the Web, not anything inherent to the Internet itself. Other protocols such as JXTA and Bittorrent have more even-handed characteristics. Second, it implicitly assumes producers in some way should have to pay for the right to be popular, as if popularity was due a levy, or a tax.

This aside, given the way the Web is today, you will pay for popularity whether you like it or not. There are many arcane sounding things you can to do to stave off the inevitable - gzip compression, delta-encoding, etags, last-modified headers, conditional gets - indeed, blogs.msdn has received some sharp criticism (inside and outside Microsoft) for not doing some of these things. But these do not address the fundamental problem - on the Web the burden of cost is born by the producer.

It's notable that such costs will tend to squeeze out the smaller, poorer voices. This alone should be sufficient to concern anyone interested in a democratic and globally accessible medium. Often these are just the voices one wants to hear. Yet, it's been like this since the web began; people who have something to say will stop saying it when it costs to much. The medium seems almost designed to manouevere a site into displaying advertising to pay its way (ironically disliked as they are by many web-savvy technologists). But the advent of RSS feeds have upped the stakes enough that the even the biggest content producers on the planet are concerned about the costs. It shouldn't surprise anyone that if this problem is addressed it will because those who can afford to, will refuse to.

Responsibilty

In all the talk about HTTP scalibility, it's easy to forget another 'ility' - responsiblity. Sean McGrath in his work on eGovernment systems has highlighted an interesting consequence of the kind of client server architecture that HTTP is predicated on - that the responsiblity of accessing, sending and downloading content is born by the client application and not the server*. If the server is not up, that's bad, but the job of getting the data moving around is squarely the client's. When you switch things around to a push based medium, the responsibility of delivery is now born in part by the server owner: "The question of responsibility especially in the event of operational issues arising becomes complex. With a pull delivery model on the other hand, organisational boundaries are crisp and clear." This may not matter for consumer applications, but a surprising number of important business systems and services are now based on HTTP data transfers. And many people believe that syndication technology like RSS and Atom will also be used for commercially consequential exchanges in the b2b, or "business to business" arena. Switching from a polling to a pushing mode, also confers a switching of responsibilities, and this might in time have far-reaching consequences where cost-efficiency is traded for risks, legal and financial. One day, your online bank might be morally and technically culpable for getting your bank statements to your computer. In that case, expect to sign even more of your rights away in the fine print.


* Disclosure: Sean McGrath is the CTO of the author's current employer, Propylon.

September 11, 2004

Setting up Cruisecontrol...

...took most of the day. It's a good tool, but finicky to setup. Also there isn't much info to be found out there on how to use it with Subversion. Here's what I threw together to port a largish modularized project over to it.

Fair Warning: At the end of the day I have no idea whether this is an idiomatic way to use Cruisecontrol - so - this howto might not be good for you. I also ought to say that I wouldn't have been able to set it up in a reasonable amount of time without having the source code to look at (and possibly, that's not the point of having the source code to look at). Once it's up and running tho' it takes care of itself.

Documentation

The documentation is... lacking. It makes you wish every open source project was like Hibernate or MySQL. Some of the attributes and element names are wrong or in camel case when they shouldn't be; some steps are left out; this is annoying but nothing you can't get past after flailing about for a bit and typing random stuff at the console until it works.

Layout

First, download Cruisecontrol and symlink it to /usr/java/cruisecontrol - then created a workspace under a user account's home folder:

    [propylon@cvsdub2 propylon]$ mkdir /home/propylon//builds
    [propylon@cvsdub2 propylon]$ mkdir /home/propylon//builds/work
    [propylon@cvsdub2 propylon]$ mkdir /home/propylon//builds/logs

work is where Cruisecontrol will find your projects; logs is where it will store log data.

    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 22:23 work
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 15:00 logs

Build

Building is the easiest part of the process. Here the documentation actually works out. To build a war file for web reporting add a file called override.properties to cruisecontrol/reporting/jsp pointing the properties at the logs directory you just setup,

    user.log.dir=/home/propylon/builds/logs
    user.build.status.file=currentbuildstatus.txt
    cruise.build.artifacts.dir=/home/propylon/builds/logs

and run the build script by passing a 'war' argument to it. Drop that war into a servlet container, and you're done.

Config file (almost, first a scripting diversion)

The Cruisecontrol config is clunky. But before we get to that, let's point out that Cruisecontrol at heart wants to fire up a JVM per project. Maybe there is some drop-dead simple way around this, but I couldn't figure it out beyond patching the code itself. To allow for multiple projects, we create the following artefacts, for each project:

  • run script: this is the file you'll use to boot Cruisecontrol for the project.
  • config: this is the Cruisecontrol config file for the project
  • ant file: this is a buildfile that lives outside the project structure. It does three things:
    1. blows away the last checkout under the 'work' folder
    2. runs a full checkout of the project
    3. calls a top level ant file in the project

The run file looks like this:

    #!/bin/sh
    export ANT_HOME=/usr/java/ant
    export JAVA_HOME=/usr/java/jdk14
    # put autobuild junk into a dummy folder
    export TOMCAT_HOME=/home/propylon/builds/HOME/TOMCAT_HOME
    # if you're reading a blog entry, skip these two
    export PROPELXBI_HOME=/home/propylon/builds/work/iams
    export IAMS_HOME=/home/propylon/builds/work/iams
    export PATH=$PATH:$ANT_HOME/bin
    ccmain=/usr/java/cruisecontrol/main/bin/cruisecontrol.sh
    $ccmain -projectname iams  -configfile cc-iams-config.xml &

and the build file looks like this:

    <?xml version="1.0"?>
    <project name="cc-iams" basedir="." default="build">
      <property file="cc-svn.properties" />
      <path id="project.classpath">
        <pathelement location="${svnjavahl.jar}" />
        <pathelement location="${svnant.jar}" />
        <pathelement location="${svnClientAdapter.jar}" />
      </path>
      <taskdef resource="svntask.properties" classpathref="project.classpath"/>
      <target name="build">
        <delete dir="work/iams"/>
        <svn  username="xxxxx" password="xxxxx">
          <checkout url="http://cvsdub2/svn/iams/trunk" revision="HEAD" destPath="work/iams" />
        </svn>
        <ant antfile="build.xml" target="build" dir="work/iams"/>
      </target>
      <target name="cp" description="print the build classpath">
        <property name="cp" refid="project.classpath" />
        <echo>${cp}</echo>
    </target>

All these files go under that builds folder we just made. Here's what things look like so far:

    [propylon@cvsdub2 builds]$ -rwxr-xr-x  1 propylon propylon   393 Sep 10 21:25 cc-iams-build.sh
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon  1599 Sep 10 22:28 cc-iams-config.xml
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon   905 Sep 10 12:27 cc-iams.xml
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 22:23 work
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 15:00 logs

You could just call straight into the project's ant file and skip the new build file, but I like the idea of a separation between automated and project build systems (must... use... more... indirection). Incidentally, the project itself has ten or so standalone modularized builds that can be run from the master build referenced above.

Config file (almost, really, some subversion first)

It can seen from the build file above that the project is in Subversion. That means we need to install the svnant libraries. This is easy to do, just unpack the distribution into /usr/java/svnant. The cc-svn.properties file can be reused across all projects and looks like this:

    svnant.version=0.9.1
    lib.dir=/usr/java/svnant/lib
    svnjavahl.jar=${lib.dir}/svnjavahl.jar
    svnant.jar=${lib.dir}/svnant.jar
    svnClientAdapter.jar=${lib.dir}/svnClientAdapter.jar

(it's lifted directly from the example provided by svnant)

So here's where we're at:

    [propylon@cvsdub2 builds]$ -rwxr-xr-x  1 propylon propylon   393 Sep 10 21:25 cc-iams-build.sh
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon  1599 Sep 10 22:28 cc-iams-config.xml
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon   905 Sep 10 12:27 cc-iams.xml
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon   608 Sep 10 14:28 cc-svn.properties
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 22:23 work
    [propylon@cvsdub2 builds]$ drwxrwxr-x  5 propylon propylon  4096 Sep 10 18:17 HOME
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 15:00 logs

Ant 1.6 (config file is next, I swear)

Cruisecontrol ships with Ant 1.5.3. I found this out when it couldn't run an ant script with an import task, but for the sake of this howto I'm going to pretend I knew that upfront. The way to get around this without changing its classpath in the startup script is to use the "antscript" attribute from the ant element to point to a script instead, ie:

    <schedule interval="21600">
      <ant  antscript="/home/propylon/builds/ant16.sh"  
          buildfile="cc-iams.xml" target="build" />
    </schedule>

The script in turn points at your own ant distribution:

    #! /bin/sh
    export ANT_HOME=/usr/java/ant
    antmain=${ANT_HOME}/bin/ant
    $antmain "$@"

let's add that to the builds folder:

    [propylon@cvsdub2 builds]$ -rwxr-xr-x  1 propylon propylon    77 Sep 10 22:18 ant16.sh
    [propylon@cvsdub2 builds]$ -rwxr-xr-x  1 propylon propylon   393 Sep 10 21:25 cc-iams-build.sh
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon  1599 Sep 10 22:28 cc-iams-config.xml
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon   905 Sep 10 12:27 cc-iams.xml
    [propylon@cvsdub2 builds]$ -rw-r--r--  1 propylon propylon   608 Sep 10 14:28 cc-svn.properties
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 22:23 work
    [propylon@cvsdub2 builds]$ drwxrwxr-x  5 propylon propylon  4096 Sep 10 18:17 HOME
    [propylon@cvsdub2 builds]$ drwxr-xr-x  3 propylon propylon  4096 Sep 10 15:00 logs

Config file (at last!)

Here's a basic config file:

    <cruisecontrol>
      <project name="iams" buildafterfailed="false">
        <bootstrappers>
          <currentbuildstatusbootstrapper file="logs/iams/currentbuildstatus.txt"/>
        </bootstrappers>
        <modificationset quietperiod="60">
          <svn
            LocalWorkingCopy="work/iams"
            username="xxxxx"
            password="xxxxx"></svn>
        </modificationset>
        <!-- 6 hours -->
        <schedule interval="21600">
          <ant antscript="/home/propylon/builds/ant16.sh"  
              buildfile="cc-iams.xml" target="build" />
        </schedule>
        <log dir="logs/iams" encoding="UTF-8">
        </log>
        <publishers>
          <currentbuildstatuspublisher file="logs/currentbuildstatus.txt"/>
          <htmlemail
              mailhost="mail.propylon.com"
              returnaddress="noreply-cruisecontrol-at-propylon.com"
              defaultsuffix="-at-propylon.com"
              buildresultsurl="http://cvsdub2:8080/cruisecontrol/buildresults/iams"
              css="/usr/java/cruisecontrol/reporting/jsp/css/cruisecontrol.css"
              xsldir="/usr/java/cruisecontrol/reporting/jsp/xsl"
              logdir="logs/iams"
              subjectprefix="[build-nanny] ">
            <map alias="list" address="S0070-at-724.ie"/>
            <map alias="tommy.lindberg" address="tommy.lindberg-at-propylon.com"/>
            <map alias="bill.dehora" address="bill.dehora-at-propylon.com"/>
            <always address="list"/>
            <failure address="tommy.lindberg" reportWhenFixed="true"/>
            <failure address="bill.dehora" reportWhenFixed="true"/>
          </htmlemail>
        </publishers>
      </project>
    </cruisecontrol>

The first thing to say about this is that the Subversion task's name is 'svn' not 'Subversion' (did I say the documentation was lacking?). Anyway, the above will do roughly the following:

  • Schedule a build for a project called 'iams'
  • Stop trying to build after a failure, unless there is a change in the repository (buildafterfailed="false")
  • keeping track of whether the project's being built (file="logs/iams/currentbuildstatus.txt")
  • look for projects changes via subversion (LocalWorkingCopy="work/iams")
  • Attempt to run a build every 6 hours (interval="21600")
  • Only run a build if something changed
  • Use a shell script to invoke the buildfile (antscript="/home/propylon/builds/ant16.sh" buildfile="cc-iams.xml" )
  • Log everything (dir="logs/iams")
  • Annoy people with the build results ("htmlemail"); some people will get annoyed at every build ("always"); some only when there's a problem ("failure")

By the looks of things, configuration has options for a number of other features, but this setup is fine.

Ant extensions

Cruisecontrol wasn't picking up jdepend or tomcat tasks during builds even thought these were installed into Ant's lib folder and are referenced as such by the individual build files (totally different classpaths of course, doh). This broke perfecty good builds. The run scripts (in cruisecontrol/main/bin) have a classpath variable called CRUISE_PATH which includes Ant. Dropping the jars in question into Cruisecontrol lib folder and hacking the paths onto the end of CRUISE_PATH variable in the script solved that problem (I'll have to remember to put them somewhere else or a Cruisecontrol upgrade will break the build). NB: I did this before discovering I needed to bung an Ant 1.6.x shell script into the process - pointing at another Ant might end up having things work for you without making modifications; if not, fix up the Cruisecontrol scripts.

Do one checkout

Cruisecontrol does not seem to check the project out the first time it's run; you have to do this (again maybe this is possible to setup). So cd to the work folder and:

    [propylon@cvsdub2 builds]$ cd work/
    [propylon@cvsdub2 work]$ mkdir iams
    [propylon@cvsdub2 work]$ svn co http://cvsdub2/svn/iams/trunk .

Murray Walker Moment: Go Go Go

Ok, we're done. Start up Cruisecontrol using the cc-iams-build.sh script

    [propylon@cvsdub2 builds]$ ./cc-iams-build.sh

End

Clearly, there are a number of alternate ways to do this; much of it will come down to how you like to organize specifics, ie, whether you put things like ANT_HOME in a .profile or in a script, or whether you want a separate buildfile for Cruisecontrol use. I've also left out a lot of details, like setting the executable bit on the shell scripts, testing the cc-iams.xml ant file is working, checking that user has permissions to write into certain folders and can run certain things and so on. The essential setup described here will also work on windows, once you fix up the paths and use .bat files instead.

Cruisecontrol has some nice touches: html mail, an indicator near the top mentioning (in red) that "this project doesn't have any tests", a web console, pie charts indicating the proportion of busted builds, build only if something changed, a pause if the project is being checked into, a list of changes made since the last build, a list of deployed artefacts, and test results. It seems once you have one project going, Cruisecontrol just works, and you can cut and paste the config file and various scripts for the next one. But I'm starting to understand why certain Thoughtworks and Atlassian open source bots are hacking out Damagecontrol.

September 05, 2004

Die, default namespaces, die

I'm currently implementing XMPP core (client and server) in Python. It's meant to be a fun/educational thing. But there aren't many compliant XMPP stacks out there, and there seems to be inertia within the Jabber community in getting momentum around a compliant stack. So I want to have a crack at an open and plug extensible reference implementation. XMPP is a fascinating technology. Within, a rant about design decisions that are taking the fun out of things for the moment.

Slump.

This is from the XMPP Core draft (draft-ietf-xmpp-core-24), section entitled XML Usage within XMPP:

A default namespace declaration is REQUIRED and is used in all XML streams in order to define the allowable first-level children of the root stream element. This namespace declaration MUST be the same for the initial stream and the response stream so that both streams are qualified consistently. The default namespace declaration applies to the stream and all stanzas sent within a stream (unless explicitly qualified by another namespace, or by the prefix of the streams namespace or the dialback namespace). - 11.2.2 Default Namespace

"A default namespace declaration is REQUIRED". Wow. I've never seen that before.

It goes on to say:

An implementation MUST NOT generate namespace prefixes for elements in the default namespace if the default namespace is 'jabber:client' or 'jabber:server'. An implementation SHOULD NOT generate namespace prefixes for elements qualified by content (as opposed to stream) namespaces other than 'jabber:client' and 'jabber:server'. - 11.2.2 Default Namespace

So here's what I think the problem is. Any XML content going into an XMPP stream/stanza that is not namespace qualified will inherit either the jabber:client or jabber:server namespaces. That's exactly how default namespaces are expected to work. When the content is lifted out it will have to be pulled out of those namespaces, otherwise the markup will be trashed by merely having passed through the XMPP application layer. Most likely fixing up such content will require a design-conflicting hack to obliviate the default mechanism - off the shelf namespace aware tools will correctly preserve the namespace binding and leave the embedded markup borked. Granted what XMPP calls Stanzas and markup folks sometimes call fragments are coming in as discrete chunks and you could silently ignore the default namespaces that came beforehand, but as far as I can tell, you're supposed to treat the entire XML stream as a logical document so the normal rules appy (and it would be weird to spec the rules and then not apply the rules). There are layering problems here.

I'm surprised to see an XML application envelope insisting on a default namespace on the document element. I don't know anything at all about the design decisions, but having seen some namespaced weirdness in my time, my first reaction is that this is not something you want to be designing into Internet technologies.

Choices.

I'm fairly conflicted on this. Entertaining non-compliance is a real option despite my design goals. Still, it's a just a draft and it may be possible to get this language turned around, assuming I have the wherewithal to present a coherent argument to the XMPP WG and not just a tirade on a blog - but my experience is that technical criticism or judgement on the merits of XML Namespaces is not always wanted, no more say than criticism of Web Services are wanted. People do believe Namespaces are an important technology. I suspect the reason this hasn't come up as an issue with XMPP yet is that as I said, compliant XMPP code is thin on the ground and XMPP doesn't seem to be used much yet to carry XML markup, something that will certainly change. And it seems that Jabber code has been fast and loose in the past when it came to XML processing. Consider this outtake, again from XMPP Core:

The element names of the element and its and children MUST be qualified by the streams namespace prefix in all instances. An implementation SHOULD generate only the 'stream:' prefix for these elements, and for historical reasons MAY accept only the 'stream:' prefix.

'SHOULD', here, means "do it, unless you have a really good reason not to". I think the only reason you would specify the above is if you were worried about breaking legacy implementations that were doing dodgy things. My reason not to obey this is it's broken to assign significance to namespace prefixes in this way. In some places it's acceptable to apply significance to a prefix (XPath comes to mind), but not here.

Maybe we'll get lucky and no non-namespaced markup will ever get sent over XMPP ;)

Permathread.

Now, this issue of namespace defaulting has come up before in Atom. There was a thread a while back with the premise, 'we need say something about xmlns="" for content', a thread I mostly instigated. My basic argument was that since you can't guarantee that Atom will be the outer document format, you should advise on being robust with respect to namespace pollution in content. So we talked about it and I got strong pushback on saying anything of the sort in the format spec - 'put it in the guide', ''not core' - that sort of thing. However, pushing Atom around using XMPP/Jabber is hot area at the moment and both technologies are receiving interest for applications outside their target domains (syndication and instant-messaging). A number of people think XMPP and Atom are a good fit, myself included. But the jabber:client and jabber:server default namespaces by design ensure that XML content embedded within an XMPP stream will be polluted. Well behaved namespace aware tools will carry those two default namespaces right down into the content, and leave them there after the markup has passed out of XMPP. There's nothing in either specification that would indicate a problem. For Atom, I suspect it's something we might have to revisit at some point. As Tim Bray puts it - Broken As Designed.

Who pays?

The question then is: should Atom pick up the tab on this? Should XMPP? This is a great question, because it presents a working group with wiggle room to disavow responsibility for ensuring inteoperation and robustness between XML formats used at different layers.

I'm not involved in specifying XMPP, but as a member of the Atom-WG I think Atom should do its utmost to enable its users' content to be robustly carried irrespective of the external packaging format. In terms of specification text, we are talking about a couple of sentences that do not impact on any other aspect of the format - in fact telling users they should shroud their content in xmlns="" before dropping it into an Atom feed doesn't impact on Atom at all.

No wonder then people feel it's superfluous or unneccessary to specify this. And of course, if Atom was all the markup in the world, there is no point. You see, within the bounds of any given spec or layer, default namespaces are a non-problem; and this is perhaps why Atom folks are deeply reluctant to say anything. It's only as you try to get the various markups to play nice you get bitten, and you tend to get bitten with respect to an enclosing XML format. Usually this is an enveloping or packaging technology and usually this is happening far away in time and space from any working groups :) As such it's entirely contextual and by following a certain line of reasoning, the idea of adding such text to the Atom format spec was not accepted.

It is precisely that ability to wiggle out of saying anything that lets you know default namespaces represent an architectural problem.

Ziggurat.

We can explain the fundamental architectural issue using 3 pictures.

The first one is what most technologists have in their heads when they are thinking about a stack or a layered architecture:

stack.gif

The thing to remember here is that layering is an unquestionably good principle to hold; any interactions and couplings between layers are highly controlled or simply do not exist. Much of the problems in computing arise from not having sound first principles to base things on; layering is about as close you can get to one.

This next one is what a stack or a layered architecture predicated on XML packaging looks like:


nest.gif

The important point to remember here is that most technologists often have the first picture in their head even when they are dealing with this second structure. In itself, that's not a problem; you can still preserve a clean layering.

When it comes to using default namespaces, this 3rd picture shows a line who's direction indicates the scope of a default namespace when applied to the outer layer:

nest-ns.gif

Now there is a problem. It should be clear that default namespaces trangress the layers. In turn this means they violate one of the few engineering principles we have in computing that could be considered a universal.

Default namespaces offer no benefits that justify breaking with the layering principle.

The rabbit's not like us.

The thing to remember is that Namespaces is in XML terms, ancient. It goes back to 1998 and was controversial at the time. But it was designed before the rise and rise of XML as a protocol and packaging technology and it's not obvious to me anyone could have predicted the extent that XML has pervaded protocol and interchange design since then. Thus the impact of default namespaces on system layering would not have been entirely clear over half a decade ago. However, if there are only prefix namespaces, this entire problem evaporates. Everyone can play together up and down the stack as no enclosing envelope or XML packaging technology, one that you don't know about or perhaps one that hasn't even been invented yet, can come along and break your content (only versioning is allowed to break your data like this :).

Twenty-eight days... six hours... forty-two minutes... twelve seconds

The default namespace is a bizarre construct. It's a like a macro and a lexical scope rule rolled up into one, but from an alternate universe. In modern protocol construction, it conspires to produce an architectural prank of the first order. As a result of the way a single XML document tree can operate at different application layers, having a globally scoped namespace snatches defeat from the jaws of victory.

On the other hand it's 2004 and we've learned a lot since then. It has to asked of XMPP why it takes this approach. Here's XMPP Core's rationale for using XML Namespaces:

Namespaces are used within all XMPP-compliant XML to create strict boundaries of data ownership. The basic function of namespaces is to separate different vocabularies of XML elements that are structurally mixed together. Ensuring that XMPP-compliant XML is namespace-aware enables any allowable XML to be structurally mixed with any data element within XMPP. Rules for XML namespace names and prefixes are defined in the following subsections. - 11.2 XML Namespace Names and Prefixes

"The basic function of namespaces is to separate different vocabularies of XML elements that are structurally mixed together". Well, jabber:client and jabber:server through their use of default namespaces subvert that rationale and ensure that that ownership can cross boundaries. I'm surprised this passes muster with the draft-hollenbeck recommendations on XML markup for use in IETF protocols tho' I haven't looked at it in a good while - if it does, that document is also in need of attention.

Fix.

The immediate solution is to wrap markup with an xmlns="" declaration before letting it loose on the world. If you don't you are commiting what might be termed a fallacy (in the Peter Deutsch sense) of XML - that the root element in front of you is always the root element. It would help if we could get this bootstrapped into protocol and format specifications via non-invasive text, especially in formats such as Atom, SOAP and XMPP that are taking responsibility for carrying arbitrary content.

Ultimately however, the xmlns="" declaration is a workaround - the proper solution is to deprecate and then eliminate the default namespace from acceptable XML usage. I would love to see a future edition of the namespaces spec start that ball rolling.