" /> Bill de hÓra: May 2005 Archives

« April 2005 | Main | June 2005 »

May 29, 2005

Control surfaces

I lost the link to this a while back, but Jon Udell pointed at it recently: Controlling agent societies with Ruby: Rich Kilmer and Dana Moore. Which could be subtitled - Java good for plumbing, Ruby good for taps. If you're into scripting, agents, testing, gaming, simulation, or glimpses into the future, you'll probably enjoy it.

More about Cougaar here, and here. You could sum it up as a distributed agent melting pot - emphasis on working code over architecture and standards. I studied this stuff in college, good to see some of it up and running.

May 28, 2005

No more nails: making good technology choices

There are times when the business needs to get out of the way and let developers get on with things. Warning: hokey analogies between kitchen units and software follow.


Screws and nails

So you're getting your kitchen refitted. New floors, new unit, a fresh coat of paint on the walls. You're looking through some swatches for the paint, when you see the people fitting the shelf doors to the units are using screws.

ooh! shiny!

Screws!

Now, you don't have any screwdrivers in the house, just a few hammers. That's because you use nails to fix things together. Sometimes you use glue or staples for paper stuff, but for anything else it's nails. That's how it's been in your family going back since forever. Nails, nails, nails. But think about it. You've never fitted a kitchen; your experience is limited to hanging a few pictures. Are you going to tell those folks to use nails and not screws because you're worried you might not have a screwdriver around or you might now be able to fix a door when it comes off? That would not be smart, not just because you don't have a screwdriver, but because the person who decided to use screws is almost certainly more competent than you are at fitting kitchens. As there's few things worse than someone expressing a strong opinion in something they know little about, you decide stick to picking colours from the swatches and let people do their jobs. Maybe you'll pick up a screwdriver at the store and see what all the fuss is about. In the realm of kitchens, screws and nails count as infrastructure, and infrastructure decisions are best left to experts.

And I'm not being snide about picking colours. At the end of the day the colour of your kitchen matters more than how the doors were fastened onto the units - unless they keep falling off. That will drive everyone that uses your kitchen nuts. And if they fall off with using nails, and you keep hammering those nails back in, there's a fair chance that you'll end needing to replace all or part of the unit to use screws. Like you should have done in the first place.

Being effective

The situation in enterprise development is somewhat different. IT managers tell developers to use software nails, either because those nails are considered industry standard or because they believed someone else when they told them those nails are industry standard. That using screws here and there could be a good idea, doesn't matter as much as the perception of standardization and adequate support.

Here's Richard Monson-Haefel on why putting the programming languages Beanshell and Groovy through the Java Community Process matters:

"Standardizing these languages has very little to do with portability and everything to do with industry acceptance. If developers at Suit-and-Tie, Inc. want to use a scripting language in their Java development, they are going to have to convince their project managers that language X, although not standardize or recognized by any industry body, is a safe bet. In practice it just doesn't work. If you don't believe me go try and convince your boss to switch from Java to Jython. It's not going to happen. - Why Standardize BeanShell and Groovy?"

nail product matrix

Richard is right; what's happening here is that Beanshell and Groovy are looking to join the list of acceptable software nails. We should all be happy about this. What is bothersome is not that it's been said so frankly, that's the reality after all, but the implicit message: inefficiency is to be accepted, this is how things are done and will be done in IT.

We can all shrug, and move on to the next kitchen, and if we're lucky somebody else will be called into fix that other kitchen a year from now. We might call this playing not to lose, but it's probably that whatever yardstick is being used to measure is wrong and is leading people astray. One can look at an institutionalized inefficiency as being inevitable or as being an opportunity to save money on operating overheads [1]. When the developers are telling you one thing about technology and the models tell you another, chances are the models are broken. So it is very much about making good informed choices. Developers mind you can be too quick to accept a status quo instead of looking to present new technologies in terms of improved costs.

Putting the soft back in software

One immediate problem with this line of thought, that says more effective but non-standard technology loses out is that it makes Paul Graham look unnecessarily prescient :) But the irony here is that these scripting languages are simpler to work with Java or C#; that's the whole point of using them! If managers are worried about who will manage the system legacy they could do worse than consider the use of languages that were once considered toys due to their simplicity, but are now known to be up to the task.

utter madness

What is interesting are the "Higher Order Programming" (HOP) facilities enabled by languages like Groovy and Jython. HOP allows developers to eliminate inessential boilerplate and 'plumbing', produce highly flexible code, focus on code-generation techniques and in certain cases write self-modifying code. These capabilities are not just pointless computer science; they are associated with higher productivity and malleable systems. They end up supplied to a lesser extent by enterprise middleware and web frameworks or through language extensions and preprocessors. Two things are worth bearing in mind. First, the chances are if one can deal with things like J2EE caching, clusters and session management and constructing .NET assemblies, that's implies more than a requisite ability to grasp HOP. Second, it's clear that all enterprise languages and middleware are evolving towards HOP - this is evident in the rise of lightweight frameworks, vendor support for dynamic scripting languages and extensions like comment metadata and aspect-oriention to Java. Put it this way - for a competent J2EE developer focused on application level development, Jython is not going to present much difficulty, and could well save customers and bosses money. The people working on JSF/Shale, the web development platform to succeed Struts, are taking a look at the facilities provided by Ruby On Rails, a web framework which garners at least as much of its power direct from the Ruby language as the framework's design.

Bottom line is that enterprise developers are going to adopt this dynamic stuff one way or another, it's about how much productivity can be yielded and how soon. The most likely problems one will have with these languages is the tools will not be as user-friendly, stock patterns and idioms will no longer apply, and things will seem weird and ugly and well, wrong. But even with better tools, it's difficult to argue that a language like Java or C# is dollar-competitive with something like Ruby or Python for application development. Dynamic languages put the soft back in software!

Network's edge

Let's look at a fine example of technologists not always being able to make choices they might want to make - Webservices APIs. Doing the rounds at the moment is the Alpine manifesto [pdf] , which tells us what was known 3 years ago about Web Services programming. Alpine seeks to build a new Java stack for Web Services and calls out the Java API JAX-RPC in particular as having issues; but it was always likely that JAX-RPC would have issues. It, like a number of WS APIs at the time, shipped in reaction to Web Services hype which made it hard to become a basis for lasting infrastructure. The official successor to JAX-RPC is JAX-WS (renamed from JAX-RPC 2.0).

On the relative merits, JAX-WS is looking better than its predessessors. The problem is that it could have been available 3 or even 5 years ago - it's a non-advance, a screw replacing a nail. The fallout is a series of misguided attempts to re-purpose distributed objects for the Web; that costs money. Some Java infrastructure is now going to get ripped out in favour of something with lasting benefit, largely because short-term thinking about markets drove technology infrastructure decisions. That means some people are going to end up buying two kitchens for Web Services.

Skate to where the enterprise will be

There's been a lot of emphasis about focusing on what businesses and enterprises need out of software, that in particular the IT industry is too focused on the T [2]. I think that is good, but sometimes business stakeholders need to get out of the way and let developers get on with things. Language choice for application development is arguably one such area, software APIs living the network edge are another.

something you skate to

When people who are good at technology are allowed to get on with things, and not be distracted by short term angles, there are benefits to be had. One place where that can happen is through Open Source development. Perhaps the key advantage in developing infrastructure in terms of Open Source is that it's difficult to subject it directly to market or bureaucratic agendas. Open Source is often accused of mere cloning of existing commercial software, and driving software value to zero. And yet a lot of interesting innovation, innovation that represents massive economic benefit, seems to be happening in that ecosystem. For example it's not impossible that instead of spending further millions of dollars trying to solve single sign-on on the server and growing webs of trust, it'll be dealt with along the lines of a GreaseMonkey script running on the client. If that happens, some will be quick to point out that it will never work, it will be unsafe, it won't scale, establishments won't buy into it, and so on. None of that will matter because it will be evident that the heavyset approach offers no further value, and the support systems needed to declare it a nail will coalesce around those who seem want to use it. It's a recurring pattern.


[1] The popular means to be seen to manage IT costs is to outsource development, but that is a race to the bottom if done without consideration. Most of the benefits realized from outsourcing are in supplying existing inefficiencies at preferred rates, something akin to paying less for what's bad for you.

[2] Currently it seems fads in IT are accelerating from the old 5 year cycle down to about 2-3 years. Nicholas Carr might have something to say about that, but hype acceleration wasn't in the Make IT Not Matter Plan, as I recall it.

May 27, 2005

All successful large systems were successful small systems

Daniel Sabbah on LAMP:

"I believe that in the same way that some of those simple solutions are good enough to start with, eventually, they are going to have to come up against scalability. [...] What we are trying to do is make sure businesses who start there [with LAMP] have a model, to not only start there but evolve into more complex situations in order to actually be able to grow."

Brad Fitzpatrick and Mark Smith on LiveJournal:

"College hobby project, Apr 1999. April 2004, 2.8 million accounts. April 2005, 6.8 million accounts. Thousands of hits/second."

I think we're in good hands. Sabbah's argument is reminiscent of recent positioning of Web Services with respect to REST/Web systems - take the high-end, manage the disruption from below, offer help to those that want it. Yes there are things to be done around scaling. But it's true to say we know what to do and that scaling issues are not unique to LAMP - when you get down it, it's still tiered client-server into a database, and IBM will have assessed this throughly and are not going to put their substantial weight behind a toy architecture. The real question to ask about scalability is "scalability for what?" Unquantified ilities don't amount to a hill of beans in this crazy world.


About the title: It's a quote from Bill Joy as far as I know; I think Grady Booch may have said it first, but differently. If anyone has an attribution, thank would be great to hear.

May 26, 2005

Has no one thought of the consequences?

The XMPP server ejabberd has shipped a 0.9.1 release.

XMPP? Push? In Erlang? Off the Web? Are they quite mad?

Maybe not.

May 25, 2005

Move any mountain


dote-alpine.png

"Alongside interoperability, a core rationale for the use of XML as a wire representation instead of broadly-available binary formats (specifically CORBA and DCOM), was that the extensibility of XML meant that message senders and receivers would be resilient to change. This promised resilience can only be realized if one works at the XML level. Even then, there is the perennial challenge of maintaining consistent semantics.
Assuming that working in the XML-space is considered a good thing, then we do not need a SOAP stack that holds the hands of end users, hiding the details of the XML message, generating Java source from WSDL-wrapped XML schemas. Instead we need a SOAP stack that makes it easy to work with the XML payload, as XML. The result should be a stack which is smaller, lighter weight, and more flexible than a classic stack. With a simpler architecture, implementation effort would be low, as would ongoing support costs. "
- Alpine: a fast, unforgiving SOAP stack for people who know what they are doing

May 22, 2005

Maven2 repository layout document

Maven2 has documented how declared dependencies map into URL space. Steve says other parties are using the layout- that's very cool. I am cheered up.

Maven2 and URL construction

More lessons to be learnt about web architecture. This time it's Maven2. In a Maven2 POM I have this dependency:

    <dependency>
      <groupId>mysql</groupId>
      <artifactId>mysql-connector-java</artifactId>
      <version>3.0.10</version>
    </dependency>

Doesn't work. I get this:

  [INFO] Main Error:
    Unable to download the artifact from any repository
    mysql:mysql-connector-java:3.0.10:jar
  from the specified remote repositories:
    http://repo1.maven.org/maven2
  Root error:
    Unable to download the artifact from any repository

The Maven2 client can't find the jar. Looking up the console I see it's trying to pull this URL down:

  [INFO] maven-compiler-plugin: resolved to version 2.0-alpha-2 from local repository
  Downloading: http://repo1.maven.org/maven2/mysql/mysql-connector-java/3.0.10/mysql-connector-java-3.0.10.jar

Some digging around shows that the jar I want is actually called mysql-connector-java-3.0.10-stable-bin.jar up on the server. The jar I need is up there, but the name is borked. The required URL is:

http://www.ibiblio.org/maven2/mysql/mysql-connector-java/3.0.10/mysql-connector-java-3.0.10-stable-bin.jar

It seems whomever built that jar stuffed more metadata into the jarname than my Maven2 client was expecting (ie it's a 'stable-bin' jar).

My answer was to download the URL and rename it in the local repository. Now, I'll have to remember not to use the -U flag in case that breaks my build. I read the Maven2 docs and nothing is jumping out at me at how to get around this - changing the version element content gets me a good jar name but breaks the preceding path.

I don't use Maven2 much, and am just tracking its progress as it goes to 1.0 as much as anything, so I don't know if this is a everyday problem [updated: a comment from Brett Porter suggest it tends to be isolated]. The POM idea is ok overall, and keeping jars out of source control is an excellent idea. But URL construction can be a fragile practice unless everyone is on board with the algorithm. Arguably, it defeats the point of trying to establish best practices for Java development and dependency management, when you don't follow best practices for the application protocol the proposed dependency management system itself depends on. There are reasons to treat URLs as opaque strings on the client rather than parameterised functions. Breakage from URL concatenation is one such reason. I wonder, if Maven2 used the DOAP format, would this problem come up, ditto for WebDAV as the protocol.

More generally the idea that this isn't documented lends weight the criticisms that plagued Maven around there being too much magic going on under the hood - or not enough. Because on the one hand its URL creation is a leaky abstraction; on the other it's not magic if it doesn't work. [updated: Steve Loughran pointed out that M2 has published their repository layout].

Finally, it's clear that the repositories are crying out for Atom feeds.

May 21, 2005

Statistically improbable phrases

The Fall - lyrics

May 15, 2005

Trading off

[Updated 2005-05-16: Danny Ayers and Julien Couvreur pointed me towards a better way to mark up the DOAP so that source code repositories can be given an identifier.]

These days we are all complexity experts and simplicity mavens. Once upon a time a popular way to trash somebody's technical design without having to bother to present a cogent argument was to point at it and say it would never scale. Today we just can call designs we don't like complex - that's even better because while it's unlikely to happen, you can actually have a meaningful conversation about scale. Complexity on the other hand is comparatively vague - indeed there is only one simple definition of simplicity for software.

Finger pointing and glibness aside, figuring out where and when to make a design tradeoff across the complexity/simplicity divide is difficult. Really difficult. And important. Aim too low and your technical designs won't scale (whatever that means) or be useful except for toy scenarios. Aim too high and no-one will be able to get past the actual or perceived complexity of the design - half the intended audience will get up and leave.

How to apply web metadata, especially with XML formats has seen all kinds of tradeoff issues and arguments. From RDF to RSS to SOAP to WSDL, one common theme is an endless debate about complexity.

In this entry I wanted to talk about a bit about a complexity/simplicity tradeoff centering around extensibility. Everyone loves extensibility, almost as much as simplicity in fact, and you will not hear many bad words said about the idea. Specifically I wanted to get specific and narrow it down to one issue - repository metadata in a format called DOAP.

DOAP is a format based on RDF for describing projects, by Edd Dumbill. It stands for Description Of A Project. It caused something of a shock when it was published, for two reasons. First it amply demonstrated that we have hardly any useful or interesting metadata about software projects. Which is terribly bad when you think about how much of the stuff we put out every year. Second, and this was far more shocking, it was a readable RDF/XML format. Shocking because up to then, everyone knew, just knew, that RDF/XML was insanely complex and comprehensible markup simply could not be produced with it (even I knew it). There are approximately 5 wildly successful RSS formats based on that assumption and who knows how many other domain specific formats.

The problem I was having was how to describe a software project or a unit of work. DOAP seemed like a good start, but there were some things I wanted to say that DOAP does not support, mostly around source code repositories. This lead me into a mire of decision making and trading off that was quite unexpected. That's the thing about working on software and data formats - you never know when you are about to fall down a rat hole.

Here's an example DOAP fragment that describes some details about a Subversion repository (there's a lot more you can put into one of these files, but we'll stick to the repository metadata for now):

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>

For RDF/XML that format is pretty good (RDF/XML can get complicated). Best of all, you can easily rip through it with a regular XML toolchain - scripting against DOAP markup isn't going to be a problem. The only thing that's a bit weird are those rdf:resource attributes peppered about the elements.

Now, there are other things we might want to say about a Subversion repository, than is allowed for by DOAP. For example DOAP doesn't have a notion for a repository alias and it's easy to alias http: accessible Subversion repositories. So to get there, we can try something like this:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/waffle/"
            rdf:type="http://example.org/doap/plusplus/alias" />
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>

Above we're using an rdf:type annotation to qualify a new doap:location as being an alias. The nice thing about rdf:type annotations is that they're entirely optional - if you're not looking for it, it won't break you. A DOAP-aware processor will accept that as a doap:location just fine. Aside from RDF, a lot of extension work and "duck-typing" with XML is done by sprinkling elements with attributes, as it's usually considered the least likely approach to break code. To break against unknown attributes you'd have to write (deeply) neurotic code of the form: "scan all the element's attributes, and if you don't recognize any of them, fall over". Some schemas are neurotic like this, but regular code doesn't tend to be. RDF goes a step further and bakes the idiom in for types, arguably in a way that's less intrusive than the approach taken by XML Schema.

Alternatively we could we could use a new element name to make a stronger distinction between doap:location and an alias. Here's an example where we make one up, called ext:alias:

  <doap:Project 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:doap="http://usefulinc.com/ns/doap#"
   xmlns:ext="http://example.org/doap/plusplus/"
   rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>

Unless you had domain-specific code that breaks when it sees new elements or had a strict schema that's going to be fine (an RDF-aware processor won't bat an eyelid at this). This approach is more likely to break regular XML code than attribute sprinkling, but again it does come down to how you do your programming, and in particular to whatever your policy towards extensions happens to be. Chances are fair a lot of code will just not pick the ext:alias and skip over it, but schema driven checks might choke.

So far, so good. Extending DOAP is looking straightforward.Although the various approaches to managing extensions are interesting in their own right, that's not what I want to focus on here. I want to concentrate on how we are using names and identity to manage extensions. What is allowing us to extend DOAP in large part is that the things of interest have names.

The thing about the way DOAP XML is structured is that although the Project has a name the Subversion repository itself has no explicit identity. It might not be entirely obvious from the XML so I'll elaborate a bit.

There are properties of a repository in the DOAP document, sure, but the repository itself has no proper name. The project does have a name - it's in the rdf:about attribute on the doap:Project. That means all the property-value metadata is asociated with the thing via its name. But the repository is dealt with differently. For example, the Jena toolkit will produce a table of subject-property-values something like this (using Qnames instead of URLs):

1 	http://example.org/projects/blah/ 	rdf:type 	doap:Project
2 	http://example.org/projects/blah/ 	doap:name 	"blah blah"
3 	http://example.org/projects/blah/ 	doap:repository 	genid:ARP132296
4 	genid:ARP132296 	rdf:type doap:SVNRepository
5 	genid:ARP132296 	ext:alias 	http://example.org/svn/waffle/
6 	genid:ARP132296 	doap:location 	http://example.org/svn/blah/
7 	genid:ARP132296 	doap:browse 	http://example.org/svn/blah/

Here's what that looks like in excel:

rdf-excel.GIF

See that "genid:ARP132296" thing? That's an internal identifier assigned by Jena's RDF/XML parser for the repository. Jena is a Java toolkit for working with RDF. Jena's parser (which is called ARP) has scanned our RDF and figured out that there is a thing in the RDF, which has properties, including a type annotation of doap:SVNRepository but has no explicit name. So it's been given a pseudo identifier - by pseudo I mean it's not globally unique. Contrast that with row 1 where Jena has realized that the name of the project 'thing' is "http://example.org/projects/blah/" just as we said it was a minute ago.

In one sense it's no big deal. Most keyed metadata today is laid out in the pseudo identified way. There are property values pairs and what they are property values of is entirely context-specific - either some specialised code or a person is going to appreciate that context and fill in the blanks. Context is king, and so on.

The problem is that relying on context is one of things that makes it tricky to share metadata. How do we pass this stuff around so it can be reused and merged with other - across time or space. Ok, so maybe you're thinking I've lost it there. "Time or space"? In italics? What?

Space might be easiest to make sense of - we could have project metadata scattered on a few different servers on the Web or inside a LAN. Without these shared names, merging that data is going to require people to go that extra yard (or mile, it depends) to figure out what goes where and embody that integration context knowledge in code or as rules as best they can. What about time? Well we might want to update the Subversion repository details later on. Like suppose six months later we added Atom/RSS feeds detailing the check in to our repository (by the way if you're not doing this with your repositories, you should try it out, it's great). To do that, I need to make up a new property because DOAP doesn't have a concept of repository commit feeds - let's put it in the same namespace as the repository alias we made up and call it 'commit-feed'. To update the XML we have to go into the document and drop in the new data like so:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>

Now we can see and RDF tools will discover that the repository has a commit-feed property value. But here's the thing - with RDF I could have done that without ever touching the original document, if only the repository had a URL identifier. This is worth some explanation.

Remember that we said the project had an identifier "http://example.Org/projects/blah/"? lets give the project a pointer to the developer mailing list:

<rdf:Description 
    xmlns:rdf="http://www.W3.Org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.Com/ns/doap#" 
    rdf:about="http://example.Org/projects/blah/">
  <doap:mailing rdf:resource="http://example.Organ"/>
</rdf:Description>

I just did that without having the original metadata to manipulate. Once I know the name of something I can start writing down property value pairs about it. RDF people say this ability is useful because it enables third-party metadata ("anyone can say anything about anything"). But I just think it's cool. I did not need to worry about knowing what all my metadata might be upfront. I did not need to worry about how I'm going to manage extensibility. If I had a webservice exposed that supplied and accepted information about all my projects there would be no checkout step to get the right XML file, edit it and check it back in as an update. I would just pass the new data to the service and the data would get merged, and this would be no more difficult that a programmer added a new key-value pair to a hashmap or a manager adding a new row to a spreadsheet. Folks who fret about passing large XML documents around would have to find something else to worry about.

Actually, let's look at that excel spreadsheet again. Imagine our imaginary webservice was using excel to store its data. Here's what the merged data looks like:

rdf-excel1.GIF

And for those of us that just can't get enough of RDF/XML:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
    <doap:mailing-list rdf:resource="http://example.org/mailman/listinfo/blah-dev"/>
  </doap:Project>
which is the unified document version of the two XML documents we started out with:
  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>
<rdf:Description 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#" 
    rdf:about="http://example.org/projects/blah/">
  <doap:mailing-list rdf:resource="http://example.org/mailman/listinfo/blah-dev"/>
</rdf:Description>

So, we can see that RDF data is not especially document or file bound - as data it rides above them. Some people might think this means that syntax doesn't matter, but what it really means is that syntax and physical structures don't have to get in the way of extensibility. Documents and files are no less useful as a result, but you can stop thinking about them as being really tiny data silos that you have to manage and keep a track of, and organize, and particulary as really tiny data silos that don't always travel well.


Ages ago it seems like, I said there was a problem. What problem? Well I can't do what I did with the project mailing list for Subversion Atom feed, as things stand. That's because I have no URL identifier for the repository and that means creating a second chunk of metadata to pass in is a non-starter, as things stand. There was that genid thingy, but it's not safely shareable or usable as metadata (for all I know, the service is using Redland instead of Jena, or Jena has changed its id generation algorithm). Depending on that to be long-lived and have integrity would be like depending on primary keys between databases, only primary keys have a better chance of working out. In that repository case I am back to getting a handle on some kind of document to check out, update and check in. The document gives me enough initial context to find where to drop in the property value for the feed.

What can we do? We could hack the DOAP format to look this:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
    <rdf:Description 
         rdf:type="http://usefulinc.com/ns/doap#SVNRepository"
      rdf:about="http://example.org/projects/blah/svn/">
      <doap:location rdf:resource="http://example.org/svn/blah/"/>
      <doap:browse rdf:resource="http://example.org/svn/blah/"/>
      <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
      <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </rdf:Description>
    </doap:repository>
  </doap:Project>

What's going on here? Well, remember earlier on I showed two ways to describe a repository's alias, one as an element and the other as an rdf:type attribute? I've done much the same thing above with SVNRepository element, by moving it to be a rdf:type attribute declaration. The SVNRepository element has been replaced by an element called rdf:Description. rdf:Description is bit like a shim for RDF/XML markup - what matters is the value of its rdf:about attribute not the element itself. If you are looking at this and thinking about HTML span and div tags, you're on the right track - rdf:Description is to RDF as span is to HTML. Dropping rdf:Description in there allows me to name the repository at the cost of being slightly abstracted from the DOAP subject matter. At this point if you believe what Adam Bosworth has to say about abstractions then half of you just stopped reading and left the room.

So that's a problem. We're getting some expressive power at the cost of introducing an abstraction, which you can argue defeats the design goals of the DOAP format to begin with.

Thankfully, there is another design option we can take, to get some of you back in the room, maybe even half of you back. Danny Ayers and Julien Couvreur showed me how to patch the DOAP markup so that we don't need rdf:Description abstraction. We can add an rdf:about attribute to the doap:SVNRepository like so:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository 
         rdf:about="http://example.org/projects/blah/svn/">
	<doap:location rdf:resource="http://example.org/svn/blah/"/>
      	<doap:browse rdf:resource="http://example.org/svn/blah/"/>
      	<ext:alias rdf:resource="http://example.org/svn/waffle/"/>
      	<ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>
That preserves the DOAP XML markup while allowing us to call out the name of the repository as "http://example.org/projects/blah/svn/". Less abstraction, but the same expressive power, and more consistent with the format.

What's interesting is that this technique of sprinkling rdf:about attributes on XML is that it's the same approach to when we mark up bits of HTML using class or cite attributes, where the attributes are being used to do some classification. The proper way to do this these days for XHTML and RSS is to use the rel attribute and not class/cite. For example rel attributes are an used as an extension mechanism in Atom and RSS. Sometimes this is called semantic markup. The idea is the same, the difference will be technologies you're going to leverage.

Anyway, here's the final sheet:

rdf-excel3.GIF

No more genid things - that's a good result, at an estimated cost of 25% of the intended audience.

Is this a good tradeoff? I think for most cases Edd's initial design + rdf:about beats the rdf:Description idea out easily - the benefits of easy processing and clear markup carry a lot of weight - every popular XML format on the web can be presented in evidence.

For the specific imaginary web service and the not-unjustified fear of wanting to get a decent handle on the repositories, using rdf:about is the next step on the ladder - not having identifiers gets more problematic as you add more unamed things (like when we have 3 repositories for that project). Naming things seems to be a web best practice for metadata as well as with the REST style of web design. Plus we can always hack the webservice to drop the rdf:Description in favour of emitting doap:SVNRepository elements for producing representations.

An aside. DOAP doesn't require that these rdf:about attributes have to be there for either Repositories or Projects. Their use is optional, without the cost of interoperability issues that normally comes with optionality. That's handy if you don't have names for these things or don't care too much about naming for the time being. However I hope the examples here have convinced you of the value of adding one to doap:Project. A good number of projects can probably get away with just using their doap:homepages URLs as the identifier of the project.

Is this a flaw in DOAP? It turns out that's not a tough call to make, and the answer is no. There's no design flaw - DOAP, as Edd laid it out shows a tradeoff - where you get extension heavy lifting from RDF, XML toolkit friendliness, and good levels of document comprehension, with minimal cluttering from RDF. As of the time of writing, 2005, DOAP as it stands is good, for two reasons

  1. You can add these rdf:about attributes without breaking anything; there's no need to assemble a working group for DOAP2 and months of handwringing about forwards and backwards compatibility.
  2. The DOAP format makes it easy to work with regular XML tools, which can't be said of RDF/XML in general.

The latter is especially important. As I said, we all espouse to the idea of extensible formats, but what we can do today with the format matters a heck of a lot. This:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
    <doap:mailing-list rdf:resource="http://example.org/mailman/listinfo/blah-dev"/>
  </doap:Project>

is going to be a lot easier to send through something like XSLT or a DOM-based script than these pair:

  <doap:Project 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/"
    rdf:about="http://example.org/projects/blah/">
    <doap:name>blah blah</doap:name>
    <doap:repository >
      <doap:SVNRepository>
        <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
    </doap:repository>
  </doap:Project>
<rdf:Description 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#" 
    rdf:about="http://example.org/projects/blah/">
  <doap:mailing-list rdf:resource="http://example.org/mailman/listinfo/blah-dev"/>
</rdf:Description>

You could present a unified document by wrapping those two inside rdf:RDF like this:

<rdf:RDF 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#"
    xmlns:ext="http://example.org/doap/plusplus/">
    <doap:Project 
    rdf:about="http://example.org/projects/blah/">
      <doap:name>blah blah</doap:name>
      <doap:repository >
      <doap:SVNRepository>
      <doap:location rdf:resource="http://example.org/svn/blah/"/>
        <doap:browse rdf:resource="http://example.org/svn/blah/"/>
        <ext:alias rdf:resource="http://example.org/svn/waffle/"/>
        <ext:commit-feed rdf:resource="http://example.org/feed/svn/blah.xml"/>
      </doap:SVNRepository>
      </doap:repository>
    </doap:Project>
  <rdf:Description 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:doap="http://usefulinc.com/ns/doap#" 
    rdf:about="http://example.org/projects/blah/">
    <doap:mailing-list rdf:resource="http://example.org/mailman/listinfo/blah-dev"/>
  </rdf:Description>
</rdf:RDF>

but this gets hard to manipulate in a general way (suppose we have lots of other rdf:Description blocks). I for one, wouldn't want to be writing or maintaining the XSLT to suck HTML tables out of that as the data accretes. This is partly because XML is document based, so the tools assume everything you'll need will be laid out under the root element and will be related structurally as child elements and so on. Which is a perfectly good assumption as I see it. RDF is not like that; the relationships are found in the RDF abstraction not necessarily in syntactic XML structures. In this case it appears silo abstraction comes at a price. Unless you're going to use some kind of RDF preprocessor to create a unified document, what makes RDF so flexible can be a pain in the ass to script against. This is one reason why we hear that RDF/XML is not easy to manipulate with XML tools- another reason is that there are a lot of different ways that DOAP could be rendered into RDF/XML. RDF/XML reflects the flexibility of RDF directly by providing too many markup options.

When you see the kinds of numbers spent on getting IT and software to work, it's clear we do have to make things simpler. Approaches like DOAP with a few minor tweaks are about as simple as you get if you want to pass on the heavy lifting of things like naming, extensiblity and structured metadata to RDF.

In this space - shared metadata - how and where to trade off are questions we'll continue to ask. In the case I talked about here, over time I think we'll move towards rdf:Description, or something like it, but not today. And remember a lot this comes down to understanding how people are working today - if the next version of RDF/XML used something called rdf:span instead of rdf:Description we might be able to hold onto more of that intended audience.

May 14, 2005

Uniform Interface

Wow. Self-cloning robots. Check out the vid. As a trained AI guy, I've had to study the theory, but when you see one, it's beautiful. And creepy. The word I'm looking for is sublime.

Maybe it's a T-alpha.

But as a metaphor for SOA or REST uniform interfaces, it will rock. I don't know how you could look at that movie and feel quite the same way about custom interfaces afterwards as you did before. Wall sockets don't quite capture it.

[via Richard]

May 12, 2005

Billy Newport interview

Proper tech stuff from Billy Newport over on the TheServerSide:

"We can demonstrate that application running on 6 dual processor blades at around 3k tps where each transaction consists of a database transaction with 12 statements and five outgoing messages. This represents the worst case workload of the application using real time matching when every incoming exchange price indication results in a match. More blades give linear scale up. The database needs to be partitioned also both from an availability point of view and for horizontal scalability. We used a quad CPU Intel box running DB2 that used a Network Appliance F940 iSCSI server for disks. The database was running at 98% CPU load at 3.6k transactions per second and these transactions were modifications, not simple cache hits."
"The key to horizontal scalability is eliminating cross talk and contention between servers. Stock trading, airline/hotel reservations, batch applications are examples of such applications. The ideal situation is that partitions do not need to interact with each other at all. If this is the case then we'll get linearly scalability. All applications that use a database will experience some cross talk within the database due to index locks or latches within the database (such as those surrounding the transaction log) until the application is using a partitioned database. Application Architects should strive to stay as close as possible to this ideal as possible for best performance."

More like that, please. The follow-up discussion ain't bad either.

I wish Chris Brumme was still posting.

Disruption

Steve Loughran:

"The rationale for Geronimo over JBoss was that it would be more open, less vulnerable to the whims of its owner vendor. Instead, even though IBM haven't bought ownership of the Geronimo code, they do own the core developers. And every contribution made by third parties in the OSS codebase ends up benefiting the IBM distro. That is the price of the BSD license: you don't need to publish your additions, but everyone else has the same right. Which is precisely why (L)GPL makes so much sense for startups trying to retain control of their software. MIT/BSD/Apache licenses are good for universal adoption, but not retaining control of "strategic" technologies."

The point about careful choice of OSS licence in order to manage strategic versus tactical technologies is a fascinating one - I haven't heard it put that way before.

IBM might have decided to see that the writing on the wall for J2EE WebSphere is that the revenue model is as likely to be disrupted from the bottom by OSS as anywhere else. Having a stake in the disruption allows them to manage the rate of decay for WebSphere both at the level of the market and with individual customers who may be wondering about which level of sophistication and what 'ilities' they require. Like a lot of moves IBM makes with open source, it's strategically smart. Next thing you know, they'll be buying Interface21.

"Where then, is the moral high ground of the Apache Geronimo stack? I think Jonas has it. It also makes me worry that Apache is, through no action of its own, going to be perceived as a tool of IBM in its ongoing war with Sun, and now, JBoss. But that is, in its own way, a metric of how OSS is transforming the Java and app server economy."

JONAS has its fair share of technical high-ground also. It was clear enough 18 months ago that ObjectWeb were keeping their heads down and working on stability and maturity while the Geronimo and JBoss stacks were either competing on incidental architecture features or looking to do rewrites. For example, JORAM is probably the most solid open source JMS provider available (whereas ActiveMQ is still meeting its potential and JBossMQ is going to be second systemed by JBossMessaging).

As for the impact on the ASF, or the ASF being used as a weapon, Geronimo went into the Apache incubator project with a lot of public bad blood with respect to JBoss. There was that bizarre Elba source interregnum and various accusations around source code origination, sufficient to get the laywers' pens out. Community being bigger than individuals is a core ASF value - ideally you're not meant to to be able to buy out ASF run projects or end depending on a single commercial entity to secure the project's committer base. If all the core developers for Geronimo worked for Gluecode and now work for IBM, as Steve suggests then the ASF maybe needs to broaden the committer base to meet its mission. As I understand the Apache way, that should have been set in place during incubation, and be carried on through the PMC.

May 09, 2005

Some weblogs

Some weblogs I've been reading lately and enjoying a lot; not new perhaps but new to me.

  • Copia: Uche Ogbuji's weblog home. Apart from lashings of Python and XML, the Quotīdiē's are wonderful (and what is it about markup guru types that makes them highly articulate?) Planet XML: please pick this one up (they already have, great).
  • Koranteng's Toli: Koranteng Ofosu-Amaah writes big, long, winding entries that can't possibly be delightful, but are.
  • Imperial Violet: proper tech blog. Anyone that wants to use parser generators to handle network protocols is basically ok by me.
  • hackaday: Hugh in work put me onto hackday; it's great fun (but I plan to do something more low tech later in the year.)
  • Lorcan Dempsey: library science and such like is supposed to be dull - but not here. It's interesting to read about technologies I work with from a strong data management and retrieval perspective rather than the usual code/protocol emphasis.
  • Dracula blogged: from the why didn't I think of that department. Dracula is one of my favourite books, the only classic book I think of as a page turner. Reading it laid out in diary form is great.

May 08, 2005

Plausible deniability

In the IETF and W3C, the specification directive "SHOULD" is a harder specification that it sounds, or is used in the commercial sector. It means roughly "unless you have an exceptional case you MUST do this". It is not meant to provide plausible deniability. Unfortunately it can end up getting used that way.

In the current Google Accelerator kerfuffle, some people have pulled out RFC 2616 as rationale to justify current site designs. It discusses GET as SHOULD NOT be having a side-effect. Then, interpreting what RFC 2119 suggests about degrees of freedom in terms of SHOULD NOT, some people are attempting or deciding to conclude that GET can broadly have side effects, including cases that might present insecurities. Thus the GWA is in some way broken and needs to be fixed because it MUST NOT break existing apps.

An argument that said Google are breaking one side of Postel's law, even if they are working to spec would be more reasonable. Using a part of the HTTP spec post-hoc is thin justification. It's difficult to credit that app developers read the HTTP spec and concluded "it's ok for me to let people logout and delete stuff via GET!". Delta encoding as described in RFC 3229 is maybe one such exceptional case. Delete and logout clickthroughs are not.

The situation does present problems. Many of us will be mailing out directives next week asking people not to use GWA and patching httpd.conf files, but eventually we will have to consider upgrading the server apps, frameworks, and possibly the clients. That's a lot of infrastucture, but shouting Google down is not a sustainable approach - GWA won't be the last technology that does this - it's a glimpse into the future of a much more automated Web. Where this goes when proxies beging to intercept and interpret javascript, or scan applet bytecode, or manage sessions, or worst case when these things are used maliciously, is anyone's guess. What this situation does highlight is that specifications matter.

May 07, 2005

Only the paranoid survive

Regarding Google's Web Accelerator, Rob tells us we were warned. And Rael is summarizing the technology implications. Indeed people have been saying this for years.

It's clever though. On the one hand some people who think Web architecture matters will consider Google Web Accelerator as a kind of Web lint. On the other hand, Google will not be winning hearts and minds everywhere at the moment. It's hard to believe this accelerator was shipped without realizing the consequences on web applications, given how many of them use GET links for side effect actions, such as deletions and logging out. As well as a new found appreciation for Web architecture, including those who didn't know the Web had an architecture worth being consistent with until today, an outcome of the ensuing discussion might be that browser UAs are broken, that HTML forms are broken, and that as a consequence you should upgrade your apps to be Web architecture happy. In fact you should reconsider your entire client stack, because client limitations are driving a lot of the breakage that comes out of the box in server-sided frameworks. Workarounds to block the GWA are not a lasting solution. Which suggests more reliance on XMLHTTPRequest and/or HTML forms deprecation and/or classic browser deprecation. A well behaved universe of applications makes the Web much more efficient to process for Google, simultaneously lowering their enormous operating costs while having developers everywhere reconsider the makeup of client stacks. And we all get to say it's architecturally the Right Thing. If it's not an oversight, it's a really smart play, and an ambitious one.

May 06, 2005

Web architecture: not just for assrons

Cory Doctorow:

"There's some foo you can add to your Web-app designs to prevent this,"

Yes there is - Robert's documenting the best of it.

I know a bunch of people who are probably looking at this accelerator thing from Google, saying I told you so, and it's sure as heck not because of web architecture - Hi Miles, Saif, Hans, Mike, Ben, Mike, Phil, Mark.

QOTD

"Des Moines is one of the only places on Earth that actually gets further away as you drive towards it." - Stephen O'Grady

Products and Solutions

To the point post from Larry Borsato:

"For any of the product companies to solve my problem, they will need to assemble some indeterminate collection of products in some configuration. As a customer, I frankly don't care about that - I just want the solution at what I consider to be a reasonable price. The truth is, I really don't care what 'products' you sell me as long as the price, and ease of maintenance, is acceptable to me." - Products don't solve my problem

Ramped

The Atom format has gotten through last call, as Tim Bray reports. Tim gives a good overview of one niggling issue, that of duplicate entries - should we allow duplicate entries in Atom feeds or not? Think before you answer, - it's a tricky one ;)

Currently the WG is working through about twenty issues. The mailing list activity has ramped right up; at the moment it feels more like irc on a bad connection than email. A few other issues are being debated intensely, such as whether summaries are optional, or whether feeds must have links. And everyone is getting their +-1's in early. But the duplicates issue is an important one - the last few hours suggest we're getting somewhere on it tho'. Dave Johnson of Roller fame has been contributing to the duplicates discussion too, which is cool.

May 04, 2005

For some definition of De-spam

I have MT-Blacklist installed on this weblog; it's very good. A few days ago tho, I got an email asking me if I wanted to De-spam a trackback from Mchel Foghl. Of course I wanted to de-spam something from Mchel, he's not a spammer, so I clicked through. That ended up with MT-Blacklist graciously deleting the trackback from the database. The right thing to do here is not do anything as "De-spam this trackback" seems to be code for "Treat this trackback as spam".

It does the same for comments, but I've never accidently trashed a comment. I think this is because comment notifications from Mt-Blacklist come with instruction "Approve this comment" which seems to be code for "Approve this comment". My suggestion for the next version of MT-Blacklist is to replace "De-spam this trackback" with "Treat this trackback as spam".

May 03, 2005

Dispute resolution

If Web Services advocates starting saying "transport independence" instead of "protocol independence", that would resolve a lot of needless argument.

Nothing like a good read to inspire.

QOTD

Sean: "...enterprise integration is, when you boil it down, a dispute resolution problem.".

May 02, 2005

Maven2: first impressions good

Rick Hightower once asked:

"After using Maven, how could you go back to Ant?"

Because it was unreliable, and your build tools can't be unreliable. Ant works and works in a transparent fashion. Maven is hit and miss - if your end requirements are not to produce a few jars, a website and a dist.tar.gz, or if something goes wrong and you have to debug it, that could be difficult. My preference with Java is to enforce dependency management through policy, standard project layouts and targets through an ant script that generates ant scripts (the nearest open source thingie to that seems to be Jam). Policy management of dependencies is hard work, can be risky, but in the absence of tools you have to do some amount of it.

Now here's Maven2. It's an alpha, but it's so much better than maven, they might want to think about renaming the project. It's much faster. It doesn't try to force Jelly on you. It's not wholly intuitive, but nor is actively obtuse as maven was. I work on some medium-large projects that I could see moving to Maven2.

It's an alpha, so it has problems. Most of these seem to be design problems rather bugs. When you do something wrong you typically get a an unhelpful stacktrace - that's no good, because it's not telling you whether you're doing something wrong or whether it's busted. For example, the copy of Maven2 I have blows up if its run in a directory with no project file. It blows up on genapp. It blows up on install. There's no obvious way to find the list of available targets (it doesn't follow, say, unix tool idioms). The package target is as best I can tell, broken from a usability viewpoint. The Maven2 team are honest enough in admitting that the documentation is lacking, but they need to clean up the error reporting to the end user before releasing a 1.0 (update: Dan Diephouse says the reporting is looking better in subversion). Critically, Maven2 is not just a build system it's also a networked application, but one that behaves as though the network will just be on hand to provide content. That's a fundamental design problem.

To stop supporting the project generation hack mentioned earlier, would be great. That's despite the fact there are a good few projects running on its artefacts now, personally and in Propylon. I mention it here from time to time, so people have asked about its availablity. Open sourcing it is an option, tho' it's really only a small stuntwork. Bizarre as it sounds I don't have time to organize a release, and I'm no believer in just chucking projects into open source and Brian Zimmer will be after me soon for some Jython I promised to do. The 'roadmap' for AntAnt was to integrate Jam and/or Ivy (see Ivy is everything Maven should be) and target the 1.7 Ant snapshots, maybe start describing projects using DOAP. Maven2 might mean doing none of that. The two things to figure out next are how flexible Maven2 is with regard to repository location and management (Brett Porter's subversion hack suggests this might be ok), and how easy it is to have an end result that is not a website and zip file.

Finally Maven2 really, really, really, really, really, really, really, really, really, really, really, really, really, really, really needs to drop the idea of using badly conceived XML scripting languages for extensions. Replacing Jelly with Marmalade might be witty, but it's carrying on a losing approach. Beanshell, Jython, Javascript - pick three.

JNDI+Objects - same old

I'm porting an application to a JORAM provider, off JBossMQ (soon it'll be the turn of ActiveMQ, but at the moment it's JORAM 4.2.0).

That should be easy, right? JNDI+JMS will provides all the neccessary abstractions one might need, plus this particular application has taken great care to use a Minimal Interoperable Subset (MIS) of JMS which will work across the major providers.

JORAM does not seem to come by default with a JNDI name of a QueueConnectionFactory (and I havne't figured out how to set this up as a configuration yet). Now, I can add one through the admin UI and go back and tell the application's config file what it is. That works and will end up as some kind of bootstrap script, which sucks out JNDI names from the apps config file and creates them in JORAM. Otherwise things start looking suboptimal from a deploy/ops viewpoint.

I had to grep the JORAM source to discover that its provider URLs begin with 'scn://'. The app in question wants you to have the provider URL. You would think armed with that knowledge I could then search the PDF document, find some notes and bookmark them. Nope; no reference to scn:// anywhere.

And JMS, well that old dear is starting to look its age - messaging models based on whatever the vendors had at the time, but with no obvious interop semantics in sight, Topics, Queues, P2P, PubSub, durable, not-durable, an inheritence model for message types that ensures downcasts, distributed transactions, MDBs slightly different. All looking more like noise than features.

I was playing with NetKernel before getting stuck into this JNDI nonsense. Now that was fun. Maybe I should port the whole thing into that instead. I'm still evaluating but so far it's impressive - NetKernel is a seriously enlightened stack. I wonder when the folks at 1060 will provide a repository and start building community around it.

URIs+Protocols whip JNDI+Objects, any day.