" /> Bill de hÓra: March 2004 Archives

« February 2004 | Main | April 2004 »

March 30, 2004

Don Box: WS-CoatHangers

Who's written more webservices specs than Don Box? I don't know, but he's mad as hell about it :)

Update: Don responded:

I wouldn't characterize my attitude as "mad as hell" but rather as "happy that no new protocols have shipped with me as the author" :)

and pointed out that the article is 13 mos old. Doh! So that's why I thought I'd heard him say this before.

I think we've know about the problem of proliferating WS specs for a while tho'. Personally I'm pretty much done in by it all - see WS-Descent - that was 12 months ago. And 3 new WS-* specs - that was 9 months ago. Wait, there's more! Here are RSS feeds put together from the Apache wiki's WebServicesSpecifications page - Chris Ferris reckons there's a bunch of stuff missing from them.

Another update: Dims updated the wiki list, and it's been picked up by the RSS feeds.

March 28, 2004

IronPython: Jim Hugunin's paper from PyCon

I wanted to pinpoint the fatal flaw in the design of the CLR that made it so bad at implementing dynamic languages. My plan was to write a short pithy article called, "Why .NET is a terrible platform for dynamic languages". Unfortunately, as I carried out my experiments I found the CLR to be a surprisingly good target for dynamic languages, or at least for the highly dynamic specific case of Python. This was unfortunate because it meant that instead of writing a short pithy paper I had to build a full Python implementation for this new platform to see if there would be any hidden traps along the way. - Jim Hugunin

[via Miguel]

March 27, 2004

What would a Jython JSR do?

James asked me:

OK now a question for you :) The Python language is standardized through the PDP and Jython is a port/impl of it. JSR 223 is an API to bind any scripting language to the JVM & web container. So what would a Jython JSR do?

It would generate hype. About dynamic langauges on the JVM in general, and about Jython in particular. That's why I said I like the idea of a Jython JSR, but for the wrong reasons! Let's be honest; anyone who thinks hype isn't of the essence in the software industry is selling something ;-)

March 26, 2004

Scripting frees

So why Groovy? Why not Jython or JRuby? Why not one of the dozens of other programming languages that are designed to run on the Java Virtual Machine? It's my opinion, and I believe the opinion of those who support this JSR, that Groovy is the best choice because it was built from the ground up for the Java Platform and uses syntax that is familiar to Java developers, while leveraging some of best features that Python, Ruby and Smalltalk have to offer. Jython and JRuby are excellent examples of how existing languages can be ported to the Java platform, but they are, after all, ports. They use syntax that is not designed with Java developers in mind and they are founded on a completely different set of code libraries. Groovy is designed for Java developers and its foundation is the standard APIs of the Java Platform. - Richard Monson-Haefel

Yes, it's built from the ground up from a Java perspective but when you use it aside JRuby or Jython that doesn't make much difference - it's just another language, just another port, and not one that feels like Java. Not the way Javascript or C# does.

Anyway, at this stage, the Groovy community need to justify their approach to anyone - they're just getting on with it and if they want a JSR I'm for that. Groovy's legacy will be that it was language that changed how we think about the platform.

I still like the idea of a Jython JSR - but for the wrong reasons - I think Jython is the best language available to the Java developer and damn if it doesn't get enough hype. But yes, Jython is not a hotbed of activity and it needs to get to 2.3 where the dynamic and functional idioms available are truly mature - if you think closures are excellent, wait until you use generators.

Really, what we need is for Sun (and the J2EE vendors) to get behind the idea of dynamic languages running on the JVM for enterprise work as well as admin and housekeeping work. Maybe there's someone in there that wants to port Self? Or maybe someone wants to port C#? Maybe someone wants to build end-to-end J2EE apps in JRuby or Groovy? Who knows.

update: James said in reply to my comment about using Groovy coimpared to other languages, "I take your point if you treat them as scripting languages in isolation - though the big difference is how you mix a scripting language and Java code". His point is well made.

March 24, 2004

How to use Ant 1.6 with IDEA 4

  • Stop IDEA
  • Backup ant.jar and optional in $idea_home/lib
  • Copy ant.jar and ant-launcher.jar from $ant_home/lib to $idea_home/lib
  • Copy ant-nodeps.jar from $ant_home/lib to $idea_home/lib as optional.jar
  • Start IDEA and add the jar files in $ant_home/lib to the IDEA ant properties Additional Classpath

Cool. No more crashing when IDEA sees the import tag. Thanks to Glen for the pointer.

MDA: not baked yet?

Few people would advise you to slap a UML modeling tool on top of XDoclet (assuming some sort of slick integration), and then sink your time into making XDoclet generate all of your code, yet that's precisely what the vendors seem to want, and that time investment's justified because, heck, you've already invested so much money in the tool. At some point, you get to the level where the code is so custom that it's not going to be reused, and MDA tools become a golden hammer. Additionally, you can end up with so many options of how to generate the same code, you lose the simplicity the tool is supposed to provide. This is apparently the vendor's opinion of how it should work. The other option is to use it merely as an expensive version of XDoclet that happens to have UML integration. In this case, it still gets ugly, because they want you to bind the attributes in your UML model to database tables and columns. There's enough debate about enshrining DB mappings in your value objects for code generation purposes, but it seems blatantly messy to do this in your UML tool. So apparently the options are either monkeying with the tool, or using it as an expensive code generator that violates abstraction. Where do I sign up? Of course the whole thing comes from the pipe-dream that we can turn software development into a turn-key, assembly line process, the same dream that's been selling CASE and RUP tools for decades. - Rob Kischuk

IDEA 4 and Ant 1.6

Has anyone managed to get IDEA 4 and Ant 1.6 playing nicely together?

[update: how to do it]

March 23, 2004

Typekey: Grand Central Web

From a performance standpoint, how many times do you get a failure when you ping weblogs.com or blo.gs or even movabletype.org? Blogspot or TypePad users, have any problems accessing your service? Have you all tried to ping two Trackback-enabled TypePad weblog posts with the same entry, and found it has failed? How about you folks that link to Amazon or Google or Sitemeter or blogrolling.com on your pages -- ever notice how slow your page loads? - Shelley Powers

Oh sure, we've noticed. But here's the thing, the web, despite being the the most successful distributed system ever created, is still quite centralized. The centralization revolves around two things.

The domain name: the domain name is the bit between http:// and the next /, (dubyadubyadubya-summink). ICANN is a central body that distributes rights to resell domain names which you can then rent for a limited time. You don't really own your domain name, but you are accorded some rights through renting. [update] Technically the analog is DNS, the domain name system which links IP addresses to domains. DNS is prety neat tech, but it's managed to a large degree in a proprietary fashion which centralizes it to a degree. Some day, those handful of root name servers are going to be taken down and... well you don't want to know.

The server: Classically on the web or any other client-server style network, you connect to a server (another machine) and have the content downloaded to your machine. On the web, that domain name has to resove to an IP address, which means a single computer. If too many people connect that computer will be overloaded. But, the physical toplogy is different. Underneath that wafer thin conceptual layer you and I call the web, there is big technical mojo. It consists of machines that cache pages so you don't always connect to that server you thought you were, load balanced clusters of machines acting as though they are one server, and entire networks of geographically astute server farms called content delivery networks (which are also proprietary). The single stupidest, unmojo part? Probably your browser. Those of us that admire the web tend not to bang on too much about these modern wonders which keep things running - we know it's all a workaround. Instead we like to say that HTTP has caching and scale designed in. The design of the web is what matters. Right? Things degrade with HTTPS. A HTTPS session requires your computer create a stable one time link to another computer making clustering more complicated and eluding caches altogether. Notice that a site like Amazon doesn't put you into a HTTPS session until it has to.

Shelley again:

All of these are dependent on centralized systems, and as we have found in every single instance of centralization and weblogs, they don't scale. Every single instance.

I remember people saying Hotmail would never scale. Then they said Google wouldn't scale (well, I still say that). Making this stuff scale is expensive, but it's doable to a point - yet the web has no really good architectural answer to what most of us call the Slashdot Effect (or in security jargon, distributed denial of service).

I don't like the Typekey solution either; commercially and technically it's inelegant. But until we come up an alternative architectural model or make PKI usable, these models will continue to be proposed. For what it's worth I think Moveable Type with Typkey are ultimately targetting corporates (who tend to be able to live with centrality) using blogs behind the firewall or across firewalls with partners.

[streets: fit but you know it]

March 22, 2004

Qualified Names (QNames) don't identify.

From the TAG document:

Further, it must be observed that some things are identified by QNames: element and attribute names, types in W3C XML Schema, etc.

QNames are macros, similar in kind to entity declarations. A QName serves to indicate the part of XML document that can be targeted by something that understands and can expand XML Namespaces. Technically, macro processing is needed for XML namespaces - since XML can't support URIs as element names, we instead use the pair of Namespace Prefix and Local Name to tunnel the Universal Name through the markup. The Universal Name is James Clark's terminology; in the abstract that's the tuple {NamespaceURI, ElementName}.

It's the Universal Name that identifies, not the QName. Using QNames as identifiers is muddled thinking, not something to be encouraged.

Mark Nottingham is right - QNames in content are evil.

March 21, 2004

EditPage means EditPage

Apache wiki has a page for Web services specifications that includes an RSS feed. [... ] Unfortunately, come of the cited specifications are no longer in vogue like WS-Attachments. WS-Transactions has been superceded by the combination of WS-Atomic Transactions and WS-Business Activity. WS-Inspection isn't receiving much attention these days. Neither is WS-Routing for that matter. There also appear to be two or three glaring omissions; notably SOAP (1.1 and 1.2), WSDL (1.1 and 2.0), UDDI (2.0 and 3.0), and SAML to name a few off the top of my head.- Chris Ferris

By all means, update the page then... that's why it's a WikiPage ;-)


The real problem seem to me that XP is going out of its way to downplay foresight and planning. In a world in which applications frequently, if not invariably, are extended to do more that originally intended, solving for the minimum is frequently not a good choice. - Glenn Martin

Fair enough. The real problem with that line of thought however is it reasons with the benefit of hindsight. There's an example given about spiralling logging requirements. But how would you know that beforehand? You might think to yourself in reflection that was obvious, but if it was so obvious why did you have let events occur to see that it was obvious? [Yes, it's circular reasoning, but that's the point] Perhaps the answer is that it wasn't obvious, with any amount of foresight, just with hindsight. In the same way, solutions to NP hard to problems tend to be only "obvious" in hindsight. Before that they remain NP hard.

The suggested answer to the mentioned logging problem is to log asynchronously (which is a very good one, let's be clear on that). That's fine, but by the time you do design your way out for all functional and non-functional aspects of even a medium sized system, chances are you'll have bled the customer dry thanks to futureproofing and still they won't have what they need today. Assuming of course that your guesses worked out to be right.

It's just not as simple as splitting matters in early design or delayed design. If it were, we'd know what to do by now :)

It's hard to see how you design for the future, when you don't what that future it is, or even which future will be. Even if you did design the ideal system, it's only ideal insofar as the requirements don't change any further into the future. There are only so many corners you can see around, unless you're building the same thing over and over (perhaps you are). So I don't see that you can rely on foresight, even when you're lucky :) I think you can to some degree, architect very broad frameworks or platforms that many solutions can run on (I'm thinking of things like .NET, J2EE, the Web, JXTA or even operating systems) , but I don't think it's like that when it comes to requirements heavy or customer driven systems.

There's any number of failed over-engineered, over-designed projects. They fail not only because they lacked the needed focus on requirements (what does system need to do?) but also because they encourage a situation where the project schedule has nothing to do with the amount of work involved - and there's enough of that going around the industry already without adding to it by overegging things.

Designing software is not as simple as simply designing for the future you believe will be the case. That's just gambling on the customer's money. As such, it's not one iota better than the cowboy code and fix approach. What we want is risk management not gambling. Doing any design above and beyond requirements needs to be assessed against the risk of project and schedule failure. Not doing any significant upfront design also needs to be assessed.

Now, there is one problem I (and others) have with the XP approach and that's what happens when you get published interfaces wrong the first time. They're in the wild, and you don't own all the callers (assuming you even know them), so you don't know what you;re breaking if you change. Martin Fowler calls this space published versus public. The answer in many cases seems to be not to rely on APIs for the interface. Thankfully there are options to consider for published interfaces other than API, such as internet protocols, REST and XML documents.

By the way, there's a whole school of thought within XP of programming towards patterns (known solutions), that has not been mentioned or critiqued here.

My own experience is that change is cheap when you have well factored code and a solid test suite to rely on. It doesn't really matter then when the change comes, just that you have tests and clean code before it comes. I should qualify cheap here - by cheap I mean affordable, timely with low risk of impact. There are many many systems out there that people are terrified to change. But again, there's nothing mentioned about the value of having taken out insurance through test coverage.

So my admittedly narrow reading of Glenn's entry is that he has a narrow reading of what XP is offering. He has a good point if you take XP as dogma or perhaps have unwarranted expectations about what a software process gives you.

March 20, 2004

XML Namespaces: they set up us the bomb!

There is a profound architectural misdesign with the way that most tools deal with XML Namespaces. Not with namespaces themselves, though some may disagree, but with their common presentation. Even the XML Infoset fosters this misdesign. The error is that the Infoset and most tools present the namespace name and the local name as two wholly separate information items (parameters). - Ken MacLeod

Although I like the idea of APIs using single names, I disagree with Ken. The problem lies squarely with XML Namespaces. The spec goofed by not providing a syntactic artefact for the full name of the element. As a result Namespaced elements are totally abstract. To tunnel them into XML syntax required extending core XML tools with what is effectively a QName macro processor. The QName, which Ken takes issue with - well, being the only thing specified it's naturally what you would expect to show up in the tools initially. A strange beast to offer up to a syntax obsessed community. Ken mentions that James Clark wrote the "clearest description of XML Namespaces" in 1999. But James Clark did that in large part by providing a decent syntax. So while the single name idea is good, all you can say about the tools folks is that they were working to spec.

Unfortunately any cogency stops there - the rest of this is a rant.

Since then, the XML Namespaces abstraction has been shoehorned into XPath syntax and the SAX and DOM APIs via the QName. Each time the complexity has gone up - but hey it's worth it, because name clashes happen every day in XML. Right? Having decided probably for legibility to prefer QNames, XPath had to invent its own (pretty decent however) model for XML Namespaces to explain what QNames are actually standing in for. Technically (if you care), in the XML Infoset XML that doesn't use XML Namespaces doesn't have a "meaningful" Infoset. Why, and what value is a meaningful Infoset anyway? I just don't know - shrug.

Meanwhile the Atom folks are talking about how define relationships between Atom elements and children in another namespace - XML Namespaces don't have any such semantics. Sure, XML doesn't either, but XML doesn't imply the kind of partitioned vocabularies XML Namespaces do, so it's hard to see why it should. So... it seems that when you do have something reusable in an XML namespaced vocabulary people sometimes won't use it. Oops. Ostensibly, this is for the usual reasons you need to be careful about normatively referring to someone's else's standard, but I suspect aesthetics and hubris also play a part. Mixed namespace documents are not always pretty. And if I've learned anything in this industry it's that people will expend no little effort to own the definition of a term. Effort that would otherwise have been wasted on useful work. The available option then becomes a one-shot mapping between vocabulary elements. In the Atom case, that becomes a mapping of Atom elements onto Dublin Core (perhaps defined as XSLT). Yes, the fact an XML Namespace is a URI is useful since we can leverage RDF for the child/parent relationships, but in truth it's just as easy to map any number of raw names or a dotted prefix name directly onto URIs (e.g. Dublin Core in HTML). Once we have the indirection of a mapping function, we might as well use it to keep my markup cruft free. Does anyone see the value XML Namespaces are adding here? I don't. Are we sure the spec understood the uses? I'm not.

[Note that some SGML folks generalized such mappings years ago as Architectural Forms (AF), but it seems not enough people liked them.]

There is almost a total suspension of critical thinking around XML Namespaces - XML itself gets dragged through the coals from time to time, but not XML Namespaces. I don't think I'll ever understand why the XML community has developed a blind spot around them.

I give up. Or maybe I don't. Glimmer of hope: Noah Mendelsohn recently challenged the belief that XML Namespace prefixes don't really matter, a notion which has always been nonsense to my mind. For me, that was a blast of fresh air.

Finally, ignore people saying using XML namespaces means being a good citizen. That one's a lemon folks. You can't be a good citizen with XML Namespaces. Not until default namespaces are dropped for being the pernicious, backward incompatible extension to XML that they are. It there was only one idea that had to go from XML Namespaces, it would have to be the default namespace. Indeed, without the default namespace, a direct spec, and a syntactic representation of a namespace along with its QName model, XML Namespaces might be decent technology. So, what did Namespaces in XML 1.1 grant us?

IRIs instead of URIs.

Uche's article that inspired Ken's entry, is very good tho'.

[marvin gaye: inner city blues]

March 19, 2004

Massachusetts starts open source repository

The repository will consist of a MySQL database, Z Object Publishing Environment application server, Apache Web server, OpenLDAP authentication service for storing membership data, and Debian Linux (news - web sites) operating system running on an Intel-based rack-mounted server. The University of Rhode Island will serve as the repository's home.
Massachusetts Builds Open-Source Public Trough

[via Kevin Dangoor]

March 18, 2004

Groovy JSR: I don't doubt it

And we hold the world ransom for... One Millllyyon Dollars.

Simply put: Groovy is a very real threat to Java. - Otaku, Cedric Beust

Groovy is not a threat to Java. But Groovy and its ilk are a sea-change to how our industry goes about producing middleware (much of it built around J2EE), right down to the business models. Those of us who have been using Lisp, Ruby, Smalltalk, Python et al to build systems alongside Java already know this. These 'scripting' languages are not just toys for building and deploying code folks, not anymore. There seem to be fewer and fewer interesting technical arguments pro Java The Language (JTL) [Java the Virtual Machine (JVM) is different entirely]. Worried about performance? Well, compiled Lisp has been always been faster than Java, and I imagine commercial Smalltalks are there or thereabouts. And after all, who isn't using JSP? The best argument pro JTL is that it is a market - both for middleware and developers and offers economies of scale and lowered risk for the customer as a result. But a certain class of languages seem to be far better suited for managed middleware components - by being more expressive, more maintainable. more flexible. Faster, cheaper, better. Add to that an economic imperative for developers in the US and Europe to become 3-5 times more productive in order to compete with developers elsewhere... well, Donald Rumsfeld put it best:

If you don't like change, you'll like being irrelevant even less.

Now, I can imagine Sun rejecting this JSR, but I don't think they will and I think Cedric is underestimating their vision. Sun will take it on board, or risk missing a big wave in middleware development (here's a clue: it's not Grid or Webservices). I don't think they'll act with the same inertia they did with XML. Opening up the JVM to dynamic languages would keep them years ahead of .NET while letting IBM/Eclipse duke it out on the tools front (did I mention that tools matter less with the right language?). It's not hard to see a Jython, JRuby or Smalltalk JSR following in its heels.

[eels: beautiful day]

What should we call "agile" languages? (warning: troll)

Daniel H Steinberg asks:

I don't really agree with Richard that "agile" is a good name for languages such as Python, Ruby, Perl, and Ruby and Groovy. Perhaps he is right that it is a better name than "scripting" language, but "agile" is already becoming overloaded with meaning. Is there a better name for such languages?

How about "better" languages ;-)

March 17, 2004

The Java programmer: half a Lisp hacker is better than none

And you're right: we were not out to win over the Lisp programmers; we were after the C++ programmers. We managed to drag a lot of them about halfway to Lisp. Aren't you happy? - Guy Steele
[via Ted Leung]

[julian lourau - lonely night]

Stop downloading the web into a database

We really need to stop looking at the world as a bunch of documents that we crawl and try to import into one gigantic database and need to start looking at the world as a distributed information space that is queried dynamically - Jeremy Gray on rdfweb-dev

Absolutely; this is one of the key motivations behind the search project I'm starting. Stop downloading the web into a database.

Web ontologies: problems, benefits

Computers are insanely, hair-tearingly, stupid - they have to be told everything in precise detail. Things you don't normally need to be clear about, ever, have to be written down in exacting detail for a computer. This is done usually in languages which are simultaneously not designed to express logical relations, unforgiving in their exactness, and bereft of anything you or I might call expressiveness. But, if you think being precise in any language is easy, try reading the terms and conditions of your credit card.

The RELATIONSHIP list should make it obvious that explicit linguistic clarity in human relations is a pipe dream. It probably won't though - the madness of the age is to assume that people can spell out, in explicit detail, the messiest aspects of their lives, and that they will eagerly do so, in order to provide better inputs to cool new software. - Clay Shirkey

There are a number of problems with declaring relations like isBossOf for use in computers; three come to mind, none of which Shirkey addresses.

First off, the chance is that they're modal: that means first order logic can't express them, which means we can't unambiguously specify what they mean (modal logics remain... "controversial"). Anyone who's well-informed on the matter recognizes there is a need for some amount of code to support ontology definitions. In the immediate future you are most likely to see such "scripting" capabilities built into business process specs, being in many cases, ontologies in denial.

Second, they're unconstrained: a network filled with unconstrained relationships is perhaps no better than a network filled with unconstrained method calls. The limitations of unconstrained interfaces we have been forced to relearn at the protocol level with webservices (which is thankfully drawing to a close). We already know that unconstrained interfaces do not work well across networks, yet web ontologies are supposed to be for internetworking.

Third, they're often temporal: modern web ontologies have no concept of a time varying relationship such as isBossOf; in fact RDF semantics explicitly excludes it, and RDF is the bedrock for any web ontology you're likely to come across. Web architecture also has no concept of time varying identity thanks to the way it assigns ownership of URIs. Cool URIs may not change, but cool URI owners do - frequently. This is compounded by the fact that the web architects would really like you to use the transient (in the ownership sense) URIs because you can type then into a browser to get more data. This transcience is less than ideal for semantic web and ontological uses. While it might smack of a badly layered stack, the point is that the architecture doesn't support it and you have to fall back on urging some kind of best practice or an architectural dictat. However, there are good technical reasons to avoid baking in temporal logic. The most important is that it's complex stuff. The nearest analogy that comes to mind is the reliable message in webservices - not having having it as core makes some people's jobs much more difficult and expensive but having it as core makes everyone pay a complexity tax whether they need it or not. So not sense of time in the semantic web core makes some sense. But no matter; the key issue is that the absence of time leaves us with significant data integrity and management issues to contend with - be wary that many facts will go stale and we have no interoperable means of expressing that yet.

So that's some of the downside. The upside is that in using ontology with computers can't possibly be worse than the in-code truth functions we use today. Legions of programmers are writing down things like isBossOf all the time (myself included). It's their job. Except they don't call them relations, or predicates, they call them methods and those methods capture what most of us call business logic. [Nor do they call themselves ontologists.] So it's pretty far from logic but good enough for business - until the time comes to change the logic where the cost of using all that code becomes apparent. It's a long held truism that we'd be all much better off if we could get such logic out of code. But it seems only so much of it can come out of code. We end up with the same design tension that afflicts web development; except this time it is separating code and relation instead of code and presentation. Today we place logic in code as we used to place markup inside code; tommorrow we might place code in logic as we place code in markup today. Either way it requires real discipline to keep the two apart and the system clean. Relational databases were supposed to do this for us and I suppose to some degree they do, but the modern SQL powered RDBMS is some ways away from its relational heritage - certainly, there's not much talk about databases as logical theorem provers anymore.

Ian Davis responds to Clay Shirkey's critique:

Despite all the obvious thought Clay put into his piece, he still managed to overlook the raison d'etre for the relationship vocabulary. Indeed it's the raison d'etre for all vocabularies. Without these vocabularies, incomplete and imperfect as they are, we would be mute in the machine readable web, unable to express ourselves in any meaningful way. You only have to look at the etymology to realise that vocabularies give you a voice. - Ian Davis

Certainly Shirkey's argument has less bite when you consider that the authors of the work he's critiquing are simply not they thinking in the naive way he imputes they do. Is Shirkey right that this is ultimately a pipe dream? Yes. But almost everyone writing these ontologies down knows this too. Such criticism is similar to criticizing a hacker for hacking because the Halting Problem represents a hard limit on the capability of a computer. But the hacker already knows this and is get something useful done anyway.

This is because Shirkey is wrong on one point - it is not the madness of our age. In the history of writing facts and relations about the world down, our age is perhaps the most sane. We have 20th Century mathematics and philosophy to thank. Arguably, more logical shibboleths were killed in the last 100 years than in the entire history of thought beforehand. Today's ontologists are quite sane and are usually painfully aware of the limitations they work with.

For my part, the raison d'etre of declaring such relations is reducing code complexity while increasing flexibility. In other words it's all about the Benjamins. This happens in two ways - we write less logic in the wrong languages and write more code in the right languages. One side effect of working more with declarative logic and ontology and less with systems languages is that it leans you towards alternative programming styles. Unfamiliar syntax aside, when you express business logic in largely unused languages like Prolog, Haskell (or functional Python), and then implement the same logic in the hugely poular VB, Java or C#, and hold to two styles side by side, you have wonder why we stay loyal to the latter languages. We can question the wisdom of an industry which flagrantly uses suboptimal tools for the job. Writing business logic in a systems programming language makes questionable economic sense and is something like driving into the future by shoving an exhaust pipe up a horse's arse - all filth no benefits. Ontology, especially targeted at middleware business logic makes a lot of sense in comparison.

Nonetheless the criticism is valuable because it keeps the non-geeks among us skeptical. We're an industry driven by hype. And while I'm not aware of anyone looking for a perfect language in web ontology, that's not to say other people who decide whether this stuff gets used won't succumb to delusional reports on their expressive and economic power.

If you are interested or invested in the ontology, service-oriented, data integration and social networking spaces and have not read Data and Reality by Bill Kent, you really should. Seriously. It is the best book ever written on data modelling with computers - at the time of writing this it is a quarter of a century old. Pertinent to this entry, it has the clearest explanations I've ever read on why philosphical discussions on ambiguity and meaning we so often disregard as pedantic nitpicking cannot be so disregarded when it comes to computers.

Reading list:

Data and Reality: Bill Kent
Knowledge Representation: John Sowa
Programming in Prolog: Clocksin and Mellish
Philosophy of Artificial Intelligence: ed Margaret Boden

[rem: the lifting]

March 16, 2004

RSS1.0 in drag

So my question is this: why isn't Atom defining a Syndication Element Set as a complement to the Dublin Core Element Set? Why duplicate all that effort when the Dublin Core people have been over the issues again and again for nine years? Ian Davis

To which we can add what Edd Dumbill said.

March 15, 2004

An email and some book recommendations from Booby Woolf

Earlier this year, I receieved an email from Bobby Woolf. He was responding to a question I left on the wiki, namely are there any good distributed computing books (in the context of Java books). Bobby supplied the following recommendations:

  • Core J2EE Patterns

  • Patterns of Enterprise Architecture

  • Enterprise Integration Patterns

all of which I own and like - it was a long time ago when I asked, long enough I'd forgotten the question, but I'm chuffed nonetheless to know they're recommendations. Bobby is also a co-author on the last, Enterprise Integration Patterns. I'll be putting a review of that book up shortly.

[calexico: gipsy's curse]

March 09, 2004

RDFX: RDF XML serialization

Mainly design notes and musings for now.

Naturally, a search project which is going to use RDF needs a way to manipulate and interchange the stuff. XML - yes please. RDF/XML - no thanks. So, in the grand software tradition, I'm hacking a (nother) XML serialization of RDF.

Design goals:

  • It's for serializing RDF in XML, not XML in RDF

  • Readable

  • Writable

  • Hackable

  • CutAndPastable

  • Optional support for contexts

  • Optional support for quotation

  • A decision in 10 minutes whether to use it

When it comes to XML, I have never bought reasoning of the form "blahML will be mainly processed by machines,so who cares what it looks like". That's not what markup's about baby. It always ends up in front of person who is in front of text editor which has no schema support and who is not even a Desparate Perl Hacker, just Desparate. I want people to be able to fire up an editor and start typing RDF triples or to be able to hack some code together to start producing or consuming the stuff. Just because RDF is formally stuffy doesn't mean the markup can't be simple.

My experience of RDF to date suggests that Uche Ogbuji and Graham Klyne (and others) have been right all along - RDF needs contextual support. I think this has been missed because folks (included myself once) assumed reification would magically provide such support (and for other things like provenance and quotation, such is the power of magic). But in 2004, RDF reification is a syntactic and semantic roadcrash; it isn't fit for anything. So folks can have quads if they want them (and we'll sort the semantics out in code for now - typed contexts anyone?).

The design artefacts:

  • Schemas: RNG and WXS:

  • Samples

  • Spec

  • Transforms: rdfxml2rdfx, rdfx2rdfxml

  • Parsers, serializers (Python, C#, Java)

So far I have the RNG, WXS. There's a sample here. I'm still sorting out whitespace normalization, but am getting happy with the literal, typed literal and bnode support (need to review against the abstract syntax). The XML is in a namespace (bah) - this may change if I decide namespaces hurt the design goals (the same sample, without namespaces).

I would like to think it's simple enough to be used as a powerful alternative to the kind of property/value extension hooks you see sometimes in XML (aka attribute driven markup), though that's not a design goal.

Things that will not be present or supported:

  • XMLBase (and by extension no relative URIs)

  • QName abbreviations in place of URIs

  • Canonicalization of XML literals (I haven't looked at this issue in 2 years, but I would like RDFX to be be "dsigabble")

  • Literal as subjects

  • Nested graphs

  • Reification (use quotation and get over it)

On the long finger is RDFC, or a compact form a la RNC (if only to get the W3C to mandate ntriples - if it's good enough for testing it's good enough). But an rdfx2yaml hack might do the job anyway.

This is fun.

[mary poppins: let's go fly a kite]

On the futility of popular uprising in 21stC Ireland

s.o.: Hey, you can only rent Intermission thru Xtravision.

me: So? Oh right... I guess they learned something from the Warner Brothers thing last year.

s.o.: We can't get it at ChartBusters. We have to go to Xtravision.

me: We still have an Xtravision account? ...Hang on, let's not rent it. That'll show them.

s.o.: Er... I'll drive round and get it in Dundrum.

me: Ok. Wanna get some beers too?

March 08, 2004

Driving in Dublin

Driving in Dublin is like driving through a building site.

[alex reece: ibiza]

MalignantVavavoomController: A linguistics of MVC

Terence Parr: all your mvc are belong to him. This is either a groundbreaking clarification, or taking the high ground through jargon (opinions may vary). Or both. I liked it.

[lenny kravitz: let love rule]

Google, we have a problem

Google Search: Necromantics

March 06, 2004


I read a piece in the Economist called "Necromantics" on the train to Galway yesterday (it's premium content online, so, no link). Apparently, the Italians have been raising the well-known dead for some time now, to discover more about them (and possibly sample DNA if I recall correctly). Giotto has been raised, as well as a man who appears in Dante's Inferno. A number of the Medici's are coming up soon.

March 03, 2004

MT generating malformed XML

Chris Lawrence asks if a CDATA section in an attribute is legal XML. My understanding is that it's not, since an opening bracket is not allowed inside attributes, thereby excluding CDATA.

So it looks like MT 2.66 is generating junk markup.

[jamelia: superstar]

Simple Kid

I'm up watching some program on RTE2 compered by Jerry Fish (if you're walking by Whelans day to day, his name is painted on the green bit at the moment). Anyway, Simple Kid. Beck meets Neil Young meets a drum machine. Outstanding, beautiful. The last time I saw anything on telly this good was Asian Dub Foundation - that was years ago.

[simple kid: truck on]