" /> Bill de hÓra: June 2005 Archives

« May 2005 | Main | July 2005 »

June 26, 2005

An Irish host that does all this?

I currently host this weblog from a UKShells account. I'm more than happy with them, UKSolutions have their act together, the service is great. But, the exchange rate between Sterling and Euro bites. Here's what they give me for around 250 euros pa:

  • SSH to bash shell
  • 250Mb filespace
  • 2 domains hosted currently
  • Multiple domains hosted from that shell account
  • 10GB data transfer per month
  • Apache .htaccess files for the websites
  • Access to raw log files
  • (S)FTP
  • Python, Perl, PHP, CGI
  • 2 x cronjob
  • Mysql account
  • Webmail
  • 10 x POP mail accounts, 1 mailing list, web managed
  • POP3/IMAP mail access, web managed
  • A/C/MX, DNS configuration, web managed
  • Sales team that understands DNS
  • Online trouble ticket
  • Competence

None of those are negotiable. Is there any Irish host that does all this for individuals? It occurs to me I ought to start looking at a US provider as well. What I pay now gives me 300 USD to play with. Dollar to Euro rocks.

Bonus points for Subversion hosting.

Expect better

Data above the level of a single site is immensely valuable to people. If you're in the software business you'll know by by now that data is the new platform. Which is why David Berlind's take on Microsoft's RSS is a bit disappointing:

"First, to have Microsoft come out and support RSS and not support the other syndication technology (Atom yes, I asked) doesn't bode well for Atom. Furthermore, members of the Atom community have discussed how Atom is designed to address subscription scenarios that RSS isn't well equipped to address. Well, before today's announcement, RSS was not very well equipped to subscribe to ordered lists (although you could technically fudge it like I do with something like del.icio.us and FireFox's Live Bookmarks). Now, by virtue of an extension, it is. Redmond RSS: Death knell to Atom? Birth of an 'open' era for Microsoft?"

David Berlind is capable of much better - I can only assume he's trying to stir the pot! So, some obligatory pushback.

First I think most of us who have worked on Atom will see Microsoft's announcement around RSS support as a positive thing. Inside Microsoft, people like Robert Scoble and Dare Obasanjo should be pleased with this outcome. MS surely get the value of putting data into RSS and are committing to it not just for IE7, but all the way down into the Longhorn OS. Sam Ruby, who is the secretary of the Atom WG is already looking to support the module in the feed validator. If Berlind is looking to incite a flame it won't work; arguing over formats is so 2003.

Second, if like me, you've worked on lists, it's a highly positive thing - they get that lists are the next low-hanging fruit after feeds. Here's a gedanken - on Monday, start writing a list of the data you have that is in list form. I bet, by Friday, you have a long list. Now imagine you can integrate all that data, remix it, aggregate it, search it, synthesize it, tag it. Technically speaking a list module is so simple, so trivial, that my best guess has been that we developer types just haven't bothered to do it. To get an idea of how trivial a list extension sounds, well, I suppose it would be like writing "The Computer" on your computer - absolutely pointless. Socially and commercially however having a unified list format that isn't buried in HTML is going to be insanely valuable. The problem with HTML lists is that you can't get at them, class them, tag them, most importantly, share them. Without the XML vocabulary, the data remains hidden inside all the other HTML gorp. Manipulating HTML lists is like trying to program with a picture of a computer instead of a real computer - there's only so much you can do.

Third, it's the XML, stupid. The list notation can go into any of the many RSS formats as far as I can tell. It's not being baked into RSS2.0, which is to say it's not an extension of RSS2.0 at all, it's just an XML module. At this point I'm tempted to debunk a particular confusion that reigns in the XML syndication world - the difference between modularity and extensibility, but that's for another day. Suffice to say the Simple List Extension, is an XML vocabulary, and not an extension of RSS. Indeed you can start using it right now without any RSS/Atom in sight. Let me repeat - you do not need to use RSS2.0 or Atom to use the Microsoft Simple List Extensions.

Fourth, the article keeps saying RSS, but is careful not to say which one, while Atom is "the other syndication technology" - this lest you not know it, is nonsense. Tech journalists following the space are surely obliged to know there's about 9 variants, 2 of which are Atomic. Consider that there is greater technical difference between RSS1.0 and RSS2.0 than between RSS2.0 and Atom. Then again, in the bowels of the software, it's possible to support all the formats in a single programming model. Bizarre, the fact that you can do so has been levelled as a criticism of Atom - almost to say, "If it doesn't break my code, what's the point!?". Some of us would consider non-breakage a benefit however (if you are non-technical consider such cries a pathology that come with being a developer - we can't help it).

Where Atom adds value is cleaning up how extensions will work, how content is to be encoded, what the content is, how to deal in particular with XHTML (can be messy), how entries are identified, links, how clients and server tools will interoperate, how feeds can be encrypted and secured - nitpicky things that waste a lot time, are not much fun, and ultimately frustrate users and tool builders. You can support all the formats within a single program, but it's tricky to support all the formats across a variety of systems. Much of the value in Atom is ultimately social, not technical - Atom is an IETF technology, relatively safe from the personality wars which have plagued syndication technology.

And that's just the Atom format - the Atom Publishing Protocol, aka "how do I move this stuff around" is in no danger of demise. Indeed, some people think the publishing protocol is a key value add of Atom. It's one thing to to support a variety of formats, it's quite another to support multiple posting protocols. You would not get to see the raw value and economies of scale of the Web today if every second server had shipped a custom version of HTTP.

This development of standard modules is where future innovation will occur - something the RSS1.0 crowd figured out years ago, and who still have the better technology in that regard, but only if you are prepared to programmatically manipulate it as RDF - otherwise it's just like all the others. For those working in the enterprise and integration space I think you will see Web Services modules come to be used or reinvented to target Atom/RSS structures. If there is a death knell anywhere in this story, it's for the SOAP envelope.

June 22, 2005

Holepunch

The comments I've been getting here in the recent have been extremely useful and insightful. A lot of them are from the US, and I'm in Ireland, which means there's a lag sometimes before they're made visible. I'm thinking about lifting the moderation trap and/or using captchas to let the good stuff pass staight through. There's a potential for spam (urk), whihc I hope a captcha can deal with initially .

One irritant about commenting is that I don't always remember where I've commented and I'm not organised enough to be keeping track. So maybe adding comment feeds is a better option - does anyone use them?

Scrapemonkey

Stephen O'Grady has had a script or two broken by a Gmail alteration:

"To be a bit less harsh, while Google probably had good reasons for making the change, it would have been great to see them be proactive and notify people of the change via their blog or some other mechanism."

I'm surprised there isn't more discussion about this. Greasemonkey is a cool idea, but these scripts are so fragile it's not funny; they make side-effected GETs look robust. The page DOM is not part of the API contract, which is the basis of Alex Bosworth's argument. In programmer terms it's like depending on the field names rather than the field values. You build on this stuff, you more or less commit to lockstepping with the server. It feels house of card-like. Someone might respond by saying that, well I'm sure 404s seem house of card-like 12 years ago when the Web dropped backlinking, or 8 years when XML dropped shorttags *. Fine except I've not heard the Greasemonkey approach articulated in terms of a tradeoff to get adoption - anyone out there think DOM scraping is an architectural insight?


* and while we're at it , here's a strawman. Retrofitting backlinks dominates Web innovation - pagerank, wikis, tags, folksonomies, trackback, pingback, bloglines, del.icio.us, pubsub, technorati - enabling backlinking is what releases value. When people talk about building out social computing infrastructure, backlinking is also the basis for that.

June 18, 2005

Hitting reload is the framework job

I'm building a simple enough web app, to manage some project related data. Withoug claiming the ability to see around corners, I'm quite sure this app will grow over time because the data it's working against is nebulous and that will push for more and more views. The main decision so far has been to keep data in XML+RDF - for example there is project data in DOAP files and person details are in FOAF and bits of DC are scattered about, RIG chunks, and so on.

I'm telling myself I'm using RDF+XML because I want to be able to pull data in from anywhere. That's true, but to be brutally honest I can't be bothered designing and maintaining yet another relational schema for yet another webapp - doing so is starting to make as much sense as designing my own filesystem or TP monitor. Life's too short, too short to be working on technology that can only possibly make sense when you're in dressed in combats and vans listening to Pearljam pretending it's still the nineties... there's a real wish to conduct oneself at a higher level of abstraction before complete dementia sets in. What's the point in designing tables for a webapp when an RDF-backed store will manage the data for you and RDF queries will come back as tabular data anyway? There are RDF triple stores that will handle in the order 10^6 statements - Leigh Dodds is doing some research on that, up to 10^8 by the looks of things. If I need queries instead of hacking out iterators+fiters I'll use versa/itql/rdql. Now, saying I never want to design another relational schema again is not to say I don't want to use a database. Most of these RDF triple stores are in fact using an RDBMS in the background, as the filesystem and indexer, it's just that the relational schema in use is not exposed to the application.

Can't say I'm too fussed about having a nice object model for the domain either. Yes, it's heresy not to have an object model for the domain - out of the corner of my eye, as I write this, I can see that Eric Evans' book is trying to wriggle off the shelf and wallop me upside my head.

Other than not using an RDBMS directly, and not being too fussed about objects, I wanted these capabilties:

  • Login+sessions
  • Easy XML out
  • Easy URL design
  • Easy URL/action mapping
  • Easy Atom subscriptions on views
  • Save query as bookmark
  • Save filters as bookmark
  • Provide adequate opportunity to look at some frameworks

I didn't care much about:

  • Protecting graphic designers from code
  • Protecting coders from user interfaces
  • Planetary class scaling
  • Shopping carts
  • Pet store transactions

More heresy - no doubt this project will be a disasterous conflagaration of worst practices. I can't wait.


Once you know what the web app is about, you then have to decide what to write it in. That alone is a research exercise. I looked at RoR and I don't know, maybe a dozen Java and Python frameworks.

RoR: RoR is really nice, but a lot of the value is tied up in hooking into an RDBMS schema via ActiveRecord. I'm not using one of those, I'm using XML+RDF, so that takes me off The Golden Path. And even the RoR guys will tell you want to stay on The Golden Path. Maybe I could write an ActiveGraph (Redland RDF has Ruby bindings), but who knows what else would get rewritten by the by - a lot of RoR magic is in mapping the DB - when that's gone a lot of value goes with it. Please yes, I got past the demo and understand that scaffolding is only one part, I just don't understand what the value beyond that is. In RoR, The Golden Path is the system value. [update: Bruce D'Arcus pointed me at Obie Fernadez' musings; seems like there's some pent up demand for RDF on Rails]

Python: The state of the web frameworks in Python is nearly as confusing as Java, no small feat. There's lots of 'em and I have no good sense where the community interest really lies. If you are doing CMS, Plone wins hands down, but I'm not doing a CMS. Zope's a parallel world and I'd have to get around zodb for RDF. Twisted is fine for building servers, but pushes too much back onto an app developer even with Nevow. CherryPy I'm still playing with, it looks nicest so far, and feels closest to a 'done' thing. Greg Wilson has also noticed this excess in the Python world recently, and he thinks the Python community needs get on message - that would not be my conclusion. Python is also overflowing in templating languages, of which Clearsilver, Tal and Cheetah are notable - tho' I'd like to try out Kid - Ryan Tomayko cracks me up.

Java: I know the Java space better than Python. Struts is out for reasons of verbosity and sanity retention - there's that XML config file format (if you need a graph, use a programming language or RDF). By the same criteria, JSF is also out, never mind it's unproven, insofar as the answer to MVC on the Web is not neccessarily even more MVC on the Web. Tapestry looked interesting but it's squarely targeted at HTML output and stopping code monkeys and graphics monkeys squabbling over who gets which bananas. The latter is a non-requirement and the HTML only thing I don't entirely get, even though I gather Tapestry is very focused on non-programmers. I read Howard say somewhere once he doesn't believe in multi-site output - maybe I've been too long in the Atom world but it seems to me publishing as Atom/RSS is becoming a requirement. Spring webflow doesn't seem to offer any value for a two tiered web app where objects are going to be incidental by design - Spring also has that graphlike XML config going on as well. Webflow by the way, is the one part of Spring which I gather has stated it is not targeted for simplicity - instead it's for complex application flows. The rot starts there.

I've complained about Java web frameworks before, especially this obsession with MVC, but much of the issue is with Java, not the framework designs. There are two many steps involved in deploying a servlet app, or making changes to a running one, or updating half-a-dozen XML files - it's ridiculous. All I want to do is edit and hit reload. Letting me hit reload is the system job. To get there means scripting support.

(Non) Conclusion: RoR's database dependency and Object MVC obsession isn't working for me. of course if you read the 37Signals weblog (and you should) your conclusion could be that I'm approaching this entirely wrong and I should start with a user interface as the requirements and work back. But you know, slick UIs are like nice shoes - awfully sexy but awfully transient - you put them on and still no-one sees the real you. Whereas data will set you free. Less surreally - I don't care what the UI is made out of as long as its not arduous and doesn't dictate how this data is going to be structured.

CherryPy is the closest thing to a decision I've settled on in the Python world, but I'm really not sure. Not at all. [update: I am getting a ton of feedback on the Python side of things, enough to have me going back to reassess as I clearly don't have enough of a clue]

The best option for Java seems to be WebWork (one of my favourite web applications is built on it). WebWork is a big improvement over Struts and the Java world is really missing a trick by going straight to JSF. The main problem is setting things up for scripting, to get out from under the compile/deploy/load loop. PyServlet comes with Jython, but that leaves things configured backways, by putting the Jython at the front of processing chain; whereas it would be better to script the action code to be invoked by a framework. Then the framework is dealing with skinning, sessions, stringtables and the like. I'm thinking about embed a JythonActionSupport into WebWork so it can call out to .py files implementing the ActionSupport interface. WebWork uses proxy generation to create and fill out your action objects, hopefully there won't be problems integrating that magic with Jython.

Generally, the web frameworks situation is depressing unless you're a researcher. If someone thinks there's a framework I should be looking at, shout. Deparalysis imminent.

June 17, 2005

Heaven - 111

Wow

I Feel Lexish

Steve Loughran:

"So you cannot say that code-in-XML is inherently wrong, as there is clearly a two-way transform at the syntactic level from an XML document into a scheme clause, and since scheme is an elegant and powerful language, so there is an elegant and powerful way to work with XML, within a representation of the document itself."

Whether data is code or not depends on what's looking at it. Without an evaluator, code is just notation, just like XML is. What might be wrong are embedding other code notations in XML, or if you are a protocols extremist, embedded verbs, because the verb is being uttered elsewhere through the protocol, and you know, make your furiously colorful sleeping green mind up already. One reason Lisp Kicks Computer Science Ass is because the notation lends to extensible evaluation of the evaluator - when smug Lisp weenies talk about extensibility, they are talking up different stuff than smug middleware weenies who think plugins and namespaces are big woop *.

Steve might want to look at JSON if he hasn't already. Yes, Lisp parens are an optimal way to inscribe tokens for evaluation, and are deeply beautiful in that Mandelbrot kind of way (oh! fractals! shiny!), but JSON is less likely to freak javascript/css/java people out. JSON has that worse-is-bettery feel to it.


* I fear that this will result in reanimating the XML v Lisp permathread from xml-dev - like the big one, it's due anytime. That particularly inane permathread's source of energy is the confusion between the evaluator and notation.

June 15, 2005

By the time they get to version 3

Microsoft are doing that Version 3 thing again. This time it's with BizTalk. Matt Milner has has written up a detailed overview of the architecture, which will be of interest to messaging techheads.

Better living through television

Darragh: "RTE has moved into the 21st century, and has got RSS feeds."

The feeds are here. Maybe the Irish Times will follow suit.

June 14, 2005

Writing code for others that use it

Greg Wilson has written an op-ed piece for DDJ, Selling Open Source. He's calling out the fact that Python has too many web frameworks, and suggests the community needs to get its act together. He does ask the key question about the Python community's motivations - why should 'they' bother to rationalise the situation? The counter-arguments are suppplied:

"The answer two-fold. First, as long as development effort is spread out over so many competing projects, none of them will reach the critical mass needed to build something that can hold its own against PHP, Mason (in Perl), or RubyOnRails, which in turn weakens Python as a whole. Second, the fact that the Python community can't get its act together and speak with a single voice in such a key area sends a disturbing message to outsiders who are used to more established ways of doing business. "

I don't know, the idea that people who write open source software need to get their "act together" for the benefit of those who choose to use it in a business context I find hugely problematic. That open source software is some kind of entity which ought to act in the best interests of those that do not write the code, or is engaged in a war with commercial code offerings, are myths, and in the worst case acts to foster a bogus entitlement culture around open source. It's your choice to use open source software, period. No-one working on OSS is obliged to get their act together to simplify your decision making processes or make something palatable to existing business structures - which is not to say some projects are not focused on that. If you are using OSS code to do some lifting in a commercial IT scenario, then you need to do your due diligence and put some thought into the ramifications of introducing it, just like you would with any bespoke or commercial offering. The need for sound management and clear thinking around IT are not obviated by OSS.

"Think about it: If a new language appeared tomorrow, and had five different, incompatible regular expression libraries, what would you think about its stability, and likely longevity?"

So, there's an actual example of this happening - Java! Java had a few active regular expression libraries for years as the JDK didn't ship with one until JDK1.4. So, no standard regex library in Java for about the first 6 years, give or take, and it didn't seem to hurt the language's longevity. The web frameworks situation in open source Java is probably 'worse' by Greg's criteria - our collective back teeth are floating in Java web frameworks. Struts dominates the Java IT enterprise marketplace but is by no means considered ideal and is effectively being phased out to make way for JSF.

"Bringing any one of them up to "click and run" standards wouldn't be rocket science."

The piece is arguably a selective reading of the situation - a lot of people who read Dr. Dobbs won't neccessarily be up to speed with what's actually going on in the Python world and will go away with the wrong idea. Greg could go back and take a second look at what's happening in the Python web world where there is community coordination - Plone, MoinMoin, Zope. I think he'd be impressed. Plone is unsurpassed in the click and run department for a CMS. MoinMoin is instantly useful as a wiki, and has a great plugin model. Zope can be complicated, it certainly shows its evolution in someplaces, but there's not much you can't do with it as an app server - for example Plone is built on it. A follow-up piece focusing on those systems would provide a more balanced view for DDJ readers.

The real problem isn't that the Python community needs to get its story straight on the web frameworks front; it's that there is a lot choice which makes demands on people's attention, and that's not just a problem restricted to the Python world, or even the open source world.