" /> Bill de hÓra: June 2003 Archives

« May 2003 | Main | July 2003 »

June 30, 2003

The Highlander Protocol

Burningbird: The Echo Project For Poets

All weblogging tools and perhaps peripherial tools would support a common API. This means, at a glance, we could post trackbacks to all weblog posts regardless of toll. But more, this also means that you could use any weblogging tool front end to post weblog posts to any weblogging back end. This opens the door to a new set of tools, as well as new technologies to work on top of them -- audio/video posts, posting from email, posting from your phone, and so on.

Well if Echo can settle on a data model, the common API can be HTTP, or any app transfer protocol. Really, what protocols do you need to blog other than NNTP, SMTP and HTTP? SMS you can gateway onto HTTP. And I think all you really need for (track|ping)back is a new header - Trackback-URI. Paul Prescod invented something like it last year for reliable messaging.

This is the biggie. This is the grand banana. This is where the rubber hits the road. Here's a scenario for you: you're on a road trip and you're writing to your desktop weblogging front end tool (someone else is driving we hope). You put the post in send mode and when you computer finds a passing WiFi signal, quick as can be, your entry is posted. Or don't bother with the computer, post with your cellphone. Of find a kiosk at a rest area along the way, and send a post, annotated with your map location.

Cool idea, but what does an API really give you here?

[via Danny]

Jakarta in a Nutshell

Now that'd be a best seller...

XML Namespaces - it continues

Raw Blog: more namespaces

Danny's picked up on the namespaces post:

How do you validate <blah.blah.net.title> style namespaces? Won't it mean (almost) arbitrary element names are posssible?

We can ask the same question about QNames - it's no harder than validating <blahblah:title>. Arbitrary element names are already possible, what matters for XML validation is that you recognize the XML - that's not quite the same thing as keeping vocabularies separate.

It's also not entirely clear what might be gained from using an alternate system.

Having used all kinds of styles for name partitioning, I believe the cost of processing against XML+ Namespaces is higher than using straight XML along with a technology that layers atop XML, not one that changes it. For that reason alone XML Namespaces are an architectural wreck. What's to be gained? Cheaper and simpler processing where partitioning names is important.

Worrying about names a bit like is worrying about typing. Yes bad things can happen with names or types, but that doesn't mean you must to bake protection into language or its interpreter; there are other approaches.

Personally I'm not exactly in love with XML Namespaces, but they do seem to work very well in the context of RDF/XML, where multiple vocabularies are more the rule than the exception.

But they don't work well, or even properly. QNames are a fit with RDF/XML because RDF uses URI as identifiers. If you want to transmit RDF in XML, or harder, if you want your XML to be transliterated into RDF, then you need a way of moving between URIs and XML element names. RDF in XML was a usecase for namespaces way back when, so you'd have to think it would be useful. But RDF/XML requirements (roundtrip URIs through XML) remains an edge case for XML in the large (at the moment, I hope that will change). Plus it seems to be the case that QNames just don't cut it syntactically or semantically - Patrick Stickler has spent a lot of time thinking and expounding on this topic, and I don't think anyone's been able to refute him to date. XML namespaces don't meet entirely RDF's requirements and can't be justified on that basis.

[btw, it's great to see people thinking about this stuff, instead of the usual chant of Namespaces Good]

June 29, 2003

RSS2.0 sans XML Namespaces

The Fishbowl: Namespaces and RSS2.0

Charles Miller picks up on my namespaces post, and gives a good breakdown of the need for namespaces. I 'm going to try to get around that by agreeing with him on policy (namespace), but disgreeing on mechanism (XMl Namespaces)

There are two ways to solve this. One is that you have some central authority with whom to register extensions, who ensures that everyone plays nicely with each other. For RSS, that central authority would have to be Dave Winer. This kind of centralised authority didn't sit well with the RSS community, which wanted the ability to go off in different directions without permission.

This would not be good, even though it's worth recognizing that there aren't many non-centralized mechanisms for naming things - GUID generators are a good example. Most XML Namespaces are based on HTTP URLs which ultimately require domain names. Domain names are centralized to a degree, but this time it's ICANN, not Dave Winer. At least you only have to do this once a year.

The other is with namespaces. Namespaces are defined by URI, so their uniqueness is ensured in the same way as Java package names. *

The third unstated option is to make your element name sufficiently unique. To do this, you can simply use fully qualified domain names to wrap your vocabulary or element.

Taking Charle's lastUpdated example. We could deal with that in a few ways. We could agree that two or more lastUpdated tags do in fact have the same meaning and say their encodings are different. An attribute qualifier would be useful there. That could be an xsd type, a mimetype, a URI, or simply a string. The point is that in this case it's not the element that needs to be uniquely identified, it's the encoding of the element content that needs identification. Another way is to tag the element with a fully qualified domain name.

Here are some options:

  <lastUpdated vocab="net.amadan.rss-ext">

I think it's extremely unlikely that you're going to clash with anyone else here.

So yes, if you have full control over the schema, namespaces are unnecessary. If you have lots of people trying different things without deferring to a central authority, namespaces are necessary to stop them stepping on each others toes.

I disagree with Charles here - XML Namespaces are just one of many options. Though I could have been clearer in the other post. I don't question the need to partition names, but I do question the use of XML Namespaces to achieve that.

[* not strictly true. Some URIs, ie the urns, don't require a domain name]

XML technologies: use these, throw the rest away

  • XML
  • XPath
  • RNG
  • XSLT
  • XQuery *
  • Unicode
  • URLs

* tentative: this will come with a lot of gorp and needless complexity, but when you have gigabytes of markup there aren't so many options.

Because it's there


No-one, as far as I know, has stood on his or her head, at the South Pole, thus becoming the first non-god to bear the weight of the firmament.


June 28, 2003

Running to stand still

Is it getting better

What Andy said. I've never been busier and I'm not about to complain about it. Most people I know in software have never been busier. Make no mistake it's a tougher market now. The work is there. However it's all fixed price, the deadlines are highly compressed, and the customer changes her mind with impunity. There is plenty of room for the smart and the efficient, who push hard make things happen, and can ship the good stuff.

One of the interesting thing about being too busy in a recession is maintaining quality of work.

We're at polar opposite of the situation during the bubble. 3 years ago quality went out the door because the industry realized it could dictate terms utterly and didn't have to worry about doing good work - just where was the customer going to go to get the good work from anyway? Today quality can be argued away because margins are tight and people can't expend the effort to do good work. The temptation is there now to ship low grade work, because it is all fixed price, because the licence revenues are plummenting or non-existent, because you can't churn your customer base and because you are working damn hard to make margins. How can we possibly produce good work when we're running to stand still?

It's always tempting to sacrifice quality at the economic extremes.

This is a mistake. A death spiral. I think it's never been more important to ship high grade work, because I'm convinced in this market you have to leverage quality as a compettitive weapon for you and your customers. Quality needs to go through the roof. The real job in this market is not educating customers that they can use IT to save money, but that thet have the chance, right now, to get the best quality of implementation around the best organisational principles for adaptable, long-lived systems, if they are willing to work with organizations and people, like yours, who care about doing the good work and will not mess them around. This is what we do at Propylon. Thoughtworks, Atlassian, IntelliJ, Orion, Zope to name just a few are thriving on quality. I bet Andy Oliver's refferals go to people he trusts to do the right thing by him and the customer. Crosby and Deming are right - quality is free and will pay for itself over and over. How to produce quality software at speed is not a mystery - look to open source, XP/IXP, Agile RUP for process styles which get results. All this will pressurize business models or processes based on upfront requirements, change control, solutions sets driven by licensing revenue, or deliverables that aren't working software. That's 90% of services and systems integrators and the product companies that supply them, right there.

The best way to make it better is to make it better.

Stupid user admits inability to install java 1.4.2

I got a few comments saying the option is there. My immediate reaction was to flame the fools and throw a tantrum - lucky I didn't.

I found the location option on the fourth attempt, under the pane that lists what's being installed. And fourth time around, there's no reboot.

Clearly, I'm losing it. But thanks to everyone who politely told me it was there!

[Later: 276 hits from javablogs for the other post so far. I feel sick...]

Namespaces: what am I missing?

Extending RSS 2.0 With Namespaces

It's obvious that his definition of <banking> (how a plane moves through the air) is different than your definition of <banking> (what bank you place transactions at). This is an example of "breaking your father's back" - since you didn't put your toys in your toy chest, problems are occurring. How are your readers supposed to know which <banking> is being talked about? Airplanes or financials? That's where namespaces come in.

The proposed solution (sans xmlns declarations):

  <title>A simple example</title>
  <description>A simple example</description>
  <airplanes:banking>Turn the rudder 45 degrees to the left.</airplanes:banking>
  <financial:banking>Bank of Montreal</financial:banking>

The conclusion:

The benefit for a piece of software is immediate: it now knows that one sort of banking is different than another sort of banking. There was no way it could tell this without namespaces. Note that I've used meaningful names above, but you don't have to - you could easily have used < blah:banking> and < flintstones:banking> . To software, the difference would still be relevant, but it merely makes things more confusing for human readers of your RSS feed.

Yes, it's obvious that banking is different from banking. No, I never see standalone xml tags like banking in the wild, they're always part of a vocabulary. If I did see a a lone tag in the XML wild, my first reaction is, why is that thing there? not, to paraphrase Marcus Aurelius, to ask what is it in itself? And certainly not to make assumptions about what computers do and don't know.

Such tags are called children for a reason, they belong with their parents. It's this kind of noddy example that perpetuates the myth that namespaces are somehow neccessary for XML in the same way <getStockQuote /> perpetuates the myth that RPC is somehow neccessary for SOAP. They're not.

As for the do's and don'ts. Forget about human readable prefixes, prefixes are incidental creatures and can't be guaranteed to roundtrip or preserve through a series of XML engines - unless you;re using XPath, prefixes are a syntactic hack that arose because XML can't embed URIs in element names.

Using namespaces, it seems you can dodge name clashes. I'm saying, when you're ready, you won't have to.

CLR Reliability


Lets forget about managed code for a moment, because we know that the way we virtualize execution makes it very difficult to predict where stack or memory resources might be required. Instead, imagine that you are writing this guaranteed forward or backward progress code in unmanaged code. Ive done it, and I find it is very difficult. To do it right, you need strict coding rules. You need static analysis tools to check for conformance to those rules. You need a harness that performs fault injection. You need hours of directed code reviews with your brightest peers. And you need many machine-years of stress runs.

Very informative.

[via Ted]

XML events: not that difficult

Carlos nails it. A practical difficulty with handling events is in storing state and knowing your current context.

There's a nice way around this in SAX. First map the behaviours you're interested in to XPaths. Then as you receive events, dynamically build an XPath that states your current document position. You can use that path to select a behaviour.

Foundations for component and service models

[This post was inspired by the article Evolving Java-based APIs , and how misguided its principles would be for the web or components. It's biased toward Java, but the principles are applicable beyond Java]

What's the problem

The core problem of component and service architectures is easy to express: how do I change a published interface without breaking the callers?

Java best practices won't help

There's no nice way to say this - idiomatic Java is not good for component or services architectures. Which might suggest that idiomatic Java has no business either on the Web or in component architectures. Please note the use of the word idiomatic - we have a lot of learning to do in a short time. But there are a few ways to mitigate things - by looking at what other programming languages have done, and in particular, by looking at what are arguably the best existence proofs we have of loosely coupled component architectures - Internet protocols.

Avoid changing or extending the interface methods

Use highly generic method calls that are qualified with metadata - this is how SMTP and HTTP work and is often cited as a reason for their phenomenal success. If you're lucky, the remoting you happen to be working with will be organized this way. You'll notice that this is the polar opposite of models such as the EJB spec, which actively encourage you to pile on the methods and go n^2.

The thing to realize is that every time we add a non-standard method to an object call we force a cost on all possible clients of that object to understand that method. The cost of integration is rises as a function of dependent methods in the system, not dependent objects. This only ever makes sense if you own all the endpoints (as is often the case with J2EE projects starting out). It makes no sense if you don't, and over the life of the system it will make less and less sense. Count the number of published method names in all the third party apis you use in your code to get an idea of how tightly coupled you are. Then count the number of published methods in your own APIs to get an idea of how tightly someone can get coupled to yours.

In systems based around services or components, this cost can quickly get out of hand, compounded by the fact the methods are not usually constrained by a protocol or exchange pattern - the semantics are not uniform. This is why RPC based web services require choreography and why protocol neutrality is an architectural defect, not a feature, of web services.

The issue with Java (or C#, or C++) is that if you're like me, you've been conditioned to think in terms of objects, not methods. So when it comes to determing the how coupled the system is, you're liable to look at package or object level dependencies, but the real damage is happening on at the method call level. For example the JDepend tool calculates coupling based on static imports, not method calls.

Control change by using a dictionary interface

If the idea of a once and only once component interface sounds impossible, then there is a working compromise. Design the interface as a dictionary (ie prefer a Map over a Bean). I've seen Python and Lisp code that does this well, plus they have good support for meta-class hacking; and it's sometimes called data-driven programming in the Lisp world. Java can do this via using method names as map keys and using reflection to invoke the method against its object.

However Java has made this approach difficult. To do it, functions really ought to be first class elements of the language so we can bind them to names and pass them around as arguments. The reflection API goes some way to addressing that as I mentioned, but it's a complicated kludge by comparison to what's available in other languages. By the way the only java API I know that's designed this way is JSR-187, but it's stalled at the moment. The Command pattern or Plugin lifecycle APIs are a step toward the idea of a uniform object interface and anyone who has worked with these patterns will understand their value (imagine programming against Eclipse without the plugin API) - the problem is their uniformity is only local to an implementation. The API that has gotten the furthest in this approach is JavaSpaces, which standardized the programmatic interface to a tuple space (think of it as a poor man's Internet protocol).

Calls should return documents not objects

There isn't much point in having a coarse grain component or services architecture if all you do is use them to gain pointer access to fine grained objects. Again this is a failing with the bean/dto style best practices of J2EE, which have arisen as an optimization technique for calling over RMI-IIOP rather than through analysis of how to control change between tiers. Thankfully web services architects have been back-pedalling away from RPC to doc/lit over the last year.

Avoid binary compatability

I can't emphasize this strongly enough. Any system that requires binary compability is non-scalable in terms of adoption, mangement and change responsiveness. I submit J2EE's single biggest architectural failing is forcing clients into binary compatability and thus, lockstepped upgrades. To be fair, .NET is not much better.

Web services are in danger of regressing to the same situation. This week, SOAP 1.2 has gone to recommendation status by making SOAP an application of the XML Infoset, not XML. This is a bug not a feature. Previously SOAP 1.1 was an application of XML. With XML, you could at least read my data with off the shelf parsers - once we agreed on the data. If I start sending you a non-XML stream in the guise of Infoset, you'll need a decoder to make the stream emit Infoset events - this means we need to agree on codecs on top of data agreement. You also need the decoder that works in your system, not mine - since what works in my system doesn't matter a damn to you. Naturally the pair of codecs need to be tested together - but before all you had to do was parse my XML. It only takes a handful of these codecs to derail interoperability.

At this point SOAP is no longer the interoperation hotspot. What will matter are the codecs needed to unpack an Infoset. At best this is interoperation at the level of RMI-IIOP, DCE or DCOM. At worst this is interoperation at the level of device drivers. You'll also note that it's easier to interoprate using codecs that are from a single source - this leverages the codec provider to lock-in the users of codecs. In fact, given what we know to date about building distributed systems (not designing them, building them), that's very much the point. The W3C have made a mistake in allowing that change to go through.

Don't confuse an API with a contract

Don't compose contracts from APIs, compose them from protocols delivering data. The key point is that an API is insufficient means to agree a contract, at least in normal usage. Contracts are better modelled as an ordered exchange of documents.

Version the contract

Versioning is something we're not good at in the industry. JAR hell is little or no improvement over DLL hell. This is a topic I'll go into in greater detail another time, since it is so problematic. But in my experience, at the level of components and services, the right approach is to version the protocol and shared data format, and not the API signatures (how you version that is local to you). Seeing @deprecated in a component's published interface is an indication something has gone wrong. Again this is a good argument for only publishing highly generic methods.

Don't build an API for data transfer

This is subtle since there's no clear boundary between transfer and invocation (transfer versus invocation is arguably the point where the REST and SOA styles depart). But if your component's or service's job is essentially to shuttle information rather than perform computation or do work that that involves significant state management, you may be better off placing the component on the web and integrating clients via a transfer protocol such as HTTP. Amazon's REST style API is a good example.

As another example, there's probably no reason to build a weblog API via something like XML-RPC or SOAP, given the ubiquity of HTTP and the fact that every blog on the planet is web accessible by default, unless you're angling to lock clients into your service by stealth.

June 27, 2003

G$$gle's new toolbar

Google Toolbar 'BlogThis' Rankles Rivals

Whatever, I use Mozilla :)

J2SE 1.4.2 installer nonsense

Nevermind: go here instead

Just installed 1.4.2 with the install shield. And immediately uninstalled it.

  • I wasn't asked where to install to.

  • I have to restart my computer to complete the install.

Oh dear. Back to 1.4.1 then.

[Later: I should have said - I'm using the offline installer]

June 25, 2003

Only Forward


When there's a syntax, I'll hack this.

June 22, 2003

Complexity is Free

Big Picture of the XML Family of Specifications by Ken Sall

How not to use equals in an if{} block

kasia in a nutshell: The abuse and over-use of toString()

The scary part is how many experienced programmers have no clue about this little gotcha.. go spread the word.

Yikes. And there's the following related anti-idiom :

  public void foo(Object whatever)

Every time I see this, it breaks my heart.

June 21, 2003

Alzheimer Oriented Architecture

Standards: Doomed to Repeat Itself?


Compose yourself

Ken Arnold Critiques Ant

The problem with Ant is that it violates something we learned with Unix. Tasks aren't composable.

Two ways out of this problem.

1: Use piped i/o and interoperate via data not functions. That's what UNIX did and in Propylon that's exactly how we handle XML process/transform composition for our customers (with PropelX).

2: Use a language or protocol that supports functional compostion. Lisp or Scheme by modelling tasks are functions and HTTP by modelling tasks as state transition acheive this (though you could argue that HTTP is a very weird pipeline). And Python is close (as Ted point out).

Tasks aren't composable in UNIX due to C, and not composable in Ant due to Java. Of course you could design a task language to be composable and use C/Java to write the interpreter, but I think the result would look so much like Lisp, it'd be easier to cut to it and write a Lisp interpreter in C/Java instead.

As for Ant. Ken Arnold's observation isn't entirely fair. Ant was not designed to compose. If it's really a problem (and I'm not sure it is) Ant would need to be rebuilt using pipelines (most people are allergic to Lisp so functional composition is a non-starter). One thing to consider is migrating off Ant syntax by moving onto Python or Jython to script your build (keeping the Ant tasks tho') and the scripting language to compose calls or pipe data through them. To be honest I think this is a issue for languages like drools and Jelly rather than Ant. Ant is fine as long as you're not trying to program with it.

Charles Miller: penetration testing

The Fishbowl: The Value of Penetration Testing

successful penetration indicates something more than a particular security flaw. It indicates some systemic flaw in network security policies or practices. The network was designed to be proof against a certain class of attacks, and it was found not to be. Why wasn't the installed software up to date against security patches? Why weren't the operators sufficiently educated to spot the social engineering attack? Why didn't anybody notice when the server started behaving out of the ordinary?

Good stuff.

whimper not bang? java.net

The Dog that Didn't Bark

Simon Phipps:

it raises for me the question of why none of the other sources I respect in the blogging community have even mentioned the launch of blogs and wikis on java.net, let alone come in with a critique (positive or negative). In particular, I've not seen anyone in the blogging A-list that I track with NetNewsWire mention or critique java.net, and the greatest omission of all was the lack of any comment on Slashdot (until June 13). What gives?

Good question. There is community hanging out here and here and here and here. It's going to be interesting to see how java.net meshes with those communities and this lot

Kendall Clark reviews ws-arch

XML.com: A Tour of the Web Services Architecture

On the 'XML backplane':

XML, according to section 1.7 of the WSA document, is the "backplane" of a web services architecture and is "much more fundamental" than either SOAP or WSDL. That's true, if for no other reason than both SOAP and WSDL are dependent on XML.

SOAP 1.2 and WSDL are not dependent on XML, they're dependent on the XML Infoset. That's a world of difference. SOAP 1.2 was deliberately designed to be decoupled from XML via the Infoset - this allows the potential use of binary or non-xml formats in the future. I submit that's not a good thing, but anyway there's no need to argue the point about an architectural dependency on XML if there isn't one (there is a practical get-the-job done engineering dependency on XML, but that's different ;).

On the web services 'infographic':

With all due respect to the WS-Arch, that's a stew, the visual equivalent of extreme spaghetti code. While I have reduced the size of this image (so that it will fit the XML.com site better), even at its original size it has several problems. I don't intend to provide an Tufte-like analysis, but it is overcrowded, there's no obvious "path" through the information it's meant to convey, the relationships between the terms are hard to discern and to understand, and it's so visually cluttered and busy that it doesn't even suggest a coherent, balanced, well-ordered architecture.

I've been lucky enough to work with the person that drew this graph, Frank McCabe. Frank is extremely smart, and well known in agent and ontology computing circles - notably he has made large contributions to FIPA's Abstract Architcture. The FIPA AA in my opinion remains the best articulation of a service architecture, but lacks the profile or coolness bestowed by XML+Internet protocols that is attracting the big players eager to roll out a new stack.

Yes, the picture of the graph is a mess, but that's missing the point. The structure of the graph is what counts. There are two things worth noting about the structure:

  • it's acyclic
  • all arcs are labelled

An acyclic, labelled graph. In other words, this graph is tailor-made for description in KIF/RDF/OWL/DAML-S (take your pick). I have no doubt that this graph or its progeny are where the future integration web services technologies with semantic web ones will occur. Far from being a stew, it's been one of the highpoints of the arch group's work in the last few months. What needs to happen next is for the ws-arch group to stop associating the more useful semantic web work with fringe AI research and thereby denigrating the technologies that provide the very capacity they will need sooner or later - to formally describe both the elements in the web services architecture and their relationships. We already know that WSDL and UDDI aren't up to the job. No doubt Kendall will pick up on this - he's been following the semantic web closely for some time.

On REST 'integration':

Section 1.6.3 ("SOA and REST architectures") attempts a rapprochement between the advocates of REST and the advocates of RPC- or SOAP-centric web services. That is yeoman's work and the WG should be publicly commended by all interested parties for undertaking it.

It seems the arch group should be applauded for acknowledging REST. I wouldn't go so far. The lean of the arch group has been to define web services architecture as being something distinct from web architecture (check the archives). Web services architecture seems destined, by accident or design (I'm really not sure) for behind the firewall, and every now and then SOAP messages will be lobbed over the web to our business partners. Sometimes I wonder why is it called a 'web services' architecture. I think the word 'web' has caused too much confusion and strife. Why not just call it 'integration services architecture' or 'middleware services architecture'? The web's actual architecture has little to do with what's being defined here other than offering cheap carriage (not neccessarily a bad thing). The best articulations of how the web works are not-normative input to ws-arch and it's taken no small amount of badgering and advocacy to get REST on the ws-arch radar - Mark Baker in particular has taken a lot of flak for his stance.

In the final analysis ws-arch neither extends not encompasses the web - that's so much the worse for both architectures. With my engineering hat on, what concerns me is that the two technologies that have driven web services over the last few years, XML and HTTP, are being marginalized in favour of an expanded architecture that mandates infosets and protocol agnosticism, and this is counter to what I see makes for successful cost-effective solutions.

Search Engine engine resources

ongoing · On Search: Basic Basics

If you're interested in search and retrieval, here are some resources.

Managing Gigabytes: best and most comprehensive of the bunch, should explain to you why XML is bad choice for storing search details in a centralized database (maybe a good choice for parallel search across the web).

Finding Out About: the natural successor to Salton, start here.

Online book: Information Retrieval: old but covers all the basics.

The authors of Managing Gigabytes built a production quality index and retrieval tool that you can use. Other stuff, that works and is usable:

Lucene: one classy piece of software

Lupy: a port of Lucence to Python, promising but incomplete.

JXTA Search: where the late Gene Kan's distributed search engine ended up.

June 19, 2003

Namespace Routing Language

James Clark does it again: NRL.

June 17, 2003

Vladimir Roubtsov: classloader article

Find a way out of the ClassLoader maze

Good read from Vladimir Roubtsov.

RDF Wide Shut

Sam Ruby: PSS

The RSS world can't stomach RDF for the most part, but seems destined to create an ad-hoc semi-interoperable subset of it via XML+Namespaces, for RSS >= 2.

Highly recommended for those who still think RDF is overkill: RDDL.

June 16, 2003

Just use a Wiki

Sam Ruby starts highlighting ad-hominen passages in comments.

So it's true - comments don't scale.


June 14, 2003

Subba Subba Hey! Migrating to Subversion I

So I'm moving some of my personal projects and managed files onto subversion. I'm going to run with it for 6 months. I hope it works out, version control is the most important aspect of programming for me after testing and a good editor.

I think that CVS will go the way of projects like Apache 1.x - supported and widely deployed, but no active development because the development an innovative focus has moved elsewhere. It's just a matter of time before one of the main sourcecode hosts support svn repositories along with CVS, and Subversion hits a critical mass.

Getting the server installed (on Suse) was tricky. INSTALL is not quite telling truth and there isn't much information on the web on setting up a server, certainly none that worked for me. All told, it took about six attempts over three nights, the problems being mainly with other libraries. I took notes and when there's more time I'll write them up.

By the way, Subversion is not the last word on version control at the low end of the market. There is a chasm between the low end/free tools such as CVS, RCS, Subversion and VSS and the higher end tools like ClearCase and Bitkeeper that reminds me strongly of state of the market a few years ago for IDEs and Bugtrackers. It's exactly the gap that tools like IDEA and Jira have filled, high quality tools with price points that seemed impossible three years ago. I find it incredulous that a business will not enter the VCS market in the next few years and clean up. Having looked at the code for svn, rcs and cvs at one point or another, I think its possible to sustain a product business in VCS at €350 per seat, maybe lower. I should be able to serve up 10 developers for the price of a decent laptop and I should be able to cut a deal at €8K for a 25 head shop. The real business issue is not price, but being able to demonstrate rock solid quality of implementation (VSS remains an anomaly, like storing your life savings under a mattress). The only one that comes close to filling that niche today is Perforce, a fine product, but its price point is enough to either push people who don't really understand the potential ROI on using a solid version control, but do think in terms of machines, IDEs or desktop suites, down to CVS and VSS, or in the larger shops possibly up to Bitkeeper or Clearcase.

[And just for fun, I hope it's based on a grid or p2p architecture which would allow the VCS to scale without trading up to high end servers].

Agile Security

Even if we completely design our security stuff up front, and even if we manage to get it right, we're still not done. Because the app will continue to grow and change in ways we did not expect. Every feature we add must go through some kind of security review, whether or not we thought we covered it in our up front work. Security is [virtually] always an ongoing process, and [virtually] never a one-time task.
-Kevin Smith [on extremeprogramming]

And miles to go before I sleep

Busy busy busy.