Stefan Tilkov paraphrased my responses to Dare's post on the Atom Protocol as:
"Bill de hÓra acknowledges that the third is indeed missing from APP, considers
the second problem a general issue with PUT, and disagrees about the first one;
but he adds two more problems: update resumption and batch/multi-part uploads."
To recap, the issues Dare raised are:
- Mismatch with data models that aren't microcontent
- Lack of support for granular updates to fields of an item
- Poor support for hierarchy
Stefan is a connector across a number of communities, so I'd like to qualify his reduction as follows:
- Atom as Joe points out, is more than an envelope, it's content. I pointed out, valuable formats - ones with media types, and not just the usual blogging suspects - are properly supported in APP. Lolcats won't be a problem.
- Use PATCH. More on this below.
- I do not think Atom is a good format for hierarchical data, but it's not clear to me that's a problem (certainly it's not a protocol level problem). You probably want to start with a placeless model as APP/Atom does and declare hierarchies and maps out of band. There are all kinds of options for this that will work within the APP constraints.
Perhaps the title of my post was misleading (that's what you get for being clever). The point wasn't to criticize some
detailed observations, or suggest APP has serious problems, but rather to criticize
the dual conclusions that 1) the APP has failed for some definition of "general
purpose" publishing, and 2) it's necessary to roll your own
publishing protocol for the reasons given. Feedback on the protocol is a good thing, but I couldn't get to those conclusions following the arguments given. It didn't take long
for some people to provide workable options, and I presented some other issues
to chew on (batch updates and resuming uploads).
I mentioned using PATCH as an option for dealing with partial updates. Matthias Ernst questioned the need for a different method:
"I don't see that need. PUT with the If-Match: header is just enough to do the work on the client side using optimistic concurrency control."
Stefan also questioned the need for PATCH:
"I’m not at all sure I like the PATCH approach, too — I’m not really keen on having to tunnel even more verbs through POST because they’re not widely supported"
update: Stefan explained to me that his concern is adding another method rather than tunneling; a valid concern.
I probably wasn't clear enough on where I was going with this. First of all, PATCH is defined in RFC2608 188.8.131.52 (sort of)
and arguably part of HTTP
, it's not a POST tunnel (thanks to Julian
for the reference). Second, what Matthias says is true for
the case of multiple editors (and APP has mention of how to deal with lost updates using If-Match
and friends), but this is a different problem to sending deltas - ie, you don't need partial updates
to have lost updates.
The design value in using a new method to deal with delta updates is twofold.
First no matter what the format is, or the optimal algorithm/policy for merging data on the format, the
PATCH method is explicit in its intent - the server is getting a change delta from the client as a function
of the representation sent down to the client. With PUT you have to infer outside the method whether
the server is receiving a delta or a full update. You can deal with this format by format using
PUT, and APP has specifications in place for avoid the problem altogether (the atom-syntax working
group felt that sending partials was overloading PUT). Joe points to the following in section 9.3:
"To avoid unintentional loss of data when editing Member Entries or Media Link Entries, Atom Protocol
clients SHOULD preserve all metadata that has not been intentionally modified, including unknown
foreign markup as defined in Section 6 of [RFC4287]."
But "general purpose" diff/patch is another matter, especially if people want to work at a higher level
than bytes. I see no reason to disallow it in the future; the best way to do that is not redefine or muddy PUT
now (or later on), but allow the protocol room to use PATCH.
Second the broader guideline I had in mind was this - whenever you
have you two operations that resemble each other superficially but are semantically different and
have different expected outcomes, you should consider separate and explicit definitions to avoid interop issues.
It's not just about finding efficient techniques for important approaches to readers and writers like optimistic concurrency - it's about providing a uniform means of expression in the protocol design.
On Strategy Tax
Broadening things beyond direct issues with Atom Protocol for a minute, it should be clear that defining your own publishing and data access protocol, means building
your own tools and platform infrastructure from top to bottom. The amount of
work to do this, again for some definition of "general purpose" shouldn't be
underestimated. It's much more likely in a high pressure commercial environment
to produce a protocol that is highly limited and works for one platform - yours. That
is you end up with less capability and yet another silo. This is analogous at
the protocol level to Facebook's choosing to create markup format for users - one
that says more about Facebook's current capabilities than the actual users -
instead of rolling with something like FOAF. Arguably controlling of data portability is largely the point, but the overall costs of doing so shouldn't underestimated. Going custom will up the overall design and engineering dollars spent 'below the waterline'. Companies, even big ones, are resource bound so each engineering dollar spent on publishing infrastructure is a dollar not spent on a cool feature a user might care about. You want to be sure it's the right thing to do. For those integrating against such a provider you probably want to keep custom formats/protocols at the edge and convert them to open models for that internal use.
This reluctance to roll out on an
open protocol is a good example of a strategy tax, where creating barriers to data allows companies building social network platforms to
maximize a return on that data and all importantly, monetize the graph of social
relations. This balance around open data and platform franchises is a difficult problem for social network providers, who are especially subject to moddish swings in interest or perceived coolness. They don't yet seem to have the stable revenue streams that Google has from adsense or that Ebay and Amazon have from providing marketplaces. It's surely tempting then to reduce the fluidity of user data while figuring out how to become an 800lb gorilla. However web history suggests betting on a user silo will be a short lived tactical advantage, not a strategic play a la desktop operating systems. Perhaps there are other models to lockin - people have been pointing out for years that Google has precious little lockin on the search page and it's trivial to use a different search engine - yet somehow they manage to get by.