Joe Gregorio: At Google we are considering using PATCH. One of the big open questions surrounding that decision is XML patch formats. What have you found for patch formats and associated libraries?
I believe that looking for an XML patch format is looking for a solution at the wrong meta level. Two examples, using AtomPub:
- In Atom, the order of elements in an entry is not significant. AtomPub servers often do not store their data in XML serialized form, or even in DOM form. If you PUT an entry, and then send a PATCH based on the original serialization, it may not be understood.
- A lot of data in this world is either not in XML, or if it is in XML, is simply there via tunneling. Atom elements are often merely thin wrappers around HTML. HTML has a DOM, and can flattened into a sequence of SAX like events, just like XML can be.
My recommendations on how to proceed
- Initially severely limit the set of vocabularies that you are considering. In this sense “XML” is simultaneously too broad, and too limiting. Atom/AtomPub is an example of something more along the lines of what I would suggest. Trust that once the problem is solved for one format, it can be readily adapted to other formats (example: KML).
- Untangle the problem into two parts: specify a patch friendly serialization (i.e., one with required newlines) for the format selected, and then use a standard diff format on the results. Watch this space.
- Spend some time up front specifying the behaviors that you want to address. In the case of Atom, adding an entry, deleting an entry, adding a category to an entry, fixing a typo in the content are examples of common scenarios. Feel free to use the Atom wiki for this purpose.