intertwingly

It’s just data

RFC: Ideal CouchDB DB Dump Format


There’s a discussion going on in CouchDB as to what an ideal dump format for a CouchDB database would look like.  A CouchDB database is a collection of URI’s, while the content associated with any given URI is often JSON, CouchDB supports the notion of an attachment that could be pretty much anything.

So... how do you dump a database?  Leading options are:

Before I solicit input, I’ll share my leanings, which is towards the latter.  My reasons are twofold: in order to do JSON right, you would need a streaming JSON parser.  Using MIME to segment a stream seams easier to me.  Secondly, we are talking backups here.  A single bit error can be difficult to recover from in a JSON stream, but the effects in a multipart MIME segmented stream would be a lot more localized.

But I’m far from an expert on multipart MIME, so I would welcome any input.  To get the ball rolling, try playing with this (source).