Wednesday 7 March 2012

Uploading a directory of images to couchdb

I wanted to have a general script to upload a set of images to couchdb. Couch can't store images as documents because it uses JSON for that. But you can still create a JSON document and attach a set of annotations to it in the form of images. But here's the catch: you have to specify the document's revision id, and that changes after every image you add.

But I had two problems. My directory of images contained sub-directories. I also wanted to access the images directly using a simple URL. So if I had a directory structure like:

    list
        file1.png
        one
            file2.png
            file3.png
        two
            file4.png

I would want the relative URLs to be: /list/file1.png and /list/one/file2.png etc. The trick in writing the script is to extract the revid from the server response. I used awk for that, then used the returned value to upload the next entry in that directory. The second trick is to use %2F not / as a directory separator when creating the docids. Couch doesn't allow nesting in the database structure but you can simulate it by creating documents called:

list
list%2Fone
line%2Ftwo

The first posting to each of those documents doesn't need a revid, but subsequent ones do. That's just a feature of couch. So here's the script. It's a bity sloppy because if couch responds with an error to any upload it will fall over. This bit of error-handling is currently left as an exercise to the reader. I'll put it in later and may update the post then. To use the script put it into a file called upload.sh and then invoke it thus: ./upload.sh images, where "images" is the master image directory.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.