How do I upload a file with metadata using a REST web service?

Just because you're not wrapping the entire request body in JSON, doesn't meant it's not RESTful to use multipart/form-data to post both the JSON and the file(s) in a single request:

curl -F "metadata=<metadata.json" -F "[email protected]" http://example.com/add-file

on the server side:

class AddFileResource(Resource):
    def render_POST(self, request):
        metadata = json.loads(request.args['metadata'][0])
        file_body = request.args['file'][0]
        ...

to upload multiple files, it's possible to either use separate "form fields" for each:

curl -F "metadata=<metadata.json" -F "[email protected]" -F "[email protected]" http://example.com/add-file

...in which case the server code will have request.args['file1'][0] and request.args['file2'][0]

or reuse the same one for many:

curl -F "metadata=<metadata.json" -F "[email protected]" -F "[email protected]" http://example.com/add-file

...in which case request.args['files'] will simply be a list of length 2.

or pass multiple files through a single field:

curl -F "metadata=<metadata.json" -F "[email protected],some-other-file.tar.gz" http://example.com/add-file

...in which case request.args['files'] will be a string containing all the files, which you'll have to parse yourself — not sure how to do it, but I'm sure it's not difficult, or better just use the previous approaches.

The difference between @ and < is that @ causes the file to get attached as a file upload, whereas < attaches the contents of the file as a text field.

P.S. Just because I'm using curl as a way to generate the POST requests doesn't mean the exact same HTTP requests couldn't be sent from a programming language such as Python or using any sufficiently capable tool.


I agree with Greg that a two phase approach is a reasonable solution, however I would do it the other way around. I would do:

POST http://server/data/media
body:
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873
}

To create the metadata entry and return a response like:

201 Created
Location: http://server/data/media/21323
{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentUrl": "http://server/data/media/21323/content"
}

The client can then use this ContentUrl and do a PUT with the file data.

The nice thing about this approach is when your server starts get weighed down with immense volumes of data, the url that you return can just point to some other server with more space/capacity. Or you could implement some kind of round robin approach if bandwidth is an issue.


One way to approach the problem is to make the upload a two phase process. First, you would upload the file itself using a POST, where the server returns some identifier back to the client (an identifier might be the SHA1 of the file contents). Then, a second request associates the metadata with the file data:

{
    "Name": "Test",
    "Latitude": 12.59817,
    "Longitude": 52.12873,
    "ContentID": "7a788f56fa49ae0ba5ebde780efe4d6a89b5db47"
}

Including the file data base64 encoded into the JSON request itself will increase the size of the data transferred by 33%. This may or may not be important depending on the overall size of the file.

Another approach might be to use a POST of the raw file data, but include any metadata in the HTTP request header. However, this falls a bit outside basic REST operations and may be more awkward for some HTTP client libraries.