Search code examples
javapostapache-commons-httpclientckan

uploading files to a dataset in CKAN / datahub.io through a Java client


I am testing the uploading of files to a dataset on CKAN / datahub.io through a Java client of the API.

public String uploadFile()
        throws CKANException {

    String returned_json = this._connection.MultiPartPost("", "");

    System.out.println("r: " + returned_json);
    return returned_json;
}

and

   protected String MultiPartPost(String path, String data)
            throws CKANException {
        URL url = null;

        try {
            url = new URL(this.m_host + ":" + this.m_port + path);
        } catch (MalformedURLException mue) {
            System.err.println(mue);
            return null;
        }

        String body = "";

        HttpClient httpclient = new DefaultHttpClient();
        try {
            String fileName = "D:\\test.jpg";

            FileBody bin = new FileBody(new File(fileName),"image/jpeg");
            StringBody comment = new StringBody("Filename: " + fileName);

            MultipartEntity reqEntity = new MultipartEntity();
            reqEntity.addPart("bin", bin);
            reqEntity.addPart("comment", comment);
            HttpPost postRequest = new HttpPost("http://datahub.io/api/storage/auth/form/2013-01-24T130158/test.jpg");
            postRequest.setEntity(reqEntity);
            postRequest.setHeader("X-CKAN-API-Key", this._apikey);
            HttpResponse response = httpclient.execute(postRequest);
            int statusCode = response.getStatusLine().getStatusCode();
            System.out.println("status code: " + statusCode);

            BufferedReader br = new BufferedReader(
                    new InputStreamReader((response.getEntity().getContent())));

            String line;
            while ((line = br.readLine()) != null) {
                body += line;
            }
            System.out.println("body: " + body);
        } catch (IOException ioe) {
            System.out.println(ioe);
        } finally {
            httpclient.getConnectionManager().shutdown();
        }

        return body;
    }

2 responses I get to my POST request:

  • a 413 error ("request entity too large") when the jpeg I try to upload is 2.83 Mb. This disappears when I shrink the file to a smaller size. Is there a limit to file size uploads?

  • a 500 error ("internal server error"). This is where I am stuck. It might have to do with the fact that my dataset on datahub.io is not "datastore enabled"? (I see a disabled "Data API" button next to my resource files in the dataset, with a tooltip saying: "Data API is unavailable for this resource as DataStore is disabled"

=> is it a possible reason for this 500 error? If so, how could I enable it from the client side? (pointers to Python code would be useful!)

Thx!
PS: the dataset I am using for testing purposes: http://datahub.io/dataset/testapi


Solution

  • Only someone with access to the exception log could tell you why the 500 is occurring.

    However, I'd check your request is the same as what you'd get from the python client that was written alongside the datastore: https://github.com/okfn/ckanclient/blob/master/ckanclient/init.py#L546

    You're sending the "bin" image buffer and "comment" file_key in your multipart request. Note the file_key must be changed for every upload, so add in a timestamp or something. And maybe you need to add in a Content-Type: for the binary.