Search code examples
amazon-s3rustrusoto

How to use select_object_content via rusoto / rust?


The following code tries to select some data from a file stored on S3:

  let client = S3Client::new(Region::default());
  let source = ... object providing bucket and key ...;

  let r = SelectObjectContentRequest {
      bucket: source.bucket,
      key: source.key,
      expression: "select id from S3Object[*].id".to_string(),
      expression_type: "SQL".to_string(),
      input_serialization: InputSerialization {
          json: Some(JSONInput { type_: Some("LINES".to_string()) }),
          ..Default::default()
      },
      output_serialization: OutputSerialization {
          json: Some(JSONOutput { record_delimiter: Some("\n".to_string()) }),
          ..Default::default()
      },
      ..Default::default()
  };

It causes the following error:

The specified method is not allowed against this resource.POST

The example is a 1:1 port of a working Python/boto3 example, so I'm quite sure it should work. I found this issue, which is a few month old and the status is not clear to me. How do I get this working with Rust?


Solution

  • Unfortunately s3 select still doesn't work on the latest rusoto_s3-0.40.0. The issue you linked has all the answer. The problems are twofold.

    First, right now the s3 select request rusoto sends out has a bogus query string. It should be /ObjectName?select&select-type=2, but rusoto encodes it to be /bjectName?select%26select-type=2. That's the error you saw.

    To verify, run your project like so:

    $ RUST_LOG=rusoto,hyper=debug cargo run
    

    You will see logs from rusoto and hyper. Sure enough it emits an incorrect URI. One can even dig into the code responsible:

    let mut params = Params::new();
    params.put("select&select-type", "2");
    request.set_params(params);
    

    It is supposed to be:

    let mut params = Params::new();
    params.put("select-type", "2");
    params.put("select", "");
    request.set_params(params);
    

    Although the fix seems trivial, remember these are glue code generated from AWS botocore service manifests, not manually coded. To incorporate the fix is not that straightforward.

    Second, the bigger problem. The AWS s3 select response uses a customized binary format. rusoto simply doesn't have a deserializer for that yet.