I need some assistance with filtering S3 results using AWS SDK for PHP v3 and JMESPath. Filtering by a number is not working with the PHP SDK as JMESPath documention and online examples suggest.
<?php
// test.php
use Aws\S3\S3Client;
// Create S3 client
$s3 = new S3Client([
'version' => 'latest',
'region' => 'us-east-1'
]);
$bucket = 'my-bucket-name';
$prefix = 'path/to/my/objects';
// Call list-objects-v2
$awspaginator = $s3->getPaginator('ListObjectsV2', [
'Bucket' => $bucket,
'Prefix' => $prefix
]);
// Apply filter to paginator
$jmes = "reverse(Contents[?Size>`0`].{Key: Key, Date: LastModified, Size: Size}) | [-10:]";
$results = $awspaginator->search($jmes);
// Echo results
$i = 0;
foreach ($results as $result) {
echo "\nResult: " . print_r($result);
$i++;
}
echo "\nCount: " . $i . PHP_EOL;
?>
This outputs Count: 0
But if I replace Size> `0`
with StorageClass=='STANDARD'
I get the 10 most recent objects as expected.
I've attempted the following Size expressions without any luck.
Size>0
// returns error: unexpected number tokenSize>'0'
// succeeds: returns no resultsSize>`0`
// succeeds: returns no results Size!=`0`
// returns results but does not filter out zero size objectsSize!=\"0\"
// returns results but does not filter out zero size objectsNote that the s2api query works just fine so this seems to be something to do with the PHP SDK Search method.
--bucket my-bucket-name \
--prefix path/to/my/objects \
--query "reverse(Contents[?Size>\`0\`].{Key: Key, Date: LastModified, Size: Size}) | [-10:]"
Any help is appreciated!
I'm struggling to find this documented anywhere, but it appears that Size
is unmarshalled as a string. I was able to make your example work with [?to_number(Size)>`0`]
or indeed with [?Size!='0']
.
This appears to be a bug, or at least a failure in documentation, as the docs state:
The AWS CLI supports JMESPath. Expressions you write for CLI output are 100 percent compatible with expressions written for the AWS SDK for PHP.
https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/guide_jmespath.html
The only thing I have been able to find even alluding to this behaviour is only tenuously related:
https://forums.aws.amazon.com/message.jspa?messageID=752541#jive-message-312324
Here the problem is that the DynamoDB API is expecting to receive numbers as strings, and an Amazon representative notes that this behaviour is a) because the SDK has to support 32-bit environments that can't handle integers over 2 billion, and b) in general all AWS SDKs are generated automatically from a language-agnostic set of data files, and they prefer to avoid making exceptions when they can avoid it. This seems to imply that using strings as integers may occur broadly across the SDK. That said, I can't find any mention in elsewhere.
Whether or not it's deliberate, it appears to be because the PHP SDK's Api/Parser/XmlParser doesn't have a mapping for the long
type that Size
is declared as. It falls back to the default behaviour here of parsing it as a string:
https://github.com/aws/aws-sdk-php/blob/master/src/Api/Parser/XmlParser.php#L23-L31