Search code examples
javascriptarraysjsonamazon-web-servicesamazon-textract

Filtering out data returned by AWS Textract function


I have extracted data returned by Textract AWS function. The return data type of this Textract function is of the following type:

{
   "AnalyzeDocumentModelVersion": "string",
   "Blocks": [ 
      { 
         "BlockType": "string",
         "ColumnIndex": number,
         "ColumnSpan": number,
         "Confidence": number,
         "EntityTypes": [ "string" ],
         "Geometry": { 
            "BoundingBox": { 
               "Height": number,
               "Left": number,
               "Top": number,
               "Width": number
            },
            "Polygon": [ 
               { 
                  "X": number,
                  "Y": number
               }
            ]
         },
         "Id": "string",
         "Page": number,
         "Relationships": [ 
            { 
               "Ids": [ "string" ],
               "Type": "string"
            }
         ],
         "RowIndex": number,
         "RowSpan": number,
         "SelectionStatus": "string",
         "Text": "string"
      }
   ],
   "DocumentMetadata": { 
      "Pages": number
   },
   "JobStatus": "string",
   "NextToken": "string",
   "StatusMessage": "string",
   "Warnings": [ 
      { 
         "ErrorCode": "string",
         "Pages": [ number ]
      }
   ]
}

I have extracted the Blocks from this data by the following code :

var d = null;
...<Some Code Here>...
d = data.Blocks;
console.log(d);

which gives an output as an array of JSON objects. An example of the extracted text is given below :

[...{ BlockType: 'WORD',
    Confidence: 99.7286376953125,
    Text: '2000.00',
    Geometry: { BoundingBox: [Object], Polygon: [Array] },
    Id: '<ID here>',
    Page: 1 }, ...]

I want to extract only the Text field and see it as the only output. How do I start about this?


Solution

  • I'm probably misinterpreting your question but if what you need is to extract the value of the Text field of each object in the data array, please take a look at the following example

    const data = [
      {
        BlockType: "WORD",
        Confidence: 99.7286376953125,
        Text: "2000.00",
        Geometry: { BoundingBox: {}, Polygon: [] },
        Id: "<ID here>",
        Page: 1,
      },
    ];
    
    const output = data.map(({ Text: text }) => text);
    
    console.log(output);

    See