Search code examples
c#.netamazon-web-servicesaws-sdkamazon-textract

How to pass the image converted in bytes to request in detectdocumenttextresult and detectdocumenttextresponse using .Net?


Not able to pass the image which is converted to bytes to request. Due to which i am unable to use detectdocumenttextresult and detectdocumenttextresponse

This is the java code i have tried to convert in c#

string document = "input.png";

ByteBuffer imageBytes;
using (Stream inputStream = new FileStream(document, FileMode.Open, FileAccess.Read)) {
    imageBytes = ByteBuffer.wrap(IOUtils.toByteArray(inputStream));
}
AmazonTextract client = AmazonTextractClientBuilder.defaultClient();

DetectDocumentTextRequest request = (new DetectDocumentTextRequest()).withDocument(new Document()
                    .withBytes(imageBytes));

DetectDocumentTextResult result = client.detectDocumentText(request);

/* this is the c# code i am not able to pass the data to request*/

AmazonTextractClient Atc = new AmazonTextractClient(credentials, config);
Image img = Image.FromFile("D:\\Images\\1.Jpeg");
byte[] ImageBytes = (byte[])(new ImageConverter()).ConvertTo(img, typeof(byte[]));
DetectDocumentTextRequest request = new DetectDocumentTextRequest();
request.Document.Bytes.Read(ImageBytes, 0 , ImageBytes.Length);
DetectDocumentTextResponse res = Atc.DetectDocumentText(request);

Solution

  • Even though the property says bytes, it wants a raw memory stream. Photo is the file location of your image. Client is your AmazonTextractClient client however you want to instantiate it.

    var client = new AmazonTextractClient("[KEY ID]", "[ACCESS KEY]", Amazon.RegionEndpoint.USEast1); 
    
    Document MyDocument;
    using (Image image = Image.FromFile(photo))
    {
        using (MemoryStream m = new MemoryStream())
        {
            image.Save(m, image.RawFormat);
            MyDocument = new Document()
            {
                Bytes = m
            };
        }
    }
    

    Then for DetectDocumentTextRequest()

    var request = new DetectDocumentTextRequest()
    {
        Document = MyDocument
    };
    
    var response = client.DetectDocumentText(request);
    

    AnalyzeDocumentRequest() also works

    var DocRequest = new AnalyzeDocumentRequest()
    {
        Document = MyDocument,
        FeatureTypes = new List<string> { FeatureType.FORMS, FeatureType.TABLES }
    };
    
    var response =  client.AnalyzeDocument(DocRequest);