Search code examples
vb.netocrvisual-studio-2022.net-4.8

Scanning line by line with Windows.Media.Ocr engine


I'm using Windows.Media.OCR engine to scan these two lines

But the software scan them like that:

enter image description here

While I'm expecting it to scan like:

KIBA/USDT 0.00003826 6.31M KIBA 241.68459400 USDT

KIBA/USDT 0.00003470 17.13M KIBA 594.48387000 USDT

The code I'm using is:

'require references: "C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd"
'"C:\ProgramFiles(x86)\ReferenceAssemblies\Microsoft\Framework.NETCore\v4.5\System.Runtime.WindowsRuntime.dll"
' and windows 10 sdk
Imports Windows.Media.Ocr
Imports System.IO
Imports System.Runtime.InteropServices.WindowsRuntime
Public Class Form1
    Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
        Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
            Using g As Graphics = Graphics.FromImage(bmp)
                Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
                g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
                Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
                    bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
                    Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
                    softwareBmp = Await decoder.GetSoftwareBitmapAsync()
                End Using
            End Using
        End Using

        Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))

        Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
        For Each language In languages
            Console.WriteLine(language.LanguageTag)
        Next
        Dim r = ocrEng.RecognizerLanguage
        Dim n = ocrEng.MaxImageDimension

        Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
        RichTextBox1.Text = ocrResult.Text
        
    End Sub
End Class

Which kind of change does this code needs in order to scan by row and not by column?

edit: Binary: code

so there is 0D 0A between rows

The complete rows image: enter image description here

but I didn't post it before cause I anyway I will need to scan only from 0.000038 ecc to 0.0000%


Solution

  • I chose to act on the output string instead of tackling the OCR API.

    Fixing the issue within the OCR API would probably be a superior solution if possible, but I could not get your code properly referenced in my system.

    So you can add this function to transpose the string

    Private Function transpose(input As String) As String
        Dim numberOfColumns = 4 ' this must be known and could be a parameter to this function
        Dim fixedInput = input.Replace(" KIBA", "|KIBA").Replace(" USDT", "|USDT")
        Dim splitInput = fixedInput.Split(" "c)
        Dim numberOfWords = splitInput.Count()
        Dim numberOfRows = numberOfWords / numberOfColumns 
        Dim words As New List(Of String)()
        For row = 0 To numberOfRows - 1
            For col = 0 To numberOfColumns - 1
                words.Add(splitInput(CInt(row + numberOfRows * col)))
            Next
        Next
        Dim sb As New System.Text.StringBuilder()
        For i = 0 To words.Count() - 1
            sb.Append(words(i).Replace("|", " "))
            If (i <> words.Count() - 1) Then
                sb.Append(If((i + 1) Mod numberOfColumns = 0, Environment.NewLine, vbTab))
            End If
        Next
        Return sb.ToString()
    End Function
    

    Simply pass your ocr output string through it. Here it is called in your code

    Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
        Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
            Using g As Graphics = Graphics.FromImage(bmp)
                Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
                g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
                Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
                    bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
                    Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
                    softwareBmp = Await decoder.GetSoftwareBitmapAsync()
                End Using
            End Using
        End Using
        Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))
        Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
        For Each language In languages
            Console.WriteLine(language.LanguageTag)
        Next
        Dim r = ocrEng.RecognizerLanguage
        Dim n = ocrEng.MaxImageDimension
        Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
        RichTextBox1.Text = transpose(ocrResult.Text)
    End Sub
    

    I tested with this function

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim input = "0.00003599 0.00003599 104.1K KIBA 23.22M KIBA 3.74655900 USDT 835.89654200 USDT 0.0000% 0.0000%"
        Dim output = transpose(input)
    End Sub
    

    Input:

    0.00003599 0.00003599 104.1K KIBA 23.22M KIBA 3.74655900 USDT 835.89654200 USDT 0.0000% 0.0000%

    Output:

    0.00003599 104.1K KIBA 3.74655900 USDT 0.0000%
    0.00003599 23.22M KIBA 835.89654200 USDT 0.0000%

    Note you need to fix your string to temporarily replace any sentence with multiple words by replacing the space with a pipe | so they are not split, and if you encounter more examples of this you can continue adding Replace according to the code. If the pipe turns out to be a valid character replace it with some other character you will never see.

    Dim fixedInput = input.Replace(" KIBA", "|KIBA").Replace(" USDT", "|USDT")
    ...
    sb.Append(words(i).Replace("|", " "))
    

    Another solution, again working on the incorrect string by transposing, but this time the output will be a class which you can work with.

    Make a class to represent your data

    Public Class KibaClass
        Public Property Price As Decimal
        Public Property VolumeKIBA As Decimal
        Public Property VolumeUSDT As Decimal
        Public Property Percent As Decimal
    End Class
    

    And a different function to parse into this class

    Private Function transposeToClass(input As String) As IEnumerable(Of KibaClass)
        Dim numberOfColumns = 4
        Dim fixedInput = input.Replace(" KIBA", "|KIBA").Replace(" USDT", "|USDT").Trim()
        Dim splitInput = fixedInput.Split(" "c)
        Dim numberOfWords = splitInput.Count()
        Dim numberOfRows = numberOfWords / numberOfColumns ' 2
        Dim words As New List(Of String)()
        For row = 0 To numberOfRows - 1
            For col = 0 To numberOfColumns - 1
                words.Add(splitInput(CInt(row + numberOfRows * col)))
            Next
        Next
        Dim kibas As New List(Of KibaClass)()
        For row = 0 To numberOfRows - 1
            Dim rowOffset = CInt(row * numberOfColumns)
            Dim kiba = New KibaClass With {
                .Percent = CDec(words(3 + rowOffset).Replace("%", "")) / 100,
                .Price = CDec(words(0 + rowOffset))}
    
            Dim multiplier As Double
    
            Dim splitVolume = words(1 + rowOffset).Split("|"c)(0)
            Dim lastChar = Convert.ToChar(splitVolume.Last())
            Dim volume = splitVolume
    
            If Not Char.IsDigit(lastChar) Then
                volume = splitVolume.Substring(0, splitVolume.Length - 1)
                Select Case lastChar.ToString().ToUpper()
                    Case "T"
                        multiplier = 1000000000.0
                    Case "M"
                        multiplier = 1000000.0
                    Case "K"
                        multiplier = 1000.0
                    Case Else
                        multiplier = 1.0
                End Select
            End If
            kiba.VolumeKIBA = CDec(CDbl(volume) * multiplier)
    
            splitVolume = words(2 + rowOffset).Split("|"c)(0)
            lastChar = Convert.ToChar(splitVolume.Last())
            volume = splitVolume
            If Not Char.IsDigit(lastChar) Then
                volume = splitVolume.Substring(0, splitVolume.Length - 1)
                Select Case lastChar.ToString().ToUpper()
                    Case "T"
                        multiplier = 1000000000.0
                    Case "M"
                        multiplier = 1000000.0
                    Case "K"
                        multiplier = 1000.0
                    Case Else
                        multiplier = 1.0
                End Select
            End If
            kiba.VolumeUSDT = CDec(CDbl(volume) * multiplier)
    
            kibas.Add(kiba)
        Next
        Return kibas
    End Function
    
    Dim output1 = transposeToClass(input)
    

    This holds an IEnumerable of your class which you can enumerate into multiple instances of that object with properties in the proper format representing the columns you originally OCR'd.