Search code examples
pythonopencvmachine-learningartificial-intelligencetesseract

How to recognize numbers in an image in Python?


I'm currently trying to write a bot to play tetris on tetrisfriends.com to practice machine-learning, but I've become stuck. I'm trying to find a way to read the players score from the game but Tesseract doesn't recognize the font/numbers and I don't think I can retrain Tesseract to recognize the numbers either because it isn't a full font being used, just numbers.

The image that I'm trying to read the numbers from is this: https://i.sstatic.net/7mCsv.jpg

When I use Tesseract I can get it to recognize other words on the page, just not the numbers which is the part I need.

Does anyone have a way to do this, either by retraining Tesseract, another method, or any other way?


Solution

  • I'm not very familiar with Tesseract in particular, but it might not your best bet here. If the end goal was just to make a bot, you could probably pull the text directly from the app rather than worrying about OCR, but if you want to learn more about machine learning and you haven't done them already the MNIST and CIFAR-10 datasets are fantastic places to start.

    Anyway! The image you're trying to test has very low contrast, and the font is heavily stylised. Looking at the website itself it looks like the characters are coloured yellow:

    before

    If you preprocessed your image so that yellow pixels are black and all others are white you would have a much cleaner source to work with e.g.:

    after

    If you want to push forward with Tesseract for this and the preprocessing isn't enough then you will probably have to retrain it for this font. You will need to prepare a corpus, process it similarly to how you expect your source data to look, and then use something like qt-box-editor to correct the data. This guide should be able to walk you through the basic steps of retraining.