I am creating PDF documents to train and test an AI model.
I generate the files using LaTeX templates and .NET code.
The model will have to work with scanned PDF documents. Is there any way in LaTeX to randomly give the document features of a scan (blur, black dots, ...). I want every image to look a bit different and not "clean".
I searched through the internet but was unable to find a way to do that. I tried compressing the document after creation which sadly has almost no effect. The option for blurring was to duplicate the text with a bit of offset, but that also does not lead to a result that looks scanned.
Thanks in advance Paul
The solution for that is ImageMagick which is a PDF and image manipulation program. There is also a NuGet package to include in .NET projects.
It can modify the LaTeX generated result document using displacement and image multiplication.