Search code examples
iosios5pdfpdf-generationpdf-extraction

iOS getting text from pdf


Hello i'm working on a speedreading app and i'm looking for some tips or suggestions. In this app i have to use different reading techniques this requires formatting the text in different sizes from a pdf. for techniques as auto scrolling without pictures. Does someone already know who to do this? or has an example for me?


Solution

  • IF the PDF contains text that is weirdly formatted or contained in images you are without luck, else there are several ObjC libraries available (on github)

    they all wrap the CoreGraphics CDPDF* Functions

    this isn't that easy and cant be answered in a one-liner but the basic approach is:

    1. get a CGPDFDocument
    2. get each PDFPage
    3. get the CGPDFDictionary for each page and parse it. it will give you ALL objects in the pdf page
    4. foreach string you encounter, call CGPDFStringCopy and append it to a mutableString that serves as your buffer
    5. the buffer is the doc's text