Search code examples
javascriptreactjspdf.js

How To Extract content from pfd file using javascript


Hello i am trying different ways to extract content from a pdf file but nothing works for me , like pdf.js when i try using pdf.js it also show error and i dont know why . I am trying this approach

var loadingTask = window.pdfjsLib.getDocument('dummy.pdf');
loadingTask.promise.then(function(pdf) {
 console.log(pdf)
});

But it is showing an error Can not read property of undefine getDocument. Please Guys Help Me Extracting content from pdf using pdf.js or anything possible


Solution

  • It should work if you import it correctly.

    pdfjsLib.GlobalWorkerOptions.workerSrc = "https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.6.347/pdf.worker.min.js";
    
    var loadingTask = pdfjsLib.getDocument("dummy.pdf");
    loadingTask.promise.then(function(pdf) {
     console.log(pdf);
    });
    <script src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.6.347/pdf.min.js"></script>

    In React it should be something like this:

    import * as pdfjsLib from "pdfjs-dist";
    pdfjsLib.GlobalWorkerOptions.workerSrc = `https://cdnjs.cloudflare.com/ajax/libs/pdf.js/${pdfjsLib.version}/pdf.worker.min.js`;
    
    const loadingTask = pdfjsLib.getDocument("dummy.pdf");
    loadingTask.promise.then(function(pdf) {
      console.log(pdf);
    });