Search code examples
c#.netpdfpassword-protection

how can a password-protected PDF file be opened programmatically?


The Adobe IFilter doesn't provide a mechanism to supply a password to open a password-protected PDF file, so it cannot be used to open password-protected files.

I was wondering, is there a relatively straightforward way to programmatically retrieve the actual encrypted data inside the PDF file, decrypt it using a standard cryptography API, and then build a new PDF file with the decrypted data?


Solution

  • To open a password protected PDF you will need to develop at least a PDF parser, decryptor and generator. I wouldn't recommend to do that, though. It's nowhere near an easy task to accomplish.

    With help of a PDF library everything is much simpler. You might want to try Docotic.Pdf library for the task (disclaimer: I work for the vendor of the library).

    Here is a sample for you task:

    public static void unprotectPdf(string input, string output)
    {
        bool passwordProtected = PdfDocument.IsPasswordProtected(input);
        if (passwordProtected)
        {
            string password = null; // retrieve the password somehow
    
            using (PdfDocument doc = new PdfDocument(input, password))
            {
                // clear both passwords in order
                // to produce unprotected document
                doc.OwnerPassword = "";
                doc.UserPassword = "";
    
                doc.Save(output);
            }
        }
        else
        {
            // no decryption is required
            File.Copy(input, output, true);
        }
    }
    

    Docotic.Pdf can also extract text (formatted or not) from PDFs. It might be useful for indexing (I guess it's what you are up to because you mentioned Adobe IFilter)