Search code examples
javascriptregexcontent-disposition

How can I extract a string between optional quotes?


I'm using a Javascript regular expression to extract "filename" from Content-Disposition HTTP Header.

An example of Content-Disposition value is:

attachment; filename="myFile.pdf"

In some case the server does not enclose filename in quotes:

attachment; filename=myFile.pdf

Case 1 (OK):

var contentDisposition = "attachment; filename=myFile.pdf" // get Content-Disposition from HTTP Header
const fileNameMatch = contentDisposition.match(/filename="?(.+)"?/);
const fileName = fileNameMatch[1];
console.log(fileName); // Expected: myFile.pdf - Actual: myFile.pdf

Case 2 (KO):

var contentDisposition = "attachment; filename=\"myFile.pdf\"" // get Content-Disposition from HTTP Header
const fileNameMatch = contentDisposition.match(/filename="?(.+)"?/);
const fileName = fileNameMatch[1];
console.log(fileName); // Expected: myFile.pdf - Actual: myFile.pdf"

In the Case 2 the expected result is: myFile.pdf while actual: myFile.pdf" (last quote is not removed)

How can I fix the regular expression in order to get the Case 2 works?


Solution

  • Non-greedy +? doesn't work before an optional, try an explicit class [^"]:

     re = /filename="?([^"]+)"?/
    
     contentDisposition = `attachment; filename="myFile.pdf"`
     console.log(contentDisposition.match(re)[1])
    
     contentDisposition = `attachment; filename=myFile.pdf`
     console.log(contentDisposition.match(re)[1])

    Another (and perhaps better) option would be to anchor the whole thing:

    re = /filename="?(.+?)"?$/
    
    s = `attachment; filename="myFile.pdf"`
    console.log(s.match(re)[1])
    
    s = `attachment; filename=myFile.pdf`
    console.log(s.match(re)[1])