Search code examples
javaregexdouble-quotes

restricting the regular expression only for a line


I have a CSV file below from one of the system.

""demo"",""kkkk""
""demo " ","fg"
"     " demo"  "
"demo"
"value1","" frg"   ","vaue5"
"val3",""tttyy "      ",""hjhj","ghuy"

Objective is get all the 2 pair double quotes removed and only one set of double quote is allowed like below. The spaces between the sets of double quote is not a fixed value. This has to be handled in a Java program using replaceAll

function in Java

"demo","kkkk"
"demo","fg"
"demo"
"demo"
"value1","frg","vaue5"
"val3","tttyy","hjhj","ghuy"

I tired this on regex101 with "[ ]*" and it works for PHP>=7.3 version but not in Java. Also tried [\"][\"]|[^\"]\s+[\"] but still not getting desired output. Any suggestion please for the regular expression which can be used in Java program?


Solution

  • Based on shown sample data, you can use:

    String repl = str.replaceAll("(?:\\h*\"){2}\\h*", "\"");
    

    RegEx Demo

    RegEx Details:

    • (?:\h*\"){2}: Match a pair of double quotes that have 0 or more whitespaces between them
    • \h*: Match 0 or more whitespace
    • Replacement is just a "