Search code examples
javaregexstringseleniumstring-matching

Regular Expressions match case replace


How do I make an expression to match some character which get's repeated and followed by some other character.

For example input strings may be like any of the below.

 //window[2]//header[@id=\'top\']/div[1]//a[1]
 //window[2]//header[@id=\\'top\\']/div[1]//a[1]
 //window[2]//header[@id=\\\'top\\\']/div[1]//a[1]
 //window[2]//header[@id=\\\\'top\\\\']/div[1]//a[1]
 //window[2]//header[@id=\\\\\'top\\\\\']/div[1]//a[1]

 (OR)

 //window[2]//header[@id=~~~~~'top~~~~~']/div[1]//a[1]

and the expected output should be like below specified . using ragex replace all.

//window[2]//header[@id='top']/div[1]//a[1]

i have tried with these regular expression

xpathJSON.replaceAll("/[~{1,}[']]/", "'")
xpathJSON.replaceAll("/^[~+]&&[']$/", "'")

but no use.

Test code:

public static void main(String[] args) {
    String xpathJSON = "//window[2]//header[@id=\"top\"]/div[1]//a[1]"; // « //window[2]//header[@id=\\\\\'top\\\\\']/div[1]//a[1]

    for (int i = 0; i < 5; i++) {
        xpathJSON = xpathJSON.replaceAll("\"", "\'");

        // As the windows navigation forward and backward this replace takes place.
        xpathJSON = xpathJSON.replaceAll("\'", "\\\\\'"); // \' to \\'
        System.out.println("\t « "+xpathJSON);
    }

    System.out.println("xapthJSON \n\t"+xpathJSON);

    xpathJSON = xpathJSON.replaceAll("\\\\", "~");
    System.out.println( xpathJSON );
    // http://www.regular-expressions.info/wordboundaries.html
    // https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
    Pattern p = Pattern.compile
            ("[~]");
            //("^[~+]&&[']$"); // ^begning +followed By $end {\\\\ - ~ = \}
    Matcher matcher = p.matcher( xpathJSON );

    boolean match = false, find = false;
    if ( matcher.matches() ) match = true;
    if ( matcher.find() )    find  = true; // finds the next expression that matches the pattern.

    int from = 0;
    int count = 0;
    while(matcher.find(from)) {
        count++;
        from = matcher.start() + 1;

        // another approach is to break when \' index is reached.
    }
    System.out.println(count);

    System.out.format("\t Match[%s] Find[%s]\n", match, find);

    System.out.println("regular expression : "+ xpathJSON.replaceAll("/[~{1,}[']]/", "'"));

    while( xpathJSON.contains("~'") ) {
        xpathJSON = xpathJSON.replaceAll("~'", "'");
    }
    System.out.println("Contains Replace : "+ xpathJSON);
}

Solution

  • Here is another one:

    ([\\~]+(['"]))(.*?)\1

    Replace with:

    $2$3$2

    https://regex101.com/r/OFojVx/1/

    In Java:

    .replaceAll("([\\\\~]+(['\"]))(.*?)\\1", "$2$3$2")