Search code examples
javaregexconditional-statements

Is it possible to replace too many "contains" by a regex in this case?


I retrieve some content from an excel file, precisely some Ids, officially the delimitor between each Id is a ',' and I need to ignore the line if there are other delimitors or some things that aren't correct like spaces etc ...

Example :

Nominal case : value = "8000,7000,7500,840000,870"

Wrong case : value = "8000;7000;7500,840000 870"

OR

value = "8000 7000 84000 870"

I tought at first to do something like that :

        while (rows.hasNext()) {
                    TableRow row = rows.next();

                    //second parameter of getCellValueAsList is the delimitor
                    **definitiveMediaToDeleteList = row.getCellValueAsList("A", ",");
                    if(definitiveMediaToDeleteList.contains(";") || definitiveMediaToDeleteList.contains("") || definitiveMediaToDeleteList.contains("")){
                        REPORT.warn("Incorrect delimitors row {}", row);
                        continue
                    }**

But I think it's the wrong way to deal with this problem, plus I will never cover all the wrong cases that I can face with what i'm retrieving with row.getCellValueAsList("A", ",")

How can I use a regex or how can I deal with it ?

EDIT : Here are some more informations for what is allowed or not :

I should have ids, each separated by a "," , no spaces, no other delimitors like ";" or "/" or anything else. And I can of course have one ID exactly


Solution

  • You can try out a regex with some input strings like this:

    import java.util.regex.Pattern;
    
    public class so73895507 {
    
        static Pattern pattern = Pattern.compile("^(?:\\d+,)*\\d+$");
    
        public static void main(String[] args) {
            checkString("8000,7000,7500,840000,870");  // nominal many
            checkString("8000");                       // nominal single
            checkString("8000;7000;7500,840000 870");  // wrong 1
            checkString("8000 7000 84000 870");        // wrong 2
            checkString("8000,");                      // wrong 3
        }
    
        static void checkString(String str) {
            boolean check = pattern.matcher(str).find();
            System.out.println(String.format("%-32s -> %s", str, check));
        }
    
    }
    

    Output:

    8000,7000,7500,840000,870        -> true
    8000                             -> true
    8000;7000;7500,840000 870        -> false
    8000 7000 84000 870              -> false
    8000,                            -> false
    

    The discussion of @erik258 and @Carapace has good points, maybe ^(\d+,)+\d+$ or ^(?:\d+,)+\d+$ is better suited for your use case - however, both of them would reject a single ID in a cell. But we can only guess what the your input data may look like...

    Edit: Updated answer to reflect new info (single values should be accepted).