Search code examples
regexuipathuipath-studiouipath-activity

Address Extraction


I want to extract the all addresses from this string. The regex should be generic.

Input string:

ABC MEDICAL CENTER
PO BOX 134
WILSON, NC 27234
SIVER BANK
4235 EXECUTIVE SQ STE 140
LAY JOLLA GA 22037ABC MEDICAL CENTER
PO BOX 134
WILSON, NC 27234
ABC MEDICAL CENTER
P.O.BOX 1624
MILSON, NC 2084
ABC MEDICAL CENTER
P.O.BOX 1689
MILSON, NC 20834
ABC MEDICAL CENTER
P.O.BOX 1625
MILSON, NG 27812

Solution

  • You can use String.split() function or StringTokenizer class to split a comma separated String in Java. 

    import java.util.Arrays;   
    public class Main
    { 
      public static void main(String[] args)     
         {   
           String CSV = "Google,Apple,Microsoft";   
           String[] values = CSV.split(",");   
           System.out.println(Arrays.toString(values));   
         } 
    }
    Output :[Google, Apple, Microsoft]
    

    You can also create an ArrayList by splitting a comma separated String as shown below:

    ArrayList list = new ArrayList(Arrays.asList(values)
    

    If your comma separated String also contains whitespace between values, then you can use the following regular expression to split the CSV string as well as get rid of the leading and trailing whitespaces from individual values.

    String CSV = "Google, Apple, Microsoft";
    String[] values = CSV.split("\\s*,\\s*");
    System.out.println(Arrays.toString(values));
    

    Here \\s* is the regular expression to find zero or more space. 
    \s is the metacharacter to find whitespace including tabs, since \ (forward slash) requires escaping in Java it becomes \ (double slash) and \s becomes \s.  Now coming to * (star or asterisk), it's another special character in regular expression which means any number of times. So \\s* means space any number of times.