Search code examples
phpregextext-parsing

Parse string into its 3 parts: name, age, city


I want to parse a line of text into three distinct parts (name, age, and city).

Example 1:

Muhammad Sholehhudin 24 Old Malang

Expected Result:

Name: Muhammad Sholehhudin
Age: 24 Old
City: Malang

The name and city substrings may contain one or more words.

Example 2:

Muhammad Sholehhudin Fauzi 51 Old Malang Kota

Expected Result:

Name: Muhammad Sholehhudin Fauzi
Age: 51 Old
City: Malang Kota

Solution

  • You can use preg_split() to split up text based on a pattern.

    The regex is split up into a few sections divided by the '|' witch means or.
    1. (^([A-z]+\s){2,3})
    - the '^' means from the beginning of the string.
    - [A-z]+ = any letter from A-z the '+' means 1 or more times.
    - '{2,3}' = that pattern 2 or three times.
    2. \d{1,3} Old = any digit 1 to 3 times followed by 'Old'.
    3. (\w+\s?){1,2}$
    - '\w+' any word character one or more times.
    - '\s?' an optional space.
    - '$' means the end of the line

    $input = $_POST['input_name'];
    
    $pattern = '/((^([A-z]+ ){2,3})|(\d{1,3} Old)|(\w+\s?){1,2}$)/';
    
    $matches = preg_split($pattern, $input, -1, PREG_SPLIT_DELIM_CAPTURE);
    
    $name = $matches[1];
    $age = $matches[5];
    $city = $matches[10];
    
    echo 'Name: ' . $name;
    echo ' Age: ' . $age;
    echo ' City: ' . $city;