Search code examples
perl

How can I split a string on the first occurrence of a digit


I have strings which consist of a name and two digits. I would like to extract the name and the digits into one variable for each. The problem I have is that some names have spaces in them. When I split on /\s+/ the name is split into two.

my (${st_name}, $val1, $val2) = split(/\s+/, $line, 3);

I have tried to split on /\d+/, I do not get the digits. I have tried to get the index of the first digit, not sure if it is really

my $index = index ($line, \d);

I will appreciated any assistance. Code tried

use strict;
use warnings;

while (my $line = <DATA>){
my (${st_name}, $val1, $val2) = split(/\s+/, $line, 3);   #doesn't work

my $index = index ($line, \d);
${st_name}=$line(0, $index);
my ($val1, $val2) = $line($index)


__DATA__
Maputsoe 2       1
Butha-Buthe (Butha-Buthe District) 2       1

Solution

  • You can make a regular expression match and capture the pieces you want. Looks like you want some text, then a space, then a number, more space(s), and another number?

    use strict;
    use warnings;
    
    while (my $line = <DATA>) {
        my ($st_name, $val1, $val2) = $line =~ m/^(.+)\s+(\d+)\s+(\d+)/;
        print "$st_name, $val1, $val2\n";
    }
    
    __DATA__
    Maputsoe 2       1
    Butha-Buthe (Butha-Buthe District) 2       1
    

    This prints

    Maputsoe, 2, 1
    Butha-Buthe (Butha-Buthe District), 2, 1
    

    The regular expression matches one or more (+) characters (.), followed by one or more spaces (\s), followed by \d numbers, and again spaces and numbers.