Search code examples
sqlregexperlinformixisql

PERL: Regular Expression for reading Social Security # with Dashes


I am writing a perl script that reads Social Security numbers from a file, looks up info across multiple tables and outputs to a delimited file. I'm writing with PERL 5 and dealing with IBM informix. I have the feeling that it is my REGEX that is the issue. I am getting multiple lines of the following error:

DBD::Informix::st execute failed: SQL: -1213: A character to numeric conversion process failed at ./corylist.pl line 61, line 461. DBD::Informix::st fetchrow_array failed: SQL: -400: Fetch attempted on unopen cursor. at ./corylist.pl line 63, line 461.

Can someone please nudge me in the right direction? Thank you!

while(<IN>) {
    $id = $_;
    chomp $id;
    $id =~ m/^\d{3}-\d{2}-\d{4}$/;
    #print "$id\n";

$STMT = <<EOF;
select  i.ss_no,
    i.fullname,             i.firstname,i.lastname,i.addr_line1,i.addr_line2,i.city,i.st,i.zip,r.res_ctry,r.res_cty,
    i.phone,NVL(aa.phone," ") cell,NVL(a.line1," ") stuemail,NVL(pa.line1," ") peremail

  from  id i,
    prof r,
    outer aa_rec a,
    outer aa_rec aa,
    outer aa_rec pa
where   i.ss_no = $id
and     i.id = r.id
and     i.decsd <> "Y"
and a.id = i.id and a.aa = "EML" and a.end_date is null
and pa.id = i.id and pa.aa = "OEML" and pa.end_date is null
and pa.beg_date = (select max(beg_date) from aa_rec where aa = "OEML" and id=$id and end_date is null)
and aa.id = i.id and aa.aa = "CELL" and aa.end_date is null
group by     ss_no,fullname,firstname,lastname,addr_line1,addr_line2,city,st,zip,res_ctry,res_cty,phone,cell,stuemail,peremail
order   by fullname, ss_no
EOF

$sth = $db1->prepare($STMT);
$sth->execute();

while (($id,$fullname,$fname,$lname,$addr1,$addr2,$city,$st,$zip,$ctry,$cnty,$phone,$cell,$stuemail,$peremail) = $sth->fetchrow_array()) {

    $x = $id." | ". $fullname." | ";
    $x .= $fname." | ".$lname." | ".$addr1." | ".$addr2." | ".$city." | ".$st." | ".$zip." | ".$ctry." | ".$cnty." | ";
    $x .= $phone." | ".$cell." | ".$stuemail." | ".$peremail." | \n";
    print $out_fh $x;
}

Solution

  • Your regex is fine, but it's not doing anything.

    $id =~ m/^\d{3}-\d{2}-\d{4}$/;
    

    That line will be true if $id matches the pattern. It does nothing else.

    A character to numeric conversion process failed

    The error message says that your database wants a number, but gets something it cannot convert what it got. Since you use $id in the query, that must be the dashes. So it's save to assume that your SSN is an integer of some sort.

    The easiest way to get rid of the error message is to just remove anything that's not a number. That will get rid of dashes - and whatever else people thought to input.

    while ( my $id = <IN>) {
        chomp $id;
        $id =~ s/\D//g; # remove any non-digits
    
        # ...
    }
    

    Now you can do the insert. But you should really not insert data by writing the variable directly into your SQL without proper quoting. That's an invitation for SQL injection. Instead, use placeholders.

    my $sql = "SELECT * FROM foo WHERE bar=?";
    

    Now when you execute the prepared statement, you pass along $id.

    my $sth = $dbh->prepare($sql);
    $sth->execute($id);
    

    If you are dealing with a large file it might be a good idea to use fetchrow_arrayref or fetchrow_hashref instead of fetchrow_array because all the copying of variables is pretty expensive. Also look at this presentation to learn more about using DBI in a fast way.

    You might want to look into SSN::Validate for actually validating the social security numbers. If you use that, it seems like you don't need to do the clean-up suggested above.

    You could also look at Text::CSV for a cleaner way of creating your CSV output.