Search code examples
phpregexvalidationisbn

regex differentiating between ISBN-10 and ISBN-13


I have an If-else statement which checks a string to see whether there is an ISBN-10 or ISBN-13 (book ID).

The problem I am facing is with the ISBN-10 check which occurs before the ISBN-13 check, the ISBN-10 check will match anything with 10 characters or more and so may mistake an ISBN-13 for an ISBN-10.

here is the code...

$str = "ISBN:9780113411436";

if(preg_match("/\d{9}(?:\d|X)/", $str, $matches)){
   echo "ISBN-10 FOUND\n";  
   //isbn returned will be 9780113411
   return 0;
}

else if(preg_match("/\d{12}(?:\d|X)/", $str, $matches)){
   echo "ISBN-13 FOUND\n";
   //isbn returned will be 9780113411436
   return 1;
}

How do I make sure I avoid this problem?


Solution

  • You really only need one regex for this. Then do a more efficient strlen() check to see which one was matched. The following will match ISBN-10 and ISBN-13 values within a string with or without hyphens, and optionally preceded by the string ISBN:, ISBN:(space) or ISBN(space).

    Finding ISBNs :

    function findIsbn($str)
    {
        $regex = '/\b(?:ISBN(?:: ?| ))?((?:97[89])?\d{9}[\dx])\b/i';
    
        if (preg_match($regex, str_replace('-', '', $str), $matches)) {
            return (10 === strlen($matches[1]))
                ? 1   // ISBN-10
                : 2;  // ISBN-13
        }
        return false; // No valid ISBN found
    }
    
    var_dump(findIsbn('ISBN:0-306-40615-2'));     // return 1
    var_dump(findIsbn('0-306-40615-2'));          // return 1
    var_dump(findIsbn('ISBN:0306406152'));        // return 1
    var_dump(findIsbn('0306406152'));             // return 1
    var_dump(findIsbn('ISBN:979-1-090-63607-1')); // return 2
    var_dump(findIsbn('979-1-090-63607-1'));      // return 2
    var_dump(findIsbn('ISBN:9791090636071'));     // return 2
    var_dump(findIsbn('9791090636071'));          // return 2
    var_dump(findIsbn('ISBN:97811'));             // return false
    

    This will search a provided string to see if it contains a possible ISBN-10 value (returns 1) or an ISBN-13 value (returns 2). If it does not it will return false.

    See DEMO of above.


    Validating ISBNs :

    For strict validation the Wikipedia article for ISBN has some PHP validation functions for ISBN-10 and ISBN-13. Below are those examples copied, tidied up and modified to be used against a slightly modified version of the above function.

    Change the return block to this:

        return (10 === strlen($matches[1]))
            ? isValidIsbn10($matches[1])  // ISBN-10
            : isValidIsbn13($matches[1]); // ISBN-13
    

    Validate ISBN-10:

    function isValidIsbn10($isbn)
    {
        $check = 0;
    
        for ($i = 0; $i < 10; $i++) {
            if ('x' === strtolower($isbn[$i])) {
                $check += 10 * (10 - $i);
            } elseif (is_numeric($isbn[$i])) {
                $check += (int)$isbn[$i] * (10 - $i);
            } else {
                return false;
            }
        }
    
        return (0 === ($check % 11)) ? 1 : false;
    }
    

    Validate ISBN-13:

    function isValidIsbn13($isbn)
    {
        $check = 0;
    
        for ($i = 0; $i < 13; $i += 2) {
            $check += (int)$isbn[$i];
        }
    
        for ($i = 1; $i < 12; $i += 2) {
            $check += 3 * $isbn[$i];
        }
    
        return (0 === ($check % 10)) ? 2 : false;
    }
    

    See DEMO of above.