Search code examples
phparraysstringexplode

Explode() String Breaking Array Into Too Many Elements


I am working on scraping and then parsing an HTML string to get the two URL parameters inside the href. After scraping the element I need, $description, the full string ready for parsing is:

<a target="_blank" href="CoverSheet.aspx?ItemID=18833&amp;MeetingID=773">Description</a><br>

Below I use the explode parameter to split the $description variable string based on the = delimiter. I then further explode based on the double quote delimiter.

Problem I need to solve: I want to only print the numbers for MeetingID parameter before the double quote, "773".

<?php
echo "Description is: " . htmlentities($description); // prints the string referenced above
$htarray = explode('=', $description); // explode the $description string which includes the link. ... then, find out where the MeetingID is located
echo $htarray[4] .  "<br>"; // this will print the string which includes the meeting ID: "773">Description</a><br>"

$meetingID = $htarray[4];
echo "Meeting ID is " . substr($meetingID,0,3); 
?>

The above echo statement using substr works to print the meeting ID, 773.

However, I want to make this bulletproof in the event MeetingID parameter exceeds 999, then we would need 4 characters. So that's why I want to delimit it by the double quotes, so it prints all numbers before the double quotes.

I try below to isolate all of the amount before the double quotes... but it isn't seeming to work correctly yet.

<?php
 $htarray = explode('"', $meetingID); // split the $meetingID string based on the " delimiter
 echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
 echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
 echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3

?>

Question, why is the array $meetingID[0] not printing the THREE numbers before the delimiter, ", but rather just printing a single number? If the explode function works properly, shouldn't it be splitting the string referenced above based on the double quotes, into just two elements? The string is

"773">Description</a><br>"

So I can't understand why when echoing after the explode with double quote delimiter, it's only printing one number at a time..


Solution

  • The reason you're getting the wrong response is because you're using the wrong variable.

    $htarray = explode('"', $meetingID);
    
    echo "Meeting ID0 is " . $meetingID[0] ; // this prints just the first number, 7
    echo "Meeting ID1 is " . $meetingID[1] ; // this prints just the second number, 7
    echo "Meeting ID2 is " . $meetingID[2] ; // this prints just the third number, 3
    
    echo "Meeting ID is " . $htarray[0] ; // this prints 773
    

    There's an easier way to do this though, using regular expressions:

    $description = '<a target="_blank" href="CoverSheet.aspx?ItemID=18833&amp;MeetingID=773">Description</a><br>';
    
    $meetingID = "Not found";
    if (preg_match('/MeetingID=([0-9]+)/', $description, $matches)) {
        $meetingID = $matches[1];
    }
    
    echo "Meeting ID is " . $meetingID;
    // this prints 773 or Not found if $description does not contain a (numeric) MeetingID value