I want this script to read each line (all are urls) from a text file and parse it to check whether two given words exist in any of the urls of that particular website. I also want all the urls (lines) in the text file be printed serially numbered. This code finds out the two words but I am not sure whether they are from the same url of that site. It displays the number of times the given words occur instead of the serial number.
<?php
$mysearch = file("phpelist.txt");
for($index = 0; $index <count($mysearch); $index++)
{
$mysearch[$index] = str_replace("\n", "", $mysearch[$index]);
$data = file_get_contents("$mysearch[$index]");
$searchTerm1 = 'about';
if (stripos($data, $searchTerm1) !== false) {
echo "$counter$.mysearch[$index]... FOUND WORD $searchTerm1<br>";
$searchTerm2 = 'us';
if (stripos($data, $searchTerm2) !== false) {
echo "... FOUND WORD $searchTerm2<br>";
}
}
else
{
echo "<br>";
echo "$mysearch[$index]...not found<br>";
}
}
?>
The output of the script is as follows:
'url1'...not found
'url2'...not found
'url3'...not found
'url4'...not found
'url5'...not found $.mysearch[5]... FOUND WORD about ... FOUND WORD us $.mysearch[6]... FOUND WORD about ... FOUND WORD us $.mysearch[7]... FOUND WORD about ... FOUND WORD us
'url6'...not found $.mysearch[9]... FOUND WORD about ... FOUND WORD us $.mysearch[10]... FOUND WORD about ... FOUND WORD us $.mysearch[11]... FOUND WORD about ... FOUND WORD us $.mysearch[12]... FOUND WORD about ... FOUND WORD us $.mysearch[13]... FOUND WORD about ... FOUND WORD us
You can do it in a simple way by a function like this:
<?php
function findWordsInString($str, $arr) {
$found = array();
foreach ($arr as $cur) {
if (stripos($str, $cur) !== false)
$found[$cur] = stripos($str, $cur);
}
return $found;
}
?>
And then, with the returned value, you can run array_keys
to get the string values that are found. The index of them stores the position.
Let's take an example:
$str = "Hello, world. How are you?";
$arr = array("Hello", "world", "you", "Hi");
This would give an output like this:
array(3) {
["Hello"]=>
int(0)
["world"]=>
int(7)
["you"]=>
int(22)
}
Only Hello
, world
, and you
are found in this case, and they are in the positions of 0
, 7
, and 22
.