Search code examples
phpfull-text-searchsearch-engine

Fulltext-search in php without database


I have a really small webpage written in php (approx. 5 pages + blog entries). All pages are located in php files on the server side (no database is used). So far I managed to search inside my 'blog entries' - because these are just plain textfiles with HTML markup (I strip the tags & performing a search operation):

$file_name=array();
$search_string="";
if(isSet($_GET["query"])){
    $search_string=$_GET["query"];
}
$search_result="";
$files="";
$phpfilename="";
$i=0;   
if (!$search_string){
    echo 'No query entered<br />';
}else{
    if ($handle = opendir('content/')) { 
        while (false !== ($file = readdir($handle))){
            if(strrchr($file, '.') === ".txt"){
                $filename[]= $file;
            }
        } 
        closedir($handle); 
    }
    foreach($filename as $value){
        $files="content/$value";
        $fp = strip_tags(file_get_contents($files));
        if(stripos($fp, $search_string)) {
            $search_result.=preg_replace('/<[^>]*>[^<]*<[^>]*>/', '', substr($fp,0,255)); // append a preview to search results
        }
        if($search_result!=""){
            echo $search_result;
        }else{
            echo "No Results<br />";
        }
    }
}

Of course that works just because the files are plain text. But I've got also pages that are real 'php' files and want to perform a search operation on them too. But I don't want to search inside the 'php code' of course. I figured out, that I would need the preparsed files that the browser gets from the webserver - I thought about using file_get_contents()‎ with http requests to all my pages (ok, 'just' about 5 pages but still)...

I've read here on SO that it's considered bad practice to do so and it feels like I'm taking the wrong approach.

Any ideas & suggestions would be highly appreciated.

Edit: A example for a regular page that I want to be able to search in

index.php

<?php ob_start(); require_once("./include/common.php"); ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title><?php echo $lang['WEBSITE_TITLE']; ?></title>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta name="keywords" content="keyword, keyword, keyword" />
<link href="css/main.css" type="text/css" rel="stylesheet" />
</head>
<body>
<div id="page">
<!-- Header Area -->
<?php include("./include/header.php"); ?>
<?php include("./include/banner.php"); ?>
<div id="content">

<?php

    $page = '';
    if(isSet($_GET["page"])){
        $page=$_GET["page"];
    }
    switch($page){
        case 'category_1':
            include("./include/category_1.php");
            break;
        case 'about':
            include("./include/category_2.php");
            break;
        case 'contact':
            include("./include/contact.php");
            break;
        default:
            include("./include/home.php");  
    }
?>
<!-- /content --></div> 

<!-- /page --></div>
<br />
<br /><br /><br />

<!-- Footer Area -->
<?php include("./include/footer.php"); ob_end_flush(); ?>

</body>
</html> 

/include/category_1.php

<?php echo '<h2>'.$lang['NAVI_CAT_1'].'</h2>'; ?>

<div id="entry">
<br/>
<?php echo $lang['CAT_1_TEXT']; ?>
</div>

language file

<?php
$lang = array();
$lang['NAVI_CAT_1'] = 'Category 1';
$lang['CAT_1_TEXT'] = 'Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim.';

?>

Solution

  • Why not include into a buffer and then search the buffer's contents?

    ob_start();
    include ('index.php');
    $contents = ob_get_clean();
    //the $contents now includes whatever the php file outputs
    

    I actually use this method in production code for all kinds of things, but mainly previewing site-generated emails before users send them. The nice thing is, you can use this on all the files, not just the php files.