Search code examples
javadesign-patternssearch-enginemeta-search

Meta Search Engine Architecture


The question wasn't clear enough, I think; here's an updated straight to the point question:

What are the common architectures used in building a meta search engine and is there any libraries available to build that type of search engine?

I'm looking at building an "enterprise" type of search engine where the indexed data could be coming from proprietary (like Autonomy or a Google Box) or public search engines (like Google Web or Yahoo Web).


Solution

  • If you look at Garlic (pdf), you'll notice that its architecture is generic enough and can be adapted to a meta-search engine.

    UPDATE:

    The rough architectural sketch is something like this:

       +---------------------------+
       |                           |
       |    Meta-Search Engine     |         +---------------+
       |                           |         |               |
       |   +-------------------+   |---------| Configuration |
       |   | Query Processor   |   |         |               |
       |   |                   |   |         +---------------+
       |   +-------------------+   |
       +-------------+-------------+
                     |
          +----------+---------------+
       +--+----------+-------------+ |
       |             |             | |
       |     +-------+-------+     | |
       |     |    Wrapper    |     | |
       |     |               |     | |
       |     +-------+-------+     | |
       |             |             | |
       |             |             | |
       |     +-------+--------+    | |
       |     |                |    | |
       |     | Search Engine  |    | |
       |     |                |    +-+
       |     +----------------+    |
       +---------------------------+
    

    The parts depicted are:

    • Meta-Search Engine - the engine, orchestrates the whole thing.
    • Query Processor - part of the engine, resolves capabilities, sends requests and aggregates results of specific search engines (through the wrappers).
    • Wrapper - bridges the meta-search engine API to specific search engines. Each wrapper works with a specific search engine. Exposes the external search engine capabilities to the meta-search engine, accepts and responds to search requests.
    • Search engine - external search engines to query, they're exposed to the meta-search engine through the wrappers.
    • Configuration - data that configures the meta-search engine, e.g., which wrappers to use, where to find more wrappers, etc. Can also configure the wrappers.