Search code examples
javascriptnode.jsfull-text-searchflexsearch

Flexsearch export and import document index issue


I'm trying to build an index on using flexsearch and nodejs and store it on a local disk as it take quite a bit of time to build. The export seems to work, but when trying to import the file again with a new document index I get this error:

TypeError: Cannot read property 'import' of undefined
at Q.t.import (/opt/hermetic/hermetic/server/node_modules/flexsearch/dist/flexsearch.bundle.js:33:330)
at Object.retrieveIndex (/opt/hermetic/hermetic/server/build/search.js:86:25)
at Object.search (/opt/hermetic/hermetic/server/build/search.js:96:32)
at init (/opt/hermetic/hermetic/server/build/server.js:270:27)

I'm running nodejs version 14 and flexsearch version 0.7.21. Below is the code I am using:

import fs from 'fs';
import Flexsearch from 'flexsearch';

const createIndex = async () => { 
    const { Document } = Flexsearch;
    const index = new Document({
      document: {
        id: 'id',
        tag: 'tag',
        store: true,
        index: [
          'record:a',
          'record:b',
          'tag',
        ],
      },
    });

    index.add({ id: 0, tag: 'category1', record: { a: '1 aaa', b: '0 bbb' } });
    index.add({ id: 1, tag: 'category1', record: { a: '1 aaa', b: '1 bbb' } });
    index.add({ id: 2, tag: 'category2', record: { a: '2 aaa', b: '2 bbb' } });
    index.add({ id: 3, tag: 'category2', record: { a: '2 aaa', b: '3 bbb' } });
    console.log('search', index.search('aaa'));

    await index.export((key, data) => fs.writeFile(`./search_index/${key}`, data, err => !!err && console.log(err)));
    return true;
}

const retrieveIndex = async () => { 
    const { Document } = Flexsearch;
    const index = new Document({
      document: {
        id: 'id',
        tag: 'tag',
        store: true,
        index: [
          'record:a',
          'record:b',
          'tag',
        ],
      },
    });

    const keys = fs
      .readdirSync('./search_index', { withFileTypes: true }, err => !!err && console.log(err))
      .filter(item => !item.isDirectory())
      .map(item => item.name);

    for (let i = 0, key; i < keys.length; i += 1) {
      key = keys[i];
      const data = fs.readFileSync(`./search_index/${key}`, 'utf8');
      index.import(key, data);
    }
    return index;
}

await createIndex();
const index = await retrieveIndex();

console.log('cached search', index.search('aaa'));

Solution

  • I was trying to find a way to export the index properly too, originally trying to put everything into one file. While it worked, I didn't really like the solution.

    Which brought me to your SO question, I've checked your code and managed to find out why you get that error.

    Basically the export is a sync operation, while you also (randomly) use async. In order to avoid the issue, you need to remove all async code and only use sync node.fs operations. For my solution, I also only once created the Document store, to then just fill it via retrieveIndex() rather than creating new Document() per function.

    I also added a .json extension to guarantee that node.fs reads the file properly and for sanity purposes - afterall it's json stored.

    So thanks for giving me the idea to store each key as file @Jamie Nicholls 🤝

    import fs from 'fs';
    import { Document } from 'flexsearch'
    
    const searchIndexPath = '/Users/user/Documents/linked/search-index/'
    
      let index = new Document({
        document: {
          id: 'date',
          index: ['content']
        },
        tokenize: 'forward'
      })
    
    
    const createIndex = () => { 
      
      index.add({ date: "2021-11-01", content: 'asdf asdf asd asd asd asd' })
      index.add({ date: "2021-11-02", content: 'fobar 334kkk' })
      index.add({ date: "2021-11-04", content: 'fobar 234 sffgfd' })
    
      index.export(
        (key, data) => fs.writeFileSync(`${searchIndexPath}${key}.json`, data !== undefined ? data : '')
      )
    }
    
    createIndex()
    
    const retrieveIndex = () => { 
    
      const keys = fs
        .readdirSync(searchIndexPath, { withFileTypes: true })
        .filter(item => !item.isDirectory())
        .map(item => item.name.slice(0, -5))
    
      for (let i = 0, key; i < keys.length; i += 1) {
        key = keys[i]
        const data = fs.readFileSync(`${searchIndexPath}${key}.json`, 'utf8')
        index.import(key, data ?? null)
      }
    }
    
    
    const searchStuff = () => {
      retrieveIndex()  
      console.log('cached search', index.search('fo'))
    }
    
    searchStuff()