Search code examples
mongodbmongodb-querymonk

MongoDB create index of all text values of a key inside arrays


I am trying to generate a mongodb index for the text values for the following keys: CVE_data_meta, vendor_name and product_name. The values are part of arrays.

My code is as follows:

col.createIndex({
    'cve.affects.vendor.vendor_data.vendor_name': 'text',
    'cve.affects.vendor.vendor_data.product.product_data.product_name': 'text',
    'cve.CVE_data_meta.ID': 'text'
  }).then(() => {
    db.close();

The issue i am running into is 'namespace name generated from index name "vulndbapi.nvd.$cve.affects.vendor.vendor_data.vendor_name_text_cve.affects.vendor.vendor_data.product.product_data.product_name_text_cve.CVE_data_meta.ID_text" is too long (127 byte max)'. Also, if i just try to index by CVE id, the search query is coming up empty.

The sample dataset. The actual dataset is much bigger.

    {
    "cve": {
        "data_type": "CVE",
        "data_format": "MITRE",
        "data_version": "4.0",
        "CVE_data_meta": {
            "ID": "CVE-2012-0001",
            "ASSIGNER": "[email protected]"
        },
        "affects": {
            "vendor": {
                "vendor_data": [{
                    "vendor_name": "microsoft",
                    "product": {
                        "product_data": [{
                                "product_name": "windows_7",
                                "version": {
                                    "version_data": [{
                                        "version_value": "-",
                                        "version_affected": "="
                                    }]
                                }
                            },
                            {
                                "product_name": "windows_server_2003",
                                "version": {
                                    "version_data": [{
                                        "version_value": "*",
                                        "version_affected": "="
                                    }]
                                }
                            }
                        ]
                    }
                }]
            }
        }
    }
}

My query code is

col.find({
$text: {
    $search: 'CVE-2012-0001'
    // $search: 'firefox'
}
}).then((resolve) => {
console.log(resolve);
db.close();

How can i generate indexes when the value is part of an array, and index all the items of that array? The final collection will exceed 50K items


Solution

  • The default name for an index is the concatenated names of the included fields, which ends up being too long in your case. The solution is to provide your own name for the index:

    col.createIndex({
        'cve.affects.vendor.vendor_data.vendor_name': 'text',
        'cve.affects.vendor.vendor_data.product.product_data.product_name': 'text',
        'cve.CVE_data_meta.ID': 'text'
      }, {name: 'vendor_product_text_index'})