I'm building indexes that seems to work like this
doc = search.Document(doc_id=str(article.key()), fields=[
search.TextField(name='title', value=article.title),
search.TextField(name='text', value=article.text),
search.TextField(name='city', value=article.city),
search.TextField(name='region', value=article.region),
search.NumberField(name='cityID', value=city_entity.key().id()),
search.NumberField(name='regionID', value=region_entity.key().id()),
search.NumberField(name='category', value=int(article.category)),
search.NumberField(name='constant', value=1),
search.NumberField(name='articleID', value=article.key().id()),
search.TextField(name='name', value=article.name)
], language='en')
search.Index(name='article').add(doc)
The app gets a new article that populates the index by the code above which seems to work. The index is built and I can search the entities with search API. But I don't want older articles than 60 days, so how can I adjust to that? There is a "created" and "updated" timestamp for the entity:
added = db.DateTimeProperty(verbose_name='added', auto_now_add=True) # readonly
modified = db.DateTimeProperty(verbose_name='modified',
auto_now_add=True)
Should I have a cron job every 24 hrs that rebuilds the entire index, or a cron job every 24 hrs that removes the oldest entities from the index? Now I'm not adding the added
and modified
variables to the index which can be useful also in the index, if I want to search for e.g. a certain timestamp in the index(?) so now that I see that it's working I ask if I aslo much act on the index variables and add the added
and modified
variables to the index?
Indexes are built automatically and continuously and you have no control over this process. When an entity is changed (or created/removed) the index gets updated. There is no way to exclude certain entities from this.
If you do not need old documents at all then you should remove them.
But in both cases (serving or removing) you'll need to use multiple equality filters (on title
, text
, city
, etc..) and one inequality filter (on created
), so you'll need to configure a compound index.