I'm about to go blind starring at this problem, so I have to ask:
I have 1 clue: rearranging the code, makes the error message different:
I get the error: min_analyzer = SpaceSeparatedTokenizer() | LowercaseFilter() | mitt_filter() TypeError: _call_() takes exactly 2 arguments (1 given)
From this code:
import sqlite3
from whoosh.fields import Schema, TEXT, ID
from whoosh.index import create_in
from whoosh.analysis import SpaceSeparatedTokenizer
from whoosh.analysis import StopFilter
from whoosh.analysis import LowercaseFilter
mitt_filter = StopFilter(stoplist=frozenset(['and', 'is', 'it', 'an', 'as', 'at', 'have', 'in', 'yet', 'if', 'from', 'for', 'when', 'by', 'to', 'you', 'be', 'we', 'that', 'may', 'not', 'with', 'a', 'on', 'your', 'this', 'of', 'us', 'will', 'can', 'the', 'or', 'are', u'og', u'i', u'-', u'\\xa0', u'for', u'av', u'til', u'p\\xe5', u'the', u'and', u'as', u'med', u'er', u'en', u'of', u'to', u'har', u'Vi', u'kontakt', u'som', u'\\xe5', u'v\\xe5re', u'vi', u'in', u'oss', u'a', u'det', u'at', u'is', u'\\u2013', u'/', u'\\xbb', u'kan', u'by', u'skal', 'fra', u'ut', u'with', u'be', u'v\\xe5rt', u'mer', u'du', u'\\xa9', u'us', u'on', u'hopp', u'ogs\\xe5', u'Hopp']), minsize=2, maxsize=None, renumber=False)
min_analyzer = SpaceSeparatedTokenizer() | LowercaseFilter() | mitt_filter()
schema = Schema(Hoveddomene=ID, innhold=TEXT (stored=True, analyzer=min_analyzer(removestops=False, positions=True)), webadresse=ID)
ix = create_in('/Users/Sverdrup/virtualenv-1.6.1/whoosh/whoosh directory/', schema)
If I rearrange the code like so:
import sqlite3
from whoosh.fields import Schema, TEXT, ID
from whoosh.index import create_in
from whoosh.analysis import SpaceSeparatedTokenizer
from whoosh.analysis import StopFilter
from whoosh.analysis import LowercaseFilter
min_analyzer = SpaceSeparatedTokenizer() | LowercaseFilter() | StopFilter(stoplist=frozenset(['and', 'is', 'it', 'an', 'as', 'at', 'have', 'in', 'yet', 'if', 'from', 'for', 'when', 'by', 'to', 'you', 'be', 'we', 'that', 'may', 'not', 'with', 'a', 'on', 'your', 'this', 'of', 'us', 'will', 'can', 'the', 'or', 'are', u'og', u'i', u'-', u'\\xa0', u'for', u'av', u'til', u'p\\xe5', u'the', u'and', u'as', u'med', u'er', u'en', u'of', u'to', u'har', u'Vi', u'kontakt', u'som', u'\\xe5', u'v\\xe5re', u'vi', u'in', u'oss', u'a', u'det', u'at', u'is', u'\\u2013', u'/', u'\\xbb', u'kan', u'by', u'skal', 'fra', u'ut', u'with', u'be', u'v\\xe5rt', u'mer', u'du', u'\\xa9', u'us', u'on', u'hopp', u'ogs\\xe5', u'Hopp']), minsize=2, maxsize=None, renumber=False)
schema = Schema(Hoveddomene=ID, innhold=TEXT (stored=True, analyzer=min_analyzer(removestops=False, positions=True)), webadresse=ID)
ix = create_in('/Users/Sverdrup/virtualenv-1.6.1/whoosh/whoosh directory/', schema)
This clue leads me to believe that it's the declaration of the stopFilter function that's wrong, but I can't see that it is?
Any help would be greatly appreciated!
I get the following error: schema = Schema(Hoveddomene=ID, innhold=TEXT (stored=True, analyzer=min_analyzer(removestops=False, positions=True)), webadresse=ID) TypeError: _call_() takes at least 2 arguments (1 given)
You probably just want mitt_filter
, otherwise you are executing __call__
on the instantiated object, which is different from your second sample.
min_analyzer = SpaceSeparatedTokenizer() | LowercaseFilter() | mitt_filter
As your second sample is more correct, that error is saying that you probably shouldn't be passing arguments to min_analyzer
when sending it to the Schema
constructor. I'm basically saying that analyzer=min_analyzer
is probably more correct, and the removestops
and positions
arguments should be supplied elsewhere.