Python regex to match whole words (minus contractions and possessives)

I am trying to using regex in Python to capture whole words from text. This is simple enough but I also want to remove contractions and possessives indicated by apostrophes.

Currently I have (?iu)(?<!')(?!n')[\w]+

Testing on the following text

One tree or many trees? My tree's green. I didn't figure this out yet.

Gives these matches

One tree or many trees My tree green I didn figure this out yet

In this example the negative lookbehind prevents the "s" and "t" after an apostrophe from being matched as whole words. But how do I write the negative lookahead (?!n') so that the matches include "did" instead of "didn"?

(My use case here is a simple Python spell checker, each word gets validated as being spelt correctly or not. I've ended up using the autocorrect module as pyenchant, aspell-python and others didn't work when installed via pip)

Solution

I would use this regex:

(?<![\w'])\w+?(?=\b|n't)

This matches word characters until it encounters n't.

Result:

>>> re.findall(r"(?<![\w'])\w+?(?=\b|n't)", "One tree or many trees? My tree's green. I didn't figure this out yet.")
['One', 'tree', 'or', 'many', 'trees', 'My', 'tree', 'green', 'I', 'did', 'figure', 'this', 'out', 'yet']

Breakdown:

(?<!         # negative lookbehind: assert the text is not preceded by...
    [\w']    # ... a word character or apostrophe
)
\w+?         # match word characters, as few as necessary, until...
(?=
    \b       # ... a word boundary...
|            # ... or ...
    n't      # ... the text "n't"
)