According to this: http://code.activestate.com/lists/python-list/413540/, tokenize.generate_tokens
should be used and not tokenize.tokenize
.
This works perfectly fine in Python 2.6
. But it does not work anymore in Python 3
:
>>> a = list(tokenize.generate_tokens(io.BytesIO("1\n".encode()).readline))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.2/tokenize.py", line 439, in _tokenize
if line[pos] in '#\r\n': # skip comments or blank lines
However, also in Python 3
, this works (and returns also the desired output):
a = list(tokenize.tokenize(io.BytesIO("1\n".encode()).readline))
According to the documentation, it seems like tokenize.tokenize
is the new way to use this module: http://docs.python.org/py3k/library/tokenize.html. tokenize.generate_tokens
isn't even documented anymore.
But, why is there still a generate_tokens
function in this module, if it's not documented? I haven't found any PEP regarding this.
I'm trying to maintain a code base for Python 2.5-3.2
, should I call generate_tokens
for Python 2
and tokenize
for Python 3
? Aren't there any better ways?
generate_tokens
seems to be really a strange thing in Python 3
. It doesn't work like in Python 2
. However, tokenize.tokenize
behaves like the old Python 2
tokenize.generate_tokens
. Therefore I wrote a little workaround:
import tokenize
if sys.hexversion >= 0x03000000d:
tokenize_func = tokenize.tokenize
else:
tokenize_func = tokenize.generate_tokens
Now I just use tokenize_func
, which works without problems.