Search code examples
pythonnlpanacondanltktokenize

Mosestokenizer issue: [WinError 2] The system cannot find the file specified


Can't figure out why is this problem appearing.

from mosestokenizer import MosesDetokenizer

with MosesDetokenizer('en') as detokenize:
    print(detokenize(["hi", 'my', 'name', 'is', 'artem']))

This is what I get:

stdbuf was not found; communication with perl may hang due to stdio buffering.
Traceback (most recent call last):
  File "C:\Users\ArtemLaptiev\Documents\GitHub\temp\foo.py", line 3, in <module>
    with MosesDetokenizer('en') as detokenize:
  File "C:\ProgramFiles\Anaconda\lib\site-packages\mosestokenizer\detokenizer.py", line 47, in __init__
    super().__init__(argv)
  File "C:\ProgramFiles\Anaconda\lib\site-packages\toolwrapper.py", line 52, in __init__
    self.start()
  File "C:\ProgramFiles\Anaconda\lib\site-packages\toolwrapper.py", line 92, in start
    cwd=self.cwd
  File "C:\ProgramFiles\Anaconda\lib\subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "C:\ProgramFiles\Anaconda\lib\subprocess.py", line 997, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

Thank you for help!


Solution

  • use sacremoses instead of moses.

    pip install -U sacremoses
    

    and

    from sacremoses import MosesTokenizer, MosesDetokenizer
    with MosesDetokenizer() as detokenize:
        print(detokenize(["hi", 'my', 'name', 'is', 'artem']))
    

    for complete details sacremoses