Search code examples
pythonwrapperbiopythonphylogeny

Using PhyML within BioPython Phylo


I recently tried to use Maximum Likelihood for the creation of trees from an alignement file but had problems when following the official BioPython's Documentation:

cmd = PhymlCommandline(input='data/random.phy')
out_log, err_log = cmd()

I replaced the path with the path to my own file in relaxed phylip format. Reading my file with AlignIO worked, so the path should not be the problem.

Is there a need to install PhyML and somehow link it to the BioPython Installation?

Unfortunately the Documentation is very short about this.

I'm using Juypter Notebook via Anaconda on Windows.

The error is:

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-437a0bc7f21e> in <module>()
  4 cmd = PhymlCommandline(input='C/Users/Nicolas/Documents/Uni/ProjektHerlyn/random.phy', cmd='phyml')
  5 #out_log, err_log = cmd()
----> 6 cmd()

~\Anaconda3\lib\site-packages\Bio\Application\__init__.py in __call__(self, stdin, stdout, stderr, cwd, env)
505                                          universal_newlines=True,
506                                          cwd=cwd, env=env,
--> 507                                          shell=use_shell)
508         # Use .communicate as can get deadlocks with .wait(), see Bug 2804
509         stdout_str, stderr_str = child_process.communicate(stdin)

~\Anaconda3\lib\subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
707                                 c2pread, c2pwrite,
708                                 errread, errwrite,
--> 709                                 restore_signals, start_new_session)
710         except:
711             # Cleanup if the child failed starting.

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
995                                          env,
996                                          os.fspath(cwd) if cwd is not None else None,
--> 997                                          startupinfo)
998             finally:
999                 # Child is launched. Close the parent's copy of those pipe

FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden

--> English: [WinError 2] System can't find the given file


Solution

  • Short answer

    • Download PhyML from here: http://www.atgc-montpellier.fr/phyml/download.php
    • Extract the appropriate executable for your OS to your computer
    • Add cmd=/your/path/PhyML-3.1_win32.exe' to your PhymlCommandline call, e.g.

      cmd = PhymlCommandline(cmd='c:/home/users/nicolas/PhyML-3.1_win32.exe', input='data/random.phy')
      
    • Alternatively you could add the path to the executable to your PATH (see below)

    Longer answer

    • PhymlCommandline inherits the AbstractCommandline class which is just a wrapper for the executables
    • From the documentation

    Note that by default we assume the underlying tool is installed on the system $PATH environment variable. This is normal under Linux/Unix, but may need to be done manually under Windows. Alternatively, you can specify the full path to the binary as the first argument (cmd):