Search code examples
pythonstringglob

What does `if file.find('freq-') != -1` mean?


I'm a chemistry student and want to write a script to extract some data (like coupling constants and interproton distance) from gaussian output files.

I found a script which extracts chemical shifts from gaussian output files. However, I don't understand what does if file.find('freq-') !=-1 mean in the script.

Here's part of the script (since the script also does other things as well so I've just sown the bit relevant to my question):

def read_gaussian_freq_outfiles(list_of_files):
    list_of_freq_outfiles = []
    for file in list_of_files:
        if file.find('freq-') !=-1:
            list_of_freq_outfiles.append([file,int(get_conf_number(file)),open(file,"r").readlines()])

    return list_of_freq_outfiles

def read_gaussian_outputfiles():
    list_of_files = []
    for file in glob.glob('*.out'):
        list_of_files.append(file)
    return list_of_files

I think in the def read_gaussian_outputfiles() bit, we create a list of file and simply add all file with extension '.out' to the list.

The read_gaussian_freq_outfiles(list_of_files) bit has probably list files which has "freq-" in the file name. But what does the file.find('freq-')!=-1 mean?

Does it mean if whatever we find in the file name doesn't equal to -1, or something else?

Some other additional information: the format of the gaussian output filename is: xxxx-opt_freq-conf-yyyy.out where xxxx is the name of your molecule and yyyy is a number.


Solution

  • When s.find(foo) fails to find foo in s, it returns -1. Therefore, when s.find(foo) does not return -1, we know it didn't fail.

    read_gaussian_freq_outfiles looks for the term "freq-" in each of the names of files in list_of_files. If it succeeds in finding this phrase in the name of a file, it appends a list containing this file, a "conf number" (not sure what this is), and the contents of the file, to a list called list_of_freq_outfiles.

    I created three files, goodbye.txt, hello.txt, and helloworld.txt to demonstrate usage.

    In this example, I'll print all files that end with .txt, create a list of files, then print all files that have the phrase "goodbye" in the filename. This should only print goodbye.txt.

    09:53 $ ls
    goodbye.txt    hello.txt      helloworld.txt
    (venv) ✔ ~/Desktop/ex 
    09:53 $ python
    Python 2.7.11 (default, Dec  5 2015, 14:44:47) 
    [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.1.76)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import glob
    >>> for file in glob.glob('*.txt'):
    ...   print(file)
    ... 
    goodbye.txt
    hello.txt
    helloworld.txt
    >>> list_of_files = [ file for file in glob.glob('*.txt') ]
    >>> print(list_of_files)
    ['goodbye.txt', 'hello.txt', 'helloworld.txt']
    >>> for file in list_of_files:
    ...   if file.find('goodbye') != -1:
    ...     print(file)
    ... 
    goodbye.txt
    

    Indeed, goodbye.txt is the only file printed.