I'm a chemistry student and want to write a script to extract some data (like coupling constants and interproton distance) from gaussian output files.
I found a script which extracts chemical shifts from gaussian output files. However, I don't understand what does if file.find('freq-') !=-1
mean in the script.
Here's part of the script (since the script also does other things as well so I've just sown the bit relevant to my question):
def read_gaussian_freq_outfiles(list_of_files):
list_of_freq_outfiles = []
for file in list_of_files:
if file.find('freq-') !=-1:
list_of_freq_outfiles.append([file,int(get_conf_number(file)),open(file,"r").readlines()])
return list_of_freq_outfiles
def read_gaussian_outputfiles():
list_of_files = []
for file in glob.glob('*.out'):
list_of_files.append(file)
return list_of_files
I think in the def read_gaussian_outputfiles()
bit, we create a list of file and simply add all file with extension '.out' to the list.
The read_gaussian_freq_outfiles(list_of_files)
bit has probably list files which has "freq-" in the file name. But what does the file.find('freq-')!=-1
mean?
Does it mean if whatever we find in the file name doesn't equal to -1, or something else?
Some other additional information: the format of the gaussian output filename is: xxxx-opt_freq-conf-yyyy.out
where xxxx
is the name of your molecule and yyyy
is a number.
When s.find(foo)
fails to find foo
in s
, it returns -1
. Therefore, when s.find(foo)
does not return -1
, we know it didn't fail.
read_gaussian_freq_outfiles
looks for the term "freq-"
in each of the names of files in list_of_files
. If it succeeds in finding this phrase in the name of a file, it appends a list containing this file, a "conf number" (not sure what this is), and the contents of the file, to a list called list_of_freq_outfiles
.
I created three files, goodbye.txt
, hello.txt
, and helloworld.txt
to demonstrate usage.
In this example, I'll print all files that end with .txt
, create a list of files, then print all files that have the phrase "goodbye"
in the filename. This should only print goodbye.txt
.
09:53 $ ls
goodbye.txt hello.txt helloworld.txt
(venv) ✔ ~/Desktop/ex
09:53 $ python
Python 2.7.11 (default, Dec 5 2015, 14:44:47)
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.1.76)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import glob
>>> for file in glob.glob('*.txt'):
... print(file)
...
goodbye.txt
hello.txt
helloworld.txt
>>> list_of_files = [ file for file in glob.glob('*.txt') ]
>>> print(list_of_files)
['goodbye.txt', 'hello.txt', 'helloworld.txt']
>>> for file in list_of_files:
... if file.find('goodbye') != -1:
... print(file)
...
goodbye.txt
Indeed, goodbye.txt
is the only file printed.