I'm not very familiar with linux as well as python. I'm taking this class that have example code of a inverted index program on python. I would like to know how to run and test the code. Here's the code that was provided to me.
This is the code for the mapping file. (inverted_index_map.py)
import sys
for line in sys.stdin:
#print(line)
key, value = line.split('\t', 1)
for word in value.strip().split():
if len(word) <=5 and len(word) >= 3:
print '%s\t%s' % (word, key.split(':', 1)[0]) #what are we emitting?
This is the code for the reduce program. (inverted_index_reduce.py)
import sys
key = None
total = ''
for line in sys.stdin:
k, v = line.split('\t', 1)
if key == k:
total += v.strip() #what are we accumulating?
else:
if key:
print '%s\t%s' % (key, total) #what are we printing?
key = k
total = v
if key:
print '%s\t%s' % (key, total) #what are we printing?
It wasn't an executable file so I tried
chmod +x inverted_index_map.py
Then I tried to run the program with:
./inverted_index_map.py testfilename.txt
But I'm not sure if the program is waiting for some kind of input from the keyboard or something. So my question is how do I test this code and see the result? I'm really not familiar with python.
These two programs are written as command-line tools, meaning they take their input from the stdin and display it to stdout. By default, that means that they take input from the keyboard and display output on the screen. In most Linux shells, you can change where input comes from and output goes to by using <file.txt
to get input from file.txt
and >file.txt
to write output in file.txt
. Additionally, you can make the output of one command become the input of another command by using firstcommand | secondcommand
.
Another problem is that the scripts you posted don't have a #!
(shebang) line, which means that you will need to use python inverted_index_map.py
to run your programs.
If you want to run inverted_index_map.py
with input from testfilename.txt
and see the output on the screen, you should try running:
python inverted_index_map.py <testfilename.txt
To run inverted_index_map.py
followed by inverted_index_reduce.py
with input from testfilename.txt
and output written to outputfile.txt
, you should try running:
python inverted_index_map.py <testfilename.txt | python inverted_index_reduce.py >outputfile.txt