So full disclosure, this is hw, but I am having a lot of difficulty figuring this out. My professor has a rather particular challenge in this one portion of the assignment that I can't quite seem to figure out. Basically I'm trying to read a very very large file and put it into a list of lists that's represented as a movie recommendation matrix. She says that this can be done without a for loop and suggests using the readlines() method.
I've been running this code:
movMat = []
with open(u"movie-matrix.txt", 'r', encoding="ISO-8859-1") as f:
movMat.append(f.readlines())
But when I run diff on the output, it is not equivalent to the original file. Any suggestions for how I should go about this?
Update: upon further analysis, I think my code is correct. I added this to make it a tuple.
with open(u"movie-matrix.txt", 'r', encoding="ISO-8859-1") as f:
movMat = list(enumerate(f.readlines()))
Update2: Since people seem to want the file I'm reading from, allow me to explain. This is a ranking system from 1-5. If a person has not ranked a file, they are denoted with a ';'. This is the second line of the file.
"3;;;;3;;;;;;;;3;;;;;;;;;2;;;;;;;;3;;;;;;;;;;;;5;;;;;;;1;;;;;;;;;;;;;;;3;;;;;;;;3;;;;;;;;;;;4;;;;4;;;;;3;;;2;;;;;;;2;;;;;;;;3;;;;;;;;;;;;;;;;;;;;4;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;4;;;;;;;;;;;;;;;3;;;;3;;;4;2;;;;;;3;;;;;;4;;;;3;;;;;3;;;;;;;;;;;;2;;;;;;;;;;;;;;;3;4;;;;;;5;;;;;;;;;;;3;2;;;1;;;;;4;;;4;3;;;;;;;;;;;;4;3;;;;;;;;2;;3;;2;;;;;;;;;;;;;;;4;;;;;1;;2;;;;;;;;;;;;;;;;;;;5;;;;;;;;;;;;;;;;;4;;;;;;;;;;4;4;;;;2;3;;;;;;3;;4;;;;;;4;;;;;3;3;;;;;;1;;4;;;;;;;;;4;;;;;;;;;2;;;;3;;;;;;4;;;;;;;3;;;;;;;;4;;;;;4;;;;;;;;;;;1;;;;;;5;;;;;;;;;;;;4;;;3;;;;;;;;2;;1;;;;;;;;;4;;;;;;;;;;;;;;;3;;;;;;;;;;;5;;;;4;;;;;;;3;;;;;;;;2;;;;;;;;;;3;;;;;5;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;3;;;;;;;;;;;;;;;;;;2;;;3;4;;;;;3;;;;;4;;;;;;;;4;;4;3;;;;;4;;3;;;1;;3;;;;;2;;;;;;;;;;;4;;;;;;;;;;;3;;;;3;;;;;;;;;;;;;;;;;;;3;;;;4;;;;;;3;;;;;;;;;;;;4;;;;;;;;;;;3;;;;;;;;3;;;4;;4;;;;;;3;;;;;;;3;;;;;;;;;3;1;;;;;;;;;;;;;;;;3;;;;;3;5;;4;;;;;;4;;3;4;;;;;;;;3;;;;;;;;;;;3;;;;3;;;;;;;;;;;;;;4;;5;;;;;;;;;;;;;;;;;;4;;;;2;;2;;;;;;;;;;3;;;;;;4;;;3;;;4;;;;3;;;3;;;;;;;;;;;;;;;;;3;;;;;;;;3;;;;;;;;;;4;;;;;;;;;5"
I can't think of any case where f.readlines()
would be better than just using f
as an iterable. That is, for example,
with open('movie-matrix.txt', 'r', encoding="ISO-8859-1") as f:
movMat = list(f)
(no reason using u'...'
notation in Python 3 -- which you have to be using if built-in open
takes encoding=
...!-).
Yes, f.readlines()
would be equivalent to list(f)
-- but it's more verbose and less obvious, so, what's the point?!
Assuming you have to output this to another file, since you mention "running diff on the output", that would be
with open('other.txt', 'w', encoding="ISO-8859-1") as f:
f.writelines(movMat)
no non-for-loop alternatives there:-).