I have a .txt document with this type of text:
[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]
(which represents a sentence and the Parts of Speech tags for each word)
I want to have a list of list in python, like this:
[[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]]
But I obtain this:
['[(“Vazhdo”,”verb”),(“të”,”particle”),(“ecësh”,”verb”),(“!”,”excl.”)]\n']
The code I'm using is:
import io
f=io.open("test.txt", mode="r", encoding="utf-8-sig")
f_list = list(f)
How can I avoid the ['[ .... ]\n'] ?
Thank you!
it looks like you can just do
import json
data = json.load(open('test.txt'))
this answer was wrong sorry... [("word","QQ")]
is NOT valid json as json does not support tuples
instead you should be able to do
import ast
data = ast.literal_eval(io.open("test.txt", mode="r", encoding="utf-8-sig").read())
here is my version
import io,ast,requests
#text file available at
text_url = "https://gist.githubusercontent.com/joranbeasley/a50d940d9ac47e8458f027d3cc88e236/raw/3a65169d30e653e085284de16b1ee715f3596c95/example.txt"
with open("example.txt","wb") as f:
# download and save textfile
f.write(requests.get(text_url).content)
data = ast.literal_eval(io.open('example.txt',encoding='utf8').read())
print(data)
print(data[0])
print(data[0][0])
results in
[('Vazhdo', 'verb'), ('të', 'particle'), ('ecësh', 'verb'), ('!', 'excl.')]
('Vazhdo', 'verb')
Vazhdo