I have downloaded the data with wget
!wget http://nlp.stanford.edu/data/glove.6B.zip
- ‘glove.6B.zip’ saved [862182613/862182613]
It is saved as zip and I would like to use glove.6B.300d.txt file from the zip file. What I want to achieve is :
embeddings_index = {}
with io.open('glove.6B.300d.txt', encoding='utf8') as f:
for line in f:
values = line.split()
word = values[0]
coefs = np.asarray(values[1:],dtype='float32')
embeddings_index[word] = coefs
Of course I am having this error:
IOErrorTraceback (most recent call last)
<ipython-input-47-d07cafc85c1c> in <module>()
1 embeddings_index = {}
----> 2 with io.open('glove.6B.300d.txt', encoding='utf8') as f:
3 for line in f:
4 values = line.split()
5 word = values[0]
IOError: [Errno 2] No such file or directory: 'glove.6B.300d.txt'
How can I unzip and use that file in my code above on Google colab?
Its simple, checkout this older post from SO.
import zipfile
zip_ref = zipfile.ZipFile(path_to_zip_file, 'r')
zip_ref.extractall(directory_to_extract_to)
zip_ref.close()