Search code examples
pythonjsondata-analysis

fail to load data in iPython, following example from "Python for data analysis" ch2


I am just starting to learn Python from Wes McKinnney's book, "Python for data analysis". I installed Python using Enthought Canopy 1.5.2-win-64 (as Enthought does not seem to distribute EPDFree anymore, which is recommended in the book).

I am blocking at Wes' first example, which prevents me from doing the rest of the chapter. The first example reads the first line of a text file available at https://github.com/pydata/pydata-book/tree/master/ch02. Here is the code :

ipython --pylab
path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
open(path).readline()

I just get a newline ouptut '\n' whereas the output in the book is :

'{ "a": "Mozilla\\/5.0 (Windows NT 6.1; WOW64) AppleWebKit\\/535.11
(KHTML, like Gecko) Chrome\\/17.0.963.78 Safari\\/535.11", "c": "US", "nk":1,
"tz": "America\\/New_York", "gr": "MA", "g": "A6qOVH", "h": "wfLQtf", "l":
"orofrog", "al": "en-US,en;q=0.8", "hh": "1.usa.gov", "r":
"http:\\/\\/www.facebook.com\\/l\\/7AQEFzjSi\\/1.usa.gov\\/wfLQtf", "u":
"http:\\/\\/www.ncbi.nlm.nih.gov\\/pubmed\\/22415991", "t":1331923247, "hc":
1331822918, "cy": "Danvers", "ll": [ 42.576698, -70.954903 ] }\n'

Unfortunately, I do not know any JSON yet, but the file provided on Wes Mckinney's website does not seem to be exactly the same than the one on the book. Not sure if that could be the source of my problem.

I am new to Python, so any help would be greatly appreciated!


Solution

  • What is the actual content of that file on disk? Note that the path you pass to open(path).readline() is relative to whichever current directory you're in when you started ipython --pylab. However, you didn't get a "File not found" error so I assume a file exists in the right place.

    How did you retrieve the file to use it locally? The book isn't specific. Did you go to the github page and download the zip package? Use Git to download the whole repository? Right-click in the browser to save the file? Did you ensure you actually downloaded the raw file and not the HTML page representing the file?

    Edit: OP confirms the file they had was actually a right-clicked-saved file from the browser that was actually an HTML file, not the raw json file. The problem was fixed by downloading the whole package as a Zip from the repository's front page, and working from within that package.