python python-3.x excel python-requests xlsx

How to download first line of xlsx file via url python

I used to use requests lib to load single line via url:

import requests

def get_line(url):
    resp = requests.get(url, stream=True)
    for line in resp.iter_lines(decode_unicode=True):
        yield line

line = get_line(url)
print(next(line))

A text files loading perfectly. But if I want to load .xlsx, result looks like unprintable symbols:

PK [symbols] [Content_Types].xml [symbols]

Is there a way to load single row of cells?

Solution

You can't just read raw HTTP response and seek for the particular Excel data. In order to get xlsx file contents in proper format you need to use an appropriate library.

One of the common libraries is xlrd, you can install it with pip:

sudo pip3 install xlrd

Example:

import requests
import xlrd

example_url = 'http://www.excel-easy.com/examples/excel-files/fibonacci-sequence.xlsx'
r = requests.get(example_url)  # make an HTTP request

workbook = xlrd.open_workbook(file_contents=r.content)  # open workbook
worksheet = workbook.sheet_by_index(0)  # get first sheet
first_row = worksheet.row(0)  # you can iterate over rows of a worksheet as well

print(first_row)  # list of cells

xlrd documentation

If you want to be able to read your data line by line - you should switch to more simple data representation format, like .csv or simple text files.