Search code examples
pythonpandascsvsizeshapes

how to determine the shape of .tsv file through python


I have a .tsv file that looks like this .tsv File structure in MSExcel

I want to determine its shape through pytorch. How Can I do that

I wrote a code

import pandas as pd

df = pd.read_csv(path/to/.tsv)

df.shape

and it output

(13596, 1)

But clearly the shape conflicts the image that I provided. What am I doing wrong?


Solution

  • You need to specify how the data is delimited when using pd.read_csv (unless it is comma separated)

    df = pd.read_csv(path/to/.tsv, sep = '\t')

    Should load the data correctly.

    See: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

    Edit: looking at your data you should also specify header=None because you don't have a header row. Ideally also supply a list of column names using the names parameter of pd.read_csv