Read TSV file in pyspark

What is the best way to read .tsv file with header in pyspark and store it in a spark data frame.

I am trying to use "spark.read.options" and "spark.read.csv" commands however no luck.

Thanks.

Regards, Jit

Solution

Well you can directly read the tsv file without providing external schema if there is header available as:

df = spark.read.csv(path, sep=r'\t', header=True).select('col1','col2')

Since spark is lazily evaluated it'll read only selected columns. Hope it helps.

Unexpected list append
Force matrix_world to be recalculated in Blender
SQLAlchemy and empty columns
ValueError: time data '24:00' does not match format '%H:%M'
Convert RDD of LabeledPoint to DataFrame toDF() Error
How to cancel trigonometric expressions in SymPy
Get view used in Django tests
Precompiled sasl python 3.9+ package for windows
Regex: Substitute pattern in string multiple times without leftovers
How to render raw html in the PyHTML library
Why does my implementation of trilateration give wrong results?
Django admin: how to sort by one of the custom list_display fields that has no database field
TypeError: not all arguments converted during string formatting - psycopg2
Is there a Python equivalent of the C# null-coalescing operator?
Kraken API - Account balances request returning Invalid Nonce
configparser without whitespace surrounding operator
Pytorch tensor to numpy array
Django: How to get a person whose birthday is today from a database?
Performance impact of inheriting from many classes
How can I do a line break (line continuation) in Python (split up a long line of source code)?
Using pydantic to change int to string
Breaking long method chains into multiple lines in Python
What do ** (double star/asterisk) and * (star/asterisk) mean in a function call?
How to install Pygame on Python 3.4?
Rotating values in a list [Python]
Launch default image viewer from pygtk program
what's the inverse of the quantile function on a pandas Series?
How can I install packages using pip according to the requirements.txt file from a local directory?
Python generate all n-permutations of n lists
FastAPI error when handling file together with form-data defined in a Pydantic model