Custom delimiter csv reader spark

I would like to read in a file with the following structure with Apache Spark.

628344092\t20070220\t200702\t2007\t2007.1370

The delimiter is \t. How can I implement this while using spark.read.csv()?

The csv is much too big to use pandas because it takes ages to read this file. Is there some way which works similar to

pandas.read_csv(file, sep = '\t')

Thanks a lot!

Solution

Use spark.read.option("delimiter", "\t").csv(file) or sep instead of delimiter.

If it's literally \t, not tab special character, use double \: spark.read.option("delimiter", "\\t").csv(file)

How to convert List of Maps to CSV
How to skip blank lines in CSV using FlatFileItemReader and chunks
Least used delimiter character in normal text < ASCII 128
Excel: macro to export worksheet as CSV file without leaving my current Excel sheet
Python Folium map not displaying when certain column is read from CSV?
Adding a column to multiple .csv files with the file name as you combine those .csv files into a single dataframe
passing a csv through an API
How to adjust text inside a specific column of a CSV text file as shell script?
KeyError when selecting pandas columns
DuckDB- copy large number of CSVs to table increasingly slow
Using Python to replace triple double quotes with single double quote in CSV
Split CSV files into smaller files but keeping the headers?
How to load data from a text file in a PostgreSQL database?
Write/ Read columns of list of numbers (integer or float) to/ from CSV in Python
How to read a csv (raster) into R and intersect it with a shapefile?
Pandas Error: need to escape, but no escapechar set
How to stream DataFrame using FastAPI without saving the data to csv file?
How can decimal values be totalled in a Windows batch file for loop
Laravel validator and excel files error
Is it possible to have a CSV header when exporting table to file/stdout directly in SQL query and not in post-processing step?
Parse very large CSV files with C++
Java - Having a huge ArrayList (1 million +), how to create a String of it in a acceptable amount of time?
convert a C# datetime object into Excel date format
Sort CSV file based on first column
Reading csv file converts it to json but it takes every row in it as a string in karate
Is there is Limition of using opencsv API or Apache Poi Api?
How to convert CSV or Object-Array to an object but with the first column as properties
Python | Pyodbc | Encoding FetchAll() to Utf-8 with .Encode('utf-8')
Python: Split CSV with character count
how to download a csv generated from papaparse?