I have an ORC file on my local machine and I need any reasonable format from it (e.g. CSV, JSON, YAML, ...).
How can I convert ORC to CSV?
java
folder and execute maven: mvn install
This is how I use them - you will likely need to adjust the paths:
java -jar ~/.m2/repository/org/apache/orc/orc-tools/1.5.4/orc-tools-1.5.4-uber.jar data ~/your_file.orc > output.json
The output is JSON Lines which is easy to convert to CSV. First I needed to remove the last two lines from the output. Then:
import pandas as pd
df = pd.read_json('output.json', lines=True)
df.to_csv('output.csv')