Search code examples
pythonapify

Preserve column order in Apify Python actor


I'm trying to store a pandas dataframe in Apify and preserve the column order. The column order is correct in my dataframe, then I use the following commands to store the data:

df_output = df.to_json(orient = 'records',  date_format = 'iso')
default_dataset_client.push_items(df_output)

The data then gets stored with alphabetically sorted columns, instead of the original order I had in the dataframe.

Interestingly enough, the JSON format preview shows the right order, but if I download the CSV or Excel file, the column order is alphabetic.

JSON file preview with correct column order

HTML preview with alphabetic (incorrect) column order

Any ideas on how to preserve the column orders in this case?


Solution

  • Apify automatically orders columns in datasets and you unfortunately can't affect this. If you really need to have your columns ordered, you'll have to resort to some work-around.

    I think the easiest way I can suggest is adding a prefix to each column name in such a way that after alphabetical sort, they will be kept in place - for example, name your columns 01_Farm name, 02_Crop, 03_Category etc.