I have a csv file with a partial format of something like:
field1,bmm/bdd/byyyy,emm/edd/eyyyy,field4....
I am successfully creating a json file like this:
{
"field1": [
{
"begDate": byyyybmmbdd,
"endDate": eyyyyemmedd,
"score": field4,
.....
The python script that I was using works fine but it gives me deprecation warning:
from datetime import datetime
dateparse = lambda x: datetime.strptime(x, '%m/%d/%Y')
df = pd.read_csv("input.csv", parse_dates=['Start', 'End'], date_parser=dateparse)
df['Start'] = df['Start'].astype(str)
df['End'] = df['End'].astype(str)
df['score'] = df['score'].round(decimals=3)
res = {}
for a1, df_gp in df.groupby('field1'):
res[a1] = df_gp.drop(columns='field1').to_dict(orient='records')
print(json.dumps(res, indent=4).lower())
FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version.
Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
Id like to be able to run the script w/o the warning so I modified the script accordingly:
dateparse = lambda x: datetime.strptime(x, '%m/%d/%Y')
df = pd.read_csv("input.csv", parse_dates=['Start', 'End'], date_format=dateparse)
I also tried this:
dateparse = lambda x: datetime.strptime(x, '%m/%d/%Y').strftime("%Y%m%d")
df = pd.read_csv("input.csv", parse_dates=['Start', 'End'], date_format=dateparse)
but the json output gives me the wrong date format:
{
"field1": [
{
"begDate": bmm/bdd/byyyy,
"endDate": emm/edd/eyyyy,
"score": 0.0,
....
Are there any suggestions on how to get around this Warning message while receiving the desired output?
You can avoid the deprecation warning by not trying to replace the deprecated date_parser
with a callable in date_format
(which expects a string, not a function). Instead, load the dates as objects and then convert them with pd.to_datetime
and dt.strftime
to get the format you want. For example:
import pandas as pd
from datetime import datetime
import json
# Read the CSV without a custom parser (dates will be parsed based on 'parse_dates')
df = pd.read_csv("input.csv", parse_dates=['Start', 'End'])
# Now convert the date columns to the desired format (e.g. "YYYYMMDD")
df['Start'] = pd.to_datetime(df['Start'], format='%m/%d/%Y').dt.strftime('%Y%m%d')
df['End'] = pd.to_datetime(df['End'], format='%m/%d/%Y').dt.strftime('%Y%m%d')
df['score'] = df['score'].round(3)
# Group and convert to the desired JSON structure
res = {}
for a1, df_gp in df.groupby('field1'):
res[a1] = df_gp.drop(columns='field1').to_dict(orient='records')
print(json.dumps(res, indent=4).lower())
This way, you get rid of the warning and still achieve your desired JSON output.