I have a code:
using (OracleCommand command = new OracleCommand(query, connection))
{
using (OracleDataReader reader = command.ExecuteReader())
{
using (var parser = new ChoParquetWriter(parquetSourcePath)
.Configure(c => c.RowGroupSize = 1000)
.NotifyAfter(1000)
.OnRowsWritten((o, e) => Console.WriteLine($"Rows: {e.RowsWritten} <--- {DateTime.Now}"))
.Setup(s => s.BeforeRecordFieldWrite += (o, e) =>
{
if (e.Source == DBNull.Value)
e.Source = null;
}))
{
if (reader.HasRows)
{
parser.Write(reader);
}
}
}
}
Query is a simple:
string query = $"SELECT x.*, x.ROWID FROM {owner}.{tableName} x";
In query
I ask for rows from Oracle Database table, where one column is a TIMESTAMP type. I can see, that parser is writing first 2000 rows without exception. But after starting writing another batch, it throws exception:
System.ArgumentOutOfRangeException: Year, Month, and Day parameters describe an un-representable DateTime. at Oracle.ManagedDataAccess.Types.DateTimeConv.ToDateTime(Byte[] byteRep, Boolean isNotTimeStampTZ, Int32 offset, Int32 length) at Oracle.ManagedDataAccess.Client.OracleDataReader.GetDateTime(Int32 i) at Oracle.ManagedDataAccess.Client.OracleDataReader.GetValue(Int32 i) at ChoETL.ChoParquetWriter`1.Write(IDataReader dr)
Exception is being thrown at a line:
parser.Write(reader);
I tried to fix it inside .Setup
.Setup(s => s.BeforeRecordFieldWrite += (o, e) =>
{
if (e.Source == DBNull.Value)
e.Source = null;
else if (e.Source is DateTime dt)
{
if (dt < DateTime.MinValue || dt > DateTime.MaxValue)
e.Source = null; // or set to a default value
}
}))
or
.Setup(s => s.BeforeRecordFieldWrite += (o, e) =>
{
if (e.Source == DBNull.Value)
e.Source = null;
else if (e.Source is string str && DateTime.TryParse(str, out DateTime dt))
{
if (dt < DateTime.MinValue || dt > DateTime.MaxValue)
e.Source = null; // or set to a default value
else
e.Source = dt;
}
})
But it doesn't help.
Can you help me with this? I suspect the problem is with some bad date (example 30 February), but how to prevent it from throwing exception?
Is there a fix without changing query
value?
I have changed my query
and added TO_CHAR for problematic columns.
TO_CHAR(BIRTHDATE, 'YYYY-MM-DD HH24:MI:SS') AS BIRTHDATE
Now whole code is working.