Search code examples
microsoft-dynamicsazure-synapse

Dynamics F&O Export to Datlake to Synapse Servelress. Errors querying "Unexpected end-of-input within record at...", I think when file is updating


I'm getting an intermittent error when querying certain 'tables' in our Synapse Serverless database. The error is always the same, but the table/csv file referenced can change.

OLE DB or ODBC error: [DataSource.Error] Microsoft SQL: Error encountered while parsing data: 'Unexpected end-of-input within record at [byte: 179132185]. '. Underlying data description: file 'Tables/ProfessionalServices/ProjectManagementAndAccounting/Transaction/ProjInvoiceEmpl/PROJINVOICEEMPL_00003.csv'. Statement ID: {1F343EAF-0F1D-4693-A79B-84EFAEFA3175} | Query hash: 0x5DB5B39FB3B72B73 | Distributed request ID: {2D4C2C9F-8912-4548-8D34-7C62732DEF6D}. Total size of data scanned is 3984 megabytes, total size of data moved is 90 megabytes, total size of data written is 0 megabytes..

Our full setup is:

Dynamics F&O ERP ---> F&O's Built in [Export to Datalake] Functionality ---> Synapse Datalake ---> Synapse Serverless DB (which is constructed of Views that hit the above Datalake files).

I'm 90% certain the issue is the underlying file - in the above error, [Tables/ProfessionalServices/ProjectManagementAndAccounting/Transaction/ProjInvoiceEmpl/PROJINVOICEEMPL_00003.csv] is erroring, as it's being updated by F&O at that same moment I'm querying it.

Reasons I think this:

  1. 15/15 times when I go to check on the underlying CSV file throwing an error, it has just received an update
  2. The issue is intermittent - fails once, may work fine a moment later
  3. Error message is reproducible in both SSMS, and Power BI
  4. The tables/csv files having issues, are ones I'd intuitively expect to be frequently updated
  5. Errors are most common during business hours, less common in evening, and nonexistent at night. i.e. negatively correlates with how likely something is to be updating.

Edit: Found some documentation, but it appears to say 'pound sand'. I'm thinking there must be something one can do though, else what's the point of real-time-data export from F&O, if it fails 15% of the time it's called?

https://learn.microsoft.com/en-us/dynamics365/fin-ops-core/dev-itpro/data-entities/finance-data-azure-data-lake

I'm wondering if there's solution on the Dynamics F&O side, or Synapse Datalake side, to avoid this.

What I've tried / considered

  1. We do have this enabled in our Serverless Views: ROWSET_OPTIONS = '{"READ_OPTIONS":["ALLOW_INCONSISTENT_READS"]}'

  2. I think using CETAS would avoid this issue. But it'd negate advantages as well.

CETAS can only be dropped and recreated. So we'd end up dropping and recreating the CETAS every night

  • So now the data is only as fresh as nightly - instead of being almost-live. This defeats the purpose of F&O's 'export to datalake'
  • Obviously adds more ETL/ELT time, complexity, and points of failure to everything

Solution

  • Found Microsoft Documentation

    Q: When I read data files by using Synapse serverless, I notice an intermittent 
    error: "Unexpected end-of-input within record at...."  A: When tables records are 
    frequently updated or deleted in finance and operations apps, you might notice a 
    read/write contention on CSV files, and this error might occur. This issue occurs 
    when the Synapse serverless SQL service queries a CSV file as it's being updated. 
    In many cases, you might be able to fix this issue by retrying the query
    

    So this appears to be a "working as expected" situation, just the functionality is poor.

    I now wonder how other companies work around this. Since, to my thinking - most of the point of Export to Datalake, is that it can be near real-time. But if queries on near real-time data frequently fail, that defeats the purpose.

    That is a different question though / will post separately.