I have created a Managed table in U-SQL and loaded data into the table. When i try reading from it, its showing status "preparing" for about 3 hours and cancelled by Yarn.
I tried Rebuild table command and same scenario for it too.
It has Audit data, When ever i process a file from Data lake i am keeping audit details into that table. like File Name, Location, record count. till now i have processed around 36,000 files. When i try to use for final audit report, its keep preparing for 3 hours and being cancelled by Yarn
Please provide more information:
UPDATE:
Based on the statement of "processed around 36k files", I assume that you insert each file individually into the table. This is not recommended and leads to table fragmentation which then in turn will have the preparation phase run out of time during code generation. Since you already have 36k table fragments, you should drop the table, and do a single INSERT from an EXTRACT over the 36k files specified in a file set using the fast file set preview feature I mention above. That way you can avoid this problem.
Once you loaded the data, you need to rebuild the table or partition to avoid later fragmentation.
We are working on improving scalability and add more features around rebuilding fragmented tables, but they will not come before 2nd half of this year the earliest. So it is important that you avoid such fragmentation.