sql-server-2008 bigdata parquet azure-data-factory azure-data-lake

Cannot transfer a large 30 GB SQL table from a client SQL Server machine to my Azure Data Lake Gen2 as a 530 MB Parquet File

I cannot copy a 30 GB SQL Server table from this machine to my Azure Data Lake Gen 2 storage account as a 530 MB Parquet file using Azure Data Factory. The compression type is gzip. The Throughput is 11.8 MB/s

The copy detail:

The failed ADF copy error message is:

{ "errorCode": "2200", "message": "Failure happened on 'Sink' side. ErrorCode=UserErrorFailedBlobFSOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=BlobFS operation failed for: A task was canceled.. Account: &apos;datalake&apos;. FileSystem: &aposcontainer-dl&apos;. Path: &apos;ImportLayer/F61ILBarAcct_Txns.parquet&apos;.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Threading.Tasks.TaskCanceledException,Message=A task was canceled.,Source=mscorlib,'", "failureType": "UserError", "target": "Copy Latest Source Data" }

It is the same on the client integration runtime log

DEBUG:
TraceComponentId: TransferClientLibrary
TraceMessageId: BlobFSOperationRetry
@logId: Warning
jobId: c063e070-cc12-4cae-895f-f8ada2bfa3ff
activityId: ecfa652d-8471-4297-be2a-4ecc0ebc89c5
eventId: BlobFSOperationRetry
message: 'Type=System.Threading.Tasks.TaskCanceledException,Message=A task was canceled.,Source=mscorlib,StackTrace=   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Storage.Data.AzureDfsClient.&lt;UpdatePathWithHttpMessagesAsync&gt;d__41.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Storage.Data.AzureDfsClientExtensions.&lt;UpdatePathAsync&gt;d__24.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.Storage.Data.AzureDfsClientExtensions.UpdatePath(IAzureDfsClient operations&#44; String action&#44; String filesystem&#44; String path&#44; Nullable`1 position&#44; Nullable`1 retainUncommittedData&#44; String contentLength&#44; String xMsLeaseAction&#44; String xMsLeaseId&#44; String xMsCacheControl&#44; String xMsContentType&#44; String xMsContentDisposition&#44; String xMsContentEncoding&#44; String xMsContentLanguage&#44; String xMsProperties&#44; String ifMatch&#44; String ifNoneMatch&#44; String ifModifiedSince&#44; String ifUnmodifiedSince&#44; Stream requestBody&#44; String xMsClientRequestId&#44; Nullable`1 timeout&#44; String xMsDate)
   at Microsoft.Azure.Storage.Data.BlobFSClient.&lt;&gt;c__DisplayClass37_0.&lt;AppendFile&gt;b__1()
   at Microsoft.Rest.TransientFaultHandling.RetryPolicy.&lt;&gt;c__DisplayClass16_0.&lt;ExecuteAction&gt;b__0()
   at Microsoft.Rest.TransientFaultHandling.RetryPolicy.ExecuteAction[TResult](Func`1 func),'

On the client machine, the cpu is a Intel Xeon E7-2830 @2.13Ghz, 64bit OS. It has 16.0 GB of Ram. It has a 40 GB Hard drive with 10 GB free space. I increase the virtual memory max to 10 GB so it can use the free space. For the Java pption, I set -Xmx, Java max heap memory, to 26 GB to take advantage of that. I can only use this client machine which has the Integration Runtime installed on it.

What could be the problem?

Solution

I manage to solve it by using the compression type snappy instead of gzip. It uses less processing power.

In addition, i ran the copy one at a time instead of many at once. It is slower but safer