I am trying to do some source transformation in ADF where I have alot server logs in the following format:
#PartnerName QA Server
#ApplicationName T_GSPClient
#AccountName DoNotModifyDMS
#SDK desktop
#ClientVersion 5.1.1894.3
#InputChannel DesktopMic
#User JohnDoe
#NmsLogin JohnDoe
#SessionId 7ba732d6-3445-4b16-b7e8-345fgd4f5g4
#ClientIP 209.122.69.109
#SRTechnology S2
#SROptions NoTextBefore
#GeneralLogLevel Trace
#ModuleLogLevels
#ServerDateTimeUTC 2023-07-06 15:28:33.105
{"date":"2023-07-06\t15:28:09.653","level":"TRACE","msg":"DMVAServerMessage-Initialize:{"dmhMessage":{"messageHeader":{"messageType":"unsubscribe","sessionId":"00000000-0000-0000-0000-000000000000","messageId":"aeb65e55-a7c0-4e96-a960-dc2a252d6b2c","transactionType":"acknowledgement","clientType":"ttsChannel","version":"1.0","application":"DesktopSDK"},"messageResponse":{"resultCode":"SERVER_ERROR","errorMessage":"The subscription id does not exist, unsubscribe did not remove an entry"}}} ","traceId":"409e0d44-ad50-4f17-84c7-0521e01e11fc","spanId":"e0750e44-ad50-4f17-90a9-3b6940e0294b","resource":{"module":".NET","class":"Nuance.SpeechAnywhere.Internal.DMVA.DMVAServerMessage","function":"Initialize","line":70,"pid":26612,"thread":"[28-27280]"}}
{"date":"2023-07-06\t15:28:09.653","level":"TRACE","msg":"DMVAServerMessage-Initialize:{"dmhMessage":{"messageHeader":{"messageType":"unsubscribe","sessionId":"00000000-0000-0000-0000-000000000000","messageId":"aeb65e55-a7c0-4e96-a960-dc2a252d6b2c","transactionType":"acknowledgement","clientType":"ttsChannel","version":"1.0","application":"DesktopSDK"},"messageResponse":{"resultCode":"SERVER_ERROR","errorMessage":"The subscription id does not exist, unsubscribe did not remove an entry"}}} ","traceId":"409e0d44-ad50-4f17-84c7-0521e01e11fc","spanId":"e0750e44-ad50-4f17-90a9-3b6940e0294b","resource":{"module":".NET","class":"Nuance.SpeechAnywhere.Internal.DMVA.DMVAServerMessage","function":"Initialize","line":70,"pid":26612,"thread":"[28-27280]"}}
{"date":"2023-07-06\t15:28:09.653","level":"TRACE","msg":"DMVAServerMessage-Initialize:{"dmhMessage":{"messageHeader":{"messageType":"unsubscribe","sessionId":"00000000-0000-0000-0000-000000000000","messageId":"aeb65e55-a7c0-4e96-a960-dc2a252d6b2c","transactionType":"acknowledgement","clientType":"ttsChannel","version":"1.0","application":"DesktopSDK"},"messageResponse":{"resultCode":"SERVER_ERROR","errorMessage":"The subscription id does not exist, unsubscribe did not remove an entry"}}} ","traceId":"409e0d44-ad50-4f17-84c7-0521e01e11fc","spanId":"e0750e44-ad50-4f17-90a9-3b6940e0294b","resource":{"module":".NET","class":"Nuance.SpeechAnywhere.Internal.DMVA.DMVAServerMessage","function":"Initialize","line":70,"pid":26612,"thread":"[28-27280]"}}
{"date":"2023-07-06\t15:28:09.653","level":"TRACE","msg":"DMVAServerMessage-Initialize:{"dmhMessage":{"messageHeader":{"messageType":"unsubscribe","sessionId":"00000000-0000-0000-0000-000000000000","messageId":"aeb65e55-a7c0-4e96-a960-dc2a252d6b2c","transactionType":"acknowledgement","clientType":"ttsChannel","version":"1.0","application":"DesktopSDK"},"messageResponse":{"resultCode":"SERVER_ERROR","errorMessage":"The subscription id does not exist, unsubscribe did not remove an entry"}}} ","traceId":"409e0d44-ad50-4f17-84c7-0521e01e11fc","spanId":"e0750e44-ad50-4f17-90a9-3b6940e0294b","resource":{"module":".NET","class":"Nuance.SpeechAnywhere.Internal.DMVA.DMVAServerMessage","function":"Initialize","line":70,"pid":26612,"thread":"[28-27280]"}}
My objective is remove/skip the first free text lines and keep the other json data and then move it to another blob for further transformation. I tried source transformation with derived column but my Data Flow is still showing json format error. I also used Copy Data activity via ForEach with "enableSkipIncompatibleRow": true, it doesn't work. It only works if I just work with single file, not when I try to iterate over many files and skip/remove those lines.
In order to remove the first few lines which are not in Json format using ADF dataflow, follow the below approach.
~
as column delimitter. So, the entire row is in the same column.substring(Column_name,1,1)=='{'
. This removes all the rows which do not begin with {
.Here, I took sink file as delimited dataset with column delimiter as No delimitter
and Quote character as No quote character
.
Data in Output file