Search code examples
azureazure-machine-learning-service

Azure Machine learning - Strip top X rows from dataset


I have a plain text csv file, which i am trying to read in Azure ML studio - the file format is pretty much like this

Geolife trajectory
WGS 84
Altitude is in Feet
Reserved 3
0,2,255,My Track,0,0,2,8421376
0
39.984702,116.318417,0,492,39744.1201851852,2008-10-23,02:53:04
39.984683,116.31845,0,492,39744.1202546296,2008-10-23,02:53:10
39.984686,116.318417,0,492,39744.1203125,2008-10-23,02:53:15
39.984688,116.318385,0,492,39744.1203703704,2008-10-23,02:53:20
39.984655,116.318263,0,492,39744.1204282407,2008-10-23,02:53:25
39.984611,116.318026,0,493,39744.1204861111,2008-10-23,02:53:30

The real data starts from Line 7, how could i strip it off, these files need to be downloaded on the fly so I don't think i would like to strip off the data by some code.


Solution

  • What is your source location - SQL or Blob or http?

    If SQL, then you can use query to start from line 6.

    If Blob/http, I would suggest reading a file as a single column TSV format, use simple R/Python script to drop first 6 rows and convert to csv