I have a file I am reading into a blob via datafactory. Its formatted in excel. Some of the column headers have special characters and spaces which isn't good if want to take it to csv or parquet and then SQL. Is there a way to correct this in the pipeline? Example "Activations in last 15 seconds high+Low" "first entry speed (serial T/a)"
Thanks
Normally, Data Flow can handle this for you by adding a Select transformation with a Rule:
SPECIAL CASE
There may be an issue with this is the column name contains forward slashes ("/"). I accidentally came across this in my testing:
Every one of the columns not mapped contains forward slashes. Unfortunately, I cannot explain why this would be the case as Data Flow is clearly aware of the column name. It can be addressed manually by adding a Fixed rule for EACH offending column, which is obviously less than ideal:
ANOTHER OPTION
The other thing you could try is to pre-process the text file with another Data Flow using a Source dataset that has no delimiters. This would give you the contents of each row as a single column. If you could get a handle on the just first row, you could remove the special characters.