Search code examples
azureazure-machine-learning-service

Change data type using Azure Machine Learning Studio


Allow me to ask again this question, as answers found on the forum did not help me so far.

I am trying to convert a column from 'string' into 'numerical' data type.

The column has no missing values and no errors, it comes from a CSV file. For the record, I tried modifing the format type of the column on the CSV file and saving it as a number, but later when importing the CSV file on Azure ML it was coded as string.

So far, I have tried the following options:

  • 'Execute Python script'. Unfortunately it does not work . It returns an error when I run the experiment. The Code I entered is:

    import pandas as df
    
    def azureml_main (df):
      df.age=pd.to_numeric(df.age,errors=’coerce’)
    
    return df
    
  • Use 'Edit Metadata' module. Select as Datatype: 'Integer' or 'Floating point' but I keep on getting an error when running the experiment.

Please kindly let me know what your thoughts are.

Thanks for your help.

Josep Maria

P.S: It's the second time I write in this forum. I hope this time it is well formulated. screenshot of 'Execute Python Script' error


Solution

  • It looks like the Python script just needs a little updating. :)

    This should work since you get dataframe1 automatically as a pandas data frame.

    import pandas as pd
    
    def azureml_main(dataframe1 = None, dataframe2 = None):
      dataframe1.age = pd.to_numeric(dataframe1.age, errors="coerce")
    
      return dataframe1