python azure databricks azure-databricks

variable problem in python function, is not defined

I am currently working with Python on Azure Databricks. Recently the cluster you are working in had an update and now my solution does not run correctly. Precisely the error I have today is:

NameError: ("name 'ajuste_fecha' is not defined", 'occurred at index 0')

The code is the following:

from pyspark.sql.types import *
from pyspark.sql.functions import *
import pandas as pd
import datetime
# from datetime import datetime  
from pyspark.sql import SQLContext
import json
import re
import requests
from pandas.io.json import json_normalize
import json
import os
from os import listdir
from os.path import isfile, join
import io

def f_ajuste_fecha(fecha):
  global ajuste_fecha
  if fecha[0:3] == "Ene" :
    ajuste_fecha = "01"
  elif fecha[0:3] == "Feb" :
    ajuste_fecha = "02"
  elif fecha[0:3] == "Mar" :
    ajuste_fecha = "03"
  elif fecha[0:3] == "Abr" :
    ajuste_fecha = "04"
  elif fecha[0:3] == "May" :
    ajuste_fecha = "05"
  elif fecha[0:3] == "Jun" :
    ajuste_fecha = "06"
  elif fecha[0:3] == "Jul" :
    ajuste_fecha = "07"
  elif fecha[0:3] == "Ago" :
    ajuste_fecha = "08"
  elif fecha[0:3] == "Sep" :
    ajuste_fecha = "09"
  elif fecha[0:3] == "Oct" :
    ajuste_fecha = "10"
  elif fecha[0:3] == "Nov" :
    ajuste_fecha = "11"
  elif fecha[0:3] == "Dic" :
    ajuste_fecha = "12"
  return ajuste_fecha

res = requests.get("https://estadisticas.bcrp.gob.pe/estadisticas/series/api/PD04637PD/csv/")
df = pd.read_csv(io.StringIO(res.text.strip().replace("<br>","\n")), engine='python')

df.rename(columns={'Tipo de cambio - TC Interbancario (S/ por US$) - Compra':'valor'} , inplace=True)
df.rename(columns={'D&iacute;a/Mes/A&ntilde;o':'fecha'} , inplace=True)

df['periodo_temp'] = df.apply(lambda x : x['fecha'][3:9],axis=1)
df['periodo_temp_2'] = df.apply(lambda x : '20' +  x['fecha'][7:9] + '-' + f_ajuste_fecha(x['periodo_temp'][0:3])  + '-' + x['fecha'][0:2] ,axis=1)


df['periodo'] = pd.to_datetime(df['periodo_temp_2'], format='%Y-%m-%d')

df_temp = df[df['periodo'] == df['periodo'].max()]
df_2 = df_temp[['periodo', 'valor']]
df_dolar = df_2
df_dolar

As I mentioned, previously I did not have this problem. I have tried instances ajuste_fecha = '' outside of the function, but it is not successful either.

What am I doing wrong? What is it due to?

Thank you very much in advance, I will be attentive to your answers.

greetings!!

Solution

The Api response that you are getting is not matching any of your condition in f_ajuste_fecha method. that's why you are getting error name 'ajuste_fecha' is not defined

May be the update has changed the output from Sep to Set, I just checked by hitting the API.

Día/Mes/Año,"Tipo de cambio - TC Interbancario (S/ por US$) - Compra"
"08.Set.20","3.54366666666667"
"09.Set.20","3.533"
"10.Set.20","3.54"
"11.Set.20","3.568"
"14.Set.20","3.567"
"15.Set.20","3.54733333333333"
"16.Set.20","3.53633333333333"
"17.Set.20","3.53116666666667"
"18.Set.20","3.52333333333333"
"21.Set.20","3.5455"
"22.Set.20","3.55183333333333"
"23.Set.20","3.57016666666667"
"24.Set.20","3.58033333333333"
"25.Set.20","3.593"
"28.Set.20","3.58816666666667"
"29.Set.20","3.5935"
"30.Set.20","3.59733333333333"
"01.Oct.20","3.60183333333333"
"02.Oct.20","3.618"

You need to update your code condition for matching Set rather than Sep.

  elif fecha[0:3] == "Set":      #replace Sep with Set here
    ajuste_fecha = "09"