Search code examples
pythonpysparkazure-synapsegettime

Pyspark access data in folder XXX/2021/12/20


I have a pipeline running in Azure Synpase and I need to execute a pyspark code that create a current date folder. The structure must be "2021/12/10" (this is the lastest data that my pipeline was executed.. one folder for year, month and day).

path= 'dataupdated/yyyy/MM/dd' .. i just need to automate the creation of these folders

I think i have to use "get datetime"..


Solution

  • You can use the os library to do that.

    import os
    from datetime import datetime as dt
    filename = f"{dt.now().strftime('%Y')}/{dt.now().strftime('%m')}/{dt.now().strftime('%d')}/file.extension" 
    #This makedirs below will create directories if not found
    os.makedirs(os.path.dirname(filename), exist_ok=True)
    with open(filename, "w") as f:
        f.write("test")