Search code examples
pythonweb-scrapingcronetl

Crontab not working properly but log shows the opposite


im working on a small ETL that collects data using webscraping, cleans and manipulates it and sends to a local sqlite3 database.

If i execute the command /virtualenv_path/python /script_path/script.py it runs perfectly, but if i schedule this command with crontab it does not work.

It just does not send any data. However, my log file shows me that the crontab is executing script.py using my venv as expected.

So, what is going on? What should i do to solve this?

I suppose that my script is not incorrect because if i execute without crontab it works flawlessly and even with crontab it does not show any error (as i said, log file suggests that everything is going really well)

this is my repository: https://github.com/raposofrct/wescraping-ETL there we have ETL folder that contains my script, crontab command that im using and my sqlite database.

thanks for any help or clue that you guys can give me.


Solution

  • Your script is likely working, but it's not putting data into the database file you're looking at. hm_db.sqlite is relative to whatever the current working directory is:

    DataBase(dados,create_engine('sqlite:///hm_db.sqlite',echo=False))
    

    That is very unlikely to be the same directory you are in when you run the script manually. Either provide an absolute path or make the path relative to your script directory, e.g.

    from pathlib import Path
    
    
    root_directory = Path(__file__).parent
    database_file = root_directory / "hm_db.sqlite"
    
    DataBase(dados, create_engine(f"sqlite:///{database_file}", echo=False))
    

    Alternatively, log os.getcwd() in your existing script to figure out where your cronjob has been storing data.