how to run a python jupyter notebook daily automatically
Update
recently I came across papermill which is for executing and parameterizing notebooks.
https://github.com/nteract/papermill
papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
This seems better than nbconvert, because you can use parameters. You still have to trigger this command with a scheduler. Below is an example with cron on Ubuntu.
Old Answer
nbconvert --execute
can execute a jupyter notebook, this embedded into a cronjob will do what you want.
Example setup on Ubuntu:
Create yourscript.sh with the following content:
/opt/anaconda/envs/yourenv/bin/jupyter nbconvert \
--execute \
--to notebook /path/to/yournotebook.ipynb \
--output /path/to/yournotebook-output.ipynb
You have more options except --to notebook. I like this option since you have a fully executable "log"-File afterwards.
I recommend using a virtual environment to run your notebook, to avoid that future updates mess with your script. Do not forget to install nbconvert into the environment.
Now create a cronjob, that runs every day e.g. at 5:10 AM, by typing crontab -e
in your terminal and add this line:
10 5 * * * /path/to/yourscript.sh
Try the SeekWell Chrome Extension. It lets you schedule notebooks to run weekly, daily, hourly or every 5 minutes, right from Jupyter Notebooks. You can also send DataFrames directly to Sheets or Slack if you like.
Here's a demo video, and there is more info in the Chrome Web Store link above as well.
**Disclosure: I'm a SeekWell co-founder
It's better to combine with airflow if you want to have higher quality. I packaged them in a docker image, https://github.com/michaelchanwahyan/datalab.
It is done by modifing an open source package nbparameterize and integrating the passing arguments such as execution_date. Graph can be generated on the fly The output can be updated and saved within inside the notebook.
When it is executed
- the notebook will be read and inject the parameters
- the notebook is executed and the output will overwrite the original path
Besides, it also installed and configured common tools such as spark, keras, tensorflow, etc.