How to schedule tasks on SageMaker

As of mid 2020, AWS provides several options to run a notebook as a cron job. It uses Papermill to inject parameters per run, and you can also use the CLI to run the notebook on demand.
You can: (1) use the AWS APIs or CLI directly; (2) use a convenience package, or (3) use a Jupyter Lab extension.

See this tutorial and the Quick Start guide for examples.


I dont think there is any way to schedule tasks on sagemaker. Notebook is meant more for interacting with the SageMaker runtime. Which is more for training and hosting ML models.

I am presuming you want retrain your model every night. There are two ways of achieving that, retrain your model somewhere else and then upload to S3 and recreate your docker container every night using an external script. Or Provide your own docker container which has a cron job scheduled within it. Give that to SageMaker to deploy.


Amazon SageMaker is a set of API that can help various machine learning and data science tasks. These API can be invoked from various sources, such as CLI, SDK or specifically from schedule AWS Lambda functions (see here for documentation: https://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html )

The main parts of Amazon SageMaker are notebook instances, training and tuning jobs, and model hosting for real-time predictions. Each one has different types of schedules that you might want to have. The most popular are:

  • Stopping and Starting Notebook Instances - Since the notebook instances are used for interactive ML models development, you don't really need them running during the nights or weekends. You can schedule a Lambda function to call the stop-notebook-instance API at the end of the working day (8PM, for example), and the start-notebook-instance API in the morning. Please note that you can also run crontab on the notebook instances (after opening the local terminal from the Jupyter interface).
  • Refreshing an ML Model - Automating the re-training of models, on new data that is flowing into the system all the time, is a common issue that with SageMaker is easier to solve. Calling create-training-job API from a scheduled Lambda function (or even from a CloudWatch Event that is monitoring the performance of the existing models), pointing to the S3 bucket where the old and new data resides, can create a refreshed model that you can now deploy into an A/B testing environment .

----- UPDATE (thanks to @snat2100 comment) -----

  • Creating and Deleting Real-time Endpoints - If your realtime endpoints are not needed 24/7 (for example, serving internal company users working during workdays and hours), you can also create the endpoints in the morning and delete them at night.