mrjob: setup logging on EMR
Out of all options, the only one really works is using stderr with a direct write (sys.stderr.write
) or using a logger with a StreamHandler to stderr.
The logs can later be retrieved after the job is finished (successfully or with an error) from:
[s3_log_uri]/[jobflow-id]/task-attempts/[job-id]/[attempt-id]/stderr
Be sure to keep the logs in your runners.emr.cleanup
configuration.