Python logging into file as a dictionary or JSON
So based on @abarnert, i found this Link which provided a good path to making this concept work for the most part. The code as it stands is:
logger=logging.getLogger()
logger.setLevel(logging.DEBUG)
file_handler=logging.FileHandler('foo.log')
stream_handler=logging.StreamHandler()
stream_formatter=logging.Formatter(
'%(asctime)-15s %(levelname)-8s %(message)s')
file_formatter=logging.Formatter(
"{'time':'%(asctime)s', 'name': '%(name)s', \
'level': '%(levelname)s', 'message': '%(message)s'}"
)
file_handler.setFormatter(file_formatter)
stream_handler.setFormatter(stream_formatter)
logger.addHandler(file_handler)
logger.addHandler(stream_handler)
Although it does not fully meet the requirement, it doesnt require any pre processing, and allows me to create two log handlers.
Afterwards, i can use something like:
with open('foo.log') as f:
logs = f.read().splitlines()
for l in logs:
for key, value in eval(l):
do something ...
to pull dict
objects instead of fighting with improperly formatted JSON to accomplish what i had set out to accomplish.
Still am hoping for a more elegant solution.
I too dealt with this and I personally believe that an external library might be an overkill for something like this.
I studied a bit the code behind logging.Formatter and came up with a subclass which in my case does the trick (my goal was to have a JSON file that Filebeat can read to further log into ElasticSearch).
Class:
import logging
import json
class JsonFormatter(logging.Formatter):
"""
Formatter that outputs JSON strings after parsing the LogRecord.
@param dict fmt_dict: Key: logging format attribute pairs. Defaults to {"message": "message"}.
@param str time_format: time.strftime() format string. Default: "%Y-%m-%dT%H:%M:%S"
@param str msec_format: Microsecond formatting. Appended at the end. Default: "%s.%03dZ"
"""
def __init__(self, fmt_dict: dict = None, time_format: str = "%Y-%m-%dT%H:%M:%S", msec_format: str = "%s.%03dZ"):
self.fmt_dict = fmt_dict if fmt_dict is not None else {"message": "message"}
self.default_time_format = time_format
self.default_msec_format = msec_format
self.datefmt = None
def usesTime(self) -> bool:
"""
Overwritten to look for the attribute in the format dict values instead of the fmt string.
"""
return "asctime" in self.fmt_dict.values()
def formatMessage(self, record) -> dict:
"""
Overwritten to return a dictionary of the relevant LogRecord attributes instead of a string.
KeyError is raised if an unknown attribute is provided in the fmt_dict.
"""
return {fmt_key: record.__dict__[fmt_val] for fmt_key, fmt_val in self.fmt_dict.items()}
def format(self, record) -> str:
"""
Mostly the same as the parent's class method, the difference being that a dict is manipulated and dumped as JSON
instead of a string.
"""
record.message = record.getMessage()
if self.usesTime():
record.asctime = self.formatTime(record, self.datefmt)
message_dict = self.formatMessage(record)
if record.exc_info:
# Cache the traceback text to avoid converting it multiple times
# (it's constant anyway)
if not record.exc_text:
record.exc_text = self.formatException(record.exc_info)
if record.exc_text:
message_dict["exc_info"] = record.exc_text
if record.stack_info:
message_dict["stack_info"] = self.formatStack(record.stack_info)
return json.dumps(message_dict, default=str)
Usage:
The formatter must simply be passed to the logging handler.
json_handler = FileHandler("foo.json")
json_formatter = JsonFormatter({"level": "levelname",
"message": "message",
"loggerName": "name",
"processName": "processName",
"processID": "process",
"threadName": "threadName",
"threadID": "thread",
"timestamp": "asctime"})
json_handler.setFormatter(json_formatter)
Explanation :
While the logging.Formatter takes a string which it interpolates to output the formatted log record, the JsonFormatter takes a dictionary where the key will be the key of the logged value in the JSON string and the value is a string corresponding to an attribute of the LogRecord that can be logged. (List available in the docs here)
Main "problem" would be parsing dates and timestamps, and the default formatter implementation has these class attributes, default_time_format and default_msec_format.
default_msec_format gets passed to time.strftime() and default_msec_format is interpolated to append miliseconds as time.strftime() doesn't provide formatting options for those.
The principle is that those are now instance attributes which can be provided in the form of time_format and msec_format to customize how the parent's class (unchanged, as it's not overwritten) formatTime() method behaves.
You could technically override it if you want to customize time formatting, but I personally found that using something else would either be redundant or limit the actual formatting options. But feel free to adjust as to your needs.
Output:
An example JSON record logged by the formatting options above, with the default time formatting options set in the class, would be:
{"level": "INFO", "message": "Starting service...", "loggerName": "root", "processName": "MainProcess", "processID": 25103, "threadName": "MainThread", "threadID": 4721200640, "timestamp": "2021-12-04T08:25:07.610Z"}