How can I add a comment to a YAML file in Python
That is possible in principle, because you can round-trip such "start-of-file" comments, but it is not nicely supported in the current ruamel.yaml 0.10 and certainly not when "starting from scratch" (i.e. no changing an existing file). At the bottom is an easy an relatively nice solution but I would first like to present an ugly workaround and a step-wise how to get this done.
Ugly:
The ugly way to do this is to just add the comment to the file before you write the YAML data to it. That is insert:
f.write('# Data for Class A\n')
just before ruamel.yaml.dump(...)
Step by step:
To insert the comment on the data structure, so the above hack is not necessary, you first
need to make sure your d
data is a CommentedMap
type. If
you compare the difference of that d
variable with one that has a the comment by loading the commented YAML back into c
import ruamel.yaml
from ruamel.yaml.comments import Comment, CommentedSeq, CommentedMap
d = CommentedMap() # <<<<< most important
for m in ['B1', 'B2', 'B3']:
d2 = {}
for f in ['A1', 'A2', 'A3']:
d2[f] = CommentedSeq(['test', 'test2'])
if f != 'A2':
d2[f].fa.set_flow_style()
d[m] = d2
yaml_str = ruamel.yaml.dump(d, Dumper=ruamel.yaml.RoundTripDumper,
default_flow_style=False, width=50, indent=8)
assert not hasattr(d, Comment.attrib) # no attribute on the CommentedMap
comment = 'Data for Class A'
commented_yaml_str = '# ' + comment + '\n' + yaml_str
c = ruamel.yaml.load(commented_yaml_str, Loader=ruamel.yaml.RoundTripLoader)
assert hasattr(c, Comment.attrib) # c has the attribute
print c.ca # and this is what it looks like
print d.ca # accessing comment attribute creates it empty
assert hasattr(d, Comment.attrib) # now the CommentedMap has the attribute
This prints:
Comment(comment=[None, [CommentToken(value=u'# Data for Class A\n')]],
items={})
Comment(comment=None,
items={})
A Comment
has an attribute comment
that needs to be set to a 2 element list that consist of the EOL comment (always only one) and a list of preceding line comments (in the form of CommentTokens
)
To create a CommentToken you need a (fake) StartMark that tells which column it starts:
from ruamel.yaml.error import StreamMark
start_mark = StreamMark(None, None, None, 0, None, None) # column 0
Now you can create the token:
from ruamel.yaml.tokens import CommentToken
ct = CommentToken('# ' + comment + '\n', start_mark, None)
Assign the token as the first element of the preceding list on your CommentedMap:
d.ca.comment = [None, [ct]]
print d.ca # in case you want to check
gives you:
Comment(comment=[None, [CommentToken(value='# Data for Class A\n')]],
items={})
And finally:
print ruamel.yaml.dump(d, Dumper=ruamel.yaml.RoundTripDumper)
gives:
# Data for Class A
B1:
A1: [test, test2]
A3: [test, test2]
A2:
- test
- test2
B2:
A1: [test, test2]
A3: [test, test2]
A2:
- test
- test2
B3:
A1: [test, test2]
A3: [test, test2]
A2:
- test
- test2
Of course you don't need to create the c
object, that is just for illustration.
What you should use:
To make the whole exercise somewhat easier you can just forget about the details and patch in the following method to CommentedBase
once:
from ruamel.yaml.comments import CommentedBase
def set_start_comment(self, comment, indent=0):
"""overwrites any preceding comment lines on an object
expects comment to be without `#` and possible have mutlple lines
"""
from ruamel.yaml.error import StreamMark
from ruamel.yaml.tokens import CommentToken
if self.ca.comment is None:
pre_comments = []
self.ca.comment = [None, pre_comments]
else:
pre_comments = self.ca.comments[1]
if comment[-1] == '\n':
comment = comment[:-1] # strip final newline if there
start_mark = StreamMark(None, None, None, indent, None, None)
for com in comment.split('\n'):
pre_comments.append(CommentToken('# ' + com + '\n', start_mark, None))
if not hasattr(CommentedBase, 'set_start_comment'): # in case it is there
CommentedBase.set_start_comment = set_start_comment
and then just do:
d.set_start_comment('Data for Class A')
Within your with
block, you can write anything you want to the file. Since you just need a comment at the top, add a call to f.write()
before you call ruamel:
with open('test.yml', "w") as f:
f.write('# Data for Class A\n')
ruamel.yaml.dump(
d, f, Dumper=ruamel.yaml.RoundTripDumper,
default_flow_style=False, width=50, indent=8)