How to change the color of all highlights in a PDF file?
I wrote a Python script to perform the task. It searches for all objects in the PDF file (marked by obj
and endobj
) and checks for every object if it is an annotation (/Type/Annot
) of the highlight type (/Subtype/Highlight
). If that is the case the color definition (/C[...]
) will be replaced.
There are some limitations:
- No real parsing of the PDF is done. The regular expressions used may not be suitable for some PDF files.
- This might not work for encrypted or compressed PDF files. (I am not sure whether the annotations might be compressed.)
- The original file will be overwritten. Don't blame me for lost data! (The script is easily edited to create new files.)
- I assume that certain PDF objects reference other objects by their position in the file. Thus, I prevent the file size from changing. This means the new color definition might not take up more bytes than the old one.
- The color definition is not validated. You might break your PDF with an invalid expression.