How to import, export and edit bookmarks of a pdf file?
There's quite a variety of tools that can extract bookmarks from a pdf to a plain text file, and vice versa. Some of which are as follows:
- pdftk
- iText toolbox (older versions only, get itext-2.0.1.jar)
- pdfWritebookmarks tool that I use
- JPdfBookmarks which even has a GUI.
Also, I have a script that can convert between the formats of many of these tools : bmconverter.py.
Another very nice way is to add bookmarks to a pdf via pdflatex.
You can use pdftk
for this. More info: How to Export and Import PDF Bookmarks.
Export PDF bookmarks on the command-line like this:
pdftk C:\Users\Sid\Desktop\doc.pdf dump_data output C:\Users\Sid\Desktop\doc_data.txt
Import PDF bookmarks from a data file like this:
pdftk C:\Users\Sid\Desktop\doc.pdf update_info C:\Users\Sid\Desktop\doc_data.txt output C:\Users\Sid\Desktop\updated.pdf
pdftk
bookmark format is a little bit tedious to write. Instead I created my own script using bash
, sed
, pdftk
and python3
. Check it out at this repo: https://github.com/SiddharthPant/booky
So now I can create a text file(bkmrks.txt
) like this which takes just 5 minutes to write even for a 1000 page pdf.
{
Title1, 1
Title2, 2
{
Subtitle1, 3
Subtitle2, 4
{
SubSubtitle1, 5
...
}
}
}
and then use my script
./booky.sh pdf_file.pdf bkmrks.txt
this automatically creates a pdf(pdf_file_new.pdf
) that has my bookmarks in it.
This is going to work in *nix systems if instead you are on a Windows machine. Then first install python3
and pdftk
just use the booky.py
file in the repo to convert bkmrks.txt
to pdftk
compatible format
python3 booky.py < bkmrks.txt > output.txt
and then use the export command to generate a dumped data file. Remove the previous bookmarks from that file and insert content of output.txt
instead using a simple copy paste. And then import that data back.
If you have a version of a document that has bookmarks and want to copy them over, a much simpler way is to use PDF-XChange Viewer (I used v2.5.211). Open the PDF that has the bookmarks (the source PDF), select all the bookmarks in the bookmarks pane, copy them using Ctrl+C, open the PDF that doesn't have the bookmarks (the target PDF), and paste them (Ctrl+V) in that PDF's bookmarks pane. PDF-Xchange Viewer preserves bookmark properties as they were from the source PDF (including any bold / italic formatting on the bookmark text). If for some reason some of the sections of the target PDF are lower or higher due to revisions made to the document, you can click the bookmark needing correction, scroll to where on the page you'd like the bookmark to open to, right-click the bookmark again and click "Set Destination". Repeat this last part as needed for any offending bookmark. Save the target PDF when finished.
This worked great for me, was quite intuitive, and I was done in a few minutes. In my particular scenario, a co-worker had produced a very long document using Word for Mac which didn't have bookmarks. Due to the length of the document, I wanted bookmarks corresponding to the document's outline. I could get Word for Windows to save the document as a PDF with bookmarks, but some formatting differences between Word for Windows and Word for Mac threw off the page count quite off (in particular, there were differences in white space around footers, and differences in the spacing between figures and the caption). I was able to play around with the headers & footers and figure sizes to get the pagination correct in Word for Windows, then saved to PDF w/ bookmarks. Unfortunately, there still were some differences in the formatting such that I wished to just apply the bookmarks to the original PDF, and that's when I figured out the solution above.