How do I make pdfcrop output all pages of the same size?
I found that the --verbose
flag will output the bounding box used at each step. Since this was a "growing" animation, the last page is the largest.
So to get them all the same size, I ran pdfcrop with --verbose and extracted this output:
%%HiResBoundingBox: 48.000022 299.872046 624.124950 420.127932
and then fed that to a second run of pdfcrop, specifying the bounding box:
pdfcrop --bbox "48.000022 299.872046 624.124950 420.127932" ~/animation.pdf
If the last page is not the largest, we need to compute the maximum width and height among all the pages, and then use these values to determine the right bounding boxes. Note that the four coordinates in a bounding box are:
- x-coordinate (distance from left edge of page) of upper-left corner,
- y-coordinate (distance from top edge of page) of upper-left corner,
- x-coordinate (distance from left edge of page) of bottom-right corner,
- y-coordinate (distance from top edge of page) of bottom-right corner.
Computing the right bounding boxes for each page and using them could be done with an appropriate patch to the pdfcrop
script (it's written in Perl), but as I'm not very comfortable with Perl, did it in Python instead; here is the script in case it's useful to someone.
import re, sys
lines = sys.stdin.readlines()
width = height = 0
# First pass: compute |width| and |height|.
for line in lines:
m = re.match(r'\\page (\d*) \[([0-9.]*) ([0-9.]*) ([0-9.]*) ([0-9.]*)\](.*)', line, re.DOTALL)
if m:
page, xmin, ymin, xmax, ymax, rest = m.groups()
width = max(width, float(xmax) - float(xmin))
height = max(height, float(ymax) - float(ymin))
# Second pass: change bounding boxes to have width |width| and height |height|.
for line in lines:
m = re.match(r'\\page (\d*) \[([0-9.]*) ([0-9.]*) ([0-9.]*) ([0-9.]*)\](.*)', line, re.DOTALL)
if m:
page, xmin, ymin, xmax, ymax, rest = m.groups()
xmin = float(xmin)
ymin = float(ymin)
xmax = float(xmax)
ymax = float(ymax)
# We want |xmin| and |xmax| such that their difference is |width|
addx = (width - (xmax - xmin)) / 2.0
xmin -= addx
xmax += addx
# We want |ymin| and |ymax| such that their difference is |height|
addy = (height - (ymax - ymin)) / 2.0
ymin -= addy
ymax += addy
sys.stdout.write(r'\page %s [%s %s %s %s]%s' % (page, xmin, ymin, xmax, ymax, rest))
else:
sys.stdout.write(line)
Usage:
Run the regular
pdfcrop
command, with--debug
, e.g.:pdfcrop --debug foo.pdf
Because of
--debug
, it will not delete thetmp-pdfcrop-*.tex
file it created. Also, note down thepdftex
(or whatever) command it executed at the end, if you had passed in some special options topdfcrop
and it's therefore nontrivial.Pass the
tmp-pdfcrop-*
file through the script above, e.g.:python find-common.py < tmp-pdfcrop-34423.tex > tmp-pdfcrop-common.tex
This will write out
tmp-pdfcrop-common.tex
with different bounding boxes.Run the
pdftex
(or whatever) command thatpdfcrop
called, with this file:pdftex -no-shell-escape -interaction=nonstopmode tmp-pdfcrop-common.tex
Check the resulting PDF file, and rename it to whatever you like:
mv tmp-pdfcrop-common.pdf foo-crop.pdf