Fast pdf to jpg conversion on Linux wanted
Solution 1:
Using Ghostscript directly (instead of using ImageMagick's convert
command, which calls Ghostscript indirectly) is indeed faster. And it gives you more control about conversion parameters. Try
gs \
-sDEVICE=jpeg \
-o bar_%03d.jpg \
-dJPEGQ=95 \
-r600x600 \
-g4960x7016 \
foo.pdf
where
-o
: determines output path+filename (and saves usage of-dBATCH -dNOPAUSE
)-dJPEGQ
: sets JPEG quality to 95%-r
: sets resolution to 600dpi-g
: sets image size to 4960x7016px-sDEVICE
: sets output as JPEG
This command will probably be still to slow for you and create files bigger than expected. For smaller filesizes and faster execution try this (which probably comes close to output quality of your convert
commandline):
gs \
-sDEVICE=jpeg \
-o bar_%03d_200dpi_q80.jpg \
-dJPEGQ=80 \
-r200x200 \
-g1653x2339 \
foo.pdf
or even
gs \
-sDEVICE=jpeg \
-o bar_%03d_default_a4.jpg \
-sPAPERSIZE=a4 \
foo.pdf
(which gives 72dpi resolution, often good enough for most screens and for most web applications).
Solution 2:
BTW, one of the reasons ImageMagick is so much slower is that it calls Ghostscript twice. It does not convert PDF => PNG in one go, but uses 2 different steps:
- it first uses Ghostscript for
PDF => PostScript
conversion; - it then uses Ghostscript for
PostScript => PNG
conversion.
You can learn about the detailed settings ImageMagick's "delegates" (the external programs ImageMagick uses, such as Ghostscript) by typing
convert -list delegate
(On my system that's a list of 32 different commands.) Now to see which commands are used to convert to PNG, use this:
convert -list delegate | grep -i png
Ok, this was for Linux. If you are on Windows, try this:
convert -list delegate | findstr /i png
You'll discover that IM does produce PNG only from PS or EPS input. So how does IM get (E)PS from your PDF? Easy:
convert -list delegate | findstr /i PDF
convert -list delegate | grep -i PDF
Ah! It uses Ghostscript to make a PDF => PS conversion, then uses Ghostscript again to make a PS => PNG conversion. Works, but isn't the most efficient way if you know that Ghostscript can do PDF => PNG in one go. And faster. And in much better quality.
About IM's handling of PDF conversion to images via the Ghostscript delegate you should know two things first and foremost:
- By default, if you don't give an extra parameter, Ghostscript will output images with a 72dpi resolution. That's why sometimes people here suggest to add
-density 600
as aconvert
parameter which tells Ghostscript to use a 600 dpi resolution for its image output. - The detour of IM to call Ghostscript twice to convert first
PDF => PS
and thenPS => PNG
is a real blunder. Because you never win and harldy keep quality in the first step, but very often loose some. Reasons:- PDF can handle transparencies, which PostScript can not.
- PDF can embed TrueType fonts, which PostScript can not. etc.pp.
(Conversion in the opposite direction,PS => PDF
, therefor is not that critical....)
That's why I'd suggested you convert your PDFs in one go to PNG (or JPEG) using Ghostscript directly. And use the most recent version 8.71 (soon to be released: 9.00) of Ghostscript...
Solution 3:
The program pdftoppm
from the poppler package is also able to create JPEGs, and for
me it is about twice as fast as using gs
as described above:
pdftoppm -jpeg -r 300 foo.pdf foo.jpg
Solution 4:
In my experience, MuPDF is a lot faster than Ghostscript. It is a much newer project without much of the cruft in gs. Try if it fits for your usecase!
mudraw -w 1024 -h 768 -r 200 -c rgb -o bar%d.png foo.pdf
If you have a older linux distribution and installed mupdf-tools from the repository, mudraw
might still be called pdfdraw
You then have to convert the png to jpeg using for example imagemagick. But it will still be faster than Ghostscript.