How to write shell script for finding number of pages in PDF?

Without any extra package:

strings < file.pdf | sed -n 's|.*/Count -\{0,1\}\([0-9]\{1,\}\).*|\1|p' \
    | sort -rn | head -n 1

Using pdfinfo:

pdfinfo file.pdf | awk '/^Pages:/ {print $2}'

Using pdftk:

pdftk file.pdf dump_data | grep NumberOfPages | awk '{print $2}'

You can also recursively sum the total number of pages in all PDFs via pdfinfo as follows:

find . -xdev -type f -name "*.pdf" -exec pdfinfo "{}" ";" | \
    awk '/^Pages:/ {n += $2} END {print n}'

The imagemagick library provides a tool called identify which in conjunction with counting the lines of output gets you what you are after...imagemagick is a easy install on osx with brew.

Here is a functional bash script that captures it to a shell variable and dumps it back to the screen...

#/bin/bash
pdfFile=$1
echo "Processing $pdfFile"
numberOfPages=$(/usr/local/bin/identify "$pdfFile" 2>/dev/null | wc -l | tr -d ' ')
#Identify gets info for each page, dump stderr to dev null
#count the lines of output
#trim the whitespace from the wc -l outout
echo "The number of pages is: $numberOfPages"

And the output of running it...

$ ./countPages.sh aSampleFile.pdf 
Processing aSampleFile.pdf
The number of pages is: 2
$

How to write shell script for finding number of pages in PDF?

Tags:

Pdf

Shell

Related

Recent Posts