How to convert all pdf files to text (within a folder) with one command?
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.