Python/Java script to download all .pdf files from a website
Yes its possible.
In python it is simple;
urllib
will help you to download files from net.
For example:
import urllib
urllib.url_retrive("http://example.com/helo.pdf","c://home")
Now you need to make a script that will find links ending with .pdf.
Example html page : Here's a link
You need to download html page and use a htmlparser or use a regular expression.
Yes it's possible. for downloading pdf files you don't even need to use Beautiful Soup or Scrapy.
Downloading from python is very straight forward Build a list of all linkpdf links & download them
Reference to how to build a list of links: http://www.pythonforbeginners.com/code/regular-expression-re-findall
If you need to crawl through several linked pages then maybe one of the frameworks might help If you are willing to build your own crawler here a great tutorial, which btw is also a good intro to Python. https://www.udacity.com/course/viewer#!/c-cs101