Rip javadocs from a doc site to a local zip file

In this case ... make your own javadocs!

First you need the source. At the time of writing the Java 8 JDK comes with a zip file called src.zip. Sometimes, for unexplained reasons, Oracle don't always include the source. So for some older versions (and who knows about the future) you have to get hold of the Java source in another way. It's worth also being aware that, in the past, Oracle have sometimes included the source with the Linux version of the JDK, but not with the Windows one.

I just unzipped this file... the top directories are "com", "java", "javax", "launcher" and "org". Directory launcher contains no files to document.

You can generate the javadocs very very simply from any or all of these by CD'ing at the command prompt/terminal to the directory ...\src. Then go

javadoc -d docs -Xmaxwarns 10 -Xmaxerrs 10 -Xdoclint:none -sourcepath . -subpackages java:javax:org:com

NB note that there is a "." after -sourcepath

Simple as that. Generating your own javadocs also has the huge advantage that you know they are precisely the right javadocs for the JDK you are using on your system.

The same applies to documenting any and all Java .jars (with source) which you use. However, all versions of most jars will be found with their documentation available for download at Maven Central http://search.maven.org...


  1. First, make sure they don't already offer an download in zip form or similar.

  2. Then, make sure you are actually allowed to do this (this may depend on where you live, and on any conditions mentioned on the web site from where you want to pull this).

  3. Then, have a look at the Wget tool. It is part of the GNU system, thus included in many Linux distributions, but also available for Windows and Mac, I suppose.

Something like this works for me:

wget --no-parent --recursive --level inf --page-requisites --wait=1 \
   https://epaul.github.io/jsch-documentation/simple.javadoc/

(without the line break; it should be escaped by the \ backslash here).

Look up what each option does in the manual before trying this.

If you want to do this repeatedly, look into the --mirror option. For downloading other websites, --convert-links might also be useful, but I found that is not needed for Javadocs, which usually have the correct absolute and relative links.

This downloads lots of the same copy of the index.html file with appended ?... names (for the FRAMES links on each page). You can remove these files after downloading by adding the --reject 'index.html\?*' option, but they still will be downloaded first (and checked for recursive links). I did not yet find out how to avoid downloading them at all. (See this related question on Serverfault.)

Maybe adding the right recursion level would help here (I didn't try).

After downloading, you might want to zip the resulting directory to take less disk space. Use the zip tool of your choice for this.