Convert all Linux man pages to text / html or markdown
I recommend trying Pandoc:
$ pandoc --from man --to html < input.1 > output.html
It produces HTML that is both readable and editable, the latter being important for my use case.
It can also produce a lot of other formats such as Markdown, which is nice when you're not sure which format you want to commit to yet.
There is a comment on the question that says Pandoc cannot convert from man
, but that seems to be out of date. The current version (2.13) does a decent job converting man
to html
for my example.
Furthermore, while the accepted answer suggests using groff -mandoc -Thtml
, that did not do as good a job for me as Pandoc. Specifically, I want to convert the old Flex-2.5.5 man page to html. groff
(version 1.22.4) unfortunately mangled all of the code examples (no indentation, no fixed-width font), making them difficult to read, while Pandoc brought them over as pre
sections. Additionally, the groff
output is full of explicit inline styles, while the Pandoc output uses no CSS at all, making it a better starting point for editing.
(There is an existing answer that also mentions Pandoc, and I considered editing my information into it, but I wanted to say more about my experience using it.)
Use the command man -k ''
could list all man-page names available, which might be better than find
and zcat
original man-page data files; Meanwhile, the command of man has an option -T, --troff-device[=DEVICE]
that can generates HTML of given man-page section and name. So the following bash script comes to convert all man-pages available in your Linux into HTML files:
man -k '' | while read sLine; do
declare sName=$(echo $sLine | cut -d' ' -f1)
declare sSection=$(echo $sLine | cut -d')' -f1|cut -d'(' -f2)
echo "converting ${sName}(${sSection}) to ${sName}.${sSection}.html ..."
man -Thtml ${sSection} ${sName} > ${sName}.${sSection}.html
done
In a intranet without Internet access that online man-pages service is unavailable, put this files in your static HTTP server such as Nginx with autoindex on is a good option, where browse and Ctrl+F may convenient.
Yes... To convert one of them, say, man of man:
zcat /usr/share/man/man1/man.1.gz | groff -mandoc -Thtml
If you want 'all of installed on your PC', you just iterate through them. For different output (text, for example), use different 'device' (the -T argument).
Just in case... if the 'iteration' was the real problem, you can use:
OUT_DIR=...
for i in `find -name '*.gz'`; do
dname=`dirname $i`
mkdir -p $OUT_DIR/$dname
zcat $i | groff -mandoc -Thtml > $OUT_DIR/$i.html
done