Documenting and detailing a single script based on the comments inside
Comments are not suitable for documentation, typically they are used to highlight specific aspects which are relevant to developers (not users) only. To achieve your goal, you can use __doc__
strings in various places:
- module-level
- class-level
- function-/method-level
In case your _run
method is really long and you feel the doc-string is too far apart from the actual code then this is a strong sign that your function is too long anyway. It should be split into multiple smaller functions to improve clarity, each of which can have its doc-string. For example the Google style guide suggests that if a function exceeds 40 lines of code, it should be broken into smaller pieces.
Then you can use for example Sphinx to parse that documentation and convert if to PDF format.
Here's an example setup (using Google doc style):
# -*- coding: utf-8 -*-
"""
Small description and information.
@author: Author
Attributes:
CONSTANT_1 (int): Some description.
CONSTANT_2 (int): Some description.
"""
import numpy as np
import math
from scipy import signal
CONSTANT_1 = 5
CONSTANT_2 = 10
class Test():
"""Main class."""
def __init__(self, run_id, parameters):
"""Some stuff not too important."""
pass
def _run(self, parameters):
"""Main program returning a result object.
Uses `func1` to compute X and then `func2` to convert it to Y.
Args:
parameters (dict): Parameters for the computation
Returns:
result
"""
X = self.func1(parameters)
Y = self.func2(X)
return Y
def func1(self, p):
"""Information on this method."""
pass
def func2(self, x):
"""Information on this method."""
pass
Then with Sphinx you can use the sphinx-quickstart
command line utility to set up a sample project. In order to create documentation for the script you can use sphinx-apidoc
. For that purpose you can create a separate directory scripts
, add an empty __init__.py
file and place all your scripts inside that directory. After running these steps the directory structure will look like the following (assuming you didn't separate build and source directories during sphinx-quickstart
(which is the default)):
$ tree
.
├── _build
├── conf.py
├── index.rst
├── make.bat
├── Makefile
├── scripts
│ └── __init__.py
│ └── example.py
├── _static
└── _templates
For sphinx-apidoc
to work, you need to enable the sphinx-autodoc
extension.
Depending on the doc-style you use, you might also need to enable a corresponding extension. The above example is using Google doc style, which is handled by the Napoleon extension. These extensions can be enabled in conf.py
:
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon']
Then you can run sphinx-apidoc
as follows (-e
puts every module/script on a separate page, -f
overwrites existing doc files, -P
documents private members (those starting with _
)):
$ sphinx-apidoc -efPo api scripts/
Creating file api/scripts.rst.
Creating file api/scripts.example.rst.
Creating file api/modules.rst.
This command created the necessary instructions for the actual build command. In order for the build too to be able to import and correctly document your scripts, you also need to set the import path accordingly. This can be done by uncommenting the following three lines near the top in conf.py
:
import os
import sys
sys.path.insert(0, os.path.abspath('.'))
To make your scripts' docs appear in the documentation you need to link them from within the main index.rst
file:
Welcome to ExampleProject's documentation!
==========================================
.. toctree::
:maxdepth: 2
:caption: Contents:
api/modules
Eventually you can run the build command:
$ make latexpdf
Then the resulting documentation can be found at _build/latex/<your-project-name>.pdf
.
This is a screenshot of the resulting documentation:
Note that there are various themes available to change the look of your documentation. Sphinx also supports plenty of configuration options to customize the build of your documentation.
Docstrings instead of comments
In order to make things easier for yourself, you probably want to make use of docstrings rather than comments:
A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the
__doc__
special attribute of that object.
This way, you can make use of the __doc__
attribute when parsing the scripts when generating documentation.
The three double quoted string placed immediately after the function/module definition that becomes the docstring is just syntactic sugaring. You can edit the __doc__
attribute programmatically as needed.
For instance, you can make use of decorators to make the creation of docstrings nicer in your specific case. For instance, to let you comment the steps inline, but still adding the comments to the docstring (programmed in browser, probably with errors):
def with_steps(func):
def add_step(n, doc):
func.__doc__ = func.__doc__ + "\nStep %d: %s" % (n, doc)
func.add_step = add_step
@with_steps
def _run(self, parameters):
"""Initial description that is turned into the initial docstring"""
_run.add_step(1, "we start by doing this")
code to do it
_run.add_step(2, "then we do this")
code to do it
code
Which would create a docstring like this:
Initial description that is turned into the initial docstring
Step 1: we start by doing this
Step 2: then we do this
You get the idea.
Generating PDF from documented scripts
SphinxPersonally, I'd just try the PDF-builders available for Sphinx, via the bundled LaTeXBuilder or using rinoh if you don't want to depend on LaTeX.
However, you would have to use a docstring format that Sphinx understands, such as reStructuredText or Google Style Docstrings.
ASTAn alternative is to use ast to extract the docstrings. This is probably what the Sphinx autodoc extension uses internally to extract the documentation from the source files. There are a few examples out there on how to do this, like this gist or this blog post.
This way you can write a script that parses and outputs any formats you want. For instance, you can output Markdown or reST and convert it to PDF using pandoc.
You could write marked up text directly in the docstrings, which would give you a lot of flexibility. Let's say you wanted to write your documentation using markdown – just write markdown directly in your docstring.
def _run(self, parameters):
"""Example script
================
This script does a, b, c
1. Does something first
2. Does something else next
3. Returns something else
Usage example:
result = script(parameters)
foo = [r.foo for r in results]
"""
This string can be extracted using ast and parsed/processed using whatever library you see fit.
Doxygen sounds suiable for this. It supports Python documentation strings and can also parse comment that start with ##
as described here:
https://www.doxygen.nl/manual/docblocks.html#pythonblocks
To get the output in PDF format you need to install a LaTeX processor, such as MikTex. When you run Doxygen it will create a latex folder that includes a "make" shell script. Run the shell script and the PDF file will be generated,.
To include content that's generated elsewhere, e.g. the SHA1 hashes you mentioned, you could use the @include
command within a comment. Note that Doxygen's @include
commands will only work if you're using ##
comments.
e.g.
## Documentation for a class.
#
# More details.
# @include PyClassSha1Hash.txt
class PyClass: