How to count lines of code in jupyter notebook
The answer from @Jessime Kirk is really good. But it seems like the ipynb file shouldn't have Chinese character. So I optimized the code as below.
#!/usr/bin/env python
from json import load
from sys import argv
def loc(nb):
with open(nb, encoding='utf-8') as data_file:
cells = load(data_file)['cells']
return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')
def run(ipynb_files):
return sum(loc(nb) for nb in ipynb_files)
if __name__ == '__main__':
print(r"This file can count the code lines number in .ipynb files.")
print(r"usage:python countIpynbLine.py xxx.ipynb")
print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
print(r"it can also count multiple code.ipynb lines.")
print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
print(r"start to count line number")
print(run(argv[1:]))
This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line:
#!/usr/bin/env python
from json import load
from sys import argv
def loc(nb):
cells = load(open(nb))['cells']
return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')
def run(ipynb_files):
return sum(loc(nb) for nb in ipynb_files)
if __name__ == '__main__':
print(run(argv[1:]))
So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb
to get results.
The same can be done from shell if you have a useful jq utility:
jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l
Also, you can use grep
to filter lines further, e.g. to remove blank lines:
| grep -e ^\"\\\\n\"$ | wc -l