For Python2 to Python3 code conversion, Which version of Python & Django best suited?
My suggestion is to first upgrade to Django==1.11.26
, which is the most recent version of Django that is supporting both Python 2 and Python 3. Stay on your current version of Python 2.7 for now.
Read carefully the release notes for 1.10.x and 1.11.x, checking for deprecations and fixing anything that stopped working from your 1.9.x code. Things WILL break. Django moves fast. For a large Django project, there may be many code changes required, and if you're using a lot of 3rd-party plugins or libraries you may have to juggle their versions around. Some of your 3rd-party dependencies will probably have been abandoned entirely, so you have to find replacements or remove the features.
To find the release notes for each version upgrade, just google "What's new in Django ". The hits will meticulously document all the deprecations and changes:
- https://docs.djangoproject.com/en/2.2/releases/1.10/
- https://docs.djangoproject.com/en/2.2/releases/1.11/
Once the webapp appears to be working fine on Django 1.11, with all tests passing (you do have a test suite, right?) then you can do the Python 3 conversion, whilst keeping the Django version the same. Django 1.11 supports up to Python 3.7, so that would be a good version to target. Expect unicode all over the place, since the implicit conversions between bytes and text is gone now and many Python 2 webapps relied upon that.
Once the project appears to be working fine on Django 1.11 and Python 3.7, then you can think about upgrading to Django 3.0, following the same process as before - reading the release notes, making the necessary changes, running the test suite, and checking out the webapp in a dev server manually.
I thought I'd add a bit to the strategy advocated by Wim's answer - get the appropriate version of Django working on both 2.7 and 3.x first - and outline some tactics that worked for me.
Python 2.7 is your escape pod, until you pull the trigger on 3.x
- your tests should run on both
- don't use any 3.x specific features, like f-strings
- first Python 3.x, then only later Django 2.x which doesn't run on 2.7
- start early, don't over analyze, but avoid the big bang approach
- file by file at first.
- start with the lowest level code, like utility libraries, that you have test suites for.
- if possible, try to gradually merge your changes to the 2.7 production branches and keep your 3.x porting code up to date with prod changes.
Which minor version of Django to start with?
My criteria here is that Django migrations can be fairly involved (and actually require more thinking than 2=>3 work). So I would move to the latest and greatest 1.11 that way you're already providing some value to your 2.7 users. There's probably a good number of pre-2.x compatibility shims on 1.11 and you'll be getting its 2.x deprecation warnings.
Which minor version of Python 3.x to start with?
Best to consider all angles, such as the availability of your 3rd party libs, support from your CI/devops suite and availability on your chosen server OS images. You could always install 3.8 and try a pip install of your requirements.txt by itself, for example.
Leverage git (or whatever scm you use) and virtualenv.
- separate
requirement.txt
files, but... - if you have a file-based, git repo, you can point each venv at the same codeline with a
pip install -e <your directory>
. that means that, in 2 different terminals you can run 2.7 and 3.x against the same unittest(s). - you could even run 2.7 and 3.x Django servers side-by-side on different ports and point say Firefox and Chrome at them.
- commit often (on the porting branch at least) and learn about git bisect.
make use of 2to3
Yes, it will break 2.7 code and Django if you let it. So...
run it in preview mode or against a single file. see what it breaks but also see what it did right.
throttle it to only certain conversions that don't break 2.7 or Django.
print x
=>print (x)
andexcept(Exception) as e
are 2 no-brainers.
This is what my throttled command looked like:
2to3 $tgt -w -f except -f raise -f next -f funcattrs -f print
- run it file-by-file until you are really confident.
use sed or awk rather than your editor for bulk conversions.
The advantage is that, as you become more aware of your apps' specifics concerns, you can build a suite of changes that can be run on either 1 file or many files and do most of the work without breaking 2.7 or Django. Apply this after your suitably-throttled 2to3 pass. That leaves you with residual cleanups in your editor and getting your tests to pass.
(optional) start running black on 2.7 code.
black which is a code formatter, uses Python 3 ASTs to run its analysis. It doesn't try to run the code, but it will flag syntax errors that prevent it from getting to the AST stage. You will have to work some pip install global magic to get there though and you have to buy into black's usefulness.
Other people have done it - learn from them.
Listening to #155 Practical steps for moving to Python 3 should give you some ideas of the work. Look at the show links for it. They love to talk up the Instagram(?) move which involved a gradual adjustment of running 2.7 code to 3.x syntax on a common codebase, and on the same git branch, until pull-the-trigger day.
See also The Conservative Python 3 Porting Guide
and Instagram Makes a Smooth Move to Python 3 - The New Stack
Conclusion
Your time to Django 1.11 EOL (April 2020) is rather short, so if you have 2+ dev resources to throw at it, I'd consider doing the following in parallel:
DEV#1: start out on a Django 1.11 bump (the theory being that Django 1.11 is probably best positioned as a jump off point to Django 2.x), using 2.7.
DEV#2: get started on Python 3.6/3.7 of your non-Django utility code. Since the code is 2.7 compatible at this point, merge it into #1 as you go.
See how both tasks proceed, assess what the Django related project risk is and what the Python 3 pain looks like. You're already missing the Python 2.7 EOL, but an obsolete web framework is probably more dangerous than legacy Python 2.7, at least for a few months. So I wouldn't wait too long to start migrating off Django 1.9 and your work doing so won't be wasted. As you see the progress, you'll start seeing the project risks better.
Your initial 2to3 progress will be slow, but the tooling and guidance is good enough that you'll quickly pick up speed so don't overthink it before starting to gather experience. The Django side depends on your exposure to breaking changes in the framework which is why I think it's best to start early.
P.S. (controversial/personal opinion) I didn't use six or other canned 2-to-3 bridge libraries much.
It's not because I don't trust it - it's brilliant for 3rd party libs - but rather that I didn't want to add a complex permanent dependency (and I was too lazy to read its doc). I'd been writing 2.7 code in 3.x compatible syntax for a long time so I didn't really feel the need to use them. Your mileage may vary and don't set out on this path if it seems like a lot of work.
Instead, I created a py223.py (57 LOC incl. comments) with this type of content, most of which is concerned with workarounds for deprecations and name changes in the standard library.
try:
basestring_ = basestring
except (NameError,) as e:
basestring_ = str
try:
cmp_ = cmp
except (NameError,) as e:
# from http://portingguide.readthedocs.io/en/latest/comparisons.html
def cmp_(x, y):
"""
Replacement for built-in function cmp that was removed in Python 3
"""
return (x > y) - (x < y)
Then import from that py223 to work around those specific concerns. Later on I will just ditch the import and move those weird isinstance(x, basestr_)
to isinstance(x, str)
but I know in advance there is little to worry about.
I would upgrade to py3 first. You'll need to look at setup.py
in the Django repo on the stable/1.9.x branch (https://github.com/django/django/blob/stable/1.9.x/setup.py) to figure out that the py3 versions supported are 3.4 (dead) and 3.5.
Once you're on py3.5 and Django 1.9 you can upgrade one at a time until you get to the version you want to end at. E.g. Django 1.11 supports py3.5 and py3.7, so
py27/dj19 -> py35/dj19 -> py35/dj1.11 -> py37/dj1.11 ... -> py37/dj2.2
dj2.2 is the first version supporting py3.8, but I would probably stop at py37/dj2.2 if you're working in a normally conservative environment.
If you have other packages you'll need to find version combinations that will work together on each step. Having a plan is key, and upgrading only one component at a time will usually end up saving you time.
The future library (https://python-future.org/) will help you with many icky situations while you need code to run on both py27 and 3.x. six is great too. I would avoid rolling your own compatibility layer (why reinvent the wheel?)
If at all possible, try to get your unit test coverage up to 75-85% before starting, and definitely set up automatic testing on both "from" and "to" versions for each upgrade step. Make sure you read and fix all warnings from Django before upgrading to the next version -- Django cares very little about backward compatibility, so I would normally suggest hitting every minor version on the upgrade path (or at least make sure you read the "backwards incompatibilities" and deprecation lists for each minor version).
Good luck (we're upgrading a 300+Kloc code base from py27/dj1.7 right now, so I feel your pain ;-)