How do I compare Rpm versions in python
Here's a working program based off of rpmdev-vercmp
from the rpmdevtools package. You shouldn't need anything special installed but yum
(which provides the rpmUtils.miscutils
python module) for it to work.
The advantage over the other answers is you don't need to parse anything out, just feed it full RPM name-version strings like:
$ ./rpmcmp.py bash-3.2-32.el5_9.1 bash-3.2-33.el5.1
0:bash-3.2-33.el5.1 is newer
$ echo $?
12
Exit status 11 means the first one is newer, 12 means the second one is newer.
#!/usr/bin/python
import rpm
import sys
from rpmUtils.miscutils import stringToVersion
if len(sys.argv) != 3:
print "Usage: %s <rpm1> <rpm2>"
sys.exit(1)
def vercmp((e1, v1, r1), (e2, v2, r2)):
return rpm.labelCompare((e1, v1, r1), (e2, v2, r2))
(e1, v1, r1) = stringToVersion(sys.argv[1])
(e2, v2, r2) = stringToVersion(sys.argv[2])
rc = vercmp((e1, v1, r1), (e2, v2, r2))
if rc > 0:
print "%s:%s-%s is newer" % (e1, v1, r1)
sys.exit(11)
elif rc == 0:
print "These are equal"
sys.exit(0)
elif rc < 0:
print "%s:%s-%s is newer" % (e2, v2, r2)
sys.exit(12)
Based on Owen S's excellent answer, I put together a snippet that uses the system RPM bindings if available, but falls back to a regex based emulation otherwise:
try:
from rpm import labelCompare as _compare_rpm_labels
except ImportError:
# Emulate RPM field comparisons
#
# * Search each string for alphabetic fields [a-zA-Z]+ and
# numeric fields [0-9]+ separated by junk [^a-zA-Z0-9]*.
# * Successive fields in each string are compared to each other.
# * Alphabetic sections are compared lexicographically, and the
# numeric sections are compared numerically.
# * In the case of a mismatch where one field is numeric and one is
# alphabetic, the numeric field is always considered greater (newer).
# * In the case where one string runs out of fields, the other is always
# considered greater (newer).
import warnings
warnings.warn("Failed to import 'rpm', emulating RPM label comparisons")
try:
from itertools import zip_longest
except ImportError:
from itertools import izip_longest as zip_longest
_subfield_pattern = re.compile(
r'(?P<junk>[^a-zA-Z0-9]*)((?P<text>[a-zA-Z]+)|(?P<num>[0-9]+))'
)
def _iter_rpm_subfields(field):
"""Yield subfields as 2-tuples that sort in the desired order
Text subfields are yielded as (0, text_value)
Numeric subfields are yielded as (1, int_value)
"""
for subfield in _subfield_pattern.finditer(field):
text = subfield.group('text')
if text is not None:
yield (0, text)
else:
yield (1, int(subfield.group('num')))
def _compare_rpm_field(lhs, rhs):
# Short circuit for exact matches (including both being None)
if lhs == rhs:
return 0
# Otherwise assume both inputs are strings
lhs_subfields = _iter_rpm_subfields(lhs)
rhs_subfields = _iter_rpm_subfields(rhs)
for lhs_sf, rhs_sf in zip_longest(lhs_subfields, rhs_subfields):
if lhs_sf == rhs_sf:
# When both subfields are the same, move to next subfield
continue
if lhs_sf is None:
# Fewer subfields in LHS, so it's less than/older than RHS
return -1
if rhs_sf is None:
# More subfields in LHS, so it's greater than/newer than RHS
return 1
# Found a differing subfield, so it determines the relative order
return -1 if lhs_sf < rhs_sf else 1
# No relevant differences found between LHS and RHS
return 0
def _compare_rpm_labels(lhs, rhs):
lhs_epoch, lhs_version, lhs_release = lhs
rhs_epoch, rhs_version, rhs_release = rhs
result = _compare_rpm_field(lhs_epoch, rhs_epoch)
if result:
return result
result = _compare_rpm_field(lhs_version, rhs_version)
if result:
return result
return _compare_rpm_field(lhs_release, rhs_release)
Note that I haven't tested this extensively for consistency with the C level implementation - I only use it as a fallback implementation that's at least good enough to let Anitya's test suite pass in environments where system RPM bindings aren't available.
In RPM parlance, 2.el5
is the release field; 2 and el5 are not separate fields. However, release need not have a .
in it as your examples show. Drop the \.(.*)
from the end to capture the release field in one shot.
So now you have a package name, version, and release. The easiest way to compare them is to use rpm's python module:
import rpm
# t1 and t2 are tuples of (version, release)
def compare(t1, t2):
v1, r1 = t1
v2, r2 = t2
return rpm.labelCompare(('1', v1, r1), ('1', v2, r2))
What's that extra '1'
, you ask? That's epoch, and it overrides other version comparison considerations. Further, it's generally not available in the filename. Here, we're faking it to '1' for purposes of this exercise, but that may not be accurate at all. This is one of two reasons your logic is going to be off if you're going by file names alone.
The other reason that your logic may be different from rpm
's is the Obsoletes
field, which allows a package to be upgraded to a package with an entirely different name. If you're OK with these limitations, then proceed.
If you don't have the rpm
python library at hand, here's the logic for comparing each of release, version, and epoch as of rpm 4.4.2.3
:
- Search each string for alphabetic fields
[a-zA-Z]+
and numeric fields[0-9]+
separated by junk[^a-zA-Z0-9]*
. - Successive fields in each string are compared to each other.
- Alphabetic sections are compared lexicographically, and the numeric sections are compared numerically.
- In the case of a mismatch where one field is numeric and one is alphabetic, the numeric field is always considered greater (newer).
- In the case where one string runs out of fields, the other is always considered greater (newer).
See lib/rpmvercmp.c
in the RPM source for the gory details.
RPM has python bindings, which lets you use rpmUtils.miscutils.compareEVR. The first and third arguments of the tuple are the package name and the packaging version. The middle is the version. In the example below, I'm trying to figure out where 3.7.4a gets sorted.
[root@rhel56 ~]# python
Python 2.4.3 (#1, Dec 10 2010, 17:24:35)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import rpmUtils.miscutils
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4", "1"))
0
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4", "1"), ("foo", "3.7.4a", "1"))
-1
>>> rpmUtils.miscutils.compareEVR(("foo", "3.7.4a", "1"), ("foo", "3.7.4", "1"))
1