sorting dictionary python 3
dict
does not keep its elements' order. What you need is an OrderedDict: http://docs.python.org/library/collections.html#collections.OrderedDict
edit
Usage example:
>>> from collections import OrderedDict
>>> a = {'foo': 1, 'bar': 2}
>>> a
{'foo': 1, 'bar': 2}
>>> b = OrderedDict(sorted(a.items()))
>>> b
OrderedDict([('bar', 2), ('foo', 1)])
>>> b['foo']
1
>>> b['bar']
2
Python's ordinary dicts
cannot be made to provide the keys/elements in any specific order. For that, you could use the OrderedDict
type from the collections
module. Note that the OrderedDict
type merely keeps a record of insertion order. You would have to sort the entries prior to initializing the dictionary if you want subsequent views/iterators to return the elements in order every time. For example:
>>> myDic={10: 'b', 3:'a', 5:'c'}
>>> sorted_list=sorted(myDic.items(), key=lambda x: x[0])
>>> myOrdDic = OrderedDict(sorted_list)
>>> myOrdDic.items()
[(3, 'a'), (5, 'c'), (10, 'b')]
>>> myOrdDic[7] = 'd'
>>> myOrdDic.items()
[(3, 'a'), (5, 'c'), (10, 'b'), (7, 'd')]
If you want to maintain proper ordering for newly added items, you really need to use a different data structure, e.g., a binary tree/heap. This approach of building a sorted list and using it to initialize a new OrderedDict()
instance is just woefully inefficient unless your data is completely static.
Edit: So, if the object of sorting the data is merely to print it in order, in a format resembling a python dict
object, something like the following should suffice:
def pprint_dict(d):
strings = []
for k in sorted(d.iterkeys()):
strings.append("%d: '%s'" % (k, d[k]))
return '{' + ', '.join(strings) + '}'
Note that this function is not flexible w/r/t the types of the key, value pairs (i.e., it expects the keys to be integers and the corresponding values to be strings). If you need more flexibility, use something like strings.append("%s: %s" % (repr(k), repr(d[k])))
instead.
A modern and fast solution, for Python 3.7. May also work in some interpreters of Python 3.6.
TLDR
To sort a dictionary by keys use:
sorted_dict = {k: disordered[k] for k in sorted(disordered)}
Almost three times faster than the accepted answer; probably more when you include imports.
Comment on the accepted answer
The example in the accepted answer instead of iterating over the keys only - with key
parameter of sorted()
or the default behaviour of dict iteration - iterates over tuples (key, value)
, which suprisingly turns out to be much slower than comparing the keys only and accessing dictionary elements in a list comprehension.
How to sort by key in Python 3.7
The big change in Python 3.7 is that the dictionaries are now ordered by default.
- You can generate sorted dict using dict comprehensions.
- Using
OrderedDict
might still be preferable for the compatibility sake. - Do not use
sorted(d.items())
withoutkey
.
See:
disordered = {10: 'b', 3: 'a', 5: 'c'}
# sort keys, then get values from original - fast
sorted_dict = {k: disordered[k] for k in sorted(disordered)}
# key = itemgetter - slower
from operator import itemgetter
key = itemgetter(0)
sorted_dict = {k: v for k, v in sorted(disordered.items(), key=key)}
# key = lambda - the slowest
key = lambda item: item[0]
sorted_dict = {k: v for k in sorted(disordered.items(), key=key)}
Timing results:
Best for {k: d[k] for k in sorted(d)}: 7.507327548999456
Best for {k: v for k, v in sorted(d.items(), key=key_getter)}: 12.031082626002899
Best for {k: v for k, v in sorted(d.items(), key=key_lambda)}: 14.22885995300021
Best for dict(sorted(d.items(), key=key_getter)): 11.209122000000207
Best for dict(sorted(d.items(), key=key_lambda)): 13.289728325995384
Best for dict(sorted(d.items())): 14.231471302999125
Best for OrderedDict(sorted(d.items(), key=key_getter)): 16.609151654003654
Best for OrderedDict(sorted(d.items(), key=key_lambda)): 18.52622927199991
Best for OrderedDict(sorted(d.items())): 19.436101284998585
Testing code:
from timeit import repeat
setup_code = """
from operator import itemgetter
from collections import OrderedDict
import random
random.seed(0)
d = {i: chr(i) for i in [random.randint(0, 120) for repeat in range(120)]}
key_getter = itemgetter(0)
key_lambda = lambda item: item[0]
"""
cases = [
# fast
'{k: d[k] for k in sorted(d)}',
'{k: v for k, v in sorted(d.items(), key=key_getter)}',
'{k: v for k, v in sorted(d.items(), key=key_lambda)}',
# slower
'dict(sorted(d.items(), key=key_getter))',
'dict(sorted(d.items(), key=key_lambda))',
'dict(sorted(d.items()))',
# the slowest
'OrderedDict(sorted(d.items(), key=key_getter))',
'OrderedDict(sorted(d.items(), key=key_lambda))',
'OrderedDict(sorted(d.items()))',
]
for code in cases:
times = repeat(code, setup=setup_code, repeat=3)
print(f"Best for {code}: {min(times)}")
I don't think you want an OrderedDict. It sounds like you'd prefer a SortedDict, that is a dict that maintains its keys in sorted order. The sortedcontainers module provides just such a data type. It's written in pure-Python, fast-as-C implementations, has 100% coverage and hours of stress.
Installation is easy with pip:
pip install sortedcontainers
Note that if you can't pip install
then you can simply pull the source files from the open-source repository.
Then you're code is simply:
from sortedcontainers import SortedDict
myDic = SortedDict({10: 'b', 3:'a', 5:'c'})
sorted_list = list(myDic.keys())
The sortedcontainers module also maintains a performance comparison with other popular implementations.