An Interest In:
Web News this Week
- April 1, 2024
- March 31, 2024
- March 30, 2024
- March 29, 2024
- March 28, 2024
- March 27, 2024
- March 26, 2024
The Best Way to Compare Two Dictionaries in Python
Python's dictionary is a very versatile data structure. We can use it for many tasks that require a key->value mapping. A dict
is not only efficient in terms of performance but also easy to use. Despite its admitted strengths, specific operations are somewhat tricky. For example, Python has no built-in feature allowing us to:
- Compare two dictionaries and check how many pairs are equal
- Get the difference between two
dict
s - Check if the dictionaries are equal, specially if when they have nested structures.
dict
s that have floating-point numbers as values
In this article, I will show how you can do those operations and many more, so lets go.
Table of Contents
- Why You Need a Robust Way to Compare Dictionaries
- Using the Right Tool for the Job
- Getting a Simple Difference
- Ignoring String Case
- Comparing Float Values
- Comparing
numpy
Values - Comparing Dictionaries With
datetime
Objects - Comparing String Values
- Excluding Fields
- Conclusion
Why You Need a Robust Way to Compare Dictionaries
Let's imagine the following scenario: you have two simple dictionaries. How can we assert if they match? Easy, right?
Yeah! You could use the
==
operator, off course!
>>> a = { 'number': 1, 'list': ['one', 'two']}>>> b = { 'list': ['one', 'two'], 'number': 1}>>> a == bTrue
That's kind of expected, the dictionaries are the same. But what if some value is different, the result will be False
but can we tell where do they differ?
>>> a = { 'number': 1, 'list': ['one', 'two']}>>> b = { 'list': ['one', 'two'], 'number': 2}>>> a == bFalse
Hum... Just
False
doesn't tell us much...
What about the str
's inside the list
. Let's say that we want to ignore their cases.
>>> a = { 'number': 1, 'list': ['ONE', 'two']}>>> b = { 'list': ['one', 'two'], 'number': 1}>>> a == bFalse
Oops...
What if the number was a float
and we consider two floats to be the same if they have at least 3 significant digits equal? Put another way, we want to check if only 3 digits after the decimal point match.
>>> a = { 'number': 1, 'list': ['one', 'two']}>>> b = { 'list': ['one', 'two'], 'number': 1.00001}>>> a == bFalse
You might also want to exclude some fields from the comparison. As an example, we might now want to remove the list
key->value from the check. Unless we create a new dictionary without it, there's no method to do that for you.
Can't it get any worse?
Yes, what if a value is a numpy
array?
>>> a = { 'number': 1, 'list': ['one', 'two'], 'array': np.ones(3)}>>> b = { 'list': ['one', 'two'], 'number': 1, 'array': np.ones(3)}>>> a == b---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-4-eeadcaeab874> in <module>----> 1 a == bValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Oh no, it raises an exception in the middle of our faces!
Damn it, what can we do then?
Using the Right Tool for the Job
Since dict
s cannot perform advanced comparisons, there are only two forms of achieving that. You can either implement the functionality yourself or use a third party library. At some point in your life you probably heard about not reinventing the wheel. So that's precisely what we're going to do in this tutorial.
We'll adopt a library called DeepDiff
, from zepworks. DeepDiff
can pick up the difference between dictionaries, iterables, strings and other objects. It accomplishes that by searching for changes in a recursively manner.
DeepDiff
is not the only kid on the block, there's also Dictdiffer, developed by the folks at CERN. Dictdiffer
is also cool but lacks a lot of the features that make DeepDiff
so interesting. In any case, I encourage you to look at both and determine which one works best for you.
Getting a Simple Difference
In this example, we'll be solving the first example I showed you. We want to find the key whose value differs between the two dict
s. Consider the following code snippet, but this time using deepdiff
.
In [1]: from deepdiff import DeepDiffIn [2]: a = { ...: 'number': 1, ...: 'list': ['one', 'two'] ...: }In [3]: b = { ...: 'list': ['one', 'two'], ...: 'number': 2 ...: }In [4]: diff = DeepDiff(a, b)In [5]: diffOut[5]: {'values_changed': {"root['number']": {'new_value': 2, 'old_value': 1}}}
Awesome! It tells us that the key 'number'
had value 1 but the new dict
, b, has a new value, 2.
Ignoring String Case
In our second example, we saw an example where one element of the list was in uppercase, but we didn't care about that. We wanted to ignore it and treat "one"
as "ONE"
You can solve that by setting ignore_string_case=True
In [10]: a = { ...: 'number': 1, ...: 'list': ['ONE', 'two'] ...: } ...: In [11]: b = { ...: 'list': ['one', 'two'], ...: 'number': 1 ...: }In [12]: diff = DeepDiff(a, b, ignore_string_case=True)In [13]: diffOut[13]: {}
If we don't do that, a very helpful message is printed.
In [14]: diff = DeepDiff(a, b)In [15]: diffOut[15]: {'values_changed': {"root['list'][0]": {'new_value': 'one', 'old_value': 'ONE'}}}
Comparing Float Values
We also saw a case where we had a float
number that we only wanted to check if the first 3 significant digits were equal. With DeepDiff
it's possible to pass the exact number of digits AFTER the decimal point. Also, since float
s differ from int
's, we might want to ignore type comparison as well. We can solve that by setting ignore_numeric_type_changes=True
.
In [16]: a = { ...: 'number': 1, ...: 'list': ['one', 'two'] ...: }In [17]: b = { ...: 'list': ['one', 'two'], ...: 'number': 1.00001 ...: }In [18]: diff = DeepDiff(a, b)In [19]: diffOut[19]: {'type_changes': {"root['number']": {'old_type': int, 'new_type': float, 'old_value': 1, 'new_value': 1.00001}}}In [24]: diff = DeepDiff(a, b, significant_digits=3, ignore_numeric_type_changes=True)In [25]: diffOut[25]: {}
Comparing numpy
Values
When we tried comparing two dictionaries with a numpy
array in it we failed miserably. Fortunately, DeepDiff
has our backs here. It supports numpy
objects by default!
In [27]: import numpy as npIn [28]: a = { ...: 'number': 1, ...: 'list': ['one', 'two'], ...: 'array': np.ones(3) ...: }In [29]: b = { ...: 'list': ['one', 'two'], ...: 'number': 1, ...: 'array': np.ones(3) ...: }In [30]: diff = DeepDiff(a, b)In [31]: diffOut[31]: {}
What if the arrays are different?
No problem!
In [28]: a = { ...: 'number': 1, ...: 'list': ['one', 'two'], ...: 'array': np.ones(3) ...: }In [32]: b = { ...: 'list': ['one', 'two'], ...: 'number': 1, ...: 'array': np.array([1, 2, 3]) ...: }In [33]: diff = DeepDiff(a, b)In [34]: diffOut[34]: {'type_changes': {"root['array']": {'old_type': numpy.float64, 'new_type': numpy.int64, 'old_value': array([1., 1., 1.]), 'new_value': array([1, 2, 3])}}}
It shows that not only the values are different but also the types!
Comparing Dictionaries With datetime
Objects
Another common use case is comparing datetime
objects. This kind of object has the following signature:
class datetime.datetime(year, month, day, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)
In case we have a dict
with datetime
objects, DeepDiff
allows us to compare only certain parts of it. For instance, if only care about year, month, and day, then we can truncate it.
In [1]: import datetimeIn [2]: from deepdiff import DeepDiffIn [3]: a = { 'list': ['one', 'two'], 'number': 1, 'date': datetime.datetime(2020, 6, 17, 22, 45, 34, 513371) }In [4]: b = { 'list': ['one', 'two'], 'number': 1, 'date': datetime.datetime(2020, 6, 17, 12, 12, 51, 115791) }In [5]: diff = DeepDiff(a, b, truncate_datetime='day')In [6]: diffOut[7]: {}
Comparing String Values
We've looked at interesting examples so far, and it's a common use case to use dict
s to store strings values. Having a better way of contrasting them can help us a lot! In this section I'm going to explain you another lovely feature, the str
diff.
In [13]: from pprint import pprintIn [17]: b = { ...: 'number': 1, ...: 'text': 'hi,
my awesome world!' ...: }In [18]: a = { ...: 'number': 1, ...: 'text': 'hello, my
dear
world!' ...: }In [20]: ddiff = DeepDiff(a, b, verbose_level=2)In [21]: pprint(ddiff, indent=2){ 'values_changed': { "root['text']": { 'diff': '---
' '+++
' '@@ -1,3 +1,2 @@
' '-hello, my
' '- dear
' '- world!
' '+hi,
' '+ my awesome world!', 'new_value': 'hi,
my awesome world!', 'old_value': 'hello, my
' ' dear
' ' world!'}}}
That's nice! We can see the exact lines where the two strings differ.
Excluding Fields
In this last example, I'll show you yet another common use case, excluding a field. We might want to exclude one or more items from the comparison. For instance, using the previous example, we might want to leave out the text
field.
In [17]: b = { ...: 'number': 1, ...: 'text': 'hi,
my awesome world!' ...: }In [18]: a = { ...: 'number': 1, ...: 'text': 'hello, my
dear
world!' ...: }In [26]: ddiff = DeepDiff(a, b, verbose_level=2, exclude_paths=["root['text']"]) ...: In [27]: ddiffOut[27]: {}
If you want even more advanced exclusions, DeepDiff
also allow you to pass a regex expression. Check this out: https://zepworks.com/deepdiff/current/exclude_paths.html#exclude-regex-paths.
Conclusion
That's it for today, folks! I really hope you've learned something new and useful. Comparing dict
's is a common use case since they can used to store almost any kind of data. As a result, having a proper tool to easy this effort is indispensable. DeepDiff
has many features and can do reasonably advanced comparisons. If you ever need to compare dict
's go check it out.
Other posts you may like:
- 3 Ways to Test API Client Applications in Python
- 5 Hidden Python Features You Probably Never Heard Of
- 7 pytest Features and Plugins That Will Save You Tons of Time
- Everything You Need to Know About Python's Namedtuples
See you next time!
This post was originally published on https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python
Original Link: https://dev.to/miguendes/the-best-way-to-compare-two-dictionaries-in-python-3g9l
Dev To
An online community for sharing and discovering great ideas, having debates, and making friendsMore About this Source Visit Dev To