Close

COMP0023: Research Software Engineering With Python

Home

## Dictionaries¶

### The Python Dictionary¶

Python supports a container type called a dictionary.

This is also known as an "associative array", "map" or "hash" in other languages.

In a list, we use a number to look up an element:

In [1]:
names = "Martin Luther King".split(" ")

In [2]:
names[1]

Out[2]:
'Luther'

In a dictionary, we look up an element using another object of our choice:

In [3]:
chapman = {"name": "Graham", "age": 48,
"Jobs": ["Comedian", "Writer"] }

In [4]:
chapman

Out[4]:
{'name': 'Graham', 'age': 48, 'Jobs': ['Comedian', 'Writer']}
In [5]:
chapman['Jobs']

Out[5]:
['Comedian', 'Writer']
In [6]:
chapman['age']

Out[6]:
48
In [7]:
type(chapman)

Out[7]:
dict

### Keys and Values¶

The things we can use to look up with are called keys:

In [8]:
chapman.keys()

Out[8]:
dict_keys(['name', 'age', 'Jobs'])

The things we can look up are called values:

In [9]:
chapman.values()

Out[9]:
dict_values(['Graham', 48, ['Comedian', 'Writer']])

When we test for containment on a dict we test on the keys:

In [10]:
'Jobs' in chapman

Out[10]:
True
In [11]:
'Graham' in chapman

Out[11]:
False
In [12]:
'Graham' in chapman.values()

Out[12]:
True

### Immutable Keys Only¶

The way in which dictionaries work is one of the coolest things in computer science: the "hash table". The details of this are beyond the scope of this course, but we will consider some aspects in the section on performance programming.

One consequence of this implementation is that you can only use immutable things as keys.

In [13]:
good_match = {
("Lamb", "Mint"): True,
("Bacon", "Chocolate"): False
}


but:

In [14]:
illegal = {
["Lamb", "Mint"]: True,
["Bacon", "Chocolate"]: False
}

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[14], line 1
----> 1 illegal = {
2     ["Lamb", "Mint"]: True,
3     ["Bacon", "Chocolate"]: False
4    }

TypeError: unhashable type: 'list'

Remember -- square brackets denote lists, round brackets denote tuples.

### No guarantee of order (before Python 3.7)¶

Another consequence of the way dictionaries used to work is that there was no guaranteed order among the elements. However, since Python 3.7, it's guaranteed that dictionaries return elements in the order in which they were inserted. Read more about why that changed and how it is still fast.

In [15]:
my_dict = {'0': 0, '1':1, '2': 2, '3': 3, '4': 4}
print(my_dict)
print(my_dict.values())

{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4}
dict_values([0, 1, 2, 3, 4])


### Sets¶

A set is a list which cannot contain the same element twice. We make one by calling set() on any sequence, e.g. a list or string.

In [16]:
name = "Graham Chapman"
unique_letters = set(name)

In [17]:
unique_letters

Out[17]:
{' ', 'C', 'G', 'a', 'h', 'm', 'n', 'p', 'r'}

Or by defining a literal like a dictionary, but without the colons:

In [18]:
primes_below_ten = { 2, 3, 5, 7}

In [19]:
type(unique_letters)

Out[19]:
set
In [20]:
type(primes_below_ten)

Out[20]:
set
In [21]:
unique_letters

Out[21]:
{' ', 'C', 'G', 'a', 'h', 'm', 'n', 'p', 'r'}

This will be easier to read if we turn the set of letters back into a string, with join:

In [22]:
"".join(unique_letters)

Out[22]:
'rCGmnhp a'

A set has no particular order, but is really useful for checking or storing unique values.

Set operations work as in mathematics:

In [23]:
x = set("Hello")
y = set("Goodbye")

In [24]:
x & y # Intersection

Out[24]:
{'e', 'o'}
In [25]:
x | y # Union

Out[25]:
{'G', 'H', 'b', 'd', 'e', 'l', 'o', 'y'}
In [26]:
y - x # y intersection with complement of x: letters in Goodbye but not in Hello

Out[26]:
{'G', 'b', 'd', 'y'}

Your programs will be faster and more readable if you use the appropriate container type for your data's meaning. Always use a set for lists which can't in principle contain the same data twice, always use a dictionary for anything which feels like a mapping from keys to values.