Grouping Data¶

🎨 dict assignment¶

In [ ]:
meals = {}
meals
In [ ]:
meals['breakfast'] = 'cereal'
In [ ]:
meals
In [ ]:
meals['lunch'] = 'sandwich'
In [ ]:
meals
In [ ]:
meals['second breakfast'] = "I don't think he knows about second breakfast!"
In [ ]:
meals

🖌 Counting¶

count_letters.py¶

NOTES

  • initialize an empty dictionary counts = {}
  • if letter not in counts: if we haven't seen that letter before, what do we do?
    • Why is this if block necessary? What will happen on counts[letter] += 1 if we don't have it?
  • counts[letter] += 1: what does this mean? What does it do? Lookup value, add 1, reassign to key.
  • step through with a debugger
    • watch how keys are initialized and counts are incremented
In [ ]:
%%file count_letters.py
def count_letters(text):
    counts = {}
    for letter in text:
        if letter not in counts:
            counts[letter] = 0
        counts[letter] += 1
    return counts

print(count_letters('banana abacus cabana'))
In [ ]:
! python count_letters.py

Common pattern:

  1. Define the key
  2. If I haven't seen the key before, initialize it
  3. Update the value for the key

0 always comes before 1, which always comes before 2.

👩🏼‍🎨 Count Words¶

Given a passage of words, count the occurrence of each word. Strip punctuation and ignore casing.

count_words.py¶

In [ ]:
def count_words(text):
    counts = {}
    for word in text.split():
        word = word.strip('.,!?;').lower()
        if word not in counts:
            counts[word] = 0
        counts[word] += 1
    return counts
In [ ]:
print(count_words("Row, row, row, your boat... and hope your boat floats!"))

🖌 Grouping¶

group_by_first_letter.py¶

NOTES

  • groups[key] = []: here we initialize a group with an empty list
    • draw out an example dictionary: each key pointing to a separate list
  • groups[key].append(word): here we append the word to the list in groups[key]
    • on the diagram, show how when the next word comes up, it gets assigned to one of those lists
  • key = word[0]: here we decide what the things in a group have in common
    • e.g. in this case, we are using the first letter of the word.
  • step through with debugger.

👩🏾‍🎨 Group by size¶

Group a sequence of words by their length.

group_by_size.py¶

In [ ]:
def group_by_size(words):
    groups = {}
    for word in words:
        key = len(word)
        if key not in groups:
            groups[key] = []
        groups[key].append(word)
    return groups


states = 'Utah Idaho Oregon California Washington Arizona Iowa Ohio Mississippi Florida Kansas Maine'.split()
print(group_by_size(states))

Grouping Pattern¶

  • Given a list of items
  • Determine the key
    • The function used to determine the key is called the grouping function
    • The grouping function defines what the items in a group share in common
  • Initialize an empty group for novel keys
  • Add the current item to its respective group
    • group might be a counter that increments or a collection that grows

Key Ideas¶

  • Assigning values to keys in a dictionary
  • Counting
  • Grouping