Strings in Action¶

🖌 Look-back¶

Given a string composed of blocks of repeated letters, turn it into a list of those blocks.

Input:

'AAAAABBCCCDDDDDDDEAAAA'

Output:

['AAAAA','BB','CCC','DDDDDDD','E','AAAA']

split_blocks.py¶

test_split_blocks.py¶

NOTES

  • Draw it out
    • Where are the boundaries? How do we know it's a boundary?
    • If we are marching through the characters of the string, how do we know that the current character is "different"?
  • Strategy:
    • Call a contiguous block of identical letters a "block"
    • Keep a list of blocks
    • Keep a current block
    • Build the current token until the current letter doesn't match
      • Put the current token in the list
      • Start the current token over with the new letter
    • At the end, add the current token to the list
      • But discover this necessity through testing... ;)
In [ ]:
def split_blocks(text):
    blocks = []
    current_block = None
    for char in text:
        if current_block is None:
            # Start first block
            current_block = char
        elif char in current_block:
            # Extend block
            current_block += char
        else:
            # New block
            blocks.append(current_block)
            current_block = char
    blocks.append(current_block)
    return blocks
In [ ]:
split_blocks('AAAABBBCDDD')

🧑🏽‍🎨 Grouping Customers¶

Your customers often come in as family groups. You have a list of customer family names (for now we'll assume that all members of a family have the same family name), and you want to group those customers together so they can be served as a group instead of individually. You also want to know how many people are in each group.

Given a list of family names, print out the sequence of family names along with the number of individuals in each family.

Input:

['Smith', 'Smith', 'Smith', 'Smith', 'Ng', 'Ng', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Santos', 'Kowalska', 'Kowalska', 'Kowalska', 'Smith', 'Smith', 'Smith']

Output:

Smith: 4
Ng: 2
Nelson: 12
Santos: 1
Kowalska: 3
Smith: 3

customer_report.py¶

test_customer_report.py¶

NOTES

  • You'll have to create test_customer_report.py manually
  • How do you test a function that print's?
    • While there are ways to do it with PyTest, let's design our functions to be easily tested
  • Test the contents of the list using a list literal
  • What boundary conditions should we include?
    • empty list (no customers)
    • families with one member
In [ ]:
def group_customers(customers):
    groups = []
    current_group = None
    for name in customers:
        if current_group is None:
            # Start first group
            current_group = [name]
        elif name in current_group:
            # Extend current group
            current_group.append(name)
        else:
            # Start new group
            groups.append(current_group)
            current_group = [name]
    groups.append(current_group)
    return groups

def format_groups(groups):
    report = []
    for group in groups:
        report.append(f'{group[0]}: {len(group)}')
    return report
        
def generate_customer_report(customers):
    groups = group_customers(customers)
    report = format_groups(groups)
    return report
In [ ]:
customers = ['Smith', 'Smith', 'Smith', 'Smith', 'Ng', 'Ng', 
             'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 
             'Nelson', 'Nelson', 'Nelson', 'Nelson', 'Nelson', 
             'Nelson', 'Nelson', 'Santos', 'Kowalska', 
             'Kowalska', 'Kowalska', 'Smith', 'Smith', 'Smith']

report = generate_customer_report(customers)

for line in report:
    print(line)

Key Ideas¶

  • Grouping contiguous items using a look-back pattern