BYU logo Computer Science

Coiteration with zip

There are times when we want to iterate over two lists at the same time. For example, we may have a list of fruits:

fruits = ['apple', 'pear', 'peach'] 

and a list of prices:

prices = [0.25, 0.40, 10.0]

and we want to print a list that shows each fruit and its price:

apple : $0.25
pear : $0.40
peach : $10.0

Yes, peaches are worth that much more than any other fruit. They are amazing.

zip()

To iterate over two lists at the same time, we can zip them together:

for fruit, price in zip(fruits, prices):
    print(f'{fruit} : ${price}')

This will produce the list of fruits and their prices as shown above.

How is this working? The zip() function creates a collection of tuples:

(apple, 0.25)
(pear, 0.40)
(peach, 10.0)

You can then iterate over this collection using for ... in, just like with a list, a file, or a string.

You can zip() as many lists as you want

You don’t have to stop at zipping just two lists. Here is an example that zips three lists:

names = ['John', 'Juan', 'João', 'Giovanni']
ages = [23, 18, 24, 22]
majors = ['Chemistry', 'Animation', 'Sociology', 'Secondary Education']

for name, age, major in zip(names, ages, majors):
    print(f'{name} is {age} years old and studies {major}')

zip() stops zipping as soon as one of the lists runs out

If you zip these lists:

fruits = ['apple', 'pear', 'blueberry', 'grape', 'strawberry']
plant_types = ['tree', 'tree', 'bush', 'vine']

for fruit, plant_type in zip(fruits, plant_types):
    print(f'The {fruit} grows on a {plant_type}.')

Then you get this output:

The apple grows on a tree.
The pear grows on a tree.
The blueberry grows on a bush.
The grape grows on a vine.

Notice that we never get any output about strawberry because there is no accompanying plant type for this fruit.

Zip works with anything that is iterable

Since zip() works with anything that is iterable, you can use it to zip strings as well:

word1 = 'planter'
word2 = 'started'

for letter1, letter2 in zip(word1, word2):
    if letter1 == letter2:
        print(f'{letter1} == {letter2} ✅')
    else:
        print(f'{letter1} != {letter2}')

This will print:

p != s
l != t
a == a ✅
n != r
t == t ✅
e == e ✅
r != d

Example: top teams

You have a list of teams and four lists of total games won for each team for a given season. So, each item in the list represents how many games a given team won:

teams = ['ASU', 'BYU', 'UVU', 'UofU', 'USU', 'BSU']
season1 = [3, 12, 8, 11, 4, 7]
season2 = [2, 11, 3, 11, 2, 6]
season3 = [1, 10, 11, 11, 4, 12]
season4 = [3, 12, 7, 11, 1, 8]

You want to know the name and record of the winningest team, meaning the team that has won the most games, totaled over all seasons.

For example:

Top Team: BYU
Most Wins: 45
Wins Per Season: (12, 11, 10, 12)

The algorithm

You can solve this problem by thinking about it this way:

  • keep track of the most wins (starts at zero)
  • keep track of the top team (starts at None)
  • keep track of the top team’s record (starts at None)
  • iterate through tuples of (team, season1wins, season2wins, season3wins, season4wins) using zip
    • calculate the total wins for this team
    • if the total wins > most wins
      • set most wins = total wins
      • set top team to this team
      • set the top record to the wins for this team

The code

Here is code that uses this algorithm:

teams = ['ASU', 'BYU', 'UVU', 'UofU', 'USU', 'BSU']
season1 = [3, 12, 8, 11, 4, 7]
season2 = [2, 11, 3, 11, 2, 6]
season3 = [1, 10, 11, 11, 4, 12]
season4 = [3, 12, 7, 11, 1, 8]

most_wins = 0
top_team = None
top_record = None

# find the winningest team
# start by zipping the teams and the amount of wins in each season
for name, s1, s2, s3, s4 in zip(teams, season1, season2, season3, season4):
    # each time through this loop we are looking at one team and all their wins in each of the four seasons
    # add up these wins to get the total wins across all 4 seasons
    total_wins = s1 + s2 + s3 + s4
    # check if this is larger than the most wins we have seen so far
    if total_wins > most_wins:
        # if this is the most wins we have seen so far, keep track of this team, their wins, and their record
        most_wins = total_wins
        top_team = name
        top_record = (s1, s2, s3, s4)

print()
print('Top Team:', top_team)
print('Most Wins:', most_wins)
print('Wins Per Season:', top_record)

Example: making reservations

You have a list of campsites and their availability. You and two other families want to go camping together for as long as possible. There is a trio of camp sites right next to each other that you would like to reserve, but you need to determine which dates give you the longest time together.

Write a program that determines the dates in the largest window where all three sites are available.

The input to this program is three lists of the format:

[
    ('2022/07/04', 'Unavailable'),
    ('2022/07/05', 'Available'),
    ('2022/07/06', 'Available'),
    ('2022/07/07', 'Unavailable'),
    ('2022/07/08', 'Unavailable'),
]

Each list is the dates and the availability/unavailability for a single campsite. There are three lists, one for each campsite.

The output is a list of the format:

['2022/07/05', '2022/07/06']

This shows that the longest period of time where all three campsites are available is the period covering these two days.

The algorithm

  • keep track of the current window, meaning a list of dates that all three campsites are available -> current_window

  • keep track of a list of all windows where all three campsites are available -> windows

  • loop through the zipped tuples (date, site_1_availability, site_2_availability, site_3_availability)

    • if all three sites are available
      • add this date to the current window
    • else
      • add the current window to the list of windows
      • reset the current window to an empty list
  • add the current window to the list of windows (covers the case where the last window is still open when we finish the loop)

  • set the max window to None

  • loop through all windows

    • if the max window is None or the current window is longer than the maximum window:
      • set the maximum window to the current window
  • print the maximum window

The code

Here is code that implements this algorithm

# we are going to keep track of every open window. An open window is a list of dates where all three sites
# are available
open_windows = []
# the current window is what we use to accumualte a list of dates where all three sites are available
current_window = []

# loop through the zipped tuples (date, site1availability, site2availability, site3availability)
for date, s1, s2, s3 in zip(dates, site1, site2, site3):
    # if all three sites are available...
    if s1 == 'Available' and s1 == s2 and s1 == s3:
        # ... add this date to the current window
        current_window.append(date)
    else:
        # ... otherwise, add the current window to our list of open windows
        open_windows.append(current_window)
        # and reset the current window to be empty
        current_window = []
# when we exit the loop, we have a current window, so add it to the list of windows
open_windows.append(current_window)

# Find the longest window out of all of the open windows
# Note that open_windows is a list of lists! We need to loop through it and find the longest list
max_window = None
for window in open_windows:
    if max_window is None or len(window) > len(max_window):
        max_window = window

# Print the dates in the maximum open window
print(max_window)