Nested Data Structures and Nested Loops

A loop within a loop… within a loop.

Diane Tunnicliffe
Towards Data Science

--

Photo by Karen Ciocca on Unsplash

I am in my second week of the part-time data science program at Flatiron School, and so far everything has been going great! I made the decision to pursue a career in data science after many years of deliberating over what career path would suit me best. I have a dual Bachelor’s degree in both Psychology and Philosophy, but until recently, I had yet to find a realm within those fields that I am particularly passionate about. Thanks to the Covid-19 pandemic and months of quarantine, I have finally been able to pursue my interests in technology, statistics, and coding at Flatiron. I love my instructors and my cohort of friendly peers, and I am finding that I have a good grasp on the material so far.

There is one thing that has thrown me for a loop though (pun definitely intended): nested loops in Python.

At the onset of this program, we were encouraged to blog about aspects of the material that we may find challenging, so as to help ourselves obtain a deeper understanding of it while also helping others to learn. Nested loops were tricky to me at first because of the fact that they are loops inside of other loops. I have found that it can help a lot to break it down and pay attention to just one loop at a time as you build your way up to the nested loop structure.

The work that I did to write this post also involved understanding of dictionaries in Python and list comprehension, two topics which also were a bit difficult at first. As with most obstacles, the more I’ve worked with it, the easier it has become. So with that in mind, here is some work with nested data in Python using nested loops.

I will start with some data that represents my family.

my_family = [  { "family_name": "Tunnicliffe",    "num_people": 4,    "local": True,    "city": "Bethpage, NY",    "date_established": 2014,    "names": ["Diane", "Steve", "Dylan", "Landon"],    "number_of_children": 2,    "children": [      {        "name": "Dylan",        "age": 5,        "favorite_color": "black",        "nickname": "Dillybeans",        "loves": "Super Mario",      },      {        "name": "Landon",        "age": 2,        "favorite_color": "blue",        "nickname": "Landybean",        "loves": "trucks",      }    ]  },  { "family_name": "Agulnick",    "num_people": 5,    "local": False,    "city": "Newton, MA",    "date_established": 1987,    "names": ["Ellen", "Mark", "Diane", "Joshua", "Allison"],    "number_of_children": 3,    "children": [      {        "name": "Diane",        "age": 31,        "favorite_color": "pink",        "nickname": "Dini",        "loves": "unicorns",      },      {        "name": "Joshua",        "age": 28,        "favorite_color": "red",        "nickname": "Joshie",        "loves": "trains",      },      {        "name": "Allison",        "age": 26,        "favorite_color": "purple",        "nickname": "Alli",        "loves": "candy",      }    ]  }]

Above, I have used the variable my_family to store a list that contains two dictionaries: one for the family unit that I live with (myself, my husband, and our two kids), and one for the family unit that I originated from (my parents, myself, and my siblings). A dictionary in Python is an object that stores data in key: value pairs. So in the first dictionary, for the key of "family name", we have the value of "Tunnicliffe". This becomes useful when using keys and values to access information, especially in a nested loop.

Photo by Marco Secchi on Unsplash

Now, let’s say that I want to access the city that I live in. To obtain this data, I would call on the first dictionary and look for the value associated with the key of "city":

print(my_family[0]['city'])

and in response, I’d get:

Bethpage, NY

This works because I know that the data representing the family I live in is stored in the first list object, a dictionary, indexed as my_family[0]. And since ‘city’ is an attribute of that dictionary, it is clear and easy to say, for instance:

print (f"I live in {my_family[0]['city']}.")

and we’d have:

I live in Bethpage, NY.

Alright, so that makes sense and is pretty straightforward. Now, what if I want to create a list of names of everyone in both families, as one combined list? For that, I can make use of a nested for loop:

names = []for unit in my_family:    for name in unit['names']:        names.append(name)print(names)

There it is. A loop within a loop. The outer loop is designated by for unit in my_family:, which accesses both dictionaries in my list contained in my_family. And for name in unit['names'] is the inner loop, which accesses all the values of the key names for both dictionaries. Finally, our append method means that for each family unit dictionary in the list my_family, and for each name in the listed names for that family unit, add the names to a list called names. So in response, we get:

['Diane', 'Steve', 'Dylan', 'Landon', 'Ellen', 'Mark', 'Diane', 'Joshua', 'Allison']

Cool! Now we’re getting somewhere. It is worth noting that where I wrote unit and name, you could put any variables that you want to represent elements. For example, I could have written for blob in my_family or for narwhal in unit['names']. But the goal here is for things to make more sense, not less, so I went with more logical (though less fun) choices for variable names. Let’s take it another step further. Now I want a list of just the children’s names. I could obtain that this way:

children = []for unit in my_family:    for child in unit['children']:        children.append(child['name'])print(children)

And I’d have:

['Dylan', 'Landon', 'Diane', 'Joshua', 'Allison']

Or if I wanted a list of their nicknames:

children = []for unit in my_family:    for child in unit['children']:         children.append(child['nickname'])print(children)

I would see this list:

['Dillybeans', 'Landybean', 'Dini', 'Joshie', 'Alli']

Please note that I am using the word “children” quite loosely here, as I have listed myself and my siblings as children, and we are all technically adults. (Although we may not feel like adults, and our interests, designated by the key 'loves': in our dataset, certainly illustrate this point.) If I wanted to find the ages of just the children in the family unit that I originated from, it helps to remember that we designated this unit as the one that is not local (geographically speaking). So to access this information, I could write something like this:

child_ages = []for unit in my_family:    if unit['local'] == False:        for child in unit['children']:            name_and_age = child['name'], child['age']            child_ages.append(name_and_age)print (child_ages)

And we would see a list of the names and ages of the children in the non-local family unit:

[('Diane', 31), ('Joshua', 28), ('Allison', 26)]

Turns out my siblings and I are, in fact, adults.

Just to reinforce our understanding of how this all works, let’s do a couple more. Suppose I want to find out which child is really interested in Super Mario. I could do that like this:

loves_Mario = Nonefor unit in my_family:    for child in unit['children']:        if child['loves'] == 'Super Mario':            loves_Mario = child['name']print (f"{loves_Mario} really loves Super Mario.")
Photo by Cláudio Luiz Castro on Unsplash

And our printed response would be:

Dylan really loves Super Mario.

It’s true! My son, Dylan, is absolutely obsessed with Super Mario. He’s wearing Super Mario pajamas and Bowser socks as I write this. Note that the == operator is used here to demonstrate that the value associated with the key child[‘loves'] and the value associated with "Super Mario" are the same. A similar type of code formula as the one used above could be used to access any information, from interests to favorite colors. For instance:

loves_trains = Noneloves_trains_age = Nonefor unit in my_family:    for child in unit['children']:

if child['loves'] == 'trains':
loves_trains = child['nickname'] loves_trains_age = child['age']print (f"{loves_trains} is still very much into trains, even at the age of {loves_trains_age}.")

And we have:

Joshie is still very much into trains, even at the age of 28.

Or:

likes_blue = Nonefor unit in my_family:    for child in unit['children']:        if child['favorite_color'] == 'blue':            likes_blue = child['name']print (f"{likes_blue}'s favorite color is blue.")

And our answer is:

Landon's favorite color is blue.

Once you get the hang of accessing the nested information, it becomes as logical as accessing any other dictionary item in Python. To wrap this up, let’s see if we can find the oldest and youngest children in my_family. I will be using nested loops and sorting for this, a task that gave me a pretty serious headache in a previous lab. But, I’ve had a little more practice now, so let’s give it a shot.

oldest_child = Noneyoungest_child = Nonechildren = []for unit in my_family:    for child in unit['children']:        children.append(child)sorted_children = (sorted(children, key = lambda child: child['age'], reverse = True))oldest_child = sorted_children[0]['name']youngest_child = sorted_children[-1]['name']print(f"The oldest child is {oldest_child}. The youngest child is {youngest_child}.")

Drumroll please…

The oldest child is Diane. The youngest child is Landon.
Myself, my husband, and my kids. My (local) family, referenced in data above as my_family[0].

Well, the code doesn’t lie. I am the oldest child in the dataset, and my youngest son, Landon, is the youngest. Considering how frustrating this whole nested loops concept was for me just a few days ago, I have to say that writing this was actually more fun than I expected. (Not sure how my siblings will feel about this adventure, so let’s just not tell them about the trains, the candy, and listing their ages on the internet.)

I think the most important thing I have learned so far is that when it comes to Python, any problem is solvable with a combination of information (be it from class materials, instructors and peers, or Google) and lots of practice. I’m at the point where I can look at a block of code and typically figure out what the output of it will be. When it comes to nested data, it can all be broken down into tiny pieces, line by line, to figure out exactly what information you are trying to access and access it.

--

--