By Ryan McBridein
software engineering
·

A Crash Course in Data Structures

A Crash Course in Data Structures

Why Your Phone Doesn't Freeze When Searching 10,000 Contacts
Imagine you have a massive collection of Spotify playlists, thousands of contacts in your phone, and a mountain of unread texts. When you search for your friend "Mario" in your contacts, your phone finds him instantly. It doesn't freeze, it doesn't panic, and it doesn't take five minutes.

How does it do that? The secret is something computer scientists call Data Structures.

A data structure is just a specific way of organizing information inside a computer’s memory. But here is the golden rule of computer science: You can’t have it all. There is always a trade-off between Time (speed) and Space (memory). If you want your app to be lightning-fast, you usually have to sacrifice some memory. If you want to save memory, your app might run a little slower.

Here is the ultimate breakdown of how we organize data to balance this out, from the clunkiest methods to the absolute holy grail of coding.


Arrays: The Rigid Speedster
Think of an array like a row of school lockers physically bolted together. If you want to find something, it’s incredibly fast. Because everything is in alphabetical order and right next to each other, you can just jump to the middle, see if you need to go left or right, and keep chopping the problem in half until you find what you need. (This is called Binary Search).

The Catch: Arrays are stubborn. When you build one, you have to tell the computer exactly how many lockers you need. If you ask for 50 lockers, and a 51st friend comes along... you’re out of luck. You have to ask the computer to build a brand new, bigger row of 100 lockers, and then painstakingly move all 50 of your friends into the new row. It wastes a ton of time.

Linked Lists: The Flexible Slowpoke
To fix the locker problem, programmers created the Linked List. Imagine a scavenger hunt. Instead of putting everyone in a neat row, you put your friends wherever there is empty space in the computer's memory. To keep track of them, you give the first friend a piece of paper (a "pointer") with the address of the second friend. The second friend has the address of the third, and so on.

The Catch: Adding a new friend is super easy—just write their address on the last person's paper. But searching? It’s a nightmare. You can’t just jump to the middle anymore. If you want to find friend #500, you have to ask friend 1, who points to friend 2, who points to friend 3... all the way to 500. It is incredibly slow.

(Side note: Programmers use these to build things like Queues—which work like a line at a store where the first person in is the first person out—and Stacks—which work like a pile of dirty laundry where the last shirt you threw on top is the first one you pick up!)

Binary Search Trees: The Mashup
We want the lightning speed of the Array, but the easy-to-grow flexibility of the Linked List. Enter the Binary Search Tree.

Imagine a family tree, but with numbers. At the very top is a "root" number. Every number smaller than the root goes down a branch to the left. Every number bigger goes down a branch to the right.

If you are looking for the number 5, and the root is 10, you know instantly to ignore the entire right side of the tree. Just like the array, you are chopping the problem in half every single step! And just like the linked list, if you want to add a new number, you just draw a new branch.

The Catch: To make this work, every single item needs arrows pointing to its left and right "children." Storing all those arrows takes up nearly three times as much memory. Space sacrificed for speed.

Hash Tables: The Holy Grail
Even chopping a problem in half takes some time. What if we want to find Mario literally instantly? We use a Hash Table. This is the "Swiss Army Knife" of coding.

A hash table uses a math trick called a "hash function." Imagine taking your contacts and sorting them into 26 buckets, labeled A through Z. When you type in "Mario," the hash function looks at the "M" and knows instantly to drop him in the 13th bucket. Boom. Instant.

But what if you add "Luigi" and "Link"? They both start with L. They belong in the same bucket. This is called a collision. To fix this, programmers make each bucket a tiny Linked List. So, you instantly jump to the "L" bucket, and then do a super quick search through the few "L" names you have.

This is actually how restaurants like Sweetgreen organize pickup orders! They have shelves labeled A-D, E-J, etc. They "hash" your salad to a specific shelf so you can find it instantly without checking every bowl in the store.

Tries: The Insane Overkill
If you absolutely must have the fastest speed imaginable and you don't care about wasting memory, you use a Trie (pronounced "try"). It’s a massive web of arrays.

To find the word "CAT", the computer goes to the C bucket, which points to an A bucket, which points to a T bucket. Three letters = three steps. It doesn't matter if your dictionary has 10 words or 10 million words. Finding "CAT" will always take exactly 3 steps. It is the ultimate speed.

The Catch: You end up with thousands of empty buckets for weird letter combinations that don't exist (like "ZZZ"). It wastes an astronomical amount of memory.

The Bottom Line
There is no "perfect" way to organize data. If you are building a video game where speed is everything, you might burn through memory using Hash Tables or Tries. If you are coding for a cheap smartwatch with very little memory, you might use Linked Lists. Being a good programmer is all about knowing which tool to pull out of your toolbox!