If you have only recently started programming, you might not even know what data structures and algorithms are, even though I can guarantee you that you have already used them.
One of the common problems of most online tutorials compared to going to school is that they have a lack of theoretical knowledge and focus only on the practice. Data structures and algorithms are the 101’s of programming, and all beginners should at least be aware of what they are.
What are algorithms and data structures?
Algorithms
Let’s start with the most obvious of the two: algorithms. Most of the pieces of code that you will write are algorithms. An algorithm is a concept where you take an initial input, transform it by running some instructions step by step, and generate a final output. A concrete example of algorithm would be asking the user for multiple numbers, reading those numbers, summing them, and returning the result of that sum.
To learn more about algorithms, I recommend the excellent free class Algorithms: Design and Analysis, by a Stanford professor.
Data structures
A data structure is a way to store data in the memory of the computer, during the duration of the execution of an algorithm. You might not know it, but you have already used one even if you are new to programming: arrays. To use the previous example of summing multiple numbers, those numbers would be stored in the RAM of the computer during the execution of your algorithm, and you would most likely use an array to store them before processing the sum.
Many data structures exist beside the arrays, and they all serve a different purpose, and should be used appropriately to maximize the performances of your algorithms. For example, it is easy to add a new element at the end of an array, but to add an element in the middle it is more complex, and you have to shift all the elements by one cell in the array. A better data structure for this use case is the “list”, which allows you to easily add or remove an element from anywhere in the list.
The most commonly used data structures are lists, sets, maps (or associative array) and arrays, but there are many others for different use cases. You can also use this huge 8 hours video that presents data structures in depth: Data Structures Easy to Advanced Course – Full Tutorial from a Google Engineer.
Why are algorithms and data structures important?
From the previous section, it should be clear that those are the foundations of programming and software engineering, and you just cannot make a program without using them. However, it might not be as obvious why different data structures should be used in different situations, and why algorithms require classes to teach about them, when “it’s just logic”.
To reuse an example from above, let’s say that you are storing some numbers in an array, and you need to add a new number. It’s fine if it has to be added at the end, you can probably just call a method array.push(<new value>)
or something similar. But what if you have to add it in the middle of the array? There’s no method for that, and you have to code it yourself (depending on the language):
void pushAtIndex(int[] array, int valueToInsert, int insertPosition) {
int precedentValue = array[insertPosition];
array[insertPosition] = valueToInsert;
for (int i = insertPosition + 1; i < array.size(); i++) {
int tmp = array[i];
array[i] = precedentValue;
precedentValue = tmp;
}
}
The real problem is not that you have to write it yourself, but as you can see, this algorithm has to go through the whole array to just add one value in the middle. But it’s going to run in microseconds anyway, so it doesn’t matter, right? Wrong. It will run in microseconds (even nanoseconds probably) for a small array with a few values. But what if you have millions, or even billions of values in your array? It will take time. Too much time.
This is where the list comes into play. If you have followed at least the first few chapters of the Algorithms class I recommended earlier, you know that different algorithms have different time complexity (the theoretical time it takes to execute it). Arrays have what’s called a linear time complexity to add a new element which means the bigger the array, the longer it will take to add a new value. On the other hand, the list has a constant time complexity to add a new value, which means it will always take as much time (nanoseconds) to add a new value, no matter the size of the list.
That’s amazing! Goodbye arrays, let’s use list all the time, right? Wrong. While it is faster to insert or delete from a list, it is faster to access to a specific value in an array! If you have an array and a list of 1 billion numbers and you want to know what the last value is you will have to go through the whole list until you reach the last element (linear time complexity). However, with an array, you will access the space in memory where the last value is stored in a single operation (constant time complexity).
I hope this example shows the necessity to have a good knowledge of data structures and algorithms. It might seem like a low impact at first, but as the software you build grows, you will need to think about optimization, execution time, etc.
How to practice?
As always, practice makes perfect, and you should get used to the different data structures and the different strategies to adopt when designing algorithms (watch the Stanford class). It will come with time, but I recommend one way to practice: LeetCode.
LeetCode is a website that offers algorithm challenges, where you will have an exercise to complete with an algorithm to write. The main interest, when using it to practice algorithms and data structures, is that your solution will be compared to the solution of everyone else, and you will see if the execution time was better or worst.
Especially when starting out, you will be able to solve the problems using basic algorithms, without leveraging properly the data structures, but you will see that compared to other people, your solution will be extremely slow. You can then look into the discussion part of the website to find betters way to solve the problem, explained by other users.
LeetCode is also a great way to practice interviews for software development jobs. I used it when practicing for my Amazon interviews, and it proved to be invaluable!
If you are interested, you can see my profile on LeetCode, to see which problems I have solved, etc.
This could be considered the 3rd article in my series about learning programming, and it is really programming 101. It’s a subject that isn’t really explicitly talked about when following online tutorials, but it is usually the first class that you will attend if you go to a Computer Science school.
It is extremely important to be comfortable with the different data structures, as they will make your programs more efficient, but also easier to code if you master those concepts.