Boost Your Algorithmic Skills: Solving the Two Sum Problem

Are you eager to boost your algorithmic skillset? It’s time to learn the Two Sum problem, a classic problem that is sometimes asked in technical interviews.

Introduction to the Two Sum Problem

The task is straightforward: locate the two numbers in the array that add up to the target amount given an array of numbers and the target sum. The purpose of this problem is to test your ability to find a pair of numbers that satisfies a specific condition.

Now, let’s get into the strategies to solve this problem. We’ll start by defining our objective and creating a simple test suite to guide our implementation. Let the dance begin!

Defining our objective: testing driven development (TDD) to the rescue

Any algorithm or coding problem can be difficult to implement at first. However, we can smooth out the road by using test driven development principles.

To start, we create a test suite that depicts the problem, much like how we start a video game with a rather clear roadmap 🚗 in mind to complete that level and, eventually, defeat the ultimate boss 🦹.

These tests will first fail ❌, but that’s okay! 👍 It’s bound to happen and we should embrace this kind of failure, being it a very good learning experience.

So, as with completing the levels of a video game 🎮, passing our tests is our main objective. It’s similar to leveling up in a video game, to go from a failing test to a passing one. We discover more about the problem and the reasoning necessary to conquer it as we pass more tests.

This approach helps us to make slow, steady progress toward our goal, ensuring that we are on the correct track and satisfying the challenge’s requirements.

Let’s define our two simple tests:

describe('twoSum', () => {
  it('returns the indices of two values that sum up to the target', () => {
    const numbers = [3, 5, 8, 11];
    const target = 14;

    const result = twoSum(numbers, target);

    expect(result).toEqual([0, 3]);
  });

  it('returns an empty array if no two elements sum up to the target', () => {
    const numbers = [3, 5, 8, 11];
    const target = 7;

    const result = twoSum(numbers, target);

    expect(result).toEqual([]);
  });
});

The Brute Force Approach: the power of exhaustive search

As we just mentioned, our ultimate goal now is to complete the level and defeat the final boss; so, at this point, we are not worried at all about the algorithm’s performance. We’ll focus our attention on turning those tests green ✅! Only once we have reached this goal, and only then, will we make it more performant. Time to warm the engines! 🔥

function twoSum(numbers: number[], target: number): number[] {
  // We loop through each element in the array
  for (let firstIndex = 0; firstIndex < numbers.length; firstIndex++) {
    const firstValue = numbers[firstIndex];

    /*
    We loop through each element again, 
    starting from the next element after the current one,
    as we already might have checked the previous ones
    */
    for (let secondIndex = firstIndex + 1; secondIndex < numbers.length; secondIndex++) {
        const secondValue = numbers[secondIndex];

        // If the sum of the two elements is equal to the target, we return their indices 
        if (firstValue + secondValue === target) return [firstIndex, secondIndex];
    }
  }

  // If no elements add up to the target, we return an empty array 
  return [];
};

Summary of the steps of the Brute Force approach:

Start

We loop through every element in the array

For each item, we iterate over the remaining items in the array

We check if the sum of the two items equals the target attribute

If the sum equals the target we return the indices of the two values

We repeat steps 2 through 4 until we find the target sum or all possible pairs have been checked

Finish

Okay, so picture this: the approach we just saw iterates over each pair of items in the array. Sounds simple enough, right? But, here is the kicker: this implementation carries a time complexity of O(n²), where n is the number of items in the array. 💣 Boom! That’s some serious computational power right there.

Put simply, this approach basically goes on a mad rampage through the array, checking every possible combination of two integers to see if their sum equals the value of the target. It’s like a wild gorilla 🦍 that pounces on each integer, comparing it with every other one in the array until it’s satisfied, or until it’s exhausted. And let me tell you, this can take a lot of time ⏳; especially for those beastly, gigantic arrays.

More in detail, the outer loop runs once for each element of the input array. If there are n elements in the array, the outer loop runs n times. This gives us a starting point of O(n).

Inside the outer loop, there’s an inner loop. For each iteration of the outer loop, the inner loop runs through the remaining elements of the array. On the first iteration of the outer loop, the inner loop runs n - 1 times. On the second iteration, it runs n - 2 times, and so on. In the worst-case scenario, where the outer loop runs for every single element, we sum up the number of times the inner loop runs: (n - 1) + (n - 2) + ... + 1. This is the arithmetic series that adds up to n * (n - 1) / 2.

If you’re wondering why we just divided by 2, here is why.

Imagine you’ve lined up a group of ninjas to take the ultimate sparring training, and there are n ninjas. You want every pair of ninjas to get into a sparring match and only once. So, the first ninja spars with everyone else, then steps out. The second ninja spars with everyone else except the first ninja, since they’ve already fought, and then the second ninja steps out, too.

If we think about this in terms of sparring matches:

The first ninja has n - 1 fights.

The second ninja has n - 2 fights (because the first ninja isn’t there anymore).

The third ninja has n - 3 fights.

…

The last ninja doesn’t have anyone left to fight with.

Now, why do we divide by 2? Because if we didn’t and just added up n - 1 fights, then n - 2, all the way to 1, we’d actually be counting some fights twice! Every fight involves two ninjas, so each one gets counted for each ninja in our total.

To avoid this double counting, we realize that we can actually just take the total number of possible fights if everyone fought with everyone (including double-counted fights), which would be n * (n - 1) (everyone fights with n - 1 others), and then cut that number in half to correct for the double counting. This is why we divide by 2.

Hence, the total number of unique fights (or steps in our inner loop) is n * (n - 1) / 2. This is similar to adding up all possible combinations of pairs without repeating any pair, which is dividing by 2 in essence.

When calculating Big O notation, we look for the most significant term, and we drop the constants and lower-order terms. In this series, the most significant term is n², which is the behavior of the time complexity as n gets very large. Constants and coefficients become less significant to the growth rate, so n * (n - 1) / 2 simplifies to n² / 2 - n/2. The dominant factor here is n², so we say the function twoSum has a time complexity of O(n²).

So, ease sometimes comes at a price. That’s why we need to bring in the big guns 🔫: the hash table and two-pointer approaches. These bad boys pack a punch and can deliver some serious performance gains. So, if you want to up your computing game, don’t be afraid to explore these alternative approaches, beginning with the hash table. Believe me, your computer will thank you!

The Hash Table Approach

function twoSum(numbers: number[], target: number): number[] {
  // Create a new empty Map to store the values and indices
  const hashMap = new Map<number, number>();

  // Loop through each element in the numbers array
  for (let index = 0; index < numbers.length; index++) {

    // Get the current value of numbers array at the current index
    const currentValue = numbers[index];

    // Calculate the remaining value needed to reach the target
    const rest = target - currentValue;

    // Check if the remaining value exists in the hashMap
    if (hashMap.has(rest)) {

      // If it does, return the indices of the two values that sum up to the target
      return [hashMap.get(rest) as number, index];
    }

    // Otherwise, add the current value and its index to the hashMap
    hashMap.set(currentValue, index);
  }

  // If no two elements sum up to the target, return an empty array
  return [];
};

Summary of the steps of the hash table approach:

Start

Create an empty hash table.

Iterate over each element in the array.

Calculate the difference between the target and the current element.

Check if the difference exists as a key in the hash table.

If the difference exists, return the indices of the two values.

Otherwise, add the current element and its index to the hash table.

Repeat steps 3-6 until the target sum is found or all elements in the array have been processed.

End.

The Hash Table approach for the Two Sum problem has a time complexity of O(n), where n is the length of the array.

This approach is faster 🚄 than the brute force approach because it uses a hash table to store the values and their indices, and can perform lookups 👀 in constant time. This means that the time it takes to search for a value in the hash table does not depend on the size of the input.

In simpler terms, the hash table approach allows us to find the solution to the Two Sum problem faster than the brute force approach because it only needs to look at each element in the array once. The hash table makes it easy to check if a value exists in the array without having to search through all the other elements.

Therefore, the hash table approach is a more efficient way to solve the Two Sum problem for larger input sizes.

Let’s analyze the time complexity in detail:

Creating a new empty Map: O(1).
The time complexity for initializing an empty Map is constant, as it doesn't depend on the size of the input array.
Looping through each element: O(n).
The for loop goes over each element in the array exactly once where n is the number of elements in the input array.

Within this loop:
1. Calculating the remaining value (rest) is a constant operation: O(1).
2. Checking if rest exists in the hashMap is an operation that, on average, takes constant time: O(1).
3. Getting the index from the hashMap, if it exists, is also a constant time operation: O(1).
4. Setting a value in the hashMap is a constant time operation: O(1).

Since all the operations inside the loop run in constant time and the loop runs n times, the dominant factor here is the loop itself. Therefore, the overall time complexity of the function twoSum is O(n).

The Two-Pointer Approach

function twoSum(numbers: number[], target: number): number[] {
  const numbersWithIndex = numbers
    // Map each number to its index
    .map((number, index) => ({ number, index }))

    // Sort the array of numbers in ascending order
    .sort((a, b) => a.number - b.number);

  // Set the left and right pointers 
  // to the beginning and end of the array, respectively
  let leftPointer = 0;
  let rightPointer = numbers.at(-1);

  // Loop through the array while the left pointer is less than the right pointer
  while (leftPointer < rightPointer) {

    // Calculate the sum of the current left and right values
    const sum = numbersWithIndex[leftPointer].number + numbersWithIndex[rightPointer].number;

    // If the sum equals the target, return the indices of the left and right values
    if (sum === target) {
      return [numbersWithIndex[leftPointer].index, numbersWithIndex[rightPointer].index];

    } else if (sum < target) {

      // If the sum is less than the target, move the left pointer to the right
      leftPointer++;

    } else {

      // If the sum is greater than the target, move the right pointer to the left
      rightPointer--;
    }
  }

  // Return an empty array if no matching indices were found
  return [];
  
};

Summary of the steps of the two-pointer approach:

Start

We sort the input array in ascending order.

We initialize two pointers, one at the start of the array and one at the end of the array.

We compute the sum of the values at the current pointers.

If the sum is equal to the target value, we return the indices of the current pointers.

If the sum is less than the target value, we increment the left pointer.

If the sum is greater than the target value, we decrement the right pointer.

We repeat steps 2 through 5 until the pointers meet or the sum is found.

If the sum is not found, we return an empty array.

End

The time complexity of the ⬇️ two-pointer ⬇️ approach is O(n log n), where n is the length of the input array. The reason for this is the following:

We map each number to its index: O(n). The .map() function is iterating over every element in the input array once.
We sort the array: O(n log n). The sort function is typically implemented with a sorting algorithm with O(n log n) average and worst-case time complexity.
While loop: O(n). After sorting the array, we use two pointers, one at the beginning of the array and one at the end, and we move them towards each other until they meet or until the sum of the numbers at the pointers is equal to the target. In the worst case, the two-pointers might traverse the entire array only once. The pointers either move closer together or the loop stops when they meet or cross, which ensures we iterate through the array a single time.

Combining these steps, the overall time complexity is dictated by the most expensive step:

Mapping: O(n)

Sorting: O(n log n)

Two-pointer loop: O(n)

Since O(n log n) is the dominant term here, the overall time complexity is O(n log n).

Think of the entire process as organizing a bookshelf, where you first have to go through each book to tag them with a number (the mapping), which takes as long as the number of books you have — O(n).

Then, you sort the books by their assigned numbers in order, which takes longer as the number of books grows — O(n log n).

After sorting, you look for a pair of books that add up to the target price you have in your mind. To do this, you start at both ends of the shelf and move inward, only going through the books once — O(n).

Even though the organizing and searching take some time, the step that takes the longest time is the sorting part. So no matter how fast you are with the other steps, the time it takes to sort the books overshadows everything else. This is why the overall time taken is dictated by the sorting part — O(n log n).

When you see O(n log n), it doesn’t literally mean you multiply O(n) and O(log n) in the arithmetic sense. Instead, it represents a time complexity where the algorithm’s running time increases in a way proportional to n times the logarithm of n.

To better understand why sorting is O(n log n), let’s delve into what n and log n represent.

n: This is the straightforward part. It just represents the number of elements that you have to deal with.
log n: The logarithmic part comes from the idea of dividing the problem in half (or more generally into equal parts) repeatedly. Imagine you have a stack of papers numbered 1 through 16. If you’re asked to find a particular number, you can divide the stack in half and choose which side to pursue, reducing the problem size by half every time. How many times can you halve the stack before you’re left with one paper? In the case of 16, it’s 4 (because 2⁴ = 16). So the number of divisions is the logarithm base 2 of the number of papers, which we write as log base 2 of n. In Big O notation, we just say log n for simplicity, assuming a base 2 unless otherwise specified.

Now, bringing these concepts together, when we say an algorithm is O(n log n), it means the runtime grows in a way that, for each element n, we are doing a series of operations that can be broken down into log n parts.

Most comparison-based sorting algorithms are O(n log n) because they often use a divide-and-conquer strategy:

They repeatedly divide the entire collection into smaller pieces (hence the log n part — imagine the repeated halving).
They perform some action (like comparing and swapping) on each element in those pieces (the n part).
Finally, they combine those pieces back together in a sorted order.

By doing this n action for each of the log n divisions, the operations pile up in a way that’s best described by multiplying n and log n — which is what O(n log n) complexity reflects in a sorting context.

Wrap up

We explored three different approaches to solve this problem, each with pros and cons.

The brute force approach is simple and easy to understand, but it has poorer performance, which makes it unsuitable for large inputs. On the other hand, the hash table approach has excellent time complexity, but it requires additional space to store the hash table.

Finally, the two-pointer approach strikes a balance between time and space complexity, making it a very effective solution for the Two Sum problem.

Choosing the right approach for the problem at hand depends on several factors, such as the size of the input, the memory constraints and the time limits. Therefore, it’s essential to understand each approach’s trade-offs to select the optimal solution for the given problem.

By mastering the different approaches to the Two Sum problem, you can become a better problem solver and improve your algorithmic skills. With this knowledge, you’ll be better equipped to tackle more complex problems and build better software. 💯