3. Longest Substring Without Repeating Characters

Given a string s, find the length of the longest 

substring without repeating characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Constraints:

  • 0 <= s.length <= 5 * 104
  • s consists of English letters, digits, symbols and spaces.

Problem Description

The task is to find the length of the longest substring within a given string s that does not contain any repeating characters. A substring is defined as a contiguous sequence of characters within a string. The goal is to seek out the longest possible substring where each character is unique; no character appears more than once.

Imagine you have a unique piece of string and you want to make the longest possible necklace with beads where each bead must have a different shape or color. How would you pick beads so that repeats are avoided? Similar to this, the problem requires us to find such a unique sequence of characters in the given string s.

Intuition

To solve this challenge, the approach revolves around using a sliding window to scan through the string s while keeping track of unique characters we’ve seen so far. This technique involves expanding and shrinking a window on different portions of the string to cover the non-repeating sequence of characters.

We use two pointers i (the start of the window) and j (the end of the window) to represent the current segment of the string we’re looking at. If we see a new character that hasn’t appeared in the current window, it’s safe to add this character to our current sequence and move the end pointer j ahead.

However, if we find a character that’s already in our current window, it means we’ve found a repeat and must therefore move the start pointer i forward. We do this by removing characters from the start of our window until the repeating character is eliminated from it.

During this process, we always keep track of the maximum length of the substring found that meets the condition. The length of the current unique character substring can be calculated by taking the difference of the end and start pointers j - i and adding 1 (since the length is the difference between the indexes plus one).

At the end of this process, when we’ve checked all characters in the string, the maximum length we tracked gives us the length of the longest substring without repeating characters.

Solution Approach

The solution implements a sliding window algorithm, which is an efficient way to maintain a subset of data in a larger dataset such as an array or string. In this context, we’re dealing with a string and aiming to find a length of substring, i.e. a continuous range of characters, with distinct values. Two pointers, often denoted as i and j, are used to represent the current window, starting from the beginning of the string and ending at the last unique character found.

The algorithm also incorporates a hash set, named ss in the code, to efficiently verify if a character has already been seen within the current window. Hash sets offer constant time complexity for add, remove, and check operations, which makes them an ideal choice for this algorithm.

Here is the step-by-step breakdown of the implementation:

  1. Initialize a hash set ss to record the characters in the current window. This will allow us to quickly check if a character is part of the current substring without having to scan all of it.
  2. Initialize two pointers, i and ji will point to the start of the current window, and j will iterate over the string to examine each character.
  3. Initialize a variable ans to keep track of the length of the longest substring found.
  4. Iterate over the string with j. For each character c located at the jth position:
    • While c is already in the set ss (implying a repeat and therefore a violation of the unique-substring rule), remove characters starting from the ith position and move i forward; this effectively shrinks the window from the left side until c can be added without creating a duplicate.
    • Once c is not in ss, add c to ss to include it in the current window.
    • Update ans if the current window size (j - i + 1) is larger than the maximum found so far.
  5. After iterating over all characters, return the value of ans.

The solution functions within O(n) time complexity—where n is the length of the string—since each character in the string is visited once by j and at most once by i as the window is moved forward.

To demonstrate the behavior of the sliding window algorithm, consider the Java template provided in the reference solution approach. This generic algorithm pattern consists of two pointers moving over a dataset and a condition checked in a while loop that modifies one of the pointers (j) based on some condition (check(j, i)) applied to the current range (or window) that the pointers define.

<pre class="wp-block-syntaxhighlighter-code">for (int i = 0, j = 0; i &lt; n; ++i) {
    while (j &lt; i &amp;&amp; check(j, i)) {
        ++j;
    }
    // logic of specific problem
}</pre>

In our solution, the “check” is finding if c is already in the set ss, and the logic after the while loop is the add-to-set and max-value-update operations.

Example Walkthrough

Let’s walk through the solution approach by using a simple example. Consider the input string s = "abcabcbb".

  1. Initialize variables:
    • A hash set ss to store characters of the current substring without repeating ones.
    • Two pointers: i (start of the window) set to 0, and j (end of the window) also set to 0.
    • A variable ans to store the length of the longest substring found, set to 0.
  2. Traverse the string with j:
    • The outer loop moves j from the left to the right across the string.
  3. Examining each character:
    • When j = 0, the character is 'a'. It’s not in ss, so add it to the set and ans is updated to 1.
    • Move j to 1, now the character is 'b'. It’s not in ss, so add it, and ans is updated to 2.
    • Move j to 2, the character is 'c'. It’s not in ss, so add it, and ans is updated to 3.
    • Move j to 3, we find 'a' again. It’s in ss, so we enter the while loop to start removing from the left side:
      • Remove 'a' from ss, and move i to 1.
    • Add 'a' back as its repeat was removed, and move j to 4.
    • Now j = 4 and the character is 'b'. Since 'b' is in ss, remove characters with the while loop:
      • Remove 'b' from ss, and move i to 2.
    • Because the repeat 'b' has been removed from ss, add the new 'b' and j moves to 5.
    • Continuing this pattern, j keeps moving to the right, adding unique characters to ss, and updating ans if we find a longer unique-substring along the way.
  4. Completing the traverse:
    • As j progresses to the end of the string (j = 8), we keep removing duplicates from the left and adding new characters to the right until our window contains unique characters only.
    • The length of each window is calculated and compared with ans, and ans is updated if a longer length is found.
  5. Return the result:
    • After the above process, we find that the longest substring without repeating characters is 'abc' with a length of 3, which is the final value of ans.

Applying this approach to our example string s = "abcabcbb", we successfully find that the longest substring without repeating characters is 3 characters long.

Solution Implementation


#include <string>
#include <unordered_set>
#include <algorithm>

class Solution {
public:
    int lengthOfLongestSubstring(std::string s) {
        // This unordered set is used to store the characters that are currently in the
        // longest substring without repeating characters.
        std::unordered_set<char> charSet;

        // The starting index of the substring.
        int start = 0;
        // The length of the longest substring without repeating characters.
        int maxLength = 0;

        // Iterate over the string.
        for (int end = 0; end < s.size(); ++end) {
            // If the character at the current ending index of the substring already exists
            // in the character set, continue to remove characters from the start of the
            // substring until the character is no longer in the set.
            while (charSet.count(s[end])) {
                charSet.erase(s[start]);
                start += 1;
            }

            // Insert the current character into the set.
            charSet.insert(s[end]);

            // Calculate the length of the current substring and update maxLength
            // if this length is the largest we've found so far.
            maxLength = std::max(maxLength, end - start + 1);
        }
        // Return the length of the longest substring found.
        return maxLength;
    }
};
Copied!
class Solution {
    public int lengthOfLongestSubstring(String s) {
        // Array to count the frequency of each character in the string (ASCII range of characters)
        int[] cnt = new int[128];
        int ans = 0;  // Variable to store the length of the longest substring found
        int n = s.length();  // Get the length of the input string
        
        // Use a sliding window approach with two pointers: left (l) and right (r)
        for (int l = 0, r = 0; r < n; ++r) {
            // Get the character at the right pointer
            char c = s.charAt(r);
            // Increment the count of the character at the right pointer
            ++cnt[c];
            
            // If the count of the character exceeds 1, shrink the window from the left side
            while (cnt[c] > 1) {
                // Move the left pointer to the right and decrease the count of the character at the left pointer
                --cnt[s.charAt(l++)];
            }
            
            // Update the maximum length of the substring found so far
            ans = Math.max(ans, r - l + 1);
        }
        
        // Return the length of the longest substring without repeating characters
        return ans;
    }
}
Copied!
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
  # Create a set to store the characters in the current substring
  chars = set()
  # Initialize left pointer (l) and result variable (res) to keep track of the longest length
  l, res = 0, 0

  # Iterate through the string with right pointer (r)
  for r in range(len(s)):
    # If the character at s[r] is already in the set, move the left pointer to the right
    while s[r] in chars:
      # Remove the character at s[l] from the set and move the left pointer forward
      chars.remove(s[l])
      l += 1
    # Add the current character at s[r] to the set
    chars.add(s[r])
    # Update the result with the maximum length found so far
    res = max(res, r - l + 1)
    
  # Return the length of the longest substring without repeating characters
  return res
Copied!
int lengthOfLongestSubstring(char* s) {
    int i, maxLength = 0;
    int pos[128] = {0};  // position of each character
    int start = 0;  // start index of the current substring

    for (i = 0; s[i]; i++) {
        char c = s[i];

        // If the character was already seen and its position is greater than or equal to start
        if (pos[c] > start) {
            start = pos[c];  // Move the start to one past the last occurrence of 'c'
        }

        // Update the position of the character to the current index
        pos[c] = i + 1;

        // Calculate the length of the current substring
        maxLength = maxLength > (i - start + 1) ? maxLength : (i - start + 1);
    }

    return maxLength;
}
Copied!
public class Solution {
    public int LengthOfLongestSubstring(string s) {
        int n = s.Length;  // Get the length of the input string
        int ans = 0;  // Variable to store the length of the longest substring found
        var cnt = new int[128];  // Array to count the frequency of each character in the string (ASCII range)

        // Use a sliding window approach with two pointers: left (l) and right (r)
        for (int l = 0, r = 0; r < n; ++r) {
            // Increment the count of the character at the right pointer
            ++cnt[s[r]];
            
            // If the count of the character exceeds 1, shrink the window from the left side
            while (cnt[s[r]] > 1) {
                // Move the left pointer to the right and decrease the count of the character at the left pointer
                --cnt[s[l++]];
            }
            
            // Update the maximum length of the substring found so far
            ans = Math.Max(ans, r - l + 1);
        }
        
        // Return the length of the longest substring without repeating characters
        return ans;
    }
}
Copied!
/**
* @param {string} s
* @return {number}
*/
var lengthOfLongestSubstring = function (s) {
    let ans = 0;  // Variable to store the length of the longest substring found
    const n = s.length;  // Get the length of the input string
    const cnt = new Map();  // Map to count the frequency of characters in the string

    // Use a sliding window approach with two pointers: left (l) and right (r)
    for (let l = 0, r = 0; r < n; ++r) {
        // Increment the count of the character at the right pointer
        cnt.set(s[r], (cnt.get(s[r]) || 0) + 1);
        
        // While the count of the character exceeds 1, shrink the window from the left side
        while (cnt.get(s[r]) > 1) {
            // Decrease the count of the character at the left pointer and move the left pointer to the right
            cnt.set(s[l], cnt.get(s[l]) - 1);
            ++l;
        }
        
        // Update the maximum length of the substring found so far
        ans = Math.max(ans, r - l + 1);
    }
    
    // Return the length of the longest substring without repeating characters
    return ans;
};
Copied!
function lengthOfLongestSubstring(s: string): number {
    let ans = 0;  // Variable to store the length of the longest substring found
    const cnt = new Map();  // Map to track the frequency of characters in the string
    const n = s.length;  // Get the length of the input string

    // Use a sliding window approach with two pointers: left (l) and right (r)
    for (let l = 0, r = 0; r < n; ++r) {
        // Increment the count of the character at the right pointer
        cnt.set(s[r], (cnt.get(s[r]) || 0) + 1);
        
        // If the count of the character exceeds 1, shrink the window from the left side
        while (cnt.get(s[r])! > 1) {
            // Decrease the count of the character at the left pointer and move the left pointer to the right
            cnt.set(s[l], cnt.get(s[l])! - 1);
            ++l;
        }
        
        // Update the maximum length of the substring found so far
        ans = Math.max(ans, r - l + 1);
    }
    
    // Return the length of the longest substring without repeating characters
    return ans;
}
Copied!
class Solution {
    function lengthOfLongestSubstring($s) {
        $n = strlen($s);  // Get the length of the input string
        $ans = 0;  // Variable to store the length of the longest substring found
        $cnt = array_fill(0, 128, 0);  // Array to track the frequency of characters (ASCII range)
        $l = 0;  // Left pointer for the sliding window

        // Iterate with the right pointer ($r) over the string
        for ($r = 0; $r < $n; ++$r) {
            // Increment the count of the character at the right pointer
            $cnt[ord($s[$r])]++;

            // While the count of the character at the right pointer is greater than 1 (duplicate found)
            while ($cnt[ord($s[$r])] > 1) {
                // Decrease the count of the character at the left pointer and move the left pointer to the right
                $cnt[ord($s[$l])]--;
                $l++;
            }

            // Update the maximum length of the substring found so far
            $ans = max($ans, $r - $l + 1);
        }

        // Return the length of the longest substring without repeating characters
        return $ans;
    }
}
Copied!
class Solution {
    func lengthOfLongestSubstring(_ s: String) -> Int {
        var start = 0
        var maxLength = 0
        var charToIndex = [Character: Int]()
        
        for (i, char) in s.enumerated() {
            if let index = charToIndex[char] {
                // If the character is already present in the charToIndex dictionary, move the start index to the next of the previous occurrence of the character.
                start = max(start, index + 1)
            }
            
            charToIndex[char] = i
            maxLength = max(maxLength, i - start + 1)
        }
        
        return maxLength
    }
}
Copied!
class Solution {
    fun lengthOfLongestSubstring(s: String): Int {
        val n = s.length  // Get the length of the input string
        var ans = 0  // Variable to store the length of the longest substring found
        val cnt = IntArray(128)  // Array to track the frequency of characters (ASCII range)
        var l = 0  // Left pointer for the sliding window
        
        // Iterate with the right pointer (r) over the string
        for (r in 0 until n) {
            // Increment the count of the character at the right pointer (r)
            cnt[s[r].toInt()]++
            
            // While the count of the character at the right pointer is greater than 1 (duplicate found)
            while (cnt[s[r].toInt()] > 1) {
                // Decrease the count of the character at the left pointer (l) and move the left pointer to the right
                cnt[s[l].toInt()]--
                l++
            }
            
            // Update the maximum length of the substring found so far
            ans = Math.max(ans, r - l + 1)
        }
        
        // Return the length of the longest substring without repeating characters
        return ans
    }
}
Copied!
class Solution {
    int lengthOfLongestSubstring(String s) {
      // Create an array to store the frequency count of characters (128 for ASCII characters)
      List cnt = List.filled(128, 0);
      int ans = 0, n = s.length;
      int l = 0;
  
      for (int r = 0; r < n; ++r) {
        // Get the character at index r
        int c = s.codeUnitAt(r);
        
        // Increment the count of the current character
        cnt[c]++;
  
        // If the character count is more than 1, move the left pointer (l)
        while (cnt[c] > 1) {
          cnt[s.codeUnitAt(l)]--;
          l++;
        }
  
        // Update the answer with the maximum length found so far
        ans = (ans > r - l + 1) ? ans : r - l + 1;
      }
      
      return ans;
    }
  }
Copied!
func lengthOfLongestSubstring(s string) (ans int) {
    // Initialize an array to count occurrences of each character (128 for ASCII characters)
    cnt := [128]int{}
    // Left pointer of the sliding window
    l := 0

    // Iterate through the string with the right pointer (r)
    for r, c := range s {
        // Increment the count of the current character at the right pointer
        cnt[c]++
        
        // If the character at the right pointer is duplicated (count > 1)
        // Move the left pointer (l) to shrink the window
        for cnt[c] > 1 {
            // Decrease the count of the character at the left pointer and move the left pointer to the right
            cnt[s[l]]--
            l++
        }
        
        // Calculate the maximum length of the substring found so far (r - l + 1)
        ans = max(ans, r-l+1)
    }

    // Return the length of the longest substring without repeating characters
    return
}
Copied!

Time and Space Complexity

Time Complexity

The time complexity of the code is O(2n) which simplifies to O(n), where n is the length of the string s. This is because in the worst case, each character will be visited twice by the two pointers i and j - once when j encounters the character and once when i moves past the character after it has been found in the set ss. However, each character is processed only a constant number of times, hence we consider the overall time complexity to be linear.

Space Complexity

The space complexity of the code is O(min(n, m)), where n is the size of the string s and m is the size of the character set (the number of unique characters in s). In the worst case, the set ss can store all unique characters of the string if all characters in the string are distinct. However, m is the limiting factor since it represents the size of the character set that can be stored in ss. Therefore, if n is larger than m, the space complexity is limited by m rather than n.

Leave a Reply

Your email address will not be published. Required fields are marked *