LeetCode 3713 - Longest Balanced Substring I

The problem is asking us to find the length of the longest substring of a given string s where all distinct characters in that substring appear the same number of times.

LeetCode Problem 3713

Difficulty: 🟡 Medium
Topics: Hash Table, String, Counting, Enumeration

Solution

Problem Understanding

The problem is asking us to find the length of the longest substring of a given string s where all distinct characters in that substring appear the same number of times. In other words, for any candidate substring, if you count how many times each unique character appears, all counts must be identical. The input is a string of lowercase English letters, with a maximum length of 1000 characters, which is small enough that algorithms with O(n²) complexity may be acceptable, but O(n³) may be too slow.

The output is a single integer representing the length of the longest balanced substring. If no substring is balanced, the minimum possible return value would be 1 since any single character substring is trivially balanced. Important edge cases include strings where all characters are identical, strings where no two characters repeat, and strings with interleaving repeating patterns, as these can trip up naive implementations.

Approaches

The brute-force approach is straightforward: we can generate all possible substrings of s and for each substring, count the occurrences of each distinct character using a hash map. If all counts are equal, the substring is balanced, and we update the maximum length. This method is correct but inefficient because it requires O(n²) substrings, and for each substring, counting frequencies can take up to O(n), giving O(n³) time complexity.

The optimal approach leverages the fact that the number of distinct characters in a substring is limited (maximum 26 for lowercase English letters). For each possible number of distinct characters k from 1 to 26, we can use a sliding window to find the longest substring where exactly k distinct characters appear, all with the same frequency. This reduces the time complexity significantly because we no longer recompute character counts from scratch for each substring, instead updating counts incrementally in a sliding window.

Approach Time Complexity Space Complexity Notes
Brute Force O(n³) O(26) Checks all substrings, counts frequencies for each
Optimal O(26 * n²) = O(n²) O(26) Sliding window for each possible distinct character count

Algorithm Walkthrough

  1. Initialize a variable max_length to store the length of the longest balanced substring found so far.

  2. Iterate k from 1 to 26, representing the number of distinct characters we want in the substring.

  3. For each k, use a sliding window approach:

  4. Maintain a hash map or array freq to track the frequency of each character within the window.

  5. Maintain a variable distinct_count to track the number of unique characters in the window and min_freq and max_freq to track the range of frequencies.

  6. Expand the right end of the window until distinct_count > k.

  7. While distinct_count <= k, check if all frequencies are equal by comparing min_freq and max_freq.

  8. If balanced, update max_length with the window length.

  9. Slide the left end of the window to try other substrings.

  10. After processing all k, return max_length.

Why it works: For every number of distinct characters, the algorithm checks all candidate substrings where exactly that number of distinct characters appears. By maintaining frequency counts in a sliding window, it ensures we never miss a candidate balanced substring while avoiding unnecessary recalculation of character counts for overlapping windows.

Python Solution

class Solution:
    def longestBalanced(self, s: str) -> int:
        max_length = 0
        n = len(s)
        
        for k in range(1, 27):  # number of distinct characters
            left = 0
            freq = [0] * 26
            distinct_count = 0
            
            for right in range(n):
                idx = ord(s[right]) - ord('a')
                if freq[idx] == 0:
                    distinct_count += 1
                freq[idx] += 1
                
                while distinct_count > k:
                    left_idx = ord(s[left]) - ord('a')
                    freq[left_idx] -= 1
                    if freq[left_idx] == 0:
                        distinct_count -= 1
                    left += 1
                
                # Check if all non-zero frequencies are equal
                non_zero_freqs = [f for f in freq if f > 0]
                if non_zero_freqs and min(non_zero_freqs) == max(non_zero_freqs):
                    max_length = max(max_length, right - left + 1)
        
        return max_length

The Python implementation first iterates over possible distinct character counts. Inside that loop, it uses a sliding window to maintain the current substring and its character frequencies. The frequencies are updated incrementally as the window expands or contracts. The check for balance is done by extracting non-zero frequencies and comparing their minimum and maximum, ensuring all distinct characters have the same count.

Go Solution

func longestBalanced(s string) int {
    maxLength := 0
    n := len(s)

    for k := 1; k <= 26; k++ {
        left := 0
        freq := [26]int{}
        distinctCount := 0

        for right := 0; right < n; right++ {
            idx := s[right] - 'a'
            if freq[idx] == 0 {
                distinctCount++
            }
            freq[idx]++

            for distinctCount > k {
                leftIdx := s[left] - 'a'
                freq[leftIdx]--
                if freq[leftIdx] == 0 {
                    distinctCount--
                }
                left++
            }

            nonZeroFreqs := []int{}
            for _, f := range freq {
                if f > 0 {
                    nonZeroFreqs = append(nonZeroFreqs, f)
                }
            }
            if len(nonZeroFreqs) > 0 {
                minFreq, maxFreq := nonZeroFreqs[0], nonZeroFreqs[0]
                for _, f := range nonZeroFreqs {
                    if f < minFreq {
                        minFreq = f
                    }
                    if f > maxFreq {
                        maxFreq = f
                    }
                }
                if minFreq == maxFreq {
                    if (right - left + 1) > maxLength {
                        maxLength = right - left + 1
                    }
                }
            }
        }
    }

    return maxLength
}

In Go, we use an array to maintain character frequencies and slices to extract non-zero frequencies for comparison. The logic mirrors the Python version. Differences include explicit type management and slice creation for frequency checking. Integer overflow is not a concern since string length is limited.

Worked Examples

Example 1: s = "abbac"

right left substring freq a freq b non-zero freqs balanced? max_length
0 0 "a" 1 0 [1] yes 1
1 0 "ab" 1 1 [1,1] yes 2
2 0 "abb" 1 2 [1,2] no 2
3 0 "abba" 2 2 [2,2] yes 4
4 0 "abbac" 2 2 [2,2,1] no 4

Longest balanced substring length is 4.

Example 2: s = "zzabccy"

Algorithm finds "zabc" with frequencies [1,1,1,1], length 4.

Example 3: s = "aba"

Algorithm finds "ab" or "ba" with frequencies [1,1], length 2.

Complexity Analysis

Measure Complexity Explanation
Time O(n²) Outer loop runs 26 times, inner sliding window iterates n, frequency check O(26)
Space O(26) Frequency array of size 26 for lowercase letters

The time complexity is dominated by the nested loops over the string and possible distinct counts. Space complexity is constant because the character set size is fixed.

Test Cases

sol = Solution()

# Provided examples
assert sol.longestBalanced("abbac") == 4  # longest "abba"
assert sol.longestBalanced("zzabccy") == 4  # longest "zabc"
assert sol.longestBalanced("aba") == 2  # longest "ab" or "ba"

# Single character
assert sol.longestBalanced("a") == 1  # trivially balanced

# All distinct
assert sol.longestBalanced("abcdef") == 1  # each char occurs once, substrings of length 1

# All identical
assert sol.longestBalanced("aaaa") == 4  # entire string is balanced

# Repeated pattern
assert sol.longestBalanced("aabbcc") == 6  # "aabbcc" each char 2 times

# Nested balance
assert sol.longestBalanced("ababcab") == 4  # "abab" or "babc"
Test Why
"abbac" tests repeated characters forming a balanced substring
"zzabccy" tests single occurrence of multiple distinct chars
"aba" tests minimal case with repeated characters
"a