LeetCode 3900 - Longest Balanced Substring After One Swap
We are given a binary string s containing only '0' and '1'. A substring is considered balanced when it contains exactly the same number of zeros and ones. Before choosing the substring, we are allowed to perform at most one swap between any two positions in the entire string.
Difficulty: 🟡 Medium
Topics: —
Solution
Problem Understanding
We are given a binary string s containing only '0' and '1'.
A substring is considered balanced when it contains exactly the same number of zeros and ones. Before choosing the substring, we are allowed to perform at most one swap between any two positions in the entire string. The swap is optional, so we may also choose not to swap anything.
The goal is to find the maximum possible length of a balanced substring that can exist after applying at most one swap.
A balanced substring must always have even length because it contains an equal number of zeros and ones. The input length can be as large as 10^5, which immediately rules out any algorithm that examines all substrings explicitly.
Several edge cases are important:
- A string containing only one type of character, such as
"111"or"00000", can never contain a non-empty balanced substring. - The optimal answer may come from a substring that is already balanced, meaning no swap is needed.
- The optimal answer may require a swap involving a character outside the chosen substring.
- Since only one swap is allowed, we cannot arbitrarily rearrange the string. The effect of a single swap is very limited and must be analyzed carefully.
Approaches
Brute Force
A direct approach would enumerate every possible swap, including the option of performing no swap. For each resulting string, we could enumerate every substring and check whether it is balanced.
There are O(n²) possible swaps and O(n²) possible substrings. Checking whether a substring is balanced takes additional time unless extensive preprocessing is used.
Even with optimizations, the overall complexity remains far beyond what is feasible for n = 10^5.
The brute force approach is correct because it explicitly tries every valid operation and every possible substring, but it is much too slow.
Key Observation
Let us encode:
'1'as+1'0'as-1
Then a substring is balanced exactly when its sum is 0.
Now consider a substring whose current sum is d.
A single swap can only affect the balance of the substring if one swapped position lies inside the substring and the other lies outside. In that case:
- Replacing an inside
0with an outside1increases the sum by2. - Replacing an inside
1with an outside0decreases the sum by2.
Therefore, one swap can only change the substring sum by ±2.
This means a substring can become balanced if and only if:
- Its current sum is
0, or - Its current sum is
2and we can decrease it by2, or - Its current sum is
-2and we can increase it by2.
No other sum can be fixed using a single swap.
The problem therefore reduces to finding the longest substring whose sum is:
02with a valid corrective swap-2with a valid corrective swap
Using prefix sums and binary search, all of these can be found efficiently.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n⁴) or worse | O(1) | Enumerates swaps and substrings |
| Optimal | O(n log n) | O(n) | Prefix sums plus binary search |
Algorithm Walkthrough
- Convert the string conceptually into values where
'1' = +1and'0' = -1. - Build a prefix sum array
pref, wherepref[i]represents the sum of the firsticharacters. - Find the longest substring whose sum is
0.
For a substring (i, j], its sum equals:
pref[j] - pref[i]
Therefore the sum is zero when the two prefix sums are equal. For every prefix value, store its earliest occurrence and compute the maximum distance.
4. Count the total number of zeros and ones in the entire string.
5. Consider substrings whose sum is 2.
Such a substring has one extra '1' compared to '0'.
To fix it, we need an inside '1' and an outside '0'.
Let the substring length be L.
Since:
ones - zeros = 2
we obtain:
zeros = (L - 2) / 2
A corrective swap is possible only if at least one zero exists outside:
zeros < totalZeros
Rearranging:
L < 2 * totalZeros + 2
Thus we need the longest sum-2 substring whose length satisfies that bound.
6. Apply the same reasoning for substrings whose sum is -2.
A valid swap requires:
ones < totalOnes
which becomes:
L < 2 * totalOnes + 2
7. Store all indices for each prefix sum value.
8. For each position j, find an earlier index i such that:
pref[j] - pref[i] = K
where K is either 2 or -2.
Additionally, enforce the required length bound. 9. Use binary search on the stored prefix-sum positions to locate the earliest valid index that satisfies the bound and maximize the substring length. 10. Return the largest answer found.
Why it works
A single swap can only alter a substring's sum by ±2. Therefore every substring that can become balanced must currently have sum 0, 2, or -2. The additional outside-character requirement is captured exactly by the derived length bounds. The algorithm enumerates all such candidate substrings through prefix sums and computes the maximum valid length, so it always returns the optimal answer.
Python Solution
from bisect import bisect_right
from collections import defaultdict
from typing import List
class Solution:
def longestBalanced(self, s: str) -> int:
n = len(s)
total_ones = s.count("1")
total_zeros = n - total_ones
prefix = [0]
for ch in s:
prefix.append(prefix[-1] + (1 if ch == "1" else -1))
ans = 0
earliest = {}
for i, value in enumerate(prefix):
if value not in earliest:
earliest[value] = i
ans = max(ans, i - earliest[value])
positions = defaultdict(list)
for i, value in enumerate(prefix):
positions[value].append(i)
def process(target_sum: int, length_bound: int) -> int:
best = 0
for j, value in enumerate(prefix):
needed = value - target_sum
if needed not in positions:
continue
idx_list = positions[needed]
pos = bisect_right(idx_list, j - length_bound)
if pos < len(idx_list) and idx_list[pos] < j:
best = max(best, j - idx_list[pos])
return best
ans = max(ans, process(2, 2 * total_zeros + 2))
ans = max(ans, process(-2, 2 * total_ones + 2))
return ans
The implementation begins by building the prefix-sum array using +1 and -1 values. The longest already-balanced substring is found using the classic technique of storing the first occurrence of every prefix sum.
Next, all positions of every prefix sum value are stored in sorted lists. This allows binary searching for valid starting positions.
The helper function process() handles the sum-2 and sum--2 cases. For each endpoint, it finds the earliest starting position that both produces the desired substring sum and satisfies the derived length bound. The maximum valid length is tracked and returned.
Finally, the best result among the sum 0, 2, and -2 cases is returned.
Go Solution
package main
import "sort"
func longestBalanced(s string) int {
n := len(s)
totalOnes := 0
for _, ch := range s {
if ch == '1' {
totalOnes++
}
}
totalZeros := n - totalOnes
prefix := make([]int, n+1)
for i := 0; i < n; i++ {
prefix[i+1] = prefix[i]
if s[i] == '1' {
prefix[i+1]++
} else {
prefix[i+1]--
}
}
ans := 0
earliest := make(map[int]int)
for i, v := range prefix {
if _, ok := earliest[v]; !ok {
earliest[v] = i
}
if i-earliest[v] > ans {
ans = i - earliest[v]
}
}
positions := make(map[int][]int)
for i, v := range prefix {
positions[v] = append(positions[v], i)
}
process := func(targetSum int, lengthBound int) int {
best := 0
for j, v := range prefix {
needed := v - targetSum
list, ok := positions[needed]
if !ok {
continue
}
pos := sort.Search(len(list), func(i int) bool {
return list[i] > j-lengthBound
})
if pos < len(list) && list[pos] < j {
if j-list[pos] > best {
best = j - list[pos]
}
}
}
return best
}
if v := process(2, 2*totalZeros+2); v > ans {
ans = v
}
if v := process(-2, 2*totalOnes+2); v > ans {
ans = v
}
return ans
}
The Go implementation follows exactly the same logic as the Python version. The primary difference is the use of sort.Search for binary search instead of bisect_right. All arithmetic comfortably fits within Go's int type because the string length is at most 10^5.
Worked Examples
Example 1
Input:
s = "100001"
Totals:
| Quantity | Value |
|---|---|
| Total Ones | 2 |
| Total Zeros | 4 |
Encoded values:
1 0 0 0 0 1
+1 -1 -1 -1 -1 +1
Prefix sums:
| Index | Prefix |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | -1 |
| 4 | -2 |
| 5 | -3 |
| 6 | -2 |
Longest sum-0 substring:
- Prefix value
0appears at indices0and2 - Length =
2
Answer so far:
2
Now check sum -2.
Length bound:
2 * totalOnes + 2
= 2 * 2 + 2
= 6
Substring "0001" has:
ones = 1
zeros = 3
sum = -2
length = 4
Since 4 < 6, a corrective swap exists.
After one swap:
100001 -> 101000
A balanced substring of length 4 becomes possible.
Final answer:
4
Example 2
Input:
s = "111"
Totals:
| Quantity | Value |
|---|---|
| Total Ones | 3 |
| Total Zeros | 0 |
Prefix sums:
| Index | Prefix |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
No equal prefix sums exist, so there is no sum-0 substring.
There are no zeros anywhere in the string, so no swap can create a balanced substring.
Final answer:
0
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Each prefix position performs binary searches |
| Space | O(n) | Prefix sums and index lists are stored |
The prefix array contains n + 1 entries. Every prefix position participates in a constant number of binary searches over stored index lists. Since each search takes O(log n) time, the overall complexity is O(n log n).
Test Cases
sol = Solution()
assert sol.longestBalanced("100001") == 4 # provided example
assert sol.longestBalanced("111") == 0 # all ones
assert sol.longestBalanced("0") == 0 # single character
assert sol.longestBalanced("1") == 0 # single character
assert sol.longestBalanced("01") == 2 # already balanced
assert sol.longestBalanced("10") == 2 # already balanced
assert sol.longestBalanced("0011") == 4 # whole string balanced
assert sol.longestBalanced("1100") == 4 # whole string balanced
assert sol.longestBalanced("111000") == 6 # balanced whole string
assert sol.longestBalanced("000111") == 6 # balanced whole string
assert sol.longestBalanced("00000") == 0 # all zeros
assert sol.longestBalanced("11111") == 0 # all ones
assert sol.longestBalanced("101") == 2 # small odd length
assert sol.longestBalanced("010") == 2 # small odd length
assert sol.longestBalanced("1001") == 4 # already balanced
assert sol.longestBalanced("11001") == 4 # requires checking swap case
assert sol.longestBalanced("101000") == 6 # whole string already balanced
Test Summary
| Test | Why |
|---|---|
"100001" |
Official example requiring a swap |
"111" |
Official example with no valid answer |
"0" |
Smallest zero-only input |
"1" |
Smallest one-only input |
"01" |
Smallest balanced string |
"10" |
Balanced in reverse order |
"0011" |
Entire string balanced |
"1100" |
Entire string balanced |
"111000" |
Large balanced region |
"000111" |
Symmetric balanced case |
"00000" |
No ones available |
"11111" |
No zeros available |
"101" |
Odd length input |
"010" |
Odd length input |
"1001" |
Balanced whole string |
"11001" |
Swap-related candidate |
"101000" |
Full-length balanced answer |
Edge Cases
All Characters Are Identical
Strings such as "11111" or "00000" contain only one type of character. A balanced substring requires both zeros and ones, so no non-empty balanced substring can exist. The algorithm handles this naturally because no valid sum-0, sum-2, or sum-(-2) substring satisfies the swap conditions.
The Best Substring Is Already Balanced
For inputs like "0011" or "111000", the optimal answer is the entire string and no swap is necessary. The longest sum-0 substring computation directly captures this case through repeated prefix sums.
A Swap Requires a Character Outside the Substring
A subtle bug source is assuming every sum-2 or sum-(-2) substring can be fixed. Consider a substring that already contains all zeros in the entire string. There is no outside zero available to swap in, so correction is impossible. The derived bounds L < 2 * totalZeros + 2 and L < 2 * totalOnes + 2 encode exactly this requirement and prevent invalid substrings from being counted.
Very Large Inputs
Since n can reach 100000, any quadratic solution will time out. The implementation stores prefix-sum positions and uses binary search, keeping the runtime at O(n log n) and making it efficient for the maximum input size.