LeetCode 3313 - Find the Last Marked Nodes in Tree
We are given an undirected tree with n nodes. A tree is a connected graph with exactly n - 1 edges and no cycles. For every node i, we perform the following process: - Node i is marked at time t = 0.
Difficulty: 🔴 Hard
Topics: Tree, Depth-First Search
Solution
Problem Understanding
We are given an undirected tree with n nodes. A tree is a connected graph with exactly n - 1 edges and no cycles.
For every node i, we perform the following process:
- Node
iis marked at timet = 0. - Every second afterward, any unmarked node that is adjacent to at least one marked node becomes marked.
- The marking process continues until every node is marked.
The key observation is that a node becomes marked exactly when the marking wave reaches it from the starting node.
If node i is the starting node, then a node v becomes marked after dist(i, v) seconds, where dist(i, v) is the number of edges on the unique path between them.
Therefore, the last node to be marked is simply a node whose distance from i is maximum.
In graph theory terms, for every node i, we must find any farthest node from i.
The constraints are important:
ncan be as large as100,000.- The graph is guaranteed to be a valid tree.
- Running a traversal from every node would be far too expensive.
The problem therefore requires a near-linear solution.
Important edge cases include very small trees, long chain-shaped trees, star-shaped trees, and trees with multiple farthest nodes at the same distance. The statement explicitly allows returning any valid farthest node when multiple answers exist.
Approaches
Brute Force
The most direct approach is to treat every node as a starting node.
For each node i:
- Run BFS or DFS from
i. - Compute distances to all other nodes.
- Find a node with maximum distance.
- Store that node as the answer for
i.
This is correct because BFS on an unweighted tree computes shortest path distances exactly, and the last marked node is simply the node with maximum distance from the starting node.
However, this approach requires traversing the entire tree for every starting node.
With n = 100,000, this becomes completely infeasible.
Key Insight
The last marked node from a starting node u is a farthest node from u.
A classical property of trees states:
Let
AandBbe the two endpoints of a diameter of the tree. For every nodeu, at least one ofAorBis a farthest node fromu.
This is an extremely powerful result.
Once we know the diameter endpoints:
- Compute distance from every node to
A. - Compute distance from every node to
B.
Then for any node u:
- If
dist(u, A) >= dist(u, B), we can returnA. - Otherwise return
B.
Since one of the diameter endpoints is always a farthest node, this gives a correct answer for every node.
The entire problem therefore reduces to:
- Find one diameter endpoint.
- Find the opposite diameter endpoint.
- Compute distances from both endpoints.
- Choose the farther endpoint for every node.
Approach Comparison
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | BFS/DFS from every node |
| Optimal | O(n) | O(n) | Uses diameter endpoints and two distance arrays |
Algorithm Walkthrough
1. Build the adjacency list
Convert the edge list into an adjacency list representation.
Since the graph is a tree, this allows efficient traversal in O(n) time.
2. Find one endpoint of the diameter
Pick any node, for example node 0.
Run DFS (or BFS) to compute distances from node 0.
The farthest node found is guaranteed to be one endpoint of a diameter. Call it A.
3. Find the opposite endpoint
Run another DFS starting from A.
The farthest node reached from A is the other diameter endpoint. Call it B.
While performing this traversal, store all distances from A.
4. Compute distances from the second endpoint
Run a third DFS starting from B.
Store all distances from B.
Now we have:
distA[u] = dist(A, u)distB[u] = dist(B, u)
for every node u.
5. Determine the answer for every node
For each node u:
- If
distA[u] >= distB[u], returnA. - Otherwise return
B.
This works because one of the diameter endpoints is always a farthest node from u.
Why it works
Let A and B be diameter endpoints.
A standard tree property states that for any node u, the maximum distance from u to any node in the tree equals:
$$\max(dist(u,A), dist(u,B))$$
Therefore, at least one of A or B is a farthest node from u.
The algorithm explicitly computes both distances and chooses the farther endpoint. Consequently, the returned node is always a valid farthest node, which is exactly the node marked last when the marking process starts from u.
Python Solution
from typing import List
class Solution:
def lastMarkedNodes(self, edges: List[List[int]]) -> List[int]:
n = len(edges) + 1
graph = [[] for _ in range(n)]
for u, v in edges:
graph[u].append(v)
graph[v].append(u)
def get_distances(start: int) -> List[int]:
dist = [-1] * n
dist[start] = 0
stack = [start]
while stack:
node = stack.pop()
for nei in graph[node]:
if dist[nei] == -1:
dist[nei] = dist[node] + 1
stack.append(nei)
return dist
dist0 = get_distances(0)
A = max(range(n), key=lambda x: dist0[x])
distA = get_distances(A)
B = max(range(n), key=lambda x: distA[x])
distB = get_distances(B)
answer = [0] * n
for node in range(n):
if distA[node] >= distB[node]:
answer[node] = A
else:
answer[node] = B
return answer
The implementation begins by constructing the adjacency list of the tree.
The helper function get_distances performs an iterative DFS and computes distances from a given starting node to every other node. Iterative traversal is used instead of recursive DFS to avoid recursion depth issues on large trees.
The first traversal from node 0 finds a diameter endpoint A.
The second traversal from A finds the opposite diameter endpoint B, while simultaneously computing all distances from A.
The third traversal computes all distances from B.
Finally, for each node we compare its distance to the two diameter endpoints and return whichever endpoint is farther away. That endpoint is guaranteed to be a valid farthest node.
Go Solution
func lastMarkedNodes(edges [][]int) []int {
n := len(edges) + 1
graph := make([][]int, n)
for _, e := range edges {
u, v := e[0], e[1]
graph[u] = append(graph[u], v)
graph[v] = append(graph[v], u)
}
getDistances := func(start int) []int {
dist := make([]int, n)
for i := range dist {
dist[i] = -1
}
dist[start] = 0
stack := []int{start}
for len(stack) > 0 {
node := stack[len(stack)-1]
stack = stack[:len(stack)-1]
for _, nei := range graph[node] {
if dist[nei] == -1 {
dist[nei] = dist[node] + 1
stack = append(stack, nei)
}
}
}
return dist
}
dist0 := getDistances(0)
A := 0
for i := 1; i < n; i++ {
if dist0[i] > dist0[A] {
A = i
}
}
distA := getDistances(A)
B := 0
for i := 1; i < n; i++ {
if distA[i] > distA[B] {
B = i
}
}
distB := getDistances(B)
ans := make([]int, n)
for i := 0; i < n; i++ {
if distA[i] >= distB[i] {
ans[i] = A
} else {
ans[i] = B
}
}
return ans
}
The Go implementation follows exactly the same logic as the Python version. Iterative DFS is used through a slice acting as a stack, which avoids recursion depth concerns. Distances are stored in integer slices initialized to -1 to indicate unvisited nodes. Since tree distances are at most n - 1, standard Go int values are more than sufficient and no overflow concerns exist.
Worked Examples
Example 1
Input:
0
/ \
1 2
Edges:
[[0,1],[0,2]]
Find diameter
Distances from 0:
| Node | Distance |
|---|---|
| 0 | 0 |
| 1 | 1 |
| 2 | 1 |
Choose:
A = 1
Distances from A=1:
| Node | distA |
|---|---|
| 0 | 1 |
| 1 | 0 |
| 2 | 2 |
Farthest node:
B = 2
Distances from B=2:
| Node | distB |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 0 |
Build answers
| Node | distA | distB | Answer |
|---|---|---|---|
| 0 | 1 | 1 | 1 or 2 |
| 1 | 0 | 2 | 2 |
| 2 | 2 | 0 | 1 |
Valid output:
[2,2,1]
Example 2
Input:
0 - 1
Diameter endpoints:
A = 0
B = 1
Distance tables:
| Node | distA | distB |
|---|---|---|
| 0 | 0 | 1 |
| 1 | 1 | 0 |
Answers:
| Node | Result |
|---|---|
| 0 | 1 |
| 1 | 0 |
Output:
[1,0]
Example 3
Input:
0
/ \
1 2
/ \
3 4
One valid diameter is:
1 - 0 - 2 - 3
So:
A = 1
B = 3
Distance arrays:
| Node | distA | distB |
|---|---|---|
| 0 | 1 | 2 |
| 1 | 0 | 3 |
| 2 | 2 | 1 |
| 3 | 3 | 0 |
| 4 | 3 | 2 |
Answer selection:
| Node | Answer |
|---|---|
| 0 | 3 |
| 1 | 3 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
Output:
[3,3,1,1,1]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Three tree traversals, each visits every node once |
| Space | O(n) | Adjacency list, stack, and distance arrays |
The tree contains exactly n - 1 edges. Each DFS traverses every edge and node once, requiring O(n) time. Since we perform a constant number of traversals, the overall complexity remains linear. The adjacency list and distance arrays also require linear memory.
Test Cases
sol = Solution()
# Example 1
assert sol.lastMarkedNodes([[0, 1], [0, 2]])[1:] == [2, 1]
# Example 2
assert sol.lastMarkedNodes([[0, 1]]) == [1, 0]
# Example 3
assert sol.lastMarkedNodes([[0, 1], [0, 2], [2, 3], [2, 4]]) == [3, 3, 1, 1, 1]
# Smallest valid tree
assert sol.lastMarkedNodes([[0, 1]]) == [1, 0]
# Chain of length 4
assert sol.lastMarkedNodes([[0, 1], [1, 2], [2, 3], [3, 4]]) == [4, 4, 4, 0, 0]
# Star tree
ans = sol.lastMarkedNodes([[0, 1], [0, 2], [0, 3], [0, 4]])
assert ans[1] != 1 # leaf's farthest node is another leaf
assert ans[2] != 2
assert ans[3] != 3
assert ans[4] != 4
# Balanced binary-like tree
edges = [
[0, 1], [0, 2],
[1, 3], [1, 4],
[2, 5], [2, 6]
]
ans = sol.lastMarkedNodes(edges)
assert len(ans) == 7
# Long chain stress pattern
n = 100
edges = [[i, i + 1] for i in range(n - 1)]
ans = sol.lastMarkedNodes(edges)
assert ans[0] == n - 1
assert ans[n - 1] == 0
Test Summary
| Test | Why |
|---|---|
[[0,1],[0,2]] |
Multiple valid farthest nodes |
[[0,1]] |
Minimum tree size |
| Example 3 tree | Branching structure |
| Long chain | Diameter equals entire tree |
| Star tree | Many leaves share same center |
| Balanced tree | Multiple levels and branches |
| Chain of length 100 | Larger linear structure |
Edge Cases
Tree with Only Two Nodes
The smallest allowed tree contains exactly two nodes connected by one edge. Each node's farthest node is simply the other endpoint. Implementations that assume at least three nodes can fail here. The diameter-based approach handles this naturally because the two nodes are themselves the diameter endpoints.
Multiple Farthest Nodes
In a star-shaped tree, the center has several leaves at the same maximum distance. The problem allows returning any valid farthest node. When distA[node] == distB[node], the implementation chooses A, which satisfies the requirement.
Extremely Deep Tree
A chain containing 100,000 nodes produces a recursion depth of 100,000 if recursive DFS is used. That would overflow the call stack in both Python and many Go environments. The implementation uses iterative DFS with an explicit stack, ensuring it works safely within the constraints.
Diameter Endpoint Ties
Some trees have multiple valid diameter pairs. The algorithm may discover any one of them depending on traversal order. This is completely fine because the theorem guarantees that for any chosen diameter endpoints A and B, one of them is a farthest node from every vertex. Therefore the correctness does not depend on finding a specific diameter pair.