LeetCode 3313 - Find the Last Marked Nodes in Tree

We are given an undirected tree with n nodes. A tree is a connected graph with exactly n - 1 edges and no cycles. For every node i, we perform the following process: - Node i is marked at time t = 0.

LeetCode Problem 3313

Difficulty: 🔴 Hard
Topics: Tree, Depth-First Search

Solution

Problem Understanding

We are given an undirected tree with n nodes. A tree is a connected graph with exactly n - 1 edges and no cycles.

For every node i, we perform the following process:

  • Node i is marked at time t = 0.
  • Every second afterward, any unmarked node that is adjacent to at least one marked node becomes marked.
  • The marking process continues until every node is marked.

The key observation is that a node becomes marked exactly when the marking wave reaches it from the starting node.

If node i is the starting node, then a node v becomes marked after dist(i, v) seconds, where dist(i, v) is the number of edges on the unique path between them.

Therefore, the last node to be marked is simply a node whose distance from i is maximum.

In graph theory terms, for every node i, we must find any farthest node from i.

The constraints are important:

  • n can be as large as 100,000.
  • The graph is guaranteed to be a valid tree.
  • Running a traversal from every node would be far too expensive.

The problem therefore requires a near-linear solution.

Important edge cases include very small trees, long chain-shaped trees, star-shaped trees, and trees with multiple farthest nodes at the same distance. The statement explicitly allows returning any valid farthest node when multiple answers exist.

Approaches

Brute Force

The most direct approach is to treat every node as a starting node.

For each node i:

  1. Run BFS or DFS from i.
  2. Compute distances to all other nodes.
  3. Find a node with maximum distance.
  4. Store that node as the answer for i.

This is correct because BFS on an unweighted tree computes shortest path distances exactly, and the last marked node is simply the node with maximum distance from the starting node.

However, this approach requires traversing the entire tree for every starting node.

With n = 100,000, this becomes completely infeasible.

Key Insight

The last marked node from a starting node u is a farthest node from u.

A classical property of trees states:

Let A and B be the two endpoints of a diameter of the tree. For every node u, at least one of A or B is a farthest node from u.

This is an extremely powerful result.

Once we know the diameter endpoints:

  • Compute distance from every node to A.
  • Compute distance from every node to B.

Then for any node u:

  • If dist(u, A) >= dist(u, B), we can return A.
  • Otherwise return B.

Since one of the diameter endpoints is always a farthest node, this gives a correct answer for every node.

The entire problem therefore reduces to:

  1. Find one diameter endpoint.
  2. Find the opposite diameter endpoint.
  3. Compute distances from both endpoints.
  4. Choose the farther endpoint for every node.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) BFS/DFS from every node
Optimal O(n) O(n) Uses diameter endpoints and two distance arrays

Algorithm Walkthrough

1. Build the adjacency list

Convert the edge list into an adjacency list representation.

Since the graph is a tree, this allows efficient traversal in O(n) time.

2. Find one endpoint of the diameter

Pick any node, for example node 0.

Run DFS (or BFS) to compute distances from node 0.

The farthest node found is guaranteed to be one endpoint of a diameter. Call it A.

3. Find the opposite endpoint

Run another DFS starting from A.

The farthest node reached from A is the other diameter endpoint. Call it B.

While performing this traversal, store all distances from A.

4. Compute distances from the second endpoint

Run a third DFS starting from B.

Store all distances from B.

Now we have:

  • distA[u] = dist(A, u)
  • distB[u] = dist(B, u)

for every node u.

5. Determine the answer for every node

For each node u:

  • If distA[u] >= distB[u], return A.
  • Otherwise return B.

This works because one of the diameter endpoints is always a farthest node from u.

Why it works

Let A and B be diameter endpoints.

A standard tree property states that for any node u, the maximum distance from u to any node in the tree equals:

$$\max(dist(u,A), dist(u,B))$$

Therefore, at least one of A or B is a farthest node from u.

The algorithm explicitly computes both distances and chooses the farther endpoint. Consequently, the returned node is always a valid farthest node, which is exactly the node marked last when the marking process starts from u.

Python Solution

from typing import List

class Solution:
    def lastMarkedNodes(self, edges: List[List[int]]) -> List[int]:
        n = len(edges) + 1

        graph = [[] for _ in range(n)]
        for u, v in edges:
            graph[u].append(v)
            graph[v].append(u)

        def get_distances(start: int) -> List[int]:
            dist = [-1] * n
            dist[start] = 0

            stack = [start]
            while stack:
                node = stack.pop()

                for nei in graph[node]:
                    if dist[nei] == -1:
                        dist[nei] = dist[node] + 1
                        stack.append(nei)

            return dist

        dist0 = get_distances(0)
        A = max(range(n), key=lambda x: dist0[x])

        distA = get_distances(A)
        B = max(range(n), key=lambda x: distA[x])

        distB = get_distances(B)

        answer = [0] * n

        for node in range(n):
            if distA[node] >= distB[node]:
                answer[node] = A
            else:
                answer[node] = B

        return answer

The implementation begins by constructing the adjacency list of the tree.

The helper function get_distances performs an iterative DFS and computes distances from a given starting node to every other node. Iterative traversal is used instead of recursive DFS to avoid recursion depth issues on large trees.

The first traversal from node 0 finds a diameter endpoint A.

The second traversal from A finds the opposite diameter endpoint B, while simultaneously computing all distances from A.

The third traversal computes all distances from B.

Finally, for each node we compare its distance to the two diameter endpoints and return whichever endpoint is farther away. That endpoint is guaranteed to be a valid farthest node.

Go Solution

func lastMarkedNodes(edges [][]int) []int {
	n := len(edges) + 1

	graph := make([][]int, n)
	for _, e := range edges {
		u, v := e[0], e[1]
		graph[u] = append(graph[u], v)
		graph[v] = append(graph[v], u)
	}

	getDistances := func(start int) []int {
		dist := make([]int, n)
		for i := range dist {
			dist[i] = -1
		}

		dist[start] = 0
		stack := []int{start}

		for len(stack) > 0 {
			node := stack[len(stack)-1]
			stack = stack[:len(stack)-1]

			for _, nei := range graph[node] {
				if dist[nei] == -1 {
					dist[nei] = dist[node] + 1
					stack = append(stack, nei)
				}
			}
		}

		return dist
	}

	dist0 := getDistances(0)

	A := 0
	for i := 1; i < n; i++ {
		if dist0[i] > dist0[A] {
			A = i
		}
	}

	distA := getDistances(A)

	B := 0
	for i := 1; i < n; i++ {
		if distA[i] > distA[B] {
			B = i
		}
	}

	distB := getDistances(B)

	ans := make([]int, n)

	for i := 0; i < n; i++ {
		if distA[i] >= distB[i] {
			ans[i] = A
		} else {
			ans[i] = B
		}
	}

	return ans
}

The Go implementation follows exactly the same logic as the Python version. Iterative DFS is used through a slice acting as a stack, which avoids recursion depth concerns. Distances are stored in integer slices initialized to -1 to indicate unvisited nodes. Since tree distances are at most n - 1, standard Go int values are more than sufficient and no overflow concerns exist.

Worked Examples

Example 1

Input:

0
/ \
1  2

Edges:

[[0,1],[0,2]]

Find diameter

Distances from 0:

Node Distance
0 0
1 1
2 1

Choose:

A = 1

Distances from A=1:

Node distA
0 1
1 0
2 2

Farthest node:

B = 2

Distances from B=2:

Node distB
0 1
1 2
2 0

Build answers

Node distA distB Answer
0 1 1 1 or 2
1 0 2 2
2 2 0 1

Valid output:

[2,2,1]

Example 2

Input:

0 - 1

Diameter endpoints:

A = 0
B = 1

Distance tables:

Node distA distB
0 0 1
1 1 0

Answers:

Node Result
0 1
1 0

Output:

[1,0]

Example 3

Input:

    0
   / \
  1   2
     / \
    3   4

One valid diameter is:

1 - 0 - 2 - 3

So:

A = 1
B = 3

Distance arrays:

Node distA distB
0 1 2
1 0 3
2 2 1
3 3 0
4 3 2

Answer selection:

Node Answer
0 3
1 3
2 1
3 1
4 1

Output:

[3,3,1,1,1]

Complexity Analysis

Measure Complexity Explanation
Time O(n) Three tree traversals, each visits every node once
Space O(n) Adjacency list, stack, and distance arrays

The tree contains exactly n - 1 edges. Each DFS traverses every edge and node once, requiring O(n) time. Since we perform a constant number of traversals, the overall complexity remains linear. The adjacency list and distance arrays also require linear memory.

Test Cases

sol = Solution()

# Example 1
assert sol.lastMarkedNodes([[0, 1], [0, 2]])[1:] == [2, 1]

# Example 2
assert sol.lastMarkedNodes([[0, 1]]) == [1, 0]

# Example 3
assert sol.lastMarkedNodes([[0, 1], [0, 2], [2, 3], [2, 4]]) == [3, 3, 1, 1, 1]

# Smallest valid tree
assert sol.lastMarkedNodes([[0, 1]]) == [1, 0]

# Chain of length 4
assert sol.lastMarkedNodes([[0, 1], [1, 2], [2, 3], [3, 4]]) == [4, 4, 4, 0, 0]

# Star tree
ans = sol.lastMarkedNodes([[0, 1], [0, 2], [0, 3], [0, 4]])
assert ans[1] != 1  # leaf's farthest node is another leaf
assert ans[2] != 2
assert ans[3] != 3
assert ans[4] != 4

# Balanced binary-like tree
edges = [
    [0, 1], [0, 2],
    [1, 3], [1, 4],
    [2, 5], [2, 6]
]
ans = sol.lastMarkedNodes(edges)
assert len(ans) == 7

# Long chain stress pattern
n = 100
edges = [[i, i + 1] for i in range(n - 1)]
ans = sol.lastMarkedNodes(edges)
assert ans[0] == n - 1
assert ans[n - 1] == 0

Test Summary

Test Why
[[0,1],[0,2]] Multiple valid farthest nodes
[[0,1]] Minimum tree size
Example 3 tree Branching structure
Long chain Diameter equals entire tree
Star tree Many leaves share same center
Balanced tree Multiple levels and branches
Chain of length 100 Larger linear structure

Edge Cases

Tree with Only Two Nodes

The smallest allowed tree contains exactly two nodes connected by one edge. Each node's farthest node is simply the other endpoint. Implementations that assume at least three nodes can fail here. The diameter-based approach handles this naturally because the two nodes are themselves the diameter endpoints.

Multiple Farthest Nodes

In a star-shaped tree, the center has several leaves at the same maximum distance. The problem allows returning any valid farthest node. When distA[node] == distB[node], the implementation chooses A, which satisfies the requirement.

Extremely Deep Tree

A chain containing 100,000 nodes produces a recursion depth of 100,000 if recursive DFS is used. That would overflow the call stack in both Python and many Go environments. The implementation uses iterative DFS with an explicit stack, ensuring it works safely within the constraints.

Diameter Endpoint Ties

Some trees have multiple valid diameter pairs. The algorithm may discover any one of them depending on traversal order. This is completely fine because the theorem guarantees that for any chosen diameter endpoints A and B, one of them is a farthest node from every vertex. Therefore the correctness does not depend on finding a specific diameter pair.