Problem Statement
Pattern:
O(n^2) Solution
List<TreeNode> res;
HashMap<String, Integer> serialCount;
// post order serialise
String pos(TreeNode root){
if(root == null) return "#";
// serialise curr subtree
String serial = root.val + "," + pos(root.left) + pos(root.right);
// update serialCount and res
int freq = serialCount.getOrDefault(serial, 0);
serialCount.put(serial, ++freq);
if (freq == 2) res.add(root);
return serial;
}
public List<TreeNode> findDuplicateSubtrees (TreeNode root){
res = new LinkedList<>();
serialCount = new HashMap<>();
pos(root);
return res;
}
Notes
- For every node, serialise its subtree convert its
root, left, right
values into a string. and save this serials frequency in a hashmap. - since we only have to return a single instance of duplicate subtrees, check if
freq == 2
, and add to result - effectively postorder traverse and serialise
Now, Why O(N^2)
- at the bottom left most node/subtree , the size of the
serial
string is1
- at the root the size of the
serial
string is >n
where n is number of nodes. - why does this matter?
- because concatenation, and hashing operations complexity increases linearly with the size of the string
- so at the root the concatenation has to concatenate a string of characters for all nodes = n, its child n-1, for its child n-2, so on and so forth
- therefore O(n^2)
O(n) Solution
if we could instead of appending all the values, could somehow ‘cache’ the serial
, effectively assign the serial an id
, so we didnt have to concatenate, and append these serial
s over and over again, and we could simply identify , concatenate and hash a serial based off if its id
int uniqueSerials = 1;
HashMap<String, Integer> serialID;
HashMap<Integer, Integer> idCount;
List<TreeNode> res;
int getSerial(String serial) {
// get serialID if exists
if (serialID.containsKey(serial)) return serialID.get(serial);
// else get unique serial
serialID.put(serial, ++uniqueSerials);
return uniqueSerials;
}
// post order serialise
int pos(TreeNode root) {
if (root == null) return 0;
// serialise curr subtree
String serial = pos(root.left) + "," + root.val + "," + pos(root.right);
// get an ID for serial
int srID = getSerial(serial);
// update idCount and res
int freq = idCount.getOrDefault(srID, 0);
idCount.put(srID, ++freq);
if (freq == 2) res.add(root);
return srID;
}
public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
serialID = new HashMap<>();
idCount = new HashMap<>();
res = new LinkedList<>();
pos(root);
return res;
}
-
In the first approach, imagine that from left side you got a string like “2,3,4,5” etc while from right you got “7,8,9,10”, then you will be concatenating 4 + 4 operations(excluding commas and if you include them its even more).
-
Whereas with ids, this is always bounded to an integer at most from left side you will get 10 chars(-2147483647 or 2147483647) and from right you will get again max 10 chars. So at most your String concatenation will be 10 chars + 10 chars.