Heavy Light Decomposition

Long post with lot of explanation, targeting novice. If you are aware of how HLD is useful, skip to “Basic Idea”.

Why a Balanced Binary Tree is good?

Balanced Binary Tree
Balanced Binary Tree

A balanced binary tree with N nodes has a height of log N. This gives us the following properties:

  • You need to visit at most log N nodes to reach root node from any other node
  • You need to visit at most 2 * log N nodes to reach from any node to any other node in the tree

The log factor is always good in the world of competitive programming 🙂

Now, if a balanced binary tree with N nodes is given, then many queries can be done with O( log N ) complexity. Distance of a path, Maximum/Minimum in a path, Maximum contiguous sum etc etc.

Why a Chain is good?


A chain is a set of nodes connected one after another. It can be viewed as a simple array of nodes/numbers. We can do many operations on array of elements with O( log N ) complexity using segment tree / BIT / other data structures. You can read more about segment trees here – A tutorial by Utkarsh .

Now, we know that Balanced Binary Trees and arrays are good for computation. We can do a lot of operations with O( log N ) complexity on both the data structures.

Why an Unbalanced Tree is bad?

Height of unbalanced tree can be arbitrary. In the worst case, we have to visit O( N ) nodes to move from one node to another node. So Unbalanced trees are not computation friendly. We shall see how we can deal with unbalanced trees.

Consider this example..

Consider the following question: Given and B, calculate the sum of all node values on the path from to B.

Simple Tree

Here are details about the given images

  1. The tree in the image has nodes.
  2. We need to visit N/3 nodes to travel from to D.
  3. We need to visit >N/3 nodes to travel from B to D.
  4. We need to visit >N/2 nodes to travel from C to D.

It is clear that moving from one node to another can be up to O( N ) complexity.

This is important: What if we broke the tree in to 3 chains as shown in image below. Then we consider each chain as an independent problem. We are dealing with chains so we can use Segment Trees/other data structures that work well on linear list of data.

Decomposed Tree

Here are the details after the trick

  1. The tree still has nodes, but it is DECOMPOSED into 3 chains each of size N/3. See 3 different colors, each one is a chain.
  2. and D belong to the same chain. We can get the required sum of node values of path from to in O( log N ) time using segment tree data structure.
  3. belongs to yellow chain, and D belongs to blue chain. Path from to can be broken as B to T3 and T4 to D. Now we are dealing with 2 cases which are similar to the above case. We can calculate required sum in O( log N ) time for to T3 and O( log N ) time for T4 to D. Great, we reduced this to O( log N ).
  4. C belongs to red chain, and belongs to blue chain. Path from to can be broken as to T1, T2 to T3 and T4 to D. Again we are dealing with 3 cases similar to 2nd case. So we can again do it in O( log N ).

Awesome!! We used concepts of Decomposition and Segment Tree DS, reduced the query complexity from O( N ) to O( log N ). As I said before, competitive programmers always love the log factors 😀 😀

But wait the tree in the example is special, only 2 nodes had degree greater than 2. We did a simple decomposition and achieved better complexity, but in a general tree we need to do some thing little more complex to get better complexity. And that little more complex decomposition is called Heavy Light Decomposition.

Basic Idea

We will divide the tree into vertex-disjoint chains ( Meaning no two chains has a node in common ) in such a way that to move from any node in the tree to the root node, we will have to change at most log N chains. To put it in another words, the path from any node to root can be broken into pieces such that the each piece belongs to only one chain, then we will have no more than log N pieces.

Let us assume that the above is done, So what?. Now the path from any node A to any node can be  broken into two paths: to LCA( A, B ) and to LCA( A, B ). Details about LCA – Click Here or Here. So at this point we need to only worry about paths of the following format: Start at some node and go up the tree because to LCA( A, B ) and to LCA( A, B ) are both such paths.

What are we up to till now?

  • We assumed that we can break tree into chains such that we will have to change at most log N chains to move from any node up the tree to any other node.
  • Any path can be broken into two paths such both paths start at some node and move up the tree
  • We already know that queries in each chain can be answered with O( log N ) complexity and there are at most log N chains we need to consider per path. So on the whole we have O( log^2 N ) complexity solution. Great!!

Till now I have explained how HLD can be used to reduce complexity. Now we shall see details about how HLD actually decomposes the tree.

Note : My version of HLD is little different from the standard one, but still everything said above holds.


Common tree related terminology can be found here.

Special Child : Among all child nodes of a node, the one with maximum sub-tree size is considered as Special child. Each non leaf node has exactly one Special child.

Special Edge : For each non-leaf node, the edge connecting the node with its Special child is considered as Special Edge.

Random Graph

Read the next 3 paras until you clearly understand every line of it, every line makes sense (Hope!). Read it 2 times 3 times 10 times 2 power 10 times .. , until you understand!!

What happens if you go to each node, find the special child and special edge and mark all special edges with green color and other edges are still black? Well, what happens is HLD. What would the graph look like then? Colorful yes. Not just colorful. Green edges actually forms vertex disjoint chains and black edges will be the connectors between chains. Let us explore one chain, start at root, move to the special child of root (there is only one special child, so easy pick), then to its special child and so on until you reach a leaf node, what you just traveled is a chain which starts at root node. Let us assume that root node has child nodes. Note that all m-1 normal child nodes are starting nodes of some other chains.

What happens if you move from a node to a normal child node of it. This is the most important part. When you move from a node to any of its normal child, the sub-tree size is at most half the sub-tree size of current node. Consider a node whose sub-tree size is and has m child nodes. If m=1, then the only child is special child (So there is no case of moving to normal child). For m>=2, sub-tree size of any normal child is <=s/2. To prove that, let us talk about the sub-tree size of special child. What is the least sub-tree size possible for special child? Answer is ceil( s/m ) (what is ceil? click here). To prove it, let us assume it is less than ceil( s/m ). As this child is the one with maximum sub-tree size, all other normal child nodes will be at most as large as special child, child nodes with each less than ceil( s/m ) will not sum up to s, so with this counter-intuition. We have the following: The mininum sub-tree size possible for special child is ceil( s/m ). This being said, the maximum size for normal child is  s/2. So when ever you move from a node to a normal child, the sub-tree size is at most half the sub-tree size of parent node.

We stated early that to move from root to any node (or viceversa) we need to change at most log N chains. Here is the proof; Changing a chain means we are moving for a node to a normal child, so each time we change chain we are at least halving the sub-tree size. For a tree with size N, we can halve it at most log N times (Why? Well, take a number and keep halving, let me know if it takes more than ceil( log N ) steps).

HL Decomposed Graph

At this point, we know what HLD is, we know why one has to use HLD, basic idea of HLD, terminology and proof. We shall now see implementation details of HLD and few related problems.



HLD(curNode, Chain):
    Add curNode to curChain
    If curNode is LeafNode: return                    //Nothing left to do
    sc := child node with maximum sub-tree size       //sc is the special child
    HLD(sc, Chain)                                    //Extend current chain to special child
    for each child node cn of curNode:                //For normal childs
        if cn != sc: HLD(cn, newChain)                //As told above, for each normal child, a new chain starts

Above algorithm correctly does HLD. But we will need bit more information when solving HLD related problems. We should be able to answer the following questions:

  1. Given a node, to which chain does that node belong to.
  2. Given a node, what is the position of that node in its chain.
  3. Given a chain, what is the head of the chain
  4. Given a chain, what is the length of the chain

So let us see a C++ implementation which covers all of the above

int chainNo=0,chainHead[N],chainPos[N],chainInd[N],chainSize[N];
void hld(int cur) {
    if(chainHead[chainNo] == -1) chainHead[chainNo]=cur;
    chainInd[cur] = chainNo;
    chainPos[cur] = chainSize[chainNo];

    int ind = -1,mai = -1;
    for(int i = 0; i < adj[cur].sz; i++) {         if(subsize[ adj[cur][i] ] > mai) {
            mai = subsize[ adj[cur][i] ];
            ind = i;

    if(ind >= 0) hld( adj[cur][ind] );

    for(int i = 0; i < adj[cur].sz; i++) {
        if(i != ind) {
            hld( adj[cur][i] );

Initially all entries of chainHead[] are set to -1. So in line 3 when ever a new chain is started, chain head is correctly assigned. As we add a new node to chain, we will note its position in the chain and increment the chain length. In the first for loop (lines 9-14) we find the child node which has maximum sub-tree size. The following if condition (line 16) is failed for leaf nodes. When the if condition passes, we expand the chain to special child. In the second for loop (lines 18-23) we recursively call the function on all normal nodes. chainNo++ ensures that we are creating a new chain for each normal child.


Problem : SPOJ – QTREE
Solution : Each edge has a number associated with it. Given 2 nodes and B, we need to find the edge on path from to B with maximum value. Clearly we can break the path into to LCA( A, B ) and to LCA( A, B ), calculate answer for each of them and take the maximum of both. As mentioned above as the tree need not be balanced, it may take upto O( N ) to travel from to LCA( A, B ) and find the maximum. Let us use HLD as detailed above to solve the problem.

Solution Link : Github – Qtree.cpp (well commented solution)
I will not explain all the functions of the solution. I will explain how query works in detail

main() : Scans the tree, calls all required functions in order.
dfs() : Helper function. Sets up depth, subsize, parent of each node.
LCA() : Returns Lowest Common Ancestor of two node
make_tree() : Segment tree construction
update_tree() : Segment tree update. Point Update
query_tree() : Segment tree query. Range Update
HLD() : Does HL-Decomposition, similar to one explained above
change() : Performs change operation as given in problem statemenm

query() : We shall see in detail about the query function.

int query(int u, int v) {
    int lca = LCA(u, v);
    return max( query_up(u, lca), query_up(v, lca) );

we calculate LCA(u, v). we call query_up function twice once for the path u to lca and again for the path v to lca. we take the maximum of those both as the answer.

query_up() : This is important. This function takes 2 nodes u, v such that v is ancestor of u. That means the path from u to v is like going up the tree. We shall see how it works.

int query_up(int u, int v) {
    int uchain, vchain = chainInd[v], ans = -1;

    while(1) {
        if(uchain == vchain) {
            int cur = query_tree(1, 0, ptr, posInBase[v]+1, posInBase[u]+1);
            if( cur > ans ) ans = cur;
        int cur = query_tree(1, 0, ptr, posInBase[chainHead[uchain]], posInBase[u]+1);
        if( cur > ans ) ans = cur;
        u = chainHead[uchain];
        u = parent(u);
    return ans;

uchain and vchain are chain numbers in which and are present respectively. We have a while loop which goes on until we move up from till v. We have 2 cases, one case is when both and belong to the same chain, in this case we can query for the range between and v. We can stop our query at this point because we reached v.
Second case is when and v are in different chains. Clearly is above in the tree than u. So we need to completely move up the chain of u and go to next chain above u. We query from chainHead[u] to u, update the answer. Now we need to change chain. Next node after current chain is the parent of chainHead[u].

Following image is the same tree we used above. I explained how query works on a path from to v.

Example Path


SPOJ – QTREE – http://www.spoj.com/problems/QTREE/
SPOJ – QTREE2 – http://www.spoj.com/problems/QTREE2/
SPOJ – QTREE3 – http://www.spoj.com/problems/QTREE3/
SPOJ – QTREE4 – http://www.spoj.com/problems/QTREE4/
SPOJ – QTREE5 – http://www.spoj.com/problems/QTREE5/
SPOJ – QTREE6 – http://www.spoj.com/problems/QTREE6/
SPOJ – QTREE7 – http://www.spoj.com/problems/QTREE7/
SPOJ – COT – http://www.spoj.com/problems/COT/
SPOJ – COT2 – http://www.spoj.com/problems/COT2/
SPOJ – COT3 – http://www.spoj.com/problems/COT3/
SPOJ – GOT – http://www.spoj.com/problems/GOT/
SPOJ – GRASSPLA – http://www.spoj.com/problems/GRASSPLA/
SPOJ – GSS7 – http://www.spoj.com/problems/GSS7/
CODECHEF – GERALD2 – http://www.codechef.com/problems/GERALD2
CODECHEF – RRTREE – http://www.codechef.com/problems/RRTREE
CODECHEF – QUERY – http://www.codechef.com/problems/QUERY
CODECHEF – QTREE – http://www.codechef.com/problems/QTREE
CODECHEF – DGCD – http://www.codechef.com/problems/DGCD
CODECHEF – MONOPLOY – http://www.codechef.com/problems/MONOPLOY

All codechef problems have Editorials.
QTREE : Explained above as example
QTREE2 : Segment trees this time should support sum operation over range. There is no update operation. Note that you can also use BIT. For finding the Kth element on a path, we can use LCA style jumping, it can be done in O( log N ).
QTREE3 : Simple. Segment tree should answer the following: index of left most black node in a range.
QTREE4 : You need the maximum distance. Clearly we can take the left most white node and right most white node. Distance between them will be the required answer. So our segment tree should be able to answer left most white node index as well as range sum.
QTREE5 : Bit hard. Read editorial of QTREE6 to get idea. (Correct me if I am wrong, I did not solve it yet)
QTREE6 : Editorial – Click Here
QTREE7 : Hard. Read editorial of MONOPOLY and GERALD2.
MONOPOLY : Editorial – Click Here
GERALD2 : Editorial – Click Here

The End!

Finallyyy!!! Long one!! When I was learning various algorithms and doing programming, the topic that bugged me the most was HLD, I did not understand it for a long time, also there was not any good tutorial. So I decided to start with HLD, I hope it will help a lot 🙂 This is my first attempt in writing a post/tutorial. Do share your views on it. Also I took a lot of time setting up this blog on EC2 (3 days, well I am new to this stuff), you can use contact me to write anything related to this blog, about cautions or bugs or will-do-good changes.
Share this post. Learn and let learn! 😀

  • Huy Thái Nguyễn

    nice blog. thank you!. 2020 but it’s very good for me


    Great Explanation.

    Java Implementation of QTREE : https://ideone.com/kIH2N5

  • Akash Bhalotia

    This is such an AMAZING tutorial! Thanks a ton!! 🙂

  • Mahatej Varma

    Thop anna nuvu..Intha manchi blog unchukuni evevo chusa. Thank you for this.

  • Test Zero

    For QTREE7 is this approach right? Use heavy light decomposition and for each node store what the answer would be if the node were white/black. For each decomposed chain maintain a persistent segment tree answering max updates. If say a node is changed from black to white that we propagate the white segment tree update upward, and for black we find the most upward node whose value is <= our node's value in the current chain and then we revert to the previous version of the corresponding nodes in the segment tree. Is this approach right?

  • Test Zero

    Can someone point out the issue with this submission for QTREE3 on SPOJ? All testcase I try pass but i get 0 on spoj.


  • Apurva Saraogi

    Its a great tutorial, Thanks 😀

  • LetsTrie

    Clear Code of HLD : Link

  • Ziyi Zhang

    I am first time learning this topic. This blog is really helpful, thank you!!

  • Erfan

    A great tutorial ! thanks.

  • Nishant Arya Kumawat

    what is special child if n=3,and edges are : 1-2,1-3,1-4.

  • soham roy

    do we not have to build a seg tree for every chain??please tell me how its working

    • darth vader

      So the thing is that baseArray[] contains all edge costs. It contains them such that all costs of chain 0 are stored first, then all edge costs of chain 1, then chain 2 and so on.
      And then we build a segment tree over this BaseArray.
      So in a way you can say that we have built numerous segtrees for all the chains, only difference is that all chains are present in concatenated manner in the same BaseArray.
      If we query the segTree for range from start to end of one chain, it would be equivalent of querying the segTree for the first chain.

  • Rafid Bin Mostofa

    Loved the way explained. 😀

  • p k

    in QTREE4 as you said we just need to find the farthest node. Is it really work because there are negative weight also! If not please explain how to solve it.

    and Thanks for such a nice editorial 🙂

  • yash jobanputra

    getting tle please help me.
    link: https://ideone.com/lbmoC3

  • Hemakshi Sachdev

    While changing the chain in line 92 of the QTREE code.. why have we not considered the edge between u and it’s parent??

  • Harish Chandra

    Can anyone explain how the lca works dynamically in this code. Thanks 🙂

  • Abhishek Tiwari

    Best explanation on HLD imo.

    the only thing that i couldn’t get was in your query_up() method u didn’t check at every point if the depth of the head of chain containing node ‘u’ is greater than that of head of chain containing node ‘v’ …bcoz if both nodes belong to different chains and the head of the other chain ( the chain having node v ) is at greater depth… node ‘u’ ( that is going up ) will never reach chain of node ‘v’ as a swap would be required in the while loop ( if i’ve comprehended your solution correctly )…
    well that costed me too many TLEs on spoj before getting an AC.

    I hope it helps someone getting TLE due to that inifinite trap.

    PS : Thanks for the post. Great work. 🙂

    • Himanshu Singh

      I think u missed ” v is an ancestor of u “.

      • Abhishek Tiwari

        the case can still arise ( even if v is an ancestor of u ) if there are more than n-1 edges, but given its a tree so there is no need of that…
        but i did check for that condition indeed and got an AC…earlier i was getting tle… don’t know why.

  • prashant upadhyay

    Hi Anudeep.. it is exceptional blog.
    I tried this approach in of the question i was solving. Though I just tried to understand the concept and tried implementing on my own. On my local machine it does seem to provide me logN chain hops at max. However When I am trying it online judge, it is going way higher than logN chain hops. Locally I am trying to create big testcases randomly. Still it tells me no more than logN hops. Can you help me debug.

    DFS_SIZE function is preprocessing for DFS i.e. special child caching.
    DFS is the original which creates the cache for querying further.
    can you help ? I dont see any reason for chain hops going over logN

    • Archit

      Can you share the code that you are using?

  • Pravin Parihar

    Why am i getting TLE ?? Can anybody please explain ??

    • Pravin Parihar

      UPD : Hi Guys, Due to tight constraints , I was getting TLE. I just replaced cin,cout with scanf, printf and it worked 🙂

  • Zeeshan

    On the code in github for QTREE
    why you took LN = 14

    • Traveler

      LN means LOGN
      search for LCA algorithm

  • Zeeshan

    why LN = 14?

    • SHA AAA

      because 2^14>10000

  • shahidul_brur

    Why I am getting TLE?
    Here is my code: http://ideone.com/wCmdPH

    • shahidul_brur

      Now got AC !!

      • Ajay verma

        i am also getting tle, can you help?

        • Jalak Patel

          for c++ :
          don’t use cin/cout
          use printf/scanf

  • Junaid

    pa[i][j] = pa[i-1][pa[i-1][j]]; what is this statement for?

    • Richard Yi

      pa[i][j] stores the (2**i)-th parent of node j
      It can be calculated by finding the 2**(i-1)th parent of the 2**(i-1)th parent of j, which is what this code does

  • Yeah its true.. Competitive programmers love the log factor more..

  • Great explanation, enjoyed reading this.. keep up the good work man (Y)

  • Thanks a ton! This post is helpful 🙂

  • Sarthak Taneja
  • Sarthak Taneja

    Hey, I am getting TLE with the code written by me though it almost same as that whose link you have shared.
    If possible please tell me mistake….

  • it’s the best guidance i have ever seen, thx very much!

  • PrudhviRaj

    Nice one

  • abhishek

    no explanation can be better than your’s.

  • ibahadir.altun

    I’ve never seen such a awesome tutorial before. Everything has explained clearly, thank you very much 🙂

  • bholagabbar

    Great tutorial. I think there are a couple of things I would like to clarify. I think it could help clear concepts for others too.

    1. We are using the terminology ‘special child’ and ‘special edge’. AFAIK, in the original Heavy-Light Decomposition, an edge is called ‘Heavy’ IFF the *subtree rooted at the child vertex is >= (1/2) size of subtree rooted at parent. It is entirely possible that a node has no ‘heavy children’.
    However, here in this tutorial, I believe we are replacing the word ‘heavy’ with ‘special’ and are removing the 1/2 constraint. Every parent node has a special child, it should be the one with max subtree size that’s it. Am I correct in stating this?

    2. In your implementation, (https://github.com/anudeep2011/programming/blob/master/qtree.cpp#L194), can you please explain line 84? Why are we querying posInBase[v]+1, posInBase[u]+1? Why the +1? I’m not sure but it it because the edges are stored at a position +1 from the vertex?

    The rest seems pretty clear. Thanks

    @anudeep, I’d love if you could clarify my doubts 🙂

  • Ty for the tutorial.
    I have a question. Can we keep the answer of each chain and for the whole chain answer the query in O(1)? We would have O(log(N)) for each given query then.
    Waiting for a reply.

  • Jaswant Singh

    nice tutorial .. 🙂 was trying for an generalized code in Java , i found one at Github : https://github.com/indy256/codelibrary/blob/master/java/src/HeavyLight.java
    but unable to understand functions [modifyOperation() , deltaEffectOnSegment() , getNeutralDelta()].
    ……….please provide some explanation for these functions so . It will be helpful.
    Thanks .

  • Vidit

    i couldn’t understand the function of otherEnd array and how change function works, please help me understand it clearly.

  • Vidit

    In the code for qtree i can’t understand the use of otherEnd array

    • tharun

      Let’s say we fix the root(in this case 0). Now, we can uniquely represent an edge with just a single vertex(child) since each vertex has only one parent. You can use a simple diagram for better understanding.
      Let’s say we have an edge(b, a) or edge(a,b), assume level(b) > level(a) i.e., ‘b’ is child of ‘a’, we can represent both edges with just a vertex ‘b’.
      So, representative of edge(a, b) = representative of edge(b, a) = b(child in vertices ‘a’ and ‘b’ i.e., vertex with higher level).
      ‘otherEnd’ array stores the ‘representative’ for corresponding edge-index(starting from 0 to n-2).
      Since, each edge can be represented with single element, we can represent the edge-cost to corresponding ‘representative’ in an array. ‘baseArray’ is used for this purpose.
      In change() function, we are provided with edge index(between 0 and n-2, inclusive). Now, for this particular edge, we find its representative using ‘otherEnd’. Now, we need to change the corresponding edge-cost of this representative in ‘baseArray’. We can find the index of this representative in ‘baseArray’ using the ‘posInBase’ array.
      As we know, ‘posInBase’ array is used for grouping of chains’ elements. Here, index is the vertex and value is position in ‘baseArray’.
      Hope, this helps.

  • Excellent explanation! Loved it!!

  • parijat

    could you tell me why there is +1 in ->query_tree(1, 0, ptr, posInBase[v]+1, posInBase[u]+1);

    • parijat

      got it thanks.
      amazing tutorial. please continue this good work

      • Shamit

        Can you explain why +1 is used. I understood +1 for v because v is the ancestor node and the cost of edge v-x ( where v is parent of x) is stored in node x. But why is +1 used for “u”?

        • ibahadir.altun

          I have been unstable at that point, too. But then I realized that, in make_tree the base-case is as follows :
          if(s == e-1) {
          st[cur] = baseArray[s];
          So, e equals to (end_of_current_segment + 1). That’s the reason behind why we used (posInBase[u] + 1). So you can easily build tree like, make_tree(1, 0, ptr – 1), and change the query_tree function to the query_tree(1, 0, ptr – 1, posInBase[v]+1, posInBase[u]). There’s is no need to use +1 if you change functions as above.

  • I_am_looser

    I have read post on topcoder for LCA.In that post they are checking for condition P[p][i]!=-1 …but in your code you ain’t why?

  • gaurav chandel

    Brilliant explanation…You have done a lot of hardwork….Its easy and its fun…..Thank you very much……and please , dont stop writing such tutorials …

  • Please can you tell me how you are calculating the LCA of two nodes using DP and that recursive function ?

  • shubham

    nice tutorial

  • Parshant garg

    awesome tutorial. I need more tutorial on some topics trie, suffix array.

  • sai krishna

    colored one are different chains am i right right?

  • Saurabh Verma

    You made it too easy for me 🙂

  • there are many algos for lca,what algorithm would you prefer to use for finding lca ?

  • What is the purpose of this array otherEnd[] ? I’m struct in here.

    • Traveler

      just for update query, you need to know what vertices are connected with ith edge. That is what otherEnd[] is for. It is not part of HLD algorithm, it is just for the problem QTREE.

  • I want more super-easy-to-understand posts like this. Thank you so much!

  • Saurabh Hota

    In QTREE4 explanation you have not taken care of the negetive weights

  • ankit

    hi anudeep..can u plz explain the dynamic programming approach u used for computing lca..

  • Nikhil Gupta

    Fantastic tutorial!

  • Shubham

    Hi, In query_up function,why in this line “query_tree(1, 0, ptr, posInBase[v]+1, posInBase[u]+1);'” posInBase[v]+1 & posInBase[u]+1 is used ?
    Shouldn’t be posInBase[v] & posInBase[u]?
    Thnx. for the great tutorial !!!

  • Aditya

    Anudeep, in your explanation og QTREE4 can you what you mean by the left most and right most white nodes? In the question it is stated that the distance between nodes can be negative.

    • anudeep2011

      Note that the solution is not trivial.
      We need a, b such that a and b are white and distance(a, b) is maximum.
      What ever the a, b be, for node LCA(a, b) one will be in its left subtree, other will be in its right subtree.
      For each node x, if you have the left farthest white node and right farthest white node, you can get the answer.

      • Rishit

        Anudeep, can you elaborate more on this? I am trying to solve using above mentioned technique, but unable to proceed further. I am stuck in finding out the information to store in segment tree base array for answering query in logarithmic time.

      • Mohammad Faizan

        But what about the negative weights on edges??

  • Vladimir

    “Each non leaf node has exactly one Special child.” And what if:
    Who is special here?

    • anudeep2011

      It can be either of them.

    • Isn’t that 4 actually 5?

  • Vivek Rai

    Thats really great 🙂 =D !!!!!! Thanks !!

  • Tom

    “The mininum sub-tree size possible for special child is ceil( s/m ). This being said, the maximum size for normal child is ceil( s/m ) and m>2.” – I think the first sentence is true, but the second one is not.

    • anudeep2011

      Corrected 🙂

  • I liked the way you chained a node with random node when there was no heavy edge connected to it. Nice optimization.

  • Thank you very much!

  • Sharad

    Hi, anudeep i think there is some problem with the 2nd para in which you said that the sub-tree size at least becomes half and on the other hand you are saying that it’s size should be <=s/2 which indicates that instead of at-least it should be at-most which is ambiguous in your description .

    • anudeep2011

      Made it clear. Thank you!

  • Aravind Srinivas

    Nice tutorial.

  • Ahmed Morst

    your post is really helpful thnx alot 😀

  • sunil

    anudeep can u please add a driver main for which this fuction hld will run


  • Really beautiful technique.. and explanation made it so easy to grab, which otherwise would have been tough to get at once… 🙂
    If you have time, please write about O(n) suffix array and LCP construction.. there are lots of O(nlg^2n) material on web, but for O(n) either mosly links are redirected to research papers or explained in very cryptic style..

    • anudeep2011

      Thank you.
      About the O(N) suffix array,
      1) I never tried to learn it, for competitive programming it was not worth it.
      2) Almost always a O(N * log N) is good enough.

      I will anyway add that to my TODO list (To learn it) and then decide about writing a blog or not 🙂

  • Great tutorial man, you make great hard work to understand this, but make a way for us to learn the thing only in 10 mins! 🙂 Hats off to you buddy! Carry on! 🙂

  • Jules

    “This being said, the maximum size for normal child is ceil( s/m ) and m>2.”
    This is false, right?
    You can only say it’s s/2 maximum, but for example if m=3, there can be one child with 5, another with 5 and the third one with 2 ; and ceil(s/m) = 4.
    (What you actually proved is that not all normal child can be greater than ceil(s/m))

    That being said, thank you for the tutorial ! 🙂

    • anudeep2011

      Corrected! Thank you

  • Pratik kumar

    Hey Anudeep nice tutorial…..easy to understand and to the point.
    Please look my code for errors:

    I used the same concept but kept edge cost at the child side of the edge so as to avoid ambiguity as you did and hence queried using the child sided node…..but still it returns sigsegv error on submission on spoj.

    • anudeep2011

      Looking into code and finding bug is just very hard. I would not want to do that.

  • Sikander

    Had been postponing learning HLD since months because of it being a hard topic, learned it in less than an hour thanks to your amazing explanation.

  • Can u explain How can we solve http://www.spoj.com/problems/COT/ using HLD bcz i didn’t figure out.

    • You need to use persistent segment tree too. Trying to code right now 😀

  • Wow! Just landed up on your blog and it’s very informative.
    You explain stuff very well. Should leave Google and start conducting workshops in India. 😛

    PS. If you can please do a blog article on different uncommon data structures(not the common ones). 🙂

    • anudeep2011

      Thank you! Yes, I always try to blog about stuff that has least resources online, or the ones that are not common with sub-continent programmers!

  • rahul kumar

    hi anudeep ,thank u for this greate tutorial ..finally i learnt how the hld works…
    i implemented QTREE but it gives TLE …i used similar code for HLD as of yours
    here is the link https://github.com/princerk/code/blob/master/.gitignore
    i uesd segment tree discusssed in utkarsh’s blog and calculated LCA using HLD
    (im sure it wont take more than 5 min of urs)

  • rohit verma

    nice tutorial 🙂

  • Mriganka Basu Roy Chowdhury

    This is great. 😀 Yes, totally agreed, HLD has very less number of tutorials; so thanks a lot. Could you please refer me / write one (this is even better :D) tutorial on 2D segtrees? This year I went to IOITC, and there was a problem on 2D BIT; in IOI 2013 there was one on 2D seg trees, so I thought knowing 2D segtrees would help.

    • anudeep2011

      Thank you. I will surely consider that topic.