MO’s Algorithm (Query square root decomposition)

Once again I found a topic that is useful, interesting but has very less resources online. Before writing this I did a small survey and surprised that almost none of the Indian programmers knew this algorithm. Its important to learn this, all the red programmers on codeforces use this effectively in div 1 C and D problems. There were not any problems on this an year and half back, but from then there is a spike up! We can expect more problems on this in future contests.

1) State a problem
2) Explain a simple solution which takes O(N^2)
3) Slight modification to above algorithm. It still runs in O(N^2)
4) Explain an algorithm to solve above problem and state its correctness
5) Proof for complexity of above algorithm – O(Sqrt(N) * N)
6) Explain where and when we can use above algorithm
7) Problems for practice and sample code

State a problem

Given an array of size N. All elements of array <= N. You need to answer M queries. Each query is of the form L, R. You need to answer the count of values in range [L, R] which are repeated at least 3 times.
Example: Let the array be {1, 2, 3, 1, 1, 2, 1, 2, 3, 1} (zero indexed)
Query: L = 0, R = 4. Answer = 1. Values in the range [L, R] = {1, 2, 3, 1, 1} only 1 is repeated at least 3 times.
Query: L = 1, R = 8. Answer = 2. Values in the range [L, R] = {2, 3, 1, 1, 2, 1, 2, 3} 1 is repeated 3 times and 2 is repeated 3 times. Number of elements repeated at least 3 times = Answer = 2.

Explain a simple solution which takes O(N^2)

For each query, loop from L to R, count the frequency of elements and report the answer. Considering M = N, following runs in O(N^2) in worst case.

for each query:
  answer = 0
  count[] = 0
  for i in {l..r}:
    count[array[i]]++
    if count[array[i]] == 3:
      answer++

Slight modification to above algorithm. It still runs in O(N^2)

add(position):
  count[array[position]]++
  if count[array[position]] == 3:
    answer++

remove(position):
  count[array[position]]--
  if count[array[position]] == 2:
    answer--

currentL = 0
currentR = 0
answer = 0
count[] = 0
for each query:
  // currentL should go to L, currentR should go to R
  while currentL &amp;lt; L:
    remove(currentL)
    currentL++
  while currentL &amp;gt; L:
    add(currentL)
    currentL--
  while currentR &amp;lt; R:
    add(currentR)
    currentR++
  while currentR &amp;gt; R:
    remove(currentR)
    currentR--
  output answer

Initially we always looped from L to R, but now we are changing the positions from previous query to adjust to current query.
If previous query was L=3, R=10, then we will have currentL=3 and currentR=10 by the end of that query. Now if the next query is L=5, R=7, then we move the currentL to 5 and currentR to 7.
add function means we are adding the element at position to our current set. And updating answer accordingly.
remove function means we are deleting the element at position from our current set. And updating answer accordingly.
Edit: Have a look a Shubajit Saha’s comment. And also this.

Explain an algorithm to solve above problem and state its correctness

MO’s algorithm is just an order in which we process the queries. We were given M queries, we will re-order the queries in a particular order and then process them. Clearly, this is an offline algorithm. Each query has L and R, we will call them opening and closing. Let us divide the given input array into Sqrt(N) blocks. Each block will be N / Sqrt(N) = Sqrt(N) size. Each opening has to fall in one of these blocks. Each closing has to fall in one of these blocks.

A query belongs to P’th block if the opening of that query fall in P’th block. In this algorithm we will process the queries of 1st block. Then we process the queries of 2nd block and so on.. finally Sqrt(N)’th block. We already have an ordering, queries are ordered in the ascending order of its block. There can be many queries that belong to the same block.

From now, I will ignore about all the blocks and only focus on how we query and answer block 1. We will similarly do for all blocks. All of these queries have their opening in block 1, but their closing can be in any block including block 1. Now let us reorder these queries in ascending order of their R value. We do this for all the blocks.

How does the final order look like?
All the queries are first ordered in ascending order of their block number (block number is the block in which its opening falls). Ties are ordered in ascending order of their R value.
For example consider following queries and assume we have 3 blocks each of size 3.
{0, 3} {1, 7} {2, 8} {7, 8} {4, 8} {4, 4} {1, 2}
Let us re-order them based on their block number.
{0, 3} {1, 7} {2, 8} {1, 2} {4, 8} {4, 4} {7, 8}
Now let us re-order ties based on their R value.
{1, 2} {0, 3} {1, 7} {2, 8} {4, 4} {4, 8} {7, 8}

Now we use the same code stated in previous section and solve the problem. Above algorithm is correct as we did not do any changes but just reordered the queries.

Proof for complexity of above algorithm – O(Sqrt(N) * N)

We are done with MO’s algorithm, it is just an ordering. Awesome part is its runtime analysis. It turns out that the O(N^2) code we wrote works in O(Sqrt(N) * N) time if we follow the order i specified above. Thats awesome right, with just reordering the queries we reduced the complexity from O(N^2) to O(Sqrt(N) * N), and that too with out any further modification to code. Hurray! we will get AC with O(Sqrt(N) * N).

Have a look at our code above, the complexity over all queries is determined by the 4 while loops. First 2 while loops can be stated as “Amount moved by left pointer in total”, second 2 while loops can be stated as “Amount moved by right pointer”. Sum of these two will be the over all complexity.

Most important. Let us talk about the right pointer first. For each block, the queries are sorted in increasing order, so clearly the right pointer (currentR) moves in increasing order. During the start of next block the pointer possibly at extreme end will move to least R in next block. That means for a given block, the amount moved by right pointer is O(N). We have O(Sqrt(N)) blocks, so the total is O(N * Sqrt(N)). Great!

Let us see how the left pointer moves. For each block, the left pointer of all the queries fall in the same block, as we move from query to query the left pointer might move but as previous L and current L fall in the same block, the moment is O(Sqrt(N)) (Size of the block). In each block the amount left pointer movies is O(Q * Sqrt(N)) where Q is number of queries falling in that block. Total complexity is O(M * Sqrt(N)) for all blocks.

There you go, total complexity is O( (N + M) * Sqrt(N)) = O( N * Sqrt(N))

Explain where and when we can use above algorithm

As mentioned, this algorithm is offline, that means we cannot use it when we are forced to stick to given order of queries. That also means we cannot use this when there are update operations. Not just that, there is one important possible limitation: We should be able to write the functions add and remove. There will be many cases where add is trivial but remove is not. One such example is where we want maximum in a range. As we add elements, we can keep track of maximum. But when we remove elements it is not trivial. Anyways in that case we can use a set to add elements, remove elements and report minimum. In that case the add and delete operations are O(log N) (Resulting in O(N * Sqrt(N) * log N) algorithm).

There are many cases where we can use this algorithm. In few cases we can also use other Data Structures like segment trees, but for few problems using MO’s algorithm is a must. Lets discuss few problems in the next section.

Problems for practice and sample code

DQUERY – SPOJ: Number of distinct elements in a range = number of elements with frequency >= 1. So it is the same problem we discussed above.
Click here for sample code
Note: That code will give TLE on submission, it will give AC if fast I/O is added. Removed fast I/O to keep code clean.
Powerful array – CF Div1 D: This is an example where MO’s algorithm is a must. I cannot think of any other solution. CF Div1 D means it is a hard problem. See how easy it is using MO’s algorithm in this case. You only need to modify add(), remove() functions in above code.
GERALD07 – Codechef
GERALD3 – Codechef
Tree and Queries – CF Div1 D
Powerful Array – CF Div1 D
Jeff and Removing Periods – CF Div1 D
Sherlock and Inversions – Codechef

I am sure there are more problems, if you know any of them, do comment, i will add them.
While this algorithm has a special name “MO”, it is just smart square root decomposition.

Signing off! Wish you a happy new year 🙂

Share this post! Learn and let learn 🙂

  • abhijeet kaurav
  • abhijeet kaurav

    here is the code …* only add and remove are modified*……..
    Please help

    #include
    using namespace std;

    #define N 200010
    #define A 1111111
    #define BLOCK 445 // ~sqrt(N)

    int cnt[A], a[N], ans[N], answer = 0,sum = 0;

    struct node {
    int L, R, i;
    }q[N];

    bool cmp(node x, node y) {
    if(x.L/BLOCK != y.L/BLOCK) {
    // different blocks, so sort by block.
    return x.L/BLOCK < y.L/BLOCK;
    }
    // same block, so sort by R value
    return x.R < y.R;
    }

    void add(int position) {
    sum -= pow(cnt[a[position]],2)*a[position];
    cnt[a[position]]++;
    sum += pow(cnt[a[position]],2)*a[position];
    }

    void remove(int position) {
    sum -= pow(cnt[a[position]],2)*a[position];
    cnt[a[position]]–;
    sum += pow(cnt[a[position]],2)*a[position];
    }

    int main() {

    int n;
    scanf("%d", &n);
    int m;
    scanf("%d", &m);
    for(int i=0; i<n; i++)
    scanf("%d", &a[i]);

    for(int i=0; i<m; i++) {
    scanf("%d%d", &q[i].L, &q[i].R);
    q[i].L–; q[i].R–;
    q[i].i = i;
    }

    sort(q, q + m, cmp);

    int currentL = 0, currentR = 0;
    for(int i=0; i<m; i++) {
    int L = q[i].L, R = q[i].R;
    while(currentL L) {
    add(currentL-1);
    currentL–;
    }
    while(currentR R+1) {
    remove(currentR-1);
    currentR–;
    }
    ans[q[i].i] = sum;
    }

    for(int i=0; i<m; i++)
    printf("%dn", ans[i]);
    return 0;
    }

  • abhijeet kaurav

    on codeforces on test case n. 6

  • abhijeet kaurav

    but it is exceeding time limit by this method on powerful array

    • N(i)tr

      Use int instead of long long int for all except for sum variable and ans array you will get AC.

  • devbishnoi

    Some of you are asking that why the block size is equal to sqrt(n)
    lets assume that block size is x.
    than no of blocks are n / x.
    No of queries in each block is = x = block size.
    we have two pointer left and right in problem.
    Here for each query left will move at most x ( within query ). And for each block right will move at most n( from index 0 to end of array)

    so now for each block left pointer will move = x * x ( x of each query at most) , and right will move n (at most)

    so for each block total moves = left pointer move + right pointer move = x * x + n.

    As we know there are n/x block.

    for all blocks total moves = (n/x) * ( x * x + n).
    f(x) = (n/x) * (x * x + n) = n * x + (n *n ) / x.
    To minimise this function f(x)
    f ‘(x) = 0;
    n + (n*n) * ( -1 / ( x*x ) ) = 0
    n = (n*n)/(x*x)
    x * x = n ==> x = sqrt(n);

  • The links for the exercise problem are broken. The same link goes for last three problems.

  • Gaurav “gkcs” Sen

    Hi Anudeep!
    Thank you for the brilliant blog, this is where I learnt about Mo’s algorithm. For anyone interested in a video tutorial, I made one on https://youtu.be/hqaRYgsLpUI with the SPOJ problem of DQUERY as an example.

    • Shivam Kumar

      why you are alive?

      • Shivam Kumar

        worst video.he don’t know what is MO’s algorithm.

  • Mohammad Faizan

    Read it for the very first time and guess what?? digested it all in one go. Man.. it’s an amazing explanation, loved it.

    • Gaurav “gkcs” Sen

      The problem there is that the right pointer might jump too much. Imagine the scenario with N=100:
      1 100
      2 2
      3 100
      4 4
      5 100
      6 6
      7 100
      8 8
      9 100
      If we sort on L and R, we will traverse about 500 blocks. If we sort on the left block index first, we traverse as:
      BLOCK 1:
      2 2
      1 100
      3 100
      BLOCK 2:
      4 4
      6 6
      5 100
      BLOCK 3:
      8 8
      7 100
      9 100
      Now the right pointer traverses about 300 blocks. Left pointer traversal is still negligible. Hence we go for this sorting technique.

      • Mohammad Faizan

        Thankyou brother, your comment helped me get the beauty of the algo..

  • Arpit Agrawal

    This is my implementation of DeQuery http://pastebin.com/Buic4fVr
    I am getting NZEC can anyone please look into my code and find the bug.

  • Arpit Agrawal

    What is the logic behind taking a block size of sqrt(N).

    • devbishnoi

      see my latest comment above

  • Saurabh Srivastava

    Why the block size SQRT(N), why not something else like log(N) ?

    • nabil1997

      If your block size is log(N), then number of block sizes become N/log(N). In that case, you have to traverse N/log(N) blocks to get the answer in the worst case.

    • devbishnoi

      see my latest comment above

  • Sriraj R

    In Powerful Array, I took the BLOCK value to be 448=sqrt(200000). However it gave a TLE. Increasing block value to 1000 gave me AC. Can you please explain the reason?

    • Sandeep

      I solved this problem today, and your comment helped me get over TLE.
      The actual complexity using MO for Powerfull Array is O((N+T)*B), where N is the size of the array, T is the number of queries and B is no.of blocks.

      In Worst case, when N = T = 200000
      1. Block size = (sqrt(n)), No.of operatios = (200000+200000)*(N/sqrt(N)) = 178885438
      2. Block size = (1000), No.of operatios = (200000+200000)*(N/1000) = 80000000

      Difference in no. of operations: 98885438 (2nd one is faster in worst case).

      What I can say is sqrt(N) is just an optimal size which works better than other values most of the time, not always.

      Though, it would be better if @anudeep2011 explains this.

  • Hi Anudeep,thanks for the amazing post. Although not a bug but I would like to mention to the readers that the code snippet given in the blog suffers from a logical counter intuitiveness that remove operation of an element can happen before add operation on it. For example n = 8, and we have 2 queries (0,7) ans (1,4) then after ordering the queries appear in (1,4) followed by (0,7) and the code snippet removes 0th element while increasing the left pointer while it was not added previously. Now this may cause serious problem because in some problems where this assumption is necessary to ensure certain quantity does not become negative. For problems that require algebraic manipulations e.g powerful array , this is not a problem because algebraic operations like addition and subtraction are order independent. For example in problem http://codeforces.com/problemset/problem/375/D from codeforces you need to maintain a BIT of frequency of occurrences of an element and if remove is permitted before addition than this frequency can go negative and there by causing out of index or infinite loop problems. A simple get around is to first group the add while loops followed by remove while loops instead of grouping them on the basis of left and right pointers to maintain the logical intuitiveness. I think it will be wise to mention this in the blog.

  • Why my code getting runtime error ?? As i guess, itsindex out of bound, but Where is index out of bound happening ?? Anyone can help ??
    Dquery Problem Link : http://www.spoj.com/problems/DQUERY/
    Ideone code link : http://ideone.com/kv6j17

  • Kishore

    algorithms involving segment trees seem far better than this i think…
    constructing segment tree takes O(N) time and then if Q queries are there then it could be done in just O(Q * lg N) time.. Where N is the size of array…
    where this MO’s algorithm could beat the Segment trees??
    Could you give some scenarios??

  • alphaguy4

    https://www.codechef.com/problems/CLOSEFAR
    This problem brought me here… !

  • can anyone give me any idea about https://www.codechef.com/problems/GERALD07
    this problem….
    can not understand the editorial
    thanks

    • coded_mind

      I haven’t seen the editorial but I think that the problem can be solved using SQRT_Decmoposition and MO’s approach.
      First Point : let’s say there are e no. of edges, write down all the edges from 1 to e as 1,2,3, . . . , e
      now do the sqrt decomposition of this list and for each chunk calculate “how many connected components will remain if the graph only had these edges”.
      After decomposition is over , be ready to answer the queries via MO’s approach by sorting them in the correct order.
      Now assume that answer for [a,b] is known in previous step , the left point of this interval also falls in the same chunk in which the left point of previous one falls ,this will look either like
      [a+k,b+j] or [a-k,b+j] or [a+k,b] or [a-k,b], where j,k are positive, j takes at most e-1 and k takes at most sqrt(e), hence guaranteeing the MO’s suggested complexity.
      The big issue is to find out how do we calculate the answer for this new interval, but this should not be an issue if you know well how to use UNION-FIND Data Structure.

  • Abhishek Kumar

    why I am getting WA 🙁
    can anyone provide some testcase where my code gives wrong answer,I have tried many testcases but couldn’t find any mistake.
    problem- http://www.spoj.com/problems/DQUERY/
    code-https://ideone.com/4RJi9p

  • Ashish Tilokani

    Short and easy to understand explanation.

    This problem is from a recent contest which motivated me to understand this algo.

    http://codeforces.com/contest/617/problem/E

  • Jaswant Singh

    Another very useful and new algo … thanks anudeep for such a nice effort .

  • With fast I/o and inline declarations
    Powerful array : http://ideone.com/mc0fNT
    Dquery :http://ideone.com/vwX6mY

    • Sahil Arora

      Powerful array does not need fast IO, if you use cin.tie(NULL);

    • Suraj

      Can you please tell mistake in my add() and remove() function? My Solution
      Thank You.

  • Sir why we take block size always 555

  • Anudeep i could not understand the update function in Powerful Array

    I used the following which gave me TLE :

    void add(long long int position) {
    cnt[a[position]]++;
    answer=answer+cnt[a[position]]*cnt[a[position]]*a[position]-(cnt[a[position]]-1)*(cnt[a[position]]-1)*a[position];
    }

    • Understood 😛 It was a small modification 🙂

    • Sahil Arora

      Hey, Look who I found! 😀

  • We can use a heap with square root decomposition to find the minimum of the range.

  • Pandu

    What should do if sqrt(n) is floating point value?

  • http://www.spoj.com/problems/ZQUERY/
    This problem can be done using Mo’s algorithm.

  • prakhar

    very nice approach to MO’s algorithm….i just dont understand why do we get wrong answer if we change the code in DQUERY for the right pointer to
    while(current_right right) {
    remove(current_right]);
    current_right–;
    }
    and give the initial value of current_right =-1 and current_left=0
    its give the correct answer for test cases but wrong on SPOJ

  • Hey Anudeep I tried to solve this : https://www.hackerearth.com/july-clash-15/algorithm/something-genuine Using the above method.

    But the only difference in this problem is that I have to calculate the number of unique elements in each sub array and there are N*(N+1)/2 possible sub arrays.How do I solve it ?

  • another problem on MO’s algorithm:

    http://www.spoj.com/problems/ZQUERY/

  • In your solution of DQUERY you fixed the block size according to max no of elements possible. So we do not have to change the block size according to no of elements given in the question every time?
    Like if n = 10 then also block size remains 555 ?

  • no need for fast i/o for dquery.. got acc using your concept just declare remove and add inline functions.

  • Shubham Sharma

    http://ideone.com/mpSL6G
    Can you please tell me where am I going wrong ..
    I implemented Mo’s Algorithm and yet I get a TLE…
    Is the implementation correct…
    COmpared with your code I found it almost identical….

  • ma5termind

    hello Anudeep,

    problem : powerful array can be solved online with the same complexity O((Q+N)*sqrt(N)). we can discuss it if u want.

  • darkshadow

    Hi, can you explain me how to modify ADD and REMOVE function to solve powerful array problem of codeforces.

  • Peeyush Yadav

    How to find max. element in range. can you please explain the remove part in detail ?
    I am not able to understand this “we can use a set to add elements, remove elements and report minimum”

  • why we can’t sort on basis of l value instead of block value.please explain????

  • Rajat De

    http://www.spoj.com/problems/FREQUENT/
    Sir i solved this problem using segmentree but i think it can be solved with MO’s algorithm too . if it can be solved using MO’s algorithm , can you tell how to do so?

  • Thank you for this valuable post

  • Rajon Bardhan
  • Himanshu Jaju

    Hey, the links to the problems are not linked properly I guess.

  • Sir ,
    Tanks for a nice tutorial !!

  • Utkarsh Saxena

    Another example can be
    Number of inversions in the subarray [L,R]
    (O(N * Sqrt(N) * log N) algorithm)

    • anudeep2011

      Yes, I added a problem from codechef which asks for the same. Thank you

  • we get Merge Sort Algorithms code please

  • update your solution of DQUERY.there are lot of bugs there.
    ie:
    range of N is 3*10^4 not 3*105^5
    BLOCK should be 174 not 555
    here 1<=a[i]<=10^6 not a[i]<=N ..so size of the arrays that you declared are wrong.
    BTW thank you for this nice explanation on MOs algorithm.Learnt a lot 🙂

  • Deepak

    This question also uses the same concept, I guess 😉 :
    http://codeforces.com/contest/86/problem/D

    • anudeep2011

      Thank you.

  • Aditya Shah
    • anudeep2011

      Thanks for the links.

  • cbgiri

    Sir,
    I think there is bug in your slide modification code which run in 0(n*n).

    if suppose if have an array of 9 element indexing from 0

    1 2 3 1 2 3 1 2 3
    and there are
    6 queries
    query:-
    1) L=0 R=8
    answer should be 3 but came 2 //wrong answer
    2) L=1 R=7
    answer should be 1 and also came 1 //right answer
    3) L=2 R=8
    answer should be 1 and also came 1 //right answer
    4) L=4 R=7
    answer should be 0 but came 1 //wrong answer
    5) L=2 R=8
    answer should be 1 and also came 1 //right answer
    3) L=2 R=7
    answer should be 0 but came 1 //wrong answer

    Please correct me if i am wrong.

    • cbgiri

      Sir, u wrote it correctly in sample code but in blog u wrote it wrongly . so please update this 🙂 .

      • anudeep2011

        I know it, but to keep things simple i did not include the border cases clearly.

        • cbgiri

          k sir.
          and happy New Year 🙂

    • anudeep2011

      Yes, it can be wrong. I really did not look into the boundary conditions and implementation details, scope of this article is to introduce and explain the algorithm and provide enough code so that the reader can implement it on its own.

  • Dhruv

    Hi Anudeep,
    Thanks a lot for this blog, after attempting RRJAM in Dec cookoff I had been trying to read about sqrt decomposition, fenwick trees, etc. I also looked at your solution, but couldn’t follow it completely. I know its a lot to ask, but if you could explain your RRJAM solution, it would really be helpful 🙂
    PS: If you have any relevant resources to learns these related topics, pls do share them too.

    Thanks

  • Pulkit

    Hi! Wonderful explanation. 🙂
    I have been trying GERALD07 http://www.codechef.com/MARCH14/problems/GERALD07 , but unable to understand . Can you post something for the problem to help more like me facing the same issue.
    Thank you 🙂

  • red

    very good and clean explanation thanks Anudeep 🙂

  • programmer

    DQUERY Done 🙂

  • Out of curiosity, is this name used commonly in the literature? Or is this name used specifically by competitive programming community? I searched online for a reference to MO’s algorithm and could not find it.

    • anudeep2011

      MO name is used only in china.

  • i just confused with : what are these lines saying ??

    How does the final order look like?
    All the queries are first ordered in ascending order of their block number (block number is the block in which its opening falls). Ties are ordered in ascending order of their R value.
    For example consider following queries and assume we have 3 blocks each of size 3.
    {0, 3} {1, 7} {2, 8} {7, 8} {4, 8} {4, 4} {1, 2}
    Let us re-order them based on their block number.
    {0, 3} {1, 7} {2, 8} {1, 2} {4, 8} {4, 4} {7, 8}
    Now let us re-order ties based on their R value.
    {1, 2} {0, 3} {1, 7} {2, 8} {4, 4} {4, 8} {7, 8}

    • anudeep2011

      What exactly you did not understand? Those lines summarize the paras above those and gives an example.

      • Hi Anudeep,
        Firstly, good initiative with posting algorithms that are not very familiar to most. Please do continue this.
        I too didn’t understand how (1,2) came after (2,8) after block wise ordering. Isn’t the division of inputs into blocks is based on L-value And then ties are sorted based on R-value?

        • anudeep2011

          Initial ordering is block based. As i told the block size is 3, L = 0,1,2 will be in the same block.

          • how the order has been done means
            firstly it has been done on blocks
            and then on block numbers and on the basis of R values
            so could you help please

          • anudeep2011

            1) Block number.
            2) On ties with block number, then on R value

  • Awesome algo !!!
    Thnk u for posting sir 🙂

  • ma5termind

    if some one feel need here is link to my solution .. I think i have coded in such a manner that one can understand .. after putting some efforts ..

    https://www.hackerrank.com/contests/w12/challenges/white-falcon-and-tree/submissions/code/2445084

  • ma5termind

    Hello Anudeep2011

    Add this to the above list . This one is a tough problem which i solved a month ago using sqrt decomposition + heavy light decomposition ..