Logic used behind Array Manipulation of HackerRank
These two places helped me to understand this algorithm more clearly.
Prefix_sum_array_and_difference_array
Stack Overflow
If you want a simple and direct explanation:
Initial, the array is 0 0 0 0 0
cpp
after the first operation, 1 2 100
it will become
seq1: 100 100 0 0 0
and after second 2 5 100
seq2: 0 100 100 100 100
and after 3 4 100
seq2: 0 0 100 100 0
but when we apply difference array
at every step, we will get
cpp
diff seq of seq1: 100 0 -100 0 0
diff seq of seq2: 0 100 0 0 0 -100
diff seq of seq3: 0 0 100 0 -100
One important property is the difference sequence of the sum of the sequences is the sum of the difference sequences.
it will give us,
cpp
100 100 0 0 -100 -100(for clarity purpose only)
or you can add all the sequences as
cpp
seq1+seq2+seq3 = 100 200 200 200 100
and then find the difference seq or the difference array which is 100 100 0 0 -100 and then find the prefix array.
Why we ignore the first 100??? Read the first article about difference array and prefix sum array!!!!
and after this, do prefix sum
cpp
100 200 200 200 100 0
Ignore last 0 as the last index we considered is for clarity purpose only.
so,both these steps find the difference array for us:)
cpp
a[p]+=sum;
if((q+1)<=N) a[q+1]-=sum;
We are basically storing the increment in the starting position and one past the last index in the range. For a b k
we will increase +k
for all elements in index [a,b]
but then the next elements will not be increased. So we are subtracting it, because w.r.t the previous increment all elements to the right of the range will be lesser by -k
. We are basically storing all the final values via this increment/decrement.
At last we are calculating the elements on the fly from left to right. If you think more deeply, it is just storing how much one element is bigger than the previous element.
Initially the array will be 0 0 0 0 0
.
After the first operation 1 3 3
originally the array elements should be
3 3 3 0 0
but we are storing it like this
3 0 0 -3 0
Meaning
First element is 3 greater than 0. Second -> 0 greater than index 1 element. Third -> 0 greater than index 2 element Fourth -> -3 greater than index 3 element. fifth -> 0 greater than index 4 element.
After the second operation 2 4 4
originally the array will be 3 7 7 4 0
but we store it like this 3 4 0 -3 -4
. Just like I described in detail keep in mind that and think that way, you will see that we are not losing information. We just store it in a different way.
Final values will be
0+(3) 0+3+(4) 0+3+4+(0) 0+3+4+0+(-3) 0+3+4+0-3+(-4)
3 7 7 4 0 matches with what we got earlier.
Note how we calculate each element. Just adding previous element with the value by which current element is greater.
Note that this solution works because it is being queried only once. If it is queried m
times, then this solution doesn't work because it will time out. Then you will have to dig deeper using advanced data structures like segment trees and/or binary indexed trees.
I'll try to explain my understanding of this:
Each line of input basically describes a sequence, and you are asked to find the maximum value of the sum of these sequences.
For example, if N
is given as 5
:
the line 2 4 13
describes the sequence [0, 13, 13, 13, 0]
the line 3 5 11
describes the sequence [0, 0, 11, 11, 11]
.
If those are the only lines we get the result sequence from the pointwise sum of the two, which is [0, 13, 24, 24, 11]
.
Now another way we can describe the sequences above are by the difference sequences, that is, at index i
we will keep the difference between the element at index i
and the element at index i-1
, and we can get the original sequence by a running sum of the difference sequence.
In the case of the above sequences, the difference sequences are:[0, 13, 0, 0, -13]
for the sequence described by 2 3 13
[0, 0, 11, 0, 0]
for the sequence described by 3 5 11
[0, 13, 11, 0, -13
for the sum of the sequences.
One important property is the difference sequence of the sum of the sequences is the sum of the difference sequences.
So what the solution does, for every line, is to sum the difference sequences (which only requires up to 2 operations due to the nature of the sequences), then to find the maximum it takes the running total of the difference sequence, thus getting the elements of the sequence, and holds the maximum value of that running total.
While the example I gave has only 2 lines, this same idea works for any number of lines.
I hope this gives good intuition as to the idea behind the solution.