How to determine the longest increasing subsequence using dynamic programming?
Speaking about DP solution, I found it surprising that no one mentioned the fact that LIS can be reduced to LCS. All you need to do is sort the copy of the original sequence, remove all the duplicates and do LCS of them. In pseudocode it is:
def LIS(S):
T = sort(S)
T = removeDuplicates(T)
return LCS(S, T)
And the full implementation written in Go. You do not need to maintain the whole n^2 DP matrix if you do not need to reconstruct the solution.
func lcs(arr1 []int) int {
arr2 := make([]int, len(arr1))
for i, v := range arr1 {
arr2[i] = v
}
sort.Ints(arr1)
arr3 := []int{}
prev := arr1[0] - 1
for _, v := range arr1 {
if v != prev {
prev = v
arr3 = append(arr3, v)
}
}
n1, n2 := len(arr1), len(arr3)
M := make([][]int, n2 + 1)
e := make([]int, (n1 + 1) * (n2 + 1))
for i := range M {
M[i] = e[i * (n1 + 1):(i + 1) * (n1 + 1)]
}
for i := 1; i <= n2; i++ {
for j := 1; j <= n1; j++ {
if arr2[j - 1] == arr3[i - 1] {
M[i][j] = M[i - 1][j - 1] + 1
} else if M[i - 1][j] > M[i][j - 1] {
M[i][j] = M[i - 1][j]
} else {
M[i][j] = M[i][j - 1]
}
}
}
return M[n2][n1]
}
OK, I will describe first the simplest solution which is O(N^2), where N is the size of the collection. There also exists a O(N log N) solution, which I will describe also. Look here for it at the section Efficient algorithms.
I will assume the indices of the array are from 0 to N - 1. So let's define DP[i]
to be the length of the LIS (Longest increasing subsequence) which is ending at element with index i
. To compute DP[i]
we look at all indices j < i
and check both if DP[j] + 1 > DP[i]
and array[j] < array[i]
(we want it to be increasing). If this is true we can update the current optimum for DP[i]
. To find the global optimum for the array you can take the maximum value from DP[0...N - 1]
.
int maxLength = 1, bestEnd = 0;
DP[0] = 1;
prev[0] = -1;
for (int i = 1; i < N; i++)
{
DP[i] = 1;
prev[i] = -1;
for (int j = i - 1; j >= 0; j--)
if (DP[j] + 1 > DP[i] && array[j] < array[i])
{
DP[i] = DP[j] + 1;
prev[i] = j;
}
if (DP[i] > maxLength)
{
bestEnd = i;
maxLength = DP[i];
}
}
I use the array prev
to be able later to find the actual sequence not only its length. Just go back recursively from bestEnd
in a loop using prev[bestEnd]
. The -1
value is a sign to stop.
OK, now to the more efficient O(N log N)
solution:
Let S[pos]
be defined as the smallest integer that ends an increasing sequence of length pos
. Now iterate through every integer X
of the input set and do the following:
If
X
> last element inS
, then appendX
to the end ofS
. This essentially means we have found a new largestLIS
.Otherwise find the smallest element in
S
, which is>=
thanX
, and change it toX
. BecauseS
is sorted at any time, the element can be found using binary search inlog(N)
.
Total runtime - N
integers and a binary search for each of them - N * log(N) = O(N log N)
Now let's do a real example:
Collection of integers:
2 6 3 4 1 2 9 5 8
Steps:
0. S = {} - Initialize S to the empty set
1. S = {2} - New largest LIS
2. S = {2, 6} - New largest LIS
3. S = {2, 3} - Changed 6 to 3
4. S = {2, 3, 4} - New largest LIS
5. S = {1, 3, 4} - Changed 2 to 1
6. S = {1, 2, 4} - Changed 3 to 2
7. S = {1, 2, 4, 9} - New largest LIS
8. S = {1, 2, 4, 5} - Changed 9 to 5
9. S = {1, 2, 4, 5, 8} - New largest LIS
So the length of the LIS is 5
(the size of S).
To reconstruct the actual LIS
we will again use a parent array.
Let parent[i]
be the predecessor of an element with index i
in the LIS
ending at the element with index i
.
To make things simpler, we can keep in the array S
, not the actual integers, but their indices(positions) in the set. We do not keep {1, 2, 4, 5, 8}
, but keep {4, 5, 3, 7, 8}
.
That is input[4] = 1, input[5] = 2, input[3] = 4, input[7] = 5, input[8] = 8.
If we update properly the parent array, the actual LIS is:
input[S[lastElementOfS]],
input[parent[S[lastElementOfS]]],
input[parent[parent[S[lastElementOfS]]]],
........................................
Now to the important thing - how do we update the parent array? There are two options:
If
X
> last element inS
, thenparent[indexX] = indexLastElement
. This means the parent of the newest element is the last element. We just prependX
to the end ofS
.Otherwise find the index of the smallest element in
S
, which is>=
thanX
, and change it toX
. Hereparent[indexX] = S[index - 1]
.
Petar Minchev's explanation helped clear things up for me, but it was hard for me to parse what everything was, so I made a Python implementation with overly-descriptive variable names and lots of comments. I did a naive recursive solution, the O(n^2) solution, and the O(n log n) solution.
I hope it helps clear up the algorithms!
The Recursive Solution
def recursive_solution(remaining_sequence, bigger_than=None):
"""Finds the longest increasing subsequence of remaining_sequence that is
bigger than bigger_than and returns it. This solution is O(2^n)."""
# Base case: nothing is remaining.
if len(remaining_sequence) == 0:
return remaining_sequence
# Recursive case 1: exclude the current element and process the remaining.
best_sequence = recursive_solution(remaining_sequence[1:], bigger_than)
# Recursive case 2: include the current element if it's big enough.
first = remaining_sequence[0]
if (first > bigger_than) or (bigger_than is None):
sequence_with = [first] + recursive_solution(remaining_sequence[1:], first)
# Choose whichever of case 1 and case 2 were longer.
if len(sequence_with) >= len(best_sequence):
best_sequence = sequence_with
return best_sequence
The O(n^2) Dynamic Programming Solution
def dynamic_programming_solution(sequence):
"""Finds the longest increasing subsequence in sequence using dynamic
programming. This solution is O(n^2)."""
longest_subsequence_ending_with = []
backreference_for_subsequence_ending_with = []
current_best_end = 0
for curr_elem in range(len(sequence)):
# It's always possible to have a subsequence of length 1.
longest_subsequence_ending_with.append(1)
# If a subsequence is length 1, it doesn't have a backreference.
backreference_for_subsequence_ending_with.append(None)
for prev_elem in range(curr_elem):
subsequence_length_through_prev = (longest_subsequence_ending_with[prev_elem] + 1)
# If the prev_elem is smaller than the current elem (so it's increasing)
# And if the longest subsequence from prev_elem would yield a better
# subsequence for curr_elem.
if ((sequence[prev_elem] < sequence[curr_elem]) and
(subsequence_length_through_prev >
longest_subsequence_ending_with[curr_elem])):
# Set the candidate best subsequence at curr_elem to go through prev.
longest_subsequence_ending_with[curr_elem] = (subsequence_length_through_prev)
backreference_for_subsequence_ending_with[curr_elem] = prev_elem
# If the new end is the best, update the best.
if (longest_subsequence_ending_with[curr_elem] >
longest_subsequence_ending_with[current_best_end]):
current_best_end = curr_elem
# Output the overall best by following the backreferences.
best_subsequence = []
current_backreference = current_best_end
while current_backreference is not None:
best_subsequence.append(sequence[current_backreference])
current_backreference = (backreference_for_subsequence_ending_with[current_backreference])
best_subsequence.reverse()
return best_subsequence
The O(n log n) Dynamic Programming Solution
def find_smallest_elem_as_big_as(sequence, subsequence, elem):
"""Returns the index of the smallest element in subsequence as big as
sequence[elem]. sequence[elem] must not be larger than every element in
subsequence. The elements in subsequence are indices in sequence. Uses
binary search."""
low = 0
high = len(subsequence) - 1
while high > low:
mid = (high + low) / 2
# If the current element is not as big as elem, throw out the low half of
# sequence.
if sequence[subsequence[mid]] < sequence[elem]:
low = mid + 1
# If the current element is as big as elem, throw out everything bigger, but
# keep the current element.
else:
high = mid
return high
def optimized_dynamic_programming_solution(sequence):
"""Finds the longest increasing subsequence in sequence using dynamic
programming and binary search (per
http://en.wikipedia.org/wiki/Longest_increasing_subsequence). This solution
is O(n log n)."""
# Both of these lists hold the indices of elements in sequence and not the
# elements themselves.
# This list will always be sorted.
smallest_end_to_subsequence_of_length = []
# This array goes along with sequence (not
# smallest_end_to_subsequence_of_length). Following the corresponding element
# in this array repeatedly will generate the desired subsequence.
parent = [None for _ in sequence]
for elem in range(len(sequence)):
# We're iterating through sequence in order, so if elem is bigger than the
# end of longest current subsequence, we have a new longest increasing
# subsequence.
if (len(smallest_end_to_subsequence_of_length) == 0 or
sequence[elem] > sequence[smallest_end_to_subsequence_of_length[-1]]):
# If we are adding the first element, it has no parent. Otherwise, we
# need to update the parent to be the previous biggest element.
if len(smallest_end_to_subsequence_of_length) > 0:
parent[elem] = smallest_end_to_subsequence_of_length[-1]
smallest_end_to_subsequence_of_length.append(elem)
else:
# If we can't make a longer subsequence, we might be able to make a
# subsequence of equal size to one of our earlier subsequences with a
# smaller ending number (which makes it easier to find a later number that
# is increasing).
# Thus, we look for the smallest element in
# smallest_end_to_subsequence_of_length that is at least as big as elem
# and replace it with elem.
# This preserves correctness because if there is a subsequence of length n
# that ends with a number smaller than elem, we could add elem on to the
# end of that subsequence to get a subsequence of length n+1.
location_to_replace = find_smallest_elem_as_big_as(sequence, smallest_end_to_subsequence_of_length, elem)
smallest_end_to_subsequence_of_length[location_to_replace] = elem
# If we're replacing the first element, we don't need to update its parent
# because a subsequence of length 1 has no parent. Otherwise, its parent
# is the subsequence one shorter, which we just added onto.
if location_to_replace != 0:
parent[elem] = (smallest_end_to_subsequence_of_length[location_to_replace - 1])
# Generate the longest increasing subsequence by backtracking through parent.
curr_parent = smallest_end_to_subsequence_of_length[-1]
longest_increasing_subsequence = []
while curr_parent is not None:
longest_increasing_subsequence.append(sequence[curr_parent])
curr_parent = parent[curr_parent]
longest_increasing_subsequence.reverse()
return longest_increasing_subsequence
The following C++ implementation includes also some code that builds the actual longest increasing subsequence using an array called prev
.
std::vector<int> longest_increasing_subsequence (const std::vector<int>& s)
{
int best_end = 0;
int sz = s.size();
if (!sz)
return std::vector<int>();
std::vector<int> prev(sz,-1);
std::vector<int> memo(sz, 0);
int max_length = std::numeric_limits<int>::min();
memo[0] = 1;
for ( auto i = 1; i < sz; ++i)
{
for ( auto j = 0; j < i; ++j)
{
if ( s[j] < s[i] && memo[i] < memo[j] + 1 )
{
memo[i] = memo[j] + 1;
prev[i] = j;
}
}
if ( memo[i] > max_length )
{
best_end = i;
max_length = memo[i];
}
}
// Code that builds the longest increasing subsequence using "prev"
std::vector<int> results;
results.reserve(sz);
std::stack<int> stk;
int current = best_end;
while (current != -1)
{
stk.push(s[current]);
current = prev[current];
}
while (!stk.empty())
{
results.push_back(stk.top());
stk.pop();
}
return results;
}
Implementation with no stack just reverse the vector
#include <iostream>
#include <vector>
#include <limits>
std::vector<int> LIS( const std::vector<int> &v ) {
auto sz = v.size();
if(!sz)
return v;
std::vector<int> memo(sz, 0);
std::vector<int> prev(sz, -1);
memo[0] = 1;
int best_end = 0;
int max_length = std::numeric_limits<int>::min();
for (auto i = 1; i < sz; ++i) {
for ( auto j = 0; j < i ; ++j) {
if (s[j] < s[i] && memo[i] < memo[j] + 1) {
memo[i] = memo[j] + 1;
prev[i] = j;
}
}
if(memo[i] > max_length) {
best_end = i;
max_length = memo[i];
}
}
// create results
std::vector<int> results;
results.reserve(v.size());
auto current = best_end;
while (current != -1) {
results.push_back(s[current]);
current = prev[current];
}
std::reverse(results.begin(), results.end());
return results;
}