calendar scheduler algorithm

Hmmm, this reminds me of a task in the university, I'll describe what i can remember The run-time is O(n*logn) which is pretty good.

This is a greedy approuch.. i will refine your request abit, tell me if i'm wrong.. Algorithem should return the MAX subset of non colliding tasks(in terms of total length? or amount of activities? i guess total length)

I would first order the list by the finishing times(first-minimum finishing time,last-maximum) = O(nlogn)

Find_set(A):
  G<-Empty set;
  S<-A
  f<-0
  while S!='Empty set' do
    i<-index of activity with earliest finish time(**O(1)**)
    if S(i).finish_time>=f
      G.insert(S(i)) \\add this to result set
      f=S(i).finish_time
    S.removeAt(i) \\remove the activity from the original set
  od
  return G

Run time analysis: initial ordering :nlogn each iteration O(1)*n = O(n)

Total O(nlogn)+O(n) ~ O(nlogn) (well, given the O notation weakness to represent real complexety on small numbers.. but as the scale grow, this is a good algo)

Enjoy.

Update:

Ok, it seems like i've misread the post, you can alternatively use dynamic programming to reduce running time, there is a solution in link text page 7-19.

you need to tweak the algorithm a bit, first you should build the table, then you can get all variations on it fairly easy.


I would use an Interval Tree for this.

After you build the data structure, you can iterate each event and perform an intersection query. If no intersections are found, it is added to your schedule.


This can be solved using graph theory. I would create an array, which contains the items sorted by start time and end time for equal start times: (added some more items to the example):

no.:  id: [ start  -   end  ] type
---------------------------------------------------------
 0:  234: [08:00AM - 09:00AM] Breakfast With Mindy
 1:  400: [09:00AM - 07:00PM] Check out stackoverflow.com
 2:  219: [11:40AM - 12:40PM] Go to Gym
 3:   79: [12:00PM - 01:00PM] Lunch With Steve
 4:  189: [12:40PM - 01:20PM] Lunch With Steve
 5:  270: [01:00PM - 05:00PM] Go to Tennis
 6:  300: [06:40PM - 07:20PM] Dinner With Family
 7:  250: [07:20PM - 08:00PM] Check out stackoverflow.com

After that i would create a list with the array no. of the least item that could be the possible next item. If there isn't a next item, -1 is added:

 0 |  1 |  2 |  3 |  4 |  5 |  6 |  7
 1 |  7 |  4 |  5 |  6 |  6 |  7 | -1

With that list it is possible to generate a directed acyclic graph. Every vertice has a connection to the vertices starting from the next item. But for vertices where already is a vertices bewteen them no edge is made. I'll try to explain with the example. For the vertice 0 the next item is 1. So a edge is made 0 -> 1. The next item from 1 is 7, that means the range for the vertices which are connected from vertice 0 is now from 1 to (7-1). Because vertice 2 is in the range of 1 to 6, another edge 0 -> 2 is made and the range updates to 1 to (4-1) (because 4 is the next item of 2). Because vertice 3 is in the range of 1 to 3 one more edge 0 -> 3 is made. That was the last edge for vertice 0. That has to be continued with all vertices leading to such a graph:

example graph

Until now we are in O(n2). After that all paths can be found using a depth first search-like algorithm and then eliminating the duplicated types from each path. For that example there are 4 solutions, but none of them has all types because it is not possible for the example to do Go to Gym, Lunch With Steve and Go to Tennis.

Also this search for all paths has a worst case complexity of O(2n). For example the following graph has 2n/2 possible paths from a start vertice to an end vertice.

example graph2
(source: archive.org)

There could be made some more optimisation, like merging some vertices before searching for all paths. But that is not ever possible. In the first example vertice 3 and 4 can't be merged even though they are of the same type. But in the last example vertice 4 and 5 can be merged if they are of the same type. Which means it doesn't matter which activity you choose, both are valid. This can speed up calculation of all paths dramatically.

Maybe there is also a clever way to consider duplicate types earlier to eliminate them, but worst case is still O(2n) if you want all possible paths.

EDIT1:

It is possible to determine if there are sets that contain all types and get a t least one such solution in polynomial time. I found a algorithm with a worst case time of O(n4) and O(n2) space. I'll take an new example which has a solution with all types, but is more complex.

no.:  id: [ start  -   end  ] type
---------------------------------------------------------
 0:  234: [08:00AM - 09:00AM] A
 1:  400: [10:00AM - 11:00AM] B
 2:  219: [10:20AM - 11:20AM] C
 3:   79: [10:40AM - 11:40AM] D
 4:  189: [11:30AM - 12:30PM] D
 5:  270: [12:00PM - 06:00PM] B
 6:  300: [02:00PM - 03:00PM] E
 7:  250: [02:20PM - 03:20PM] B
 8:  325: [02:40PM - 03:40PM] F
 9:  150: [03:30PM - 04:30PM] F
10:  175: [05:40PM - 06:40PM] E
11:  275: [07:00PM - 08:00PM] G

example graph3

1.) Count the different types in the item set. This is possible in O(nlogn). It is 7 for that example.

2.) Create a n*n-matrix, that represents which nodes can reach the actual node and which can be reached from the actual node. For example if position (2,4) is set to 1, means that there is a path from node 2 to node 4 in the graph and (4,2) is set to 1 too, because node 4 can be reached from node 2. This is possible in O(n2). For the example the matrix would look like that:

111111111111
110011111111
101011111111
100101111111
111010111111
111101000001
111110100111
111110010111
111110001011
111110110111
111110111111
111111111111

3.) Now we have in every row, which nodes can be reached. We can also mark each node in a row which is not yet marked, if it is of the same type as a node that can be reached. We set that matrix positions from 0 to 2. This is possible in O(n3). In the example there is no way from node 1 to node 3, but node 4 has the same type D as node 3 and there is a path from node 1 to node 4. So we get this matrix:

111111111111
110211111111
121211111111
120121111111
111212111111
111121020001
111112122111
111112212111
111112221211
111112112111
111112111111
111111111111

4.) The nodes that still contains 0's (in the corresponding rows) can't be part of the solution and we can remove them from the graph. If there were at least one node to remove we start again in step 2.) with the smaller graph. Because we removed at least one node, we have to go back to step 2.) at most n times, but most often this will only happend few times. If there are no 0's left in the matrix we can continue with step 5.). This is possible in O(n2). For the example it is not possible to build a path with node 1 that also contains a node with type C. Therefore it contains a 0 and is removed like node 3 and node 5. In the next loop with the smaller graph node 6 and node 8 will be removed.

5.) Count the different types in the remainig set of items/nodes. If it is smaller than the first count there is no solution that can represent all types. So we have to find another way to get a good solution. If it is the same as the first count we now have a smaller graph which still holds all the possible solutions. O(nlogn)

6.) To get one solution we pick a start node (it doesn't matter which, because all nodes that are left in the graph are part of a solution). O(1)

7.) We remove every node that can't be reached from the choosen node. O(n)

8.) We create a matrix like in step 2.) and 3.) for that graph and remove the nodes that can not reach nodes of any type like in step 4.). O(n3)

9.) We choose one of the next nodes from the node we choosen before and continue with 7.) until there we are at a end node and the graph only has one path left.

That way it is also possible to get all paths, but that can still be exponential many. After all it should be faster than finding solutions in the original graph.