Spaced repetition (SRS) for learning
What you want to do is to have a number X_i
for all questions i
. You can normalize these numbers (make their sum 1) and pick one at random with the corresponding probability.
If N
is the number of different questions and M
is the number of times each question has been answered in average, then you could find X
in M*N
time like this:
- Create the array
X[N]
set to 0. - Run through the data, and every time you see question
i
answered wrong, increaseN[i]
byf(t)
wheret
is the answering time andf
is an increasing function.
Because f
is increasing, a question answered wrong a long time ago has less impact than one answered wrong yesterday. You can experiment with different f
to get a nice behaviour.
The smarter way
A faster way is not to generate X[]
every time you choose questions, but save it in a database table.
You won't be able to apply f
with this solution. Instead just add 1 every time the question is answered wrongly, and then run through the table regularly - say every midnight - and multiply all X[i]
by a constant - say 0.9
.
Update: Actually you should base your data on corrects, not wrongs. Otherwise, questions not answered neither true nor false for a long time, will have a smaller chance of getting chosen. It should be opposite.
Anki is an open source program implementing spaced repetition. Being open source, you can browse the source for libanki, a spaced repetition library for Anki. As of Januray 2013, Anki version 2 sources can be browsed here.
The sources are in Python
, the executable pseudo code language.
Reading the source to understand the algorithm may be feasible. The data model is defined using sqlalechmey
, the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.