Recurrent neural network in 11.1 explicit examples?
Here is a simple example that may help you get started. In this example, we are going to a predict a simple time series of a sinusoid wave.
data = Table[Sin[x], {x, 0, 100, 0.04}];
We will cut the data into windows of 51 data points. The first 50 points as a whole is our X, and the last data point is our Y.
training =
RandomSample[
List /@ Most[#] -> List@Last[#] & /@ (Partition[data, 51, 1])];
We use a single gated recurrent layer in our neural network
net = NetChain[{
GatedRecurrentLayer[10],
LinearLayer[1]}, "Input" -> {50, 1}, "Output" -> 1
]
and train with the training data
trained = NetTrain[net, training]
After training, we can use it to predict the time series. We first feed the neural network with 50 data points and then repeatedly use the data it generates to feed back into the neural network to generate the next data point. Here is a comparison between the ground truth and our predictions, which shows very good agreements.
ListPlot[{Flatten@
NestList[Append[Rest[#], trained[#]] &,
List /@ Sin[Range[-49*0.04, 0, 0.04]], 500][[All, -1]],
Table[Sin[x], {x, 0, 500*0.04, 0.04}]}, Joined -> True,
PlotLegends -> {"predicted", "ground truth"}]
Taking inspiration from the answer by xslittlegrass, I came up with the following solution.
Recall the sample data from this question. We have an observable obs
we are interested to predict:
and three parameters par1, par2, par3
that are correlated with the observable to some extent:
We only use the data for the first 700 time steps to train the model, and will try to predict the next 300 time steps.
We create a training
set of tlen
consecutive data points of length featn
each, which have the respectively following data point of length featn
as output. Then we train a model that returns featn
outputs:
dat = Transpose[{par1/Max[par1], par2/Max[par2], par3/Max[par3], obs/Max[obs]}];
tlen = 300; featn = Length[dat[[1]]];
training = Table[dat[[i ;; i + tlen - 1]] -> dat[[i + tlen]], {i, 1, Length[dat] - tlen}];
net = NetChain[{GatedRecurrentLayer[tlen, "Dropout" -> {"VariationalInput" -> 0.1 , "VariationalState" -> 0.5}], LinearLayer[featn]}, "Input" -> {tlen, featn}, "Output" -> {featn}]
trained = NetTrain[net, training, Method -> {"ADAM", "InitialLearningRate" -> 0.0001}]
The training takes about two minutes. Finally, we can iteratively predict the future 300 steps
datt = dat;
Dynamic[i]
Do[
start = datt[[-tlen ;;]];
AppendTo[datt, trained[start]];
, {i, 1, 300}]
Amazingly, the prediction is qualitatively correct, with amplitude deviation growing to about 15% over the course of 300 time steps!
ListPlot[{datt[[;; , 4]], tab}]
Any suggestions for how to improve upon the above?
For instance, let's assume you have a sequence of 3 and you have 8 input variables (features) X
Let Y be the output with values "yes" or "no" for each sequence of X
Let's X have a dimension of 195
You create a sequence of 3 for X
Xpartition = Partition[X, 3]
Now, you create your trainingData
:
trainingData = MapThread[Rule, {Xpartition, Y}]
You build your model:
net = NetChain[
{ LongShortTermMemoryLayer[32]
, SequenceLastLayer[]
, LinearLayer[2]
, SoftmaxLayer[]
}
, "Input" -> {3, 8}
, "Output" -> NetDecoder[{"Class", {"no", "yes"}}]
]
where 3 is the number of vectors in the sequence, and 8 is the length of vectors.