Setting correct input for RNN
You can begin with a snippet that you mention in the question.
Any help on how the X and Y in model.fit should look like for my case?
X
should be a numpy matrix of shape [num samples, sequence length, D]
, where D
is a number of values per timestamp. I suppose D=1
in your case, because you only pass temperature value.
y
should be a vector of target values (as in the snippet). Either binary (alarm/not_alarm), or continuous (e.g. max temperature deviation). In the latter case you'd need to change sigmoid activation for something else.
Should i normalise the data beforehand
Yes, it's essential to preprocess your raw data. I see 2 crucial things to do here:
- Normalise temperature values with min-max or standardization (wiki, sklearn preprocessing). Plus, I'd add a bit of smoothing.
- Drop some fraction of last timestamps from all of the time-series to avoid information leak.
Finally, I'd say that this task is more complex than it seems to be. You might want to either find a good starter tutorial on time-series classification, or a course on machine learning in general. I believe you can find a better method than RNN.