Finding the min max of a stock chart
You'll notice that a-lot of the answers go for derivatives with some sort of low-pass filtering. A moving average of some kind, if you will. Both the fft, the square-window moving average and the exponential moving average are all pretty similar on a fundamental level. However, given the choice over all moving averages, which is the best one?
The answer: The Gaussian moving average; That of the normal distribution, of which you are aware.
The reason: The Gaussian filter is the only filter that will never produce a "fake" maximum; A maximum that was not there to begin with. This has been theoretically proven for both continuous and discrete data (Make sure you use the discrete Gaussian for discrete data though!). As you increase the Gaussian sigma, the local maxima and minima will merge, in a most intuitive way. Thus if you want there to be no more than one local maximum per day, you set sigma to one, ET cetera.
I usually use a combination of Moving Average and Exponential Moving Average. It proved (empirically) to be well fitted for the task (enough for my needs, at least). The results are tuned with only two parameters. Here is a sample:
Edit
In case it is useful for somebody, here is my Mathematica code:
f[sym_] := Module[{l},
(*get data*)
l = FinancialData[sym, "Jan. 1, 2010"][[All, 2]];
(*perform averages*)
l1 = ExponentialMovingAverage[MovingAverage[l, 10], .2];
(*calculate ma and min positions in the averaged list*)
l2 = {#[[1]], l1[[#[[1]]]]} & /@
MapIndexed[If[#1[[1]] < #1[[2]] > #1[[3]], #2, Sequence @@ {}] &,
Partition[l1, 3, 1]];
l3 = {#[[1]], l1[[#[[1]]]]} & /@
MapIndexed[If[#1[[1]] > #1[[2]] < #1[[3]], #2, Sequence @@ {}] &,
Partition[l1, 3, 1]];
(*correlate with max and mins positions in the original list*)
maxs = First /@ (Ordering[-l[[#[[1]] ;; #[[2]]]]] + #[[1]] -
1 & /@ ({4 + #[[1]] - 5, 4 + #[[1]] + 5} & /@ l2));
mins = Last /@ (Ordering[-l[[#[[1]] ;; #[[2]]]]] + #[[1]] -
1 & /@ ({4 + #[[1]] - 5, 4 + #[[1]] + 5} & /@ l3));
(*Show the plots*)
Show[{
ListPlot[l, Joined -> True, PlotRange -> All,
PlotLabel ->
Style[Framed[sym], 16, Blue, Background -> Lighter[Yellow]]],
ListLinePlot[ExponentialMovingAverage[MovingAverage[l, 10], .2]],
ListPlot[{#, l[[#]]} & /@ maxs,
PlotStyle -> Directive[PointSize[Large], Red]],
ListPlot[{#, l[[#]]} & /@ mins,
PlotStyle -> Directive[PointSize[Large], Black]]},
ImageSize -> 400]
]
I don't know what you mean by "simple derivatives". I understand it to mean you have tested a gradient descent and found it unsatisfactory because of the abundance of local extrema. If so, you want to look at simulated annealing:
Annealing is a metallurgical process used to temper metals through a heating and cooling treatment. (...). These irregularities are due to atoms being stuck in the wrong place of the structure. In the process of annealing, the metal is heated up and then allowed to cool down slowly. Heating up gives the atoms the energy they need to get un-stuck, and the slow cool-down period allows them to move to their correct location in the structure.
(...) However, in order to escape local optima, the algorithm will have a probability of taking a step in a bad direction: in other words, of taking a step that increases the value for a minimization problem or that decreases the value for a maximization problem. To simulate the annealing process, this probability will depend in part on a "temperature" parameter in the algorithm, which is initialized at a high value and decreased at each iteration. Consequently, the algorithm will initially have a high probability of moving away from a nearby (likely local) optimum. Over the iterations that probability will decrease and the algorithm will converge on the (hopefully global) optimum it did not have the chance to escape from. (source, cuts &, emphasis mine)
I know that local optima is precisely what the circles in your drawing represent, above, and hence what you want to find. But, as I interpret the quote "so many local mins and maxes simple derivatives won't work.", this is also precisely what you find too much of. I assume you have trouble with all the "zig-zag" you curve makes between two circled points.
All that seems to differentiate the optima you circle from the rest of the points of the curve is their globality, precisely: to find a lower point than the first point you circle on the left you have to go further away either way in the x coordinate than you need to do the same for its close neighbors. That's what annealing gives you: depending on the temperature parameter, you can control the size of the jumps you allow yourself to make. There has to be a value for which you catch the "big" local optima, and yet miss the "small" ones. What I'm suggesting isn't revolutionary: there are several examples (e.g. 1 2) where people have obtained nice results from such noisy data.