What is behind experimental function: FindFormula?

The Experimental function FindFormula[] at the moment is using a combination of different methods: it combines non linear regression with Markov chain Monte Carlo methods (e.g. Metropolisā€“Hastings algorithm). In the future (possibly in V$10.3$) there will be an option allowing the user to choose which method to use.


I doubt that this is very robust. Consider a simple change in the DE example in the Documentation:

sol = y /. NDSolve[{y'[x] == y[x] Cos[x], y[0] == 2}, y, {x, -5, 300}][[1]];
times = N[Range[-5, 600]/9];
data = Transpose[{times, sol[times] + RandomReal[0.05, Length[times]]}];
lp = ListPlot[data, PlotRange -> All]

Now

FindFormula[data, x, 1, TargetFunctions -> {Exp, Sin, Cos}]

thinks the best solution is 2.27414 Sin[x] + 2.5479. Whereas a much better solution, obviously compatible with the selected TargetFunctions, is 2 Exp[Sin[x]].


The following reveals definitions

<< GeneralUtilities`
PrintDefinitions@FindFormula

As usual one can click the symbols to find definitions of functions "further down". It should also be noted that FindFormula is listed in the Machine Learning guide, which corresponds to symbol names like SymbolicMachineLearning`PackageScope`ImputArgumentsTestFindFormula shown further down by PrintDefinitions.