Interpolate the noisy data
This gets the data:
data1 = Get["https://pastebin.com/raw/Z3GGhH2b"];
There are outliers in the data. This can be seen with this command:
qs = Range[0, 0.95, 0.25]~Join~Range[0.95, 1, 0.01];
TableForm[Transpose[{qs, Quantile[Abs[Log10@data1], qs]}]]
Or with this plot:
ListPlot[data1, PlotRange -> All]
This finds the outlier positions:
Import["https://raw.githubusercontent.com/antononcube/\
MathematicaForPrediction/master/OutlierIdentifiers.m"]
olPos = OutlierPosition[data1, SPLUSQuartileIdentifierParameters]
(* {92, 93, 95, 99, 105, 109, 112, 114} *)
Here is the plot of the data without the outliers:
ListPlot[Delete[data1, List /@ olPos], PlotRange -> All]
We have identified that the outlier presence causes the problems in question. There are three ways to deal with that situation:
ignore the outliers (this answer),
replace the outlier values with average from neighbors (george2079 answer),
use a more robust filter,
MedianFilter
, (bill s answer).
Below is the modified GaussianFilter
plots code in the question over data with ignored outliers. Note that ignoring the outliers is not a simple removal from the original 1D data array. We remove the outliers of the corresponding time series (2D array) and then do the filtering. Also, we have to use TimeSeries
in order to make GaussianFilter
work over the 2D array.
Block[{data = Transpose[{Range[Length[data1]], data1}]},
data = Delete[data, List /@ olPos];
Table[Show[{
ListPlot[data, PlotTheme -> "Detailed"],
ListLinePlot[GaussianFilter[TimeSeries[data], l], PlotStyle -> Red]
}, PlotLabel -> Row[{"l=", l}], ImageSize -> 300], {l, 0, 30,
10}]]
For impulsive noises, you are probably better off with a Median filter than with a Gaussian Filter, since it is better able to remove the effect of outliers.
data = Uncompress[FromCharacterCode[
Flatten[ImageData[Import["http://i.stack.imgur.com/RZcpj.png"],"Byte"]]]];
smoothed = MedianFilter[data, 5];
Show[ListPlot[data], ListPlot[smoothed, PlotStyle -> Green]]
And here is the same filtering applied to your second data set:
data2 = Uncompress[FromCharacterCode[
Flatten[ImageData[Import["http://i.stack.imgur.com/WYcxd.png"], "Byte"]]]];
Show[ListPlot[data2], ListPlot[MedianFilter[data2, 5], PlotStyle -> Green]]
One thing to beware of, if you simply delete your outliers you are effectively shifting all of the following data. This may or may not be important obviously depending on what you want to do with the result, and on how many outliers you need to drop.
Here is an approach where we keep the outlier positions and replace the bad values with local averages:
outliers = Flatten[Position[data , x_ /; x > 1 || x < 0]]
(data[[#]] = Mean[Select[data[[# - 3 ;; # + 3]], 0 < # < 1 &]] ) & /@
outliers;
Show[{
ListPlot[GaussianFilter[data, 3], Joined -> True] ,
ListPlot[data, PlotStyle -> Red]}]
If you want you can go back and replace the outlier values with the filter data.
data[[outliers]] = GaussianFilter[data, 3][[outliers]]
Now look at the data we "made up"
Show[{
ListPlot[GaussianFilter[data, 3], Joined -> True] ,
ListPlot[data, PlotStyle -> Red]},
Epilog -> {PointSize[0.015],
Point[Transpose[{outliers, data[[outliers]]}]]},
PlotRange -> {{60, 140}, {0, 10^-5}}, AxesOrigin -> {60, 0}]