Detecting and fixing outliers in a GPS trajectory
Algorithm I use.
- Calculate Euclidean minimum spanning tree of points:
- Find 2 points most far apart from each other on this network
- Find shortest route between them:
As one can see it might cut corner on a sharp turn.
I have ArcGIS python implementation of above algorithm, it uses networkx module. Let me know if this is of interest and I'll update my answer with the script
UPDATE:
# Connects points to make polyline. Makes 1 line at a time
# Tool assumes that 1st layer in Table of Conternt is TARGET polyline feature class,
# second layer in TOC is SOURCE point fc.
# If no selection found in SOURCE layer, works on entire dataset
import arcpy, traceback, os, sys
import itertools as itt
from math import sqrt
sys.path.append(r'C:\Users\felix_pertziger\AppData\Roaming\Python\Python27\site-packages')
import networkx as nx
from networkx import dijkstra_path_length
try:
def showPyMessage():
arcpy.AddMessage(str(time.ctime()) + " - " + message)
def CheckLayerLine(infc):
d=arcpy.Describe(infc)
theType=d.shapeType
if theType!="Polyline":
arcpy.AddWarning("\nTool designed to work with polylines as TARGET!")
raise NameError, "Wrong input\n"
return d
def CheckLayerPoint(infc):
d=arcpy.Describe(infc)
theType=d.shapeType
if theType!="Point":
arcpy.AddWarning("\nTool designed to work with points as SOURCE!")
raise NameError, "Wrong input\n"
return d
mxd = arcpy.mapping.MapDocument("CURRENT")
layers = arcpy.mapping.ListLayers(mxd)
if len(layers)<=1:
arcpy.AddWarning("\nNot enough layers in the view!")
raise NameError, "Wrong input\n"
destLR, sourceLR=layers[0],layers[1]
a = CheckLayerPoint(sourceLR);d = CheckLayerLine(destLR)
# copy all points to manageable list
g=arcpy.Geometry()
geometryList=arcpy.CopyFeatures_management(sourceLR,g)
nPoints=len(geometryList)
arcpy.AddMessage('Computing minimum spanning tree')
list2connect=[p.firstPoint for p in geometryList]
# create network
p=list(itt.combinations(range(nPoints), 2))
arcpy.SetProgressor("step", "", 0, len(p),1)
G=nx.Graph()
for f,t in p:
p1=list2connect[f]
p2=list2connect[t]
dX=p2.X-p1.X;dY=p2.Y-p1.Y
lenV=sqrt(dX*dX+dY*dY)
G.add_edge(f,t,weight=lenV)
arcpy.SetProgressorPosition()
arcpy.AddMessage(len(G.edges()))
mst=nx.minimum_spanning_tree(G)
del G
# find remotest pair
arcpy.AddMessage(len(mst.edges()))
length0=nx.all_pairs_dijkstra_path_length(mst)
lMax=0
for f,t in p:
lCur=length0[f][t]
if lCur>lMax:
lMax=lCur
best=(f,t)
gL=nx.dijkstra_path(mst,best[0],best[1])
del mst
nPoints=len(gL)
ordArray=arcpy.Array()
for i in gL: ordArray.add(list2connect[i])
# append line to TARGET
curT = arcpy.da.InsertCursor(destLR,"SHAPE@")
curT.insertRow((arcpy.Polyline(ordArray),))
arcpy.RefreshActiveView()
del curT
except:
message = "\n*** PYTHON ERRORS *** "; showPyMessage()
message = "Python Traceback Info: " + traceback.format_tb(sys.exc_info()[2])[0]; showPyMessage()
message = "Python Error Info: " + str(sys.exc_type)+ ": " + str(sys.exc_value) + "\n"; showPyMessage()
One idea is to create a script that lists the angles (and maybe the length also) of every segment of your path. Now you can compare the values of every segment with its direct neighbours (and possibly the second neighbours also to increase accuracy) and select all those points where the values exceed a given threashold-value. Finally simply delete the points from your path.
Also worth looking at is the Median-5 method.
Each x (or y) coordinate is set to the median of the 5 x (or y) values around it in sequence (i.e. itself, the two previous values and the two subsequent values).
e.g. x3 = median(x1,x2,x3,x4,x5) y3 = median(y1,y2,y3,y4,y5) etc.
Method is quick and is also easy to use on streaming data.