Plane fitting in a 3d point cloud

Update

This functionality is now integrated in https://github.com/daavoo/pyntcloud and makes the plane fitting process much simplier:

Given a point cloud:

enter image description here

You just need to add a scalar field like this:

is_floor = cloud.add_scalar_field("plane_fit")

Wich will add a new column with value 1 for the points of the plane fitted.

You can visualize the scalar field:

enter image description here


Old answer

I think that you could easily use PCA to fit the plane to the 3D points instead of regression.

Here is a simple PCA implementation:

def PCA(data, correlation = False, sort = True):
""" Applies Principal Component Analysis to the data

Parameters
----------        
data: array
    The array containing the data. The array must have NxM dimensions, where each
    of the N rows represents a different individual record and each of the M columns
    represents a different variable recorded for that individual record.
        array([
        [V11, ... , V1m],
        ...,
        [Vn1, ... , Vnm]])

correlation(Optional) : bool
        Set the type of matrix to be computed (see Notes):
            If True compute the correlation matrix.
            If False(Default) compute the covariance matrix. 
            
sort(Optional) : bool
        Set the order that the eigenvalues/vectors will have
            If True(Default) they will be sorted (from higher value to less).
            If False they won't.   
Returns
-------
eigenvalues: (1,M) array
    The eigenvalues of the corresponding matrix.
    
eigenvector: (M,M) array
    The eigenvectors of the corresponding matrix.

Notes
-----
The correlation matrix is a better choice when there are different magnitudes
representing the M variables. Use covariance matrix in other cases.

"""

mean = np.mean(data, axis=0)

data_adjust = data - mean

#: the data is transposed due to np.cov/corrcoef syntax
if correlation:
    
    matrix = np.corrcoef(data_adjust.T)
    
else:
    matrix = np.cov(data_adjust.T) 

eigenvalues, eigenvectors = np.linalg.eig(matrix)

if sort:
    #: sort eigenvalues and eigenvectors
    sort = eigenvalues.argsort()[::-1]
    eigenvalues = eigenvalues[sort]
    eigenvectors = eigenvectors[:,sort]

return eigenvalues, eigenvectors

And here is how you could fit the points to a plane:

def best_fitting_plane(points, equation=False):
""" Computes the best fitting plane of the given points

Parameters
----------        
points: array
    The x,y,z coordinates corresponding to the points from which we want
    to define the best fitting plane. Expected format:
        array([
        [x1,y1,z1],
        ...,
        [xn,yn,zn]])
        
equation(Optional) : bool
        Set the oputput plane format:
            If True return the a,b,c,d coefficients of the plane.
            If False(Default) return 1 Point and 1 Normal vector.    
Returns
-------
a, b, c, d : float
    The coefficients solving the plane equation.

or

point, normal: array
    The plane defined by 1 Point and 1 Normal vector. With format:
    array([Px,Py,Pz]), array([Nx,Ny,Nz])
    
"""

w, v = PCA(points)

#: the normal of the plane is the last eigenvector
normal = v[:,2]
   
#: get a point from the plane
point = np.mean(points, axis=0)


if equation:
    a, b, c = normal
    d = -(np.dot(normal, point))
    return a, b, c, d
    
else:
    return point, normal    

However as this method is sensitive to outliers you could use RANSAC to make the fit robust to outliers.

There is a Python implementation of ransac here.

And you should only need to define a Plane Model class in order to use it for fitting planes to 3D points.

In any case if you can clean the 3D points from outliers (maybe you could use a KD-Tree S.O.R filter to that) you should get pretty good results with PCA.

Here is an implementation of an S.O.R:

def statistical_outilier_removal(kdtree, k=8, z_max=2 ):
""" Compute a Statistical Outlier Removal filter on the given KDTree.

Parameters
----------                        
kdtree: scipy's KDTree instance
    The KDTree's structure which will be used to
    compute the filter.
    
k(Optional): int
    The number of nearest neighbors wich will be used to estimate the 
    mean distance from each point to his nearest neighbors.
    Default : 8
    
z_max(Optional): int
    The maximum Z score wich determines if the point is an outlier or 
    not.
    
Returns
-------
sor_filter : boolean array
    The boolean mask indicating wherever a point should be keeped or not.
    The size of the boolean mask will be the same as the number of points
    in the KDTree.
    
Notes
-----    
The 2 optional parameters (k and z_max) should be used in order to adjust
the filter to the desired result.

A HIGHER 'k' value will result(normally) in a HIGHER number of points trimmed.

A LOWER 'z_max' value will result(normally) in a HIGHER number of points trimmed.

"""

distances, i = kdtree.query(kdtree.data, k=k, n_jobs=-1) 

z_distances = stats.zscore(np.mean(distances, axis=1))

sor_filter = abs(z_distances) < z_max

return sor_filter

You could feed the function with a KDtree of your 3D points computed maybe using this implementation