What are the implications of invalid geometries
Keeping malformed data is a bad idea, because you can never predict when and where will the failure occur. Moreover, malformed data can cause Heisenbugs, the most vicious and illusive type of bugs.
I think that it is a bit pointless to discuss the possible outcome of storing invalid geometries. Having that said, The consequences can include:
- Wrong results (that is, the
ST_Distance
will return inaccurate or plain wrong figures) - Database performance issues: Keeping malformed data can seriously damage the database performance and create huge log file, because every function call will write an error to the log and disrupted the ordinary database work.
- Database crashes.
- Application crashes - either caused by receiving malformed data from the database, or by receiving unreasonable outcome (negative distance, for example).
- Phantom behaviour (see link above). This is the worst consequence of all. You'll have strange things happening. Slowdowns, data loss, crashes, unreasonable results, long pauses, no responsiveness and many other curses. You might not be able to spot them or reproduce them, because they all fall under the "undefined" category in every documentation.
My advice - if small buffers do not significantly harm your data consistency, use them to prevent any of the above from happening. Keep your data valid.
You can prevent invalid geometries entering your database in the first place. For PostgreSQL/PostGIS users, this is simple to do with check constraints. For example, consider a table public.my_valid_table
with a column of polygon geometries geom
, use the following SQL/DDL:
ALTER TABLE public.my_valid_table
ADD CONSTRAINT enforce_valid_geom CHECK (st_isvalid(geom));
Note: this table has to have valid polygons before enforcing the constraint.
If you then try to insert/add an invalid geometry, you will see an error:
ERROR: new row for relation "my_valid_table" violates check constraint "enforce_valid_geom"