DeleteDuplicates[] does not work as expected on floating point values
You would do well to understand the difference between tools that are intended for structural operations and those that are intended for mathematical operations. DeleteDuplicates
is of the former, generally speaking. As such it is comparing the exact FullForm
of the objects, or at least something close (caveat).
As b.gatessucks recommends in a comment you can use a mathematical comparison function for the equivalence test of DeleteDuplicates
, e.g.:
DeleteDuplicates[m, Abs[#1 - #2] < 10^-12 &]
{1.06423 + 0.0968739 I, 0.0250407 + 1.00352 I}
Incidentally you could also use Union
, but the syntax is a bit different. Note the ( )
.
Union[m, SameTest -> (Abs[#1 - #2] < 10^-12 &)]
{0.0250407 + 1.00352 I, 1.06423 + 0.0968739 I}
Using InputForm
to show all of the digits of your expression you can see that they are not structurally identical in the (approximate) way that Mathematica "sees" them:
m // InputForm
{1.0642275928442373 + 0.09687392021742822*I, 1.0642275928442366 + 0.09687392021742817*I, 1.0642275928442366 + 0.09687392021742797*I, 1.064227592844237 + 0.09687392021742822*I, 1.0642275928442373 + 0.09687392021742852*I, 1.0642275928442366 + 0.09687392021742793*I, 1.0642275928442368 + 0.09687392021742801*I, 0.025040728196256346 + 1.0035162552538588*I, 1.0642275928442377 + 0.0968739202174282*I, 1.0642275928442375 + 0.0968739202174283*I}
Performance
Yves reminded me to mention something about the performance of using a custom comparison function in DeleteDuplicates
or Union
as I did above. For long lists this is always considerably slower than using the default method. I gave an example with timings in How to represent a list as a cycle.
To apply that method here we could Round
the numbers beforehand:
Round[m, 10^-12] // DeleteDuplicates // N
{1.06423 + 0.0968739 I, 0.0250407 + 1.00352 I}
I added // N
to convert back to machine precision, but the values will not be precisely the same. This probably doesn't matter if you consider numbers this close to be duplicates, but should you want the unchanged numbers you could use GatherBy
and get performance not far distant.
First /@ GatherBy[m, Round[#, 10^-6] &]
Version 10.0 introduced DeleteDuplicatesBy
which works similarly to the GatherBy
method; it has the following syntax:
DeleteDuplicatesBy[m, Round[#, 10^-6] &]
However it may not perform as well as GatherBy
; see:
- DeleteDuplicatesBy is not performing as I'd hoped. Am I missing something?