How to compare two structures via pointer equality in Elixir / Erlang
There is no "official" way to do this, and I would say that if you think you actually need to do this, you're doing something wrong and should ask another question about how to achieve the goal you want to achieve. So this answer is offered in the spirit of playfulness and exploration, in the hope that it spreads some interesting knowledge about the Erlang/Elixir VM.
There is a function, erts_debug:size/1
, that tells you how many memory "words" an Erlang/Elixir term occupies. This table tells you how many words various terms use. In particular, a tuple uses 1 word, plus 1 word for each element, plus the storage space for any elements that are "non-immediate". We're using small integers as elements, and they are "immediates" and thus "free". So this checks out:
> :erts_debug.size({1,2})
3
Now let's make a tuple containing two of those tuples:
> :erts_debug.size({{1,2}, {1,2}})
9
That makes sense: the two inner tuples are 3 words each, and the outer tuple is 1+2 words, for a total of 9 words.
But what if we put the inner tuple in a variable?
> x = {1, 2}
{1, 2}
> :erts_debug.size({x, x})
6
Look, we saved 3 words! That's because the contents of x
only counts once; the outer tuple points to the same inner tuple twice.
So let's write a little function that does this for us:
defmodule Test do
def same?(a, b) do
a_size = :erts_debug.size(a)
b_size = :erts_debug.size(b)
# Three words for the outer tuple; everything else is shared
a_size == b_size and :erts_debug.size({a,b}) == a_size + 3
end
end
System working? Seems to be:
> Test.same? x, {1,2}
false
> Test.same? x, x
true
Goal accomplished!
However, say we're trying to call this function from another function in a compiled module, not from the iex shell:
def try_it() do
x = {1, 2}
a1 = {"a", {1, 2}}
a2 = {"a", {1, 2}}
a3 = {"a", x}
IO.puts "a1 and a2 same? #{same?(a1,a2)}"
IO.puts "a1 and a3 same? #{same?(a1,a3)}"
IO.puts "a3 and a2 same? #{same?(a3,a2)}"
end
That prints:
> Test.try_it
a1 and a2 same? true
a1 and a3 same? true
a3 and a2 same? true
That's because the compiler is smart enough to see that those literals are equal, and coalesces them to one term while compiling.
Note that this sharing of terms is lost when terms are sent to another process, or stored in / retrieved from an ETS table. See the Process Messages section of the Erlang Efficiency Guide for details.
Let me answer my question:
There is no need for developers to do pointer comparison explicitly, because Elixir already does that internally, in pattern matching and in operators ==
and ===
(via the corresponding Erlang operators).
For example, given
a1 = {0, {1, 2}}
a2 = {1, {1, 2}}
x = {a1, a2}
s = {1, 2}
b1 = {0, s}
b2 = {1, s}
y = {b1, b2}
in IEx we have
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> a1 = {0, {1, 2}}
{0, {1, 2}}
iex(2)> a2 = {1, {1, 2}}
{1, {1, 2}}
iex(3)> x = {a1, a2}
{{0, {1, 2}}, {1, {1, 2}}}
iex(4)> s = {1, 2}
{1, 2}
iex(5)> b1 = {0, s}
{0, {1, 2}}
iex(6)> b2 = {1, s}
{1, {1, 2}}
iex(7)> y = {b1, b2}
{{0, {1, 2}}, {1, {1, 2}}}
iex(8)> :erts_debug.size(x)
15
iex(9)> :erts_debug.size(y)
12
iex(10)> x == y
true
iex(11)> x === y
true
That is, x
and y
are content equal, but memory different, because y
occupies less memory than x
as it internally shares substructure s
.
In short, ==
and ===
do both content and pointer comparison. Pointer comparison is the most efficient way for Erlang to avoid traversing the same substructure on both sides of the comparison, thus saving lots of time for large shared substructures.
Now, if structural duplication across two structures is a concern, like when they are loaded from two large files with similar content, then one must compress them into two new structures sharing the parts in which they are content equal. This was the case of a1
and a2
which were compressed as b1
and b2
.
Erlang/OTP 22 (and possibly earlier) provides :erts_debug.same/2
, which will allow you to do the desired memory pointer test. However, note the function is undocumented and in a module named erts_debug
, so you should only rely on it for debugging and testing, and never in production code.
In my almost 9 years using Erlang/Elixir, I have only used it once, which is to test that we are not needlessly allocating structs in Ecto. Here is the commit for reference.