Why does Enumerable not have a length attribute in Ruby?
Enumerable has the count
method, which is usually going to be the intuitive "length" of the enumeration.
But why not call it "length"? Well, because it operates very differently. In Ruby's built-in data structures like Array
and Hash
, length
simply retrieves the pre-computed size of the data structure. It should always return instantly.
For Enumerable#count
, however, there's no way for it to know what sort of structure it's operating on and thus no quick, clever way to get the size of the enumeration (this is because Enumerable
is a module, and can be included in any class). The only way for it to get the size of the enumeration is to actually enumerate through it and count as it goes. For infinite enumerations, count
will (appropriately) loop forever and never return.
Enumerables are not guaranteed to have lengths - the only requirement for an object which Enumerable is mixed into is that it responds to #each
, which causes it to return the next item in the series, and #<=>
which allows comparison of values provided by the enumerable. Methods like #sort
will enumerate the entire collection over the course of sorting, but may not know the bounds of the set ahead of time. Consider:
class RandomSizeEnumerable
include Enumerable
def each
value = rand 1000
while value != 500
yield value
value = rand 1000
end
end
# Not needed for this example, but included as a part of the Enumerable "interface".
# You only need this method if #max, #min, or #sort are used on this class.
def <=>(a, b)
a <=> b
end
end
This enumerable will be called until the iterator generates the value "500", which will cause it to stop enumerating. The result set is collected and sorted. However, a #length
method is meaningless in this context, because the length is unknowable until the iterator has been exhausted!
We can call #length
on the result of things like #sort
, since they return an array, though:
p RandomSizeEnumerable.new.sort.length # 321
p RandomSizeEnumerable.new.sort.length # 227
p RandomSizeEnumerable.new.sort.length # 299
Conventionally, #length
is used when the length is known and can be returned in constant time, whereas #count
(and sometimes #size
) tend to be used when the length may not be known ahead of time and needs to be computed by iterating the result set (thus, taking linear time). If you need the size of the result set provided by an Enumerable, try using .to_a.length
#count
.