scala, guidelines on return type - when prefer seq, iterable, traversable

  • Use Seq by default everywhere.
  • Use IndexedSeq when you need to access by index.
  • Use anything else only in special circumstances.

These are the "common-sense" guidelines. They are simple, practical, and work well in practice while balancing principles and performance. The principles are:

  1. Use a type that reflects how the data is organized (thanks OP and ziggystar).
  2. Use interface types in both method arguments and return types. Both inputs and return types of an API benefit from the flexibility of generality.

Seq satisfies both principles. As described in http://docs.scala-lang.org/overviews/collections/seqs.html:

A sequence is a kind of iterable that has a [finite] length and whose elements have fixed index positions, starting from 0.

90% of the time, your data is a Seq.

Other notes:

  • List is an implementation type, so you shouldn't use it in an API. A Vector for instance can't be used as a List without going through a conversion.
  • Iterable doesn't define length. Iterable abstracts across finite sequences and potentially infinite streams. Most of the time one is dealing with finite sequences so you "have a length," and Seq reflects that. Frequently you won't actually make use of length. But it's needed often enough, and is easy to provide, so use Seq.

Drawbacks:

There are some slight downsides to these "common-sense" conventions.

  • You can't use List cons pattern matching i.e. case head :: tail => ... . You can use :+ and +: as described here. Importantly, however, matching on Nil still works as described in Scala: Pattern matching Seq[Nothing].

Footnotes:

  • I'm not discussing Map here because the question, sensibly, doesn't ask about it.
  • I'm only addressing immutable collections here.
  • The guidelines I suggest are consistent with Should I use List[A] or Seq[A] or something else?

This is a good question. You have to balance two concerns:

  • (1) try to keep your API general, so you can change the implementation later
  • (2) give the caller some useful operations to perform on the collection

Where (1) asks you to be as little specific about the type (e.g. Iterable over Seq), and (2) asks you the opposite.

Even if the return type is just Iterable, you can still return let's say a Vector, so if the caller wishes to gain extra power, it can just call .toSeq or .toIndexedSeq on it, and that operation is cheap for a Vector.

As a measure of the balance, I would add a third point:

  • (3) use a type that kind of reflects how the data is organised. E.g. when you can assume that the data does have a sequence, give Seq. If you can assume that no two equal objects can occur, give a Set. Etc.

Here are my rules of thumb:

  • try to use only a small set of collections: Set, Map, Seq, IndexedSeq
  • I often violate this previous rule, though, using List in favour of Seq. It allows the caller to do pattern matching with the cons extractors
  • use immutable types only (e.g. collection.immutable.Set, collection.immutable.IndexedSeq)
  • do not use concrete implementations (Vector), but the general type (IndexedSeq) which gives the same API
  • if you are encapsulating a mutable structure, only return Iterator instances, the caller can then easily generate a strict structure, e.g. by calling toList on it
  • if your API is small and clearly tuned towards "big data throughput", use IndexedSeq

Of course, this is my personal choice, but I hope it sounds sane.


Make your method's return type as specific as possible. Then if the caller wants to keep it as a SuperSpecializedHashMap or type it as a GenTraversableOnce, they can. This is why the compiler infers the most specific type by default.

Tags:

Scala