Why prefer monoids over semigroups in Haskell? Why do we need mempty?

The mempty certainly is needed for some applications. For instance, what should be the result of foldMapping over an empty list? There's no a value you could feed the mapped function to obtain an m. So, there needs to be a “default”. It's nice if this is an identity element of an associative operation – for example it allows arbitrary reordering/chunking the fold without changing the result. When folding over a tree, actually a lot of memptys may show up in the middle but you always know they will get “squeezed out” in the end result, so you don't depend on the exact tree layout for reproducable results.

That said, you're right with your concern: Semigroup would quite often be sufficient, and it would probably be better if this class were used wherever mempty isn't needed (since there are actually a few quite nifty types that are Semigroup but not Monoid). However, the early design of the standard library apparently did not consider this important enough to warrant the extra class, so lots of code came to rely on Monoid without really needing a monoid.

So – much the same issue as we used to have with Monad: lots of code really only needed Applicative or even just Functor, but for historical reasons was stuck with Monad anyway until the AMP. Ultimately the problem is that Haskell class hierarchies can't really be refined after the fact, only extended downwards, not upwards.

These days base features classes for both Semigroup and Monoid (and plans are afoot to make the latter imply the former), so you don't have to choose between the two abstractions - just use the weakest assumption suitable for your purpose.

That said, Monoid does seem to be somewhat more useful in daily practice than Semigroup. Why is this? I can think of a couple of lines of argument:

Things that are useful in programming, such as collections, tend to have zero-or-many semantics; Monoid naturally abstracts composable collections of things in a way that Semigroup doesn't. (For example, [], the free Monoid, is a classic example of a collection.)
Types that are Semigroups but not Monoids tend to lose precision when you start composing them. Given two non-empty lists xs and ys, we know that xs <> ys has at least two elements. But its type only promises that it has at least one; useful information has been discarded.
From a social perspective, folks are scared enough of Cabal Hell to avoid pulling in extra dependencies just to deal with "non-empty things". When you want to work with monads-sans-return and categories-sans-id, installing the semigroupoids package requires a chunk of the "Kmett Platform".

It's the same as with all the other abstractions: the more operations you have at your disposal, the more complex things you can abstract over. Concerning Monoid specifically, here are a few most popular functions, which Semigroup is just not enough for:

foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m

fold :: (Foldable t, Monoid m) => t m -> m

mconcat :: Monoid a => [a] -> a

Why prefer monoids over semigroups in Haskell? Why do we need mempty?

Tags:

Functional Programming

Haskell

Category Theory

Related

Recent Posts