What does deriving do/mean in Haskell?

In short:

deriving automatically implements functions for a few of Haskell's typeclasses such as Show and Eq. This cannot be done with arbitrary typeclasses, but the ones for which deriving does work for are simple enough for automatic implementation.

The Show typeclass defines functions for how to represent data types as a String.

More extensively:

Are you familiar with typeclasses?

https://www.haskell.org/tutorial/classes.html

Typeclasses are similar to interfaces in Java: they define a few functions that any data type who wants to use those functions can implement.

For instance, say we have a class like such:

class Comparable a where
    lessThan :: a -> a -> Bool
    equalsTo :: a -> a -> Bool

Beware of the word class. It means typeclass in the Haskell setting, not a typical "class" you would hear about in object oriented languages. The a here is a filler type, similar to how you would expect templates would work in C++ and generics to behave in Java.

Let's say we define a data type as follows:

data Color = Red | Green | Blue

To make Comparable work with Color, we implement an instance of Comparable:

instance Comparable Color where
    lessThan Red   Green = True
    lessThan Red   Blue  = True
    lessThan Green Blue  = True
    lessThan _     _     = False

    equalsTo Red   Red   = True
    equalsTo Green Green = True
    equalsTo Blue  Blue  = True
    equalsTo _     _     = False

Roughly speaking, this now allows you to "compare" Red, Green, and Blue with each other. But was there any way that GHC could have automagically guessed that this is was the exact "order" you wanted?

Taking a step back, the typeclass Show has a similar structure:

https://hackage.haskell.org/package/base-4.9.1.0/docs/src/GHC.Show.html#Show

class  Show a  where
    showsPrec :: Int -> a -> ShowS
    show      :: a   -> String
    showList  :: [a] -> ShowS

    showsPrec _ x s = show x ++ s
    show x          = shows x ""
    showList ls   s = showList__ shows ls s

Something to notice is that functions within a typeclass may be defined in terms of each other. Indeed, we could have also easily done:

class Comparable a where
    lessThan :: a -> a -> Bool
    equalsTo :: a -> a -> Bool

    greaterThan :: a -> a -> Bool
    greaterThan lhs rhs = not (lessThan lhs rhs && equalsTo lhs rhs)

However, the key point is this: for arbitrary user-defined typeclasses, GHC has no idea how their functions should be implemented when you try to associate the typeclass with a data type such as Color or BaseballPlayer. For some typeclasses such as Show, Eq, Ord, etc, where the functionality is simple enough, GHC can generate default implementations that you can of course overwrite yourself.

Indeed, let's experiment by trying to compile the following:

data Color = Red | Green | Blue deriving (Comparable)

The result I get is this:

test.hs:9:43:
    Can't make a derived instance of ‘Comparable Color’:
      ‘Comparable’ is not a derivable class
      Try enabling DeriveAnyClass
    In the data declaration for ‘Color’

Of course, some GHC extensions can be used to extend the power of deriving, but that's for another day :)


In this specific case, it generates a Show instance for your type, as follows:

instance Show BaseballPlayer where
    show Pitcher    = "Pitcher"
    show Catcher    = "Catcher"
    show Infielder  = "Infielder"
    show Outfielder = "Outfielder"

In this way, values of type BaseballPlayer can be converted to string, e.g. by print.

The string is chosen so that it is a valid Haskell expression that can reconstruct, once evaluated, the original value.

The general case is a bit more complex, but follows the same idea: converting a value into a Haskell expression-string. For instance

data T = T Int (Maybe Bool) deriving Show

will make the instance so that

show (T 1 Nothing) = "T 1 Nothing"
show (T 2 (Just 3)) = "T 2 (Just 3)"

Note how the parentheses are also generated in the last case. This is done using the showsPrec class member, but it's not that important.


Deriving means that your data type is automatically able to "derive" instances for certain type classes. In this case BaseballPlayer derives Show which means we can use any function that requires an instance of Show to work with BaseballPlayer.

Automatic deriving makes it easier for you to avoid boilerplate. The most common type classes for automatic deriving are Show and Eq, because the compiler can make very sensible values for these typeclasses.