What's the difference between the "data" and "type" keywords?
type
declares a type synonym. A type synonym is a new name for an existing type. For example, this is how String
is defined in the standard library:
type String = [Char]
String
is another name for a list of Char
s. GHC will replace all usages of String
in your program with [Char]
at compile-time.
To be clear, a String
literally is a list of Char
s. It's just an alias. You can use all the standard list functions on String
values:
-- length :: [a] -> Int
ghci> length "haskell"
7
-- reverse :: [a] -> [a]
ghci> reverse "functional"
"lanoitcnuf"
data
declares a new data type, which, unlike a type synonym, is different from any other type. Data types have a number of constructors defining the possible cases of your type. For example, this is how Bool
is defined in the standard library:
data Bool = False | True
A Bool
value can be either True
or False
. Data types support pattern matching, allowing you to perform a runtime case-analysis on a value of a data type.
yesno :: Bool -> String
yesno True = "yes"
yesno False = "no"
data
types can have multiple constructors (as with Bool
), can be parameterised by other types, can contain other types inside them, and can recursively refer to themselves. Here's a model of exceptions which demonstrates this; an Error a
contains an error message of type a
, and possibly the error which caused it.
data Error a = Error { value :: a, cause :: Maybe (Error a) }
type ErrorWithMessage = Error String
myError1, myError2 :: ErrorWithMessage
myError1 = Error "woops" Nothing
myError2 = Error "myError1 was thrown" (Just myError1)
It's important to realise that data
declares a new type which is apart from any other type in the system. If String
had been declared as a data
type containing a list of Char
s (rather than a type synonym), you wouldn't be able to use any list functions on it.
data String = MkString [Char]
myString = MkString ['h', 'e', 'l', 'l', 'o']
myReversedString = reverse myString -- type error
There's one more variety of type declaration: newtype
. This works rather like a data
declaration - it introduces a new data type separate from any other type, and can be pattern matched - except you are restricted to a single constructor with a single field. In other words, a newtype
is a data
type which wraps up an existing type.
The important difference is the cost of a newtype
: the compiler promises that a newtype
is represented in the same way as the type it wraps. There's no runtime cost to packing or unpacking a newtype
. This makes newtype
s useful for making administrative (rather than structural) distinctions between values.
newtype
s interact well with type classes. For example, consider Monoid
, the class of types with a way to combine elements (mappend
) and a special 'empty' element (mempty
). Int
can be made into a Monoid
in many ways, including addition with 0 and multiplication with 1. How can we choose which one to use for a possible Monoid
instance of Int
? It's better not to express a preference, and use newtype
s to enable either usage with no runtime cost. Paraphrasing the standard library:
-- introduce a type Sum with a constructor Sum which wraps an Int, and an extractor getSum which gives you back the Int
newtype Sum = Sum { getSum :: Int }
instance Monoid Sum where
(Sum x) `mappend` (Sum y) = Sum (x + y)
mempty = Sum 0
newtype Product = Product { getProduct :: Int }
instance Monoid Product where
(Product x) `mappend` (Product y) = Product (x * y)
mempty = Product 1
type
works just like let
: it allows you to give a re-usable name to something, but that something will always work just as if you had inlined the definition. So
type ℝ = Double
f :: ℝ -> ℝ -> ℝ
f x y = let x2 = x^2
in x2 + y
behaves exactly the same way as
f' :: Double -> Double -> Double
f' x y = x^2 + y
as in: you can anywhere in your code replace f
with f'
and vice versa; nothing would change.
OTOH, both data
and newtype
create an opaque abstraction. They are more like a class constructor in OO: even though some value is implemented simply in terms of a single number, it doesn't necessarily behave like such a number. For instance,
newtype Logscaledℝ = LogScaledℝ { getLogscaled :: Double }
instance Num LogScaledℝ where
LogScaledℝ a + LogScaledℝ b = LogScaledℝ $ a*b
LogScaledℝ a - LogScaledℝ b = LogScaledℝ $ a/b
LogScaledℝ a * LogScaledℝ b = LogScaledℝ $ a**b
Here, although Logscaledℝ
is data-wise still just a Double
number, it clearly behaves different from Double
.