How does compiler figure out fixed point of a functor and how cata work at leaf level?

The only way the compiler can know about the relationship between ListF e and [e] is if you tell it. You haven't provided enough context to answer exactly how, but I can infer that unFix has type

unFix :: [e] -> ListF e [e]

that is, it unrolls the top layer. unFix may be more general, for example in recursion-schemes a type family is used to associate data types with their underlying functors. But this is where the two types are connected.

As for your second question, you need to be clearer about when you have a list and when you have a ListF. They are completely different.

fmap (cata lenAlg) Nil

Here the functor you are mapping over is ListF e for whatever e you like. That is, it's this fmap:

fmap :: (a -> b) -> ListF e a -> ListF e b

If you implement instance Functor (ListF e) yourself (always a good exercise) and get it to compile, you will find that Nil must map to Nil, so the cata lenAlg didn't matter at all.

Let's look at the type of Cons 1 (Cons 2 Nil):

Nil                 :: ListF e a
Cons 2 Nil          :: ListF Int (ListF e a)
Cons 1 (Cons 2 Nil) :: ListF Int (ListF Int (ListF e a))

Something is awry here. The trouble is that we are forgetting to do the opposite of unFix to roll the ListF back up into a regular list. I will call this function

roll :: ListF e [e] -> [e]

Now we have

Nil                                      :: ListF e a
roll Nil                                 :: [e]
Cons 2 (roll Nil)                        :: ListF Int [Int]
roll (Cons 2 (roll Nil))                 :: [Int]
Cons 1 (roll (Cons 2 (roll Nil)))        :: ListF Int [Int]
roll (Cons 1 (roll (Cons 2 (roll Nil)))) :: [Int]

The types are staying nice and small now, it's a good sign. For the execution trace, let's just note that unFix . roll = id, however they work. It's important here to notice that

fmap f (Cons a b) = Cons a (f b)
fmap f Nil        = Nil

That is, fmap on ListF just applies the function on the "recursive part" of the type.

I'm going to switch the Cons case to lenAlg (ConsF e n) = 1 + n to make the trace a tiny bit more readable. Still kind of a mess, good luck.

cata lenAlg (roll (Cons 1 (roll (Cons 2 (roll Nil)))))
(lenAlg . fmap (cata lenAlg) . unFix) (roll (Cons 1 (roll (Cons 2 (roll Nil)))))
lenAlg (fmap (cata lenAlg) (unFix (roll (Cons 1 (roll (Cons 2 (roll Nil)))))))
lenAlg (fmap (cata lenAlg) (Cons 1 (roll (Cons 2 (roll Nil)))))
lenAlg (Cons 1 (cata lenAlg (roll (Cons 2 (roll Nil)))))
1 + cata lenAlg (roll (Cons 2 (roll Nil)))
1 + (lenAlg . fmap (cata lenAlg) . unFix) (roll (Cons 2 (roll Nil)))
1 + lenAlg (fmap (cata lenAlg) (unFix (roll (Cons 2 (roll Nil)))))
1 + lenAlg (fmap (cata lenAlg) (Cons 2 (roll Nil)))
1 + lenAlg (Cons 2 (cata lenAlg (roll Nil)))
1 + 1 + cata lenAlg (roll Nil)
1 + 1 + (lenAlg . fmap (cata lenAlg) . unFix) (roll Nil)
1 + 1 + lenAlg (fmap (cata lenAlg) (unFix (roll Nil)))
1 + 1 + lenAlg (fmap (cata lenAlg Nil))
1 + 1 + lenAlg Nil
1 + 1 + 0
2

See also my other answer about catamorphisms.

No, unFix exposes the structure, and then fmap f applies a function f on it. If it's a Leaf, fmap f will do its thing that it is defined to do for leaves - i.e., nothing. It's fmap that "knows" i.e. is defined to handle each case, as usual in Haskell's case-based definitions.

Fix (ListF e) = ListF e (Fix (ListF e)) 
              = NilF | ConsF e (Fix (ListF e))

so renaming Fix (ListF e) to Listof e we get

Listof     e  = NilF | ConsF e (Listof e)

Listof e is a recursive type. ListF e r is a non-recursive type. Fix makes a recursive type out of it. ListF e being a Functor means fmap works on the r "meat", ListF e being the "hard outer shell" of this "fruit".

data ListF e a = NilF | ConsF e a

newtype Fix f = X { unFix    :: (f       (Fix f)) }

-- unFix ::    Fix f          -> f       (Fix f        )
-- unFix (_ :: Fix (ListF e)) :: ListF e (Fix (ListF e))

lenAlg :: ListF e Int -> Int
lenAlg (ConsF e n) = n + 1
lenAlg NilF        = 0

instance Functor (ListF e) where
    -- fmap :: (a -> b) -> (ListF e a) -> (ListF e b)
    fmap f NilF        = NilF
    fmap f (ConsF e r) = ConsF e (f r)

cata :: (ListF e Int -> Int) -> Fix (ListF e) -> Int
cata lenAlg x = (lenAlg . fmap (cata lenAlg) . unFix) x
              =  lenAlg ( fmap (cata lenAlg) ( unFix  x ))
--------
        x       ::       Fix (ListF e)
        unFix x ::            ListF e (Fix (ListF e))
        fmap (cata lenAlg) :: ListF e (Fix (ListF e)) -> ListF e (Int)
              cata lenAlg  ::          Fix (ListF e)  ->          Int

        fmap (cata lenAlg)   (unFix  x              ) :: ListF e  Int 
 lenAlg (_                                            :: ListF e  Int ) :: Int

See? All the wires go to their proper places. fmap transforms the inner part r recursively, and then lenAlg algebra applies one last transformation, one last step in collapsing the structure where all its inner parts were already collapsed into a recursive result. Thus producing the final result.

As a concrete example, for a list of two numbers, 1 and 2, we have

-- newtype Fix f  = X { unFix :: (f      (Fix f         )) }
--             _\_______       ____\____      _\________
onetwo :: Fix (ListF Int) -- ~ ListF Int (Fix (ListF Int))
onetwo = X (ConsF 1
                  (X (ConsF 2
                            (X NilF))))

cata lenAlg :: Fix (ListF e) -> Int
cata lenAlg    onetwo
  = {- definition of cata   -}
    lenAlg . fmap (cata lenAlg) . unFix  $ onetwo
  = {- definition of onetwo -}
    lenAlg . fmap (cata lenAlg) . unFix  $ 
                           X (ConsF 1 (X (ConsF 2 (X NilF))))
  = {- definition of unFix  -}
    lenAlg . fmap (cata lenAlg) $ 
                              ConsF 1 (X (ConsF 2 (X NilF)))
  = {- definition of fmap   -}
    lenAlg $ ConsF 1 (cata lenAlg     (X (ConsF 2 (X NilF))))
  = {- definition of lenAlg -}
    (+ 1)  $          cata lenAlg     (X (ConsF 2 (X NilF)))
  = {- definition of cata   -}
    (+ 1)  $ lenAlg . fmap (cata lenAlg) . unFix $ 
                                       X (ConsF 2 (X NilF))
  = {- definition of unFix  -}
    (+ 1) $ lenAlg . fmap (cata lenAlg) $ ConsF 2 (X NilF)
  = {- definition of fmap   -}
    (+ 1) $ lenAlg $ ConsF 2 $ cata lenAlg        (X NilF)
  = {- definition of lenAlg -}
    (+ 1) $ (+ 1)  $           cata lenAlg        (X NilF)
  = {- definition of cata   -}
    (+ 1) $ (+ 1)  $ lenAlg . fmap (cata lenAlg) . unFix $ 
                                                  (X NilF)
  = {- definition of unFix  -}
    (+ 1) $ (+ 1)  $ lenAlg . fmap (cata lenAlg) $   NilF
  = {- definition of fmap   -}
    (+ 1) $ (+ 1)  $ lenAlg $                        NilF
  = {- definition of lenAlg -}
    (+ 1) $ (+ 1)  $ 0
  = (+ 1) $ 1
  = 2

Also,

squaringAlg :: ListF Int [Int] -> [Int]
squaringAlg (ConsF e r) = e*e : r
squaringAlg NilF        = []

filteringAlg :: (e -> Bool) -> ListF e [e] -> [e]
filteringAlg p (ConsF e r) | p e       = e : r
                           | otherwise = r
filteringAlg _ NilF                    = []

etc.

"List is the fixed point of ListF" is a fast-and-loose figure of speech. While fast and loose reasoning is morally correct, you always need to keep in mind the boring correct thing. Which is as follows:

any least fixpoint of ListF e is isomorphic to [e].

Now "the compiler" (that's a fast-and-loose word for "the Haskell language" by the way) is not aware of isomorphisms of this kind. You can write isomorphic types all day

data []    x = []   | (:)   x ([]    x)    -- the imaginary built-in definition of []
data ListA x = NilA | ConsA x (ListA x)
data ListB x = NilB | ConsB x (ListB x)
data ListC x = NilC | ConsC x (ListC x)

and the compiler will never treat them as "the same type". Nor will it know that the fixpoint of ListF e is the same as [e], or indeed what a fixpoint is. If you want to use these isomorphisms, you need to teach the compiler about them by writing some code (e.g. by defining instances of Data.Types.Isomorphic), and then specifying the isomorphism explicitly each time you want to use it!

Having this is mind, let's look at cata you have defined. The first thing that mets the eye is that the attempt at defining the type signature is a syntax error. Let's remove it and just define the function in GHCi (I changed the name of the parameter to f to avoid confusion, and fixed a few typos in the definition of ListF)

Main> data ListF e a = NilF | ConsF e a
Main> let lenAlg :: ListF e Int -> Int
Main|     lenAlg (ConsF e n) = n + 1
Main|     lenAlg NilF = 0
Main|
Main>
Main> :m + Data.Fix
Main Data.Fix> cata f = f . fmap (cata f) . unFix
Main Data.Fix> :t cata
cata :: Functor f => (f b -> b) -> Fix f -> b
Main Data.Fix> :t cata lenAlg
cata lenAlg :: Functor (ListF e) => Fix (ListF e) -> Int

This says that in order to use lenAlg, you need to:

define an instance of Functor for ListF e
use Fix (ListF e) (which is a fixpoint of ListF) explicitly. Wishing "the fixpoint of ListF" into existence just doesn't work. There's no magic.

So let's do this:

Main Data.Fix> instance Functor (ListF e) where
Main Data.Fix|     fmap f NilF = NilF
Main Data.Fix|     fmap f (ConsF e a) = ConsF e (f a)
Main Data.Fix>
Main Data.Fix> :t cata lenAlg
cata lenAlg :: Fix (ListF e) -> Int

Great, now we can calculate the length of a ListF-based Fix-wrapped list. But let's define a few helper definitions first.

Main Data.Fix> type List e = Fix (ListF e)
Main Data.Fix> nil = Fix (NilF)
Main Data.Fix> let infixr 7 ~~                   -- using an infix operator for cons
Main Data.Fix|     h ~~ t = Fix (ConsF h t)
Main Data.Fix|
Main Data.Fix>
Main Data.Fix> myList = (1::Int) ~~ 2 ~~ 3 ~~ 4 ~~ nil
Main Data.Fix> :t myList
myList :: Fix (ListF Int)

Here's our "list" (a member of a type that is isomorphic to [Int] to be precise). Let's cata lenAlg it:

Main Data.Fix> cata lenAlg myList
4

Success!

Bottom line: the compiler knows nothing,you need to explain it everything.

How does compiler figure out fixed point of a functor and how cata work at leaf level?

Tags:

Haskell

Catamorphism

Recursion Schemes

Fixpoint Combinators

Category Theory

Related

Recent Posts