In haskell finalize foreign pointers immediately after they are no longer referenced
You are thinking about this in the wrong way for Haskell.
In C++, RAII is used to ensure that resources are released -- promptly. Since C++ lacks a finally
construct, there is no other way to ensure that resources are released in the presence of exceptions. Also, since C++ lacks a garbage collector, reference counting and RAII are the order of the day.
In Haskell (and other garbage collected languages), however, the situation is different. One does not rely on finalizers running promptly. In fact, one should not rely on finalizers running at all, since they could be delayed for an arbitrary amount of time if the amount of available memory is high enough -- and might never be executed at all if the program terminates before the finalizer (or even the garbage collector) has a chance to run since the object became unreachable.
Instead, one uses explicit resource deallocation. This seems bad, but isn't. For reasons of memory safety, one should put the object in a "zombie" state, so that any further attempts to use the object throw exceptions (since they are bugs).
Alternatively, if the resources are such that they are automatically deallocated on process exit, one can rely on finalizers -- but note that they may not be called promptly (as you mentioned), and so an explicit performGC
may be needed if the resource is exhausted. I suspect that not knowing when the life of truly scarce resources is over (at least conservatively) is probably a code smell even in C++ -- it means that there is no upper bound on the amount of the resource consumed.
If we want to be independent of the GHC's garbage collection, we need to introduce some kind of determinism and therefore explicit deallocation. Allocation is usually something of type IO a
, and the corresponding deallocation of type a -> IO ()
(just as your example).
Now, what if we had the following functions?
allocate :: IO a -> (a -> IO ()) -> Alloc a
runAlloc :: Alloc a -> IO a
autoAllocate
should take both an allocation and deallocation and give you the result of the allocation in the new (superficial) Alloc
monad, and runAlloc
runs all actions and deallocations. Your example wouldn't change that much, except for the end:
allocateFPtr size = autoAllocate (newFPtr size) freeFPtr
main :: IO ()
main = forM_ [1 .. 5] $ runAlloc . const work
where
work = do x <- allocateFPtr 1024
return ()
Now, allocate
, runAlloc
and Alloc
already exists in resourcet
as allocate
, runResourceT
and ResourceT
, and the actual code would look like this:
allocateFPtr size = fmap snd $ allocate (newFPtr size) freeFPtr
main :: IO ()
main = forM_ [1 .. 5] $ runResourceT . const work
where
work = do x <- allocateFPtr 1024
return ()
Result:
falloc(1024, 0x1e04014, 0); 1 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04020, 0); 2 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e0402c, 0); 3 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04038, 0); 4 Result: 0 ffree(0x6abc60) falloc(1024, 0x1e04044, 0); 5 Result: 0 ffree(0x6abc60)
But you said that some of your pointers should actually live longer. That's also not a problem, since allocate
actually returns m (ReleaseKey, a)
, and ReleaseKey
can be used to either release the memory earlier than runResourceT
(using release
) or remove the automatic release mechanism (using unprotect
, which returns the deallocation action).
So, all in all, I guess your scenario could be handled well with ResourceT
. After all, it's synopsis is "Deterministic allocation and freeing of scarce resources".
In the very limited case where you are concerned about just freeing some memory that you can have live on the haskell heap, there is a special edge case available to you.
mallocForeignPtr
allocates the memory as a pinned mutable byte array on the haskell heap, so when the ForeignPtr
(and the mutable byte array) get GC'd, the memory gets automatically reclaimed with no finalizer invocation.
This is considerably cheaper than adding a manual hook to call some free
corresponding to a system malloc
, but only in the limited circumstances where you can live with the limitations.
However, if you are relying on freeing another resource (e.g. through a file handle object, or memory or resource IDs out on the GPU or something else) you're still hosed.
In general, don't rely on the GC to free up valuable external resources for you, except as a sort of "apology" pass for leaked things during, say, exceptions or the like. Your usual control flow should still free up the external resources you use.