How to use "managed library expressions" to free the memory?

The purpose of managed library expressions is to provide "garbage collection" (i.e. automatic cleanup) for data structures created in C through LibraryLink.

The documentation is here:

  • Managed Library Expressions

It comes with a full demo with source code which you should read.


What are managed library expressions and what are they good for?

Why they are useful can be more easily seen through an example. Consider how the TriangleLink package works. It provides a low-level interface to the triangle library. This library manipulates meshes, which are stored in a data structure implemented purely in C, thus not directly accessible from Mathematica. To start, we must create a mesh:

<< TriangleLink`

tri = TriangleCreate[]
(* TriangleExpression[1] *)

The C side data structure is represented as an integer wrapped with the head TriangleExpression. The integer is a unique reference to a data structure stored on the C side. The C code is in fact keeping track of a mapping between integers and allocated meshes, but for as long as we use the library only from Mathematica, we do not need to be concerned with this.

We can now manipulate this expression. There are a number of functions that the package provides for this. We can for example add some points to it:

RandomSeed[10];
pts = RandomReal[1, {50, 2}];

TriangleSetPoints[tri, pts]
(* 0 *)

We can compute a Delaunay triangulation, which returns a second mesh data structure. Let's store this in tri2:

tri2 = TriangleTriangulate[tri, ""]
(* TriangleExpression[2] *)

And finally we can retrieve the triangles from tri2 and visualize them:

Graphics[
 GraphicsComplex[
  TriangleGetPoints[tri2], 
  {FaceForm[None], EdgeForm[Black], Polygon@TriangleGetElements[tri2]}]
 ]

enter image description here

Now that we have the result, we do not need these C-side data structures anymore. We should free them so they do not take up any more memory.

TriangleDelete[tri]
(* 0 *)

TriangleDelete[tri2]
(* 0 *)

What if we want to wrap this up into an easy-to-use, high level Delaunay triangulation function? Normally Mathematica does not require users to be concerned with memory management (TriangleCreate and TriangleDelete), so all this should be hidden inside the implementation. We could write something like this, which just encapsulates all of the steps from above:

triangulate[pts_] :=
 Module[{tri, tri2, result},
  tri = TriangleCreate[];
  TriangleSetPoints[tri, pts];
  tri2 = TriangleTriangulate[tri, ""];
  result = Graphics[
    GraphicsComplex[
     TriangleGetPoints[tri2], {FaceForm[None], EdgeForm[Black], 
      Polygon@TriangleGetElements[tri2]}]
    ];
  TriangleDelete[tri];
  TriangleDelete[tri2];
  result
 ]

But this is far from perfect. What if one of the steps fails? Notice those 0 values returned by functions such as TriangleSetPoints? They are error codes, which should be checked. We should also check that TriangleTriangulate doesn't return a LibraryFunctionError. And we must update the TriangleDelete calls accordingly. For example:

triangulate[pts_] :=
 Module[{tri, tri2, result, err},
  tri = TriangleCreate[];
  err = TriangleSetPoints[tri, pts];
  If[err != 0,
    TriangleDelete[tri];
    Return[$Failed]; 
  ]
  tri2 = TriangleTriangulate[tri, ""];
  (* we should check for errors here too *)
  result = Graphics[
    GraphicsComplex[
     TriangleGetPoints[tri2], {FaceForm[None], EdgeForm[Black], 
      Polygon@TriangleGetElements[tri2]}]
    ];
  TriangleDelete[tri];
  TriangleDelete[tri2];
  result
 ]

Also consider what happens if the user aborts the function in the middle. TriangleDelete won't be called so we have a memory leak. Dealing with this requires even more complex code, or even the use of undocumented functions such as Internal`WithLocalSettings.

In the end, manually managing memory and making sure that TriangleDelete is called at the appropriate times becomes difficult. It's easy to make a mistake. This is the problem managed library expressions are designed to solve. They make it possible to omit TriangleDelete and have a TriangleExpression automatically freed as soon as it is no longer referenced. When our Module exits, tri and tri2 automatically get Cleared, which automatically triggers the cleanup code that we had to invoke manually before (using TriangleDelete).

TriangleLink does in fact use managed library expressions since version 10, so TriangleDelete is not necessary. I showed it here to demonstrate the problem.


Simplified example problem

To create an example which could benefit from managed library expressions, let us modify your example a bit to require explicit memory management (malloc and free). Of course we don't really need this when dealing only with two variables of type int, but consider this a model of a more complicated case where malloc is unavoidable.

Let us move the code into a separate file, demo.c.

#include "WolframLibrary.h"

#include <stdlib.h>

int *result1, *result2;

DLLEXPORT int doCompute(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    mint I0;
    I0 = MArgument_getInteger(Args[0]);
    result1 = malloc(sizeof(int));
    result2 = malloc(sizeof(int));
    *result1 = I0*I0;
    *result2 = (*result1) * (*result1);
    return LIBRARY_NO_ERROR;
}

DLLEXPORT int getSquare(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    MArgument_setInteger(Res, *result1);
    return LIBRARY_NO_ERROR;
}

DLLEXPORT int getPow4(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    MArgument_setInteger(Res, *result2);
    return LIBRARY_NO_ERROR;
}

DLLEXPORT int freeResult(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    free(result1);
    free(result2);
    return LIBRARY_NO_ERROR;
}

We can compile and load the library like this:

Needs["CCompilerDriver`"]

lib = CreateLibrary[{"demo.c"}, "demo"];

doCompute = LibraryFunctionLoad[lib, "doCompute", {Integer}, "Void"];
getSquare = LibraryFunctionLoad[lib, "getSquare", {}, Integer];
getPow4 = LibraryFunctionLoad[lib, "getPow4", {}, Integer];
freeResult = LibraryFunctionLoad[lib, "freeResult", {}, "Void"];

And use it like this:

doCompute[20]

getSquare[]
(* 400 *)

getPow4[]
(* 160000 *)

freeResult[]

freeResult will free the memory allocated to store the result of the computation. We must use this very carefully. It must be called precisely once for every doCompute. Not calling it wastes memory (memory leak). Calling it twice causes a crash.

Note: Of course it would be better practice to modify freeResult like this:

DLLEXPORT int freeResult(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    free(result1); result1 = NULL;
    free(result2); result2 = NULL;
    return LIBRARY_NO_ERROR;
}

Setting the pointers to NULL avoids the crash if freeResult is called twice because free is safe to call on a null pointer. But in the general case (when freeing arbitrary resources), it is best to ensure that a double free is avoided.


Example using managed library expressions

To implement managed library expressions for our library, we must:

  1. Define a data structure, of which we can create multiple instances.
  2. Write code to initialize and free the data structure
  3. Create a dictionary that maps integer IDs to instances of this data structure

This is much more easy to do using C++, as the C++ standard library does have a "map" ("dictionary") data structure. If you want to do it in pure C, you must implement such a data structure on your own, which is really beyond the scope of this question.

Thus the source code will start with

// demo.cpp

#include "WolframLibrary.h"

#include <cstdlib>
#include <map>

// our data structure
struct PowerExpr {
    int pow2;
    int pow4;
};

std::map<mint, PowerExpr*> dict;

DLLEXPORT void manage_PowerExpr(WolframLibraryData libData, mbool mode, mint id)
{
    if (mode == 0) { // create
        dict[id] = new PowerExpr();
    } else { // destroy
      if (dict.find(id) == dict.end()) { // check that id exists in the map
        libData->Message("noinst");
        return;
      }
      delete dict[id];
      dict.erase(id);
    }
}

The function manage_PowerExp can have any name you wish, but its signature must follow this pattern, as described in the documentation. When it is called with mode=0, it must create a new data structure and associate it with the given ID. When called with mode=1, it must destroy the data structure with the given ID.

Then we need code to initialize the library. The initialization code will register the function that manages the data structure and associate it with a name (here "PowerExpr") which we can use to refer to it in Mathematica.

/* Initialize Library */
EXTERN_C DLLEXPORT int WolframLibrary_initialize( WolframLibraryData libData) {
    return (*libData->registerLibraryExpressionManager)("PowerExpr", &manage_PowerExpr);
}

/* Uninitialize Library */
EXTERN_C DLLEXPORT void WolframLibrary_uninitialize( WolframLibraryData libData) {
    int err = (*libData->unregisterLibraryExpressionManager)("PowerExpr");
}

EXTERN_C DLLEXPORT mint WolframLibrary_getVersion() {
    return WolframLibraryVersion;
}

Then each of our functions that operates on the "PowerExpr" data structure must take an ID as an argument, so it knows which instance to retrieve from the dictionary.

We check that the given ID exists. If it doesn't, we issue a LibraryFunction::noinst message (this is up to you).

EXTERN_C DLLEXPORT int doCompute(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    mint id, x;
    id = MArgument_getInteger(Args[0]);
    x  = MArgument_getInteger(Args[1]);

    if (dict.find(id) == dict.end()) { 
        libData->Message("noinst"); 
        return LIBRARY_FUNCTION_ERROR; 
    }

    int sqr = x*x; 
    dict[id]->pow2 = sqr;
    dict[id]->pow4 = sqr*sqr;

    return LIBRARY_NO_ERROR;
}

EXTERN_C DLLEXPORT int getSquare(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    mint id;

    id = MArgument_getInteger(Args[0]);
    if (dict.find(id) == dict.end()) { 
        libData->Message("noinst"); 
        return LIBRARY_FUNCTION_ERROR; 
    }

    MArgument_setInteger(Res, dict[id]->pow2);

    return LIBRARY_NO_ERROR;
}

EXTERN_C DLLEXPORT int getPow4(WolframLibraryData libData, mint Argc, MArgument *Args, MArgument Res) {
    mint id;

    id = MArgument_getInteger(Args[0]);
    if (dict.find(id) == dict.end()) { 
       libData->Message("noinst"); 
       return LIBRARY_FUNCTION_ERROR; 
    }

    MArgument_setInteger(Res, dict[id]->pow4);

    return LIBRARY_NO_ERROR;
}

Notice that since we are using C++, the extension of the source file must be .cpp (or other standard C++ extension) and that each library function must be preceded by EXTERN_C (a macro defined in WolframLibrary.h that expands to extern "C").

Now we are ready to compile and load the library.

lib = CreateLibrary[{"demo.cpp"}, "demo"];

doCompute = LibraryFunctionLoad[lib, "doCompute", {Integer, Integer}, "Void"];
getSquare = LibraryFunctionLoad[lib, "getSquare", {Integer}, Integer];
getPow4 = LibraryFunctionLoad[lib, "getPow4", {Integer}, Integer];

Let us turn off history tracking so we can demonstrate the automatic cleanup of these data structures later:

$HistoryLength = 0;

Let's create two instances of the data structure:

exp1 = CreateManagedLibraryExpression["PowerExpr", PowerExpr]
exp2 = CreateManagedLibraryExpression["PowerExpr", PowerExpr]
(* PowerExpr[1] *)
(* PowerExpr[2] *)

We can check that these are managed library expressions like so:

ManagedLibraryExpressionQ[PowerExpr[1]]
(* True *)

And retrieve their ID in a safe manner using ManagedLibraryExpressionID:

ManagedLibraryExpressionID[PowerExpr[1]]
(* 1 *)

We could use First[exp1] to extract the ID, but ManagedLibraryExpressionID also checks that the ID is actually associated with a C-side data structure:

ManagedLibraryExpressionID[PowerExpr[1]]
(* 1 *)

ManagedLibraryExpressionID[PowerExpr[137]]
(* $Failed *)

Let us now use the data structures:

doCompute[ManagedLibraryExpressionID[exp1], 20]

doCompute[ManagedLibraryExpressionID[exp2], 30]

getSquare[ManagedLibraryExpressionID[exp1]]
(* 400 *)

getSquare[ManagedLibraryExpressionID[exp2]]
(* 900 *)

getPow4[ManagedLibraryExpressionID[exp1]]
(* 160000 *)

getPow4[ManagedLibraryExpressionID[exp2]]
(* 810000 *)

These data structures will be automatically freed when exp1 and exp2 are cleared. We can check that PowerExpr[1] exists like so:

ManagedLibraryExpressionQ[PowerExpr[1]]
(* True *)

Now let us get rid of all references to PowerExpr[1]:

Clear[exp1]

We set $HistoryLength=0 earlier so that a reference to PowerExpr[1] would not be kept in Out (and prevent automatic cleanup).

ManagedLibraryExpressionQ[PowerExpr[1]]
(* False *)

Once we have all this set up, we can create a high level function which encapsulates all the steps:

pow24[x_] := 
  Module[{exp = CreateManagedLibraryExpression["PowerExpr", PowerExpr]}, 
    doCompute[ManagedLibraryExpressionID[exp], x]
    {getSquare[ManagedLibraryExpressionID[exp]], getPow4[ManagedLibraryExpressionID[exp]]}
  ]

At this point you might rightly think that doing this is just too much work and simply not worth the trouble. That is true for such a simple example. It becomes worth doing this when you already have some sort of framework set up to manage multiple instances of some data structure on the C side, and you want to add automatic cleanup to it. TriangleLink and TetGenLink are good examples of when using this feature is useful.

But you will also notice that most of the extra code we needed to write is quite straightforward and could be generated automatically. This is exactly what LTemplate does. It makes it much easier to work with managed library expressions by automating most of the tedious tasks we had to do to set up everything.


Using LTemplate to accomplish the same

We can do the same with with LTemplate using much less code:

<< LTemplate`

template = LClass[
   "PowerExpr",
   {
    LFun["doCompute", {Integer}, "Void"],
    LFun["getSquare", {}, Integer],
    LFun["getPow4", {}, Integer]
    }
   ];

SetDirectory[$TemporaryDirectory];
code = "
  struct PowerExpr {
      int pow2, pow4;

      void doCompute(mint x) { pow2 = x*x; pow4 = pow2*pow2; }
      mint getSquare() const { return pow2; }
      mint getPow4() const { return pow4; }
  };
  ";
Export["PowerExpr.h", code, "String"];

CompileTemplate[template]

LoadTemplate[template]

exp = Make[PowerExpr]
(* PowerExpr[1] *)

exp@"doCompute"[20]

exp@"getSquare"[]
(* 400 *)

exp@"getPow4"[]
(* 160000 *)

This is short enough that it is actually worth using it for returning multiple results even in relatively simple cases.

You can evaluate SystemOpen["LTemplate-PowerExpr.cpp"] to see all the code that LTemplate generated for us:

It looks like this:

#include "LTemplate.h"
#include "LTemplateHelpers.h"
#include "PowerExpr.h"


#define LTEMPLATE_MESSAGE_SYMBOL  "LTemplate`LTemplate"

#include "LTemplate.inc"


std::map<mint, PowerExpr *> PowerExpr_collection;

DLLEXPORT void PowerExpr_manager_fun(WolframLibraryData libData, mbool mode, mint id)
{
    if (mode == 0) { // create
      PowerExpr_collection[id] = new PowerExpr();
    } else {  // destroy
      if (PowerExpr_collection.find(id) == PowerExpr_collection.end()) {
        libData->Message("noinst");
        return;
      }
      delete PowerExpr_collection[id];
      PowerExpr_collection.erase(id);
    }
}

extern "C" DLLEXPORT int PowerExpr_get_collection(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    mma::IntTensorRef res = mma::detail::get_collection(PowerExpr_collection);
    mma::detail::setTensor<mint>(Res, res);
    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT mint WolframLibrary_getVersion()
{
    return WolframLibraryVersion;
}

extern "C" DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData)
{
    mma::libData = libData;
    {
        int err;
        err = (*libData->registerLibraryExpressionManager)("PowerExpr", PowerExpr_manager_fun);
        if (err != LIBRARY_NO_ERROR) return err;
    }
    return LIBRARY_NO_ERROR;
}

extern "C" DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData)
{
    (*libData->unregisterLibraryExpressionManager)("PowerExpr");
    return;
}


extern "C" DLLEXPORT int PowerExpr_doCompute(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    mma::detail::MOutFlushGuard flushguard;
    const mint id = MArgument_getInteger(Args[0]);
    if (PowerExpr_collection.find(id) == PowerExpr_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try
    {
        mint var1 = MArgument_getInteger(Args[1]);

        (PowerExpr_collection[id])->doCompute(var1);
    }
    catch (const mma::LibraryError & libErr)
    {
        libErr.report();
        return libErr.error_code();
    }
    catch (const std::exception & exc)
    {
        mma::detail::handleUnknownException(exc.what(), "PowerExpr::doCompute()");
        return LIBRARY_FUNCTION_ERROR;
    }
    catch (...)
    {
        mma::detail::handleUnknownException(NULL, "PowerExpr::doCompute()");
        return LIBRARY_FUNCTION_ERROR;
    }

    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT int PowerExpr_getSquare(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    mma::detail::MOutFlushGuard flushguard;
    const mint id = MArgument_getInteger(Args[0]);
    if (PowerExpr_collection.find(id) == PowerExpr_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try
    {
        mint res = (PowerExpr_collection[id])->getSquare();
        MArgument_setInteger(Res, res);
    }
    catch (const mma::LibraryError & libErr)
    {
        libErr.report();
        return libErr.error_code();
    }
    catch (const std::exception & exc)
    {
        mma::detail::handleUnknownException(exc.what(), "PowerExpr::getSquare()");
        return LIBRARY_FUNCTION_ERROR;
    }
    catch (...)
    {
        mma::detail::handleUnknownException(NULL, "PowerExpr::getSquare()");
        return LIBRARY_FUNCTION_ERROR;
    }

    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT int PowerExpr_getPow4(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    mma::detail::MOutFlushGuard flushguard;
    const mint id = MArgument_getInteger(Args[0]);
    if (PowerExpr_collection.find(id) == PowerExpr_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try
    {
        mint res = (PowerExpr_collection[id])->getPow4();
        MArgument_setInteger(Res, res);
    }
    catch (const mma::LibraryError & libErr)
    {
        libErr.report();
        return libErr.error_code();
    }
    catch (const std::exception & exc)
    {
        mma::detail::handleUnknownException(exc.what(), "PowerExpr::getPow4()");
        return LIBRARY_FUNCTION_ERROR;
    }
    catch (...)
    {
        mma::detail::handleUnknownException(NULL, "PowerExpr::getPow4()");
        return LIBRARY_FUNCTION_ERROR;
    }

    return LIBRARY_NO_ERROR;
}

For returning multiple values, you'll find it much easier to return them via a WSTP/Mathlink link object. Start from this:

#include <mathlink.h>
#pragma comment(lib, "ml64i4m") // only 4 seems to work?
#include <WolframLibrary.h> // include after <mathlink.h>

EXTERN_C DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData) {
    return 0;
}
EXTERN_C DLLEXPORT mint WolframLibrary_getVersion() {
    return WolframLibraryVersion;
}

// load like LibraryFunctionLoad[..., "compute", LinkObject, LinkObject]
// k_Integer :> {k^2, k^4}
EXTERN_C DLLEXPORT int compute(WolframLibraryData libData, MLINK mlp)
{
    // Note: arguments are always wrapped in a list implicitly
    long argc; MLCheckFunction(mlp, "List", &argc);

    int k; MLGetInteger(mlp, &k);

    MLNewPacket(mlp); // Note explicit creation of reply packet (different from WSTP applications)
    MLPutFunction(mlp, "List", 2);
    MLPutInteger(mlp, k*k);
    MLPutInteger(mlp, k*k*k*k);

    return LIBRARY_NO_ERROR;
}

and use like

dll = "J:\\Masterarbeit\\Implementation\\Scratch\\ScratchLibraryLink\\\
x64\\Debug\\ScratchLibraryLink.dll";
compute = LibraryFunctionLoad[dll, "compute", LinkObject, LinkObject]
compute[2]

giving

{4, 16}

IMO, "managed library expressions" are meant for conceptual 'objects' that ahve a certain lifetime and resources associated with them, not single-fire big results. In particular, they also often have mutators associated with them.

Tags:

Librarylink