How to implement a triplet network with NetSharedArray?

The problem is that I'm not sure how to write the triplet hinge loss in Mathematica.

Suppose we have three images A,B,C. For each image, we have a feature vector, $\textbf{a}=f(A)$ etc. The formula for the triplet loss is then: $$loss({\textbf{a}, \textbf{p}, \textbf{n}}) = max(\Vert\textbf{a} - \textbf{p}\Vert^{2} - \Vert\textbf{a} - \textbf{n}\Vert^2 +\alpha, 0)$$

First, define a NetGraph that computes the L2-Norm $\Vert\ldots\Vert$:

l2norm = NetGraph[{ThreadingLayer[(#1 - #2)^2 &], SummationLayer[]}, {1 -> 2}]

Now the triplet loss is simply (with an $\alpha=0.2$):

alpha = 0.2;
tripletloss = NetGraph[{l2norm, l2norm, ThreadingLayer[Max[#1 - #2 + alpha, 0]&]}, 
  {{NetPort["a"], NetPort["p"]} -> 1 -> 3, {NetPort["a"], NetPort["n"]} -> 2 -> 3}]

We can evaluate this loss for three feature vectors:

tripletloss[<|"a" -> {3, 2}, "n" -> {3, 2}, "p" -> {1.2, 2.1}|>]
Out[145]= 3.45

This can be used as a loss in your triplet net.


I'm adding this post becuase it is the complete answer to the original question (I accepted Sebastian's parital answer because it was very helpful).

This code implements a MWE of what I had in mind, and adds a few follow-up questions/remarks:

(* Define the loss from Sebastian's post above *)
l2norm = NetGraph[{ThreadingLayer[(#1 - #2)^2 &], 
   SummationLayer[]}, {1 -> 2}]; alpha = 0.2;
tripletloss = NetGraph[{l2norm, l2norm, 
   ThreadingLayer[Max[#1 - #2 + alpha, 0] &]}, {{NetPort["a"], NetPort["p"]} -> 
    1 -> 3, {NetPort["a"], NetPort["n"]} -> 2 -> 3}]

(* get triplet training data e.g. (anchor, pos, neg) *)
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
tripleBatchGenerator[assn_Association] := Module[{pi, ni, a, p, n},
  Table[pi = RandomInteger[{0, 9}];
   ni = RandomChoice[Complement[Range[0, 9], {pi}]];
   pos = Position[trainingData, x_ -> pi];
   {a, p} = Extract[trainingData, RandomSample[pos, 2]][[All, 1]];
   n = Part[trainingData, 
      RandomChoice[
       Complement[Range[Length@trainingData], Flatten[pos]]]][[1]];
   <|"a" -> a, "p" -> p, "n" -> n|>, assn["BatchSize"]]
  ]

(* Question: is this a correct/good way to make the siamese portion? *)
evalnet = NetTake[NetModel["LeNet"], {1, -3}]
aevalnet = NetInsertSharedArrays[evalnet];
pevalnet = NetInsertSharedArrays[evalnet];
nevalnet = NetInsertSharedArrays[evalnet];

(* Sow pieces into a netgraph for training *)
enc = NetEncoder[{"Image", {28, 28}, "Grayscale"}];
net = NetGraph[<|
   "aevalnet" -> aevalnet,
   "pevalnet" -> pevalnet,
   "nevalnet" -> nevalnet,
   "loss" -> tripletloss|>, {
   NetPort["a"] -> "aevalnet" -> NetPort["loss", "a"], 
   NetPort["p"] -> "pevalnet" -> NetPort["loss", "p"], 
   NetPort["n"] -> "nevalnet" -> NetPort["loss", "n"],
   "loss" -> NetPort["Loss"]}, "a" -> enc, "p" -> enc, "n" -> enc]

 (* Train it! *)
 trainResult = NetTrain[net, tripleBatchGenerator, All, MaxTrainingRounds -> 500]

It starts training correctly:

enter image description here

However, here are some follow-up questions/remarks:

  1. When using a generator function, specifying the ValidationSet doesn't seem to work (e.g. ValidationSet->Scaled[.1]).
  2. Is there any good way to automate a search for the hyperparameter alpha (which was randomly set to .2)?
  3. Are the 3 calls to NetInsertSharedArrays right, is using NetMapOperator better?

Perhaps @Sebastian can address these to help put the finishing touches on it?