Neural Networks: Does Mathematica (v11) experimental code support state-of-art Models?
Mathematica's neural network functionality is based on MXNET. So you can use pre-trained models for MXNET or create and train state-of-the-art models with NetGraph
.
For example, pre-trained Inception-V3:
https://github.com/dmlc/mxnet-model-gallery/blob/master/imagenet-1k-inception-v3.md
URLDownload[ "http://data.dmlc.ml/mxnet/models/imagenet/inception-v3.tar.gz", FileNameJoin[{$UserDocumentsDirectory, "inception-v3.tar.gz"}] ]; ExtractArchive["inception-v3.tar.gz"]; Needs["NeuralNetworks`"] net = NeuralNetworks`ImportMXNetModel[ "model//Inception-7-symbol.json", "model//Inception-7-0001.params" ]
Newest 'Xception'-model is not replicable right now. Because MXNET doesn't have SeparableConv2D and GlobalAveragePooling2D layers. Even in the Keras SeparableConv2D layer is available only with the TensorFlow backend. Global(Average|Max)Pooling exists in MXNET but not realized in Mathematica.
UPDATE
Since the V11.1 we can use AggregationLayer
for global pooling.
SeparableConv2D can be built from the other layers.
n = 128; h = 3; w = 3; depth = 2; NetChain[ { ReplicateLayer[1], TransposeLayer[], NetMapOperator[ConvolutionLayer[depth, {h, w}]], FlattenLayer[1], ConvolutionLayer[n, {1, 1}] }, "Input" -> {32, 9, 9} ]
Bring in pre-trained models is sometimes very useful. Alexey's answer is somewhat brief, here I'm trying to add some examples hopefully will be helpful.
We can load the trained network by
net = NeuralNetworks`ImportMXNetModel[
"model/Inception-7-symbol.json",
"model/Inception-7-0001.params"
]
and attach the final softmax layer to calculate the probabilities in each class:
net2 = NetGraph[{net, SoftmaxLayer[]}, {1 -> 2},
"Input" -> NetEncoder[{"Image", {299, 299}, ColorSpace -> "RGB"}],
"Output" -> NetDecoder[{"Class", Range[1008]}]]
The prediction label/text mapping is in the file synset.txt:
labels = Import["model/synset.txt", "Table"]
We can then use the inception network to identify images. For example
imgs = EntityValue[#, "Image"] & /@ {Entity["Species",
"Infraspecies:CanisLupusFamiliaris"],
Entity["Species", "Species:FelisCatus"],
Entity["Species", "Species:PantheraTigris"],
Entity["Species", "Genus:Macropus"]}
and the labels are identified fairly accurately
labels[[net2[ImageResize[#, {299, 299}]]]] & /@ imgs
(* {{"n02099601", "golden", "retriever"},
{"n02127052", "lynx,", "catamount"},
{"n02129604", "tiger,", "Panthera", "tigris"},
{"n01877812", "wallaby,", "brush", "kangaroo"}} *)
We can also try to visualize the weights in its layers. For example, here are the weights the one channel of the first convolution layer:
weight = NetExtract[net, {1, "Weights"}];
ImageCollage[
Table[ImageAdjust[
Image[weight[[n, All, All]], ColorSpace -> "RGB"]], {n, 1, 32}],
ImagePadding -> 1]
And we can see what these convolution filters do to the input image:
conv = NetChain[{NetExtract[net, 1]},
"Input" -> NetEncoder[{"Image", {299, 299}, ColorSpace -> "RGB"}]];
data = conv@ImageResize[#, {299, 299}] &@imgs[[1]];
ImageCollage[ImageAdjust@Image[#] & /@ data]
We can also use Take
to cut the inception model at some layer, and visualize the propagated input image at that layer:
layers = Take[net, {"conv_conv2d", "mixed_2_tower_1_conv_1_conv2d"}]
NetChain[{layers},
"Input" -> NetEncoder[{"Image", {299, 299}, ColorSpace -> "RGB"}]]
data2 = net3@ImageResize[#, {299, 299}] &@imgs[[1]];
ImageCollage[ImageAdjust@Image[#] & /@ data2, ImagePadding -> 1]
XCEPTION
https://arxiv.org/abs/1610.02357
entry = NetGraph[
<|
"conv_1" -> ConvolutionLayer[32, {3, 3}, "Stride" -> 2],
"relu_1" -> Ramp,
"conv_2" -> ConvolutionLayer[64, {3, 3}],
"relu_2" -> Ramp,
"resid_1" -> ConvolutionLayer[128, {1, 1}, "Stride" -> 2],
"sep_conv_1" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[128, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_3" -> Ramp,
"sep_conv_2" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[128, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"max_pool_1" ->
PoolingLayer[{3, 3}, "Stride" -> 2, "PaddingSize" -> 1],
"add_1" -> ThreadingLayer[Plus],
"resid_2" -> ConvolutionLayer[256, {1, 1}, "Stride" -> 2],
"relu_4" -> Ramp,
"sep_conv_3" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[256, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_5" -> Ramp,
"sep_conv_4" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[256, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"max_pool_2" ->
PoolingLayer[{3, 3}, "Stride" -> 2, "PaddingSize" -> 1],
"add_2" -> ThreadingLayer[Plus],
"resid_3" -> ConvolutionLayer[728, {1, 1}, "Stride" -> 2],
"relu_6" -> Ramp,
"sep_conv_5" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_7" -> Ramp,
"sep_conv_6" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"max_pool_3" ->
PoolingLayer[{3, 3}, "Stride" -> 2, "PaddingSize" -> 1],
"add_3" -> ThreadingLayer[Plus]
|>
,
{
NetPort["Input"] ->
"conv_1" ->
"relu_1" -> "conv_2" -> "relu_2" -> "resid_1" -> "add_1",
"relu_2" ->
"sep_conv_1" ->
"relu_3" -> "sep_conv_2" -> "max_pool_1" -> "add_1",
"add_1" -> "resid_2" -> "add_2",
"add_1" ->
"relu_4" ->
"sep_conv_3" ->
"relu_5" -> "sep_conv_4" -> "max_pool_2" -> "add_2",
"add_2" -> "resid_3" -> "add_3",
"add_2" ->
"relu_6" ->
"sep_conv_5" ->
"relu_7" -> "sep_conv_6" -> "max_pool_3" -> "add_3"
}
,
"Input" -> NetEncoder[{"Image", {299, 299}, ColorSpace -> "RGB"}]
]
middle = NetGraph[
<|
"relu_1" -> Ramp,
"sep_conv_1" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_2" -> Ramp,
"sep_conv_2" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_3" -> Ramp,
"sep_conv_3" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"add" -> ThreadingLayer[Plus]
|>
,
{
NetPort["Input"] ->
"relu_1" ->
"sep_conv_1" ->
"relu_2" -> "sep_conv_2" -> "relu_3" -> "sep_conv_3" -> "add",
NetPort["Input"] -> "add"
}
,
"Input" -> {728, 19, 19}
]
exit = NetGraph[
<|
"resid" -> ConvolutionLayer[1024, {1, 1}, "Stride" -> 2],
"relu_1" -> Ramp,
"sep_conv_1" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[728, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_2" -> Ramp,
"sep_conv_2" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[1024, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"max_pool" ->
PoolingLayer[{3, 3}, "Stride" -> 2, "PaddingSize" -> 1],
"add" -> ThreadingLayer[Plus],
"sep_conv_3" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[1536, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_3" -> Ramp,
"sep_conv_4" ->
NetGraph[
{
ReplicateLayer[1],
TransposeLayer[],
NetMapOperator[ConvolutionLayer[1, {3, 3}, "PaddingSize" -> 1]],
FlattenLayer[1],
ConvolutionLayer[2048, {1, 1}]
},
{1 -> 2 -> 3 -> 4 -> 5}
],
"relu_4" -> Ramp,
"global_pool" -> AggregationLayer[Mean],
"softmax" -> {2048, SoftmaxLayer[]}
|>
,
{
NetPort["Input"] -> "resid" -> "add",
NetPort["Input"] ->
"relu_1" ->
"sep_conv_1" -> "relu_2" -> "sep_conv_2" -> "max_pool" -> "add",
"add" ->
"sep_conv_3" ->
"relu_3" -> "sep_conv_4" -> "relu_4" -> "global_pool" -> "softmax"
}
,
"Input" -> {728, 19, 19},
"Output" -> NetDecoder[{"Class", Range[2048]}]
]
xception = NetChain[
<|
"entry_flow" -> entry,
"middle_flow" -> NetNestOperator[middle, 8],
"exit_flow" -> exit
|>
]