How to inspect the learned ClassifierFunction decision tree model?
Here is one way to visualize/interpret Classify
's tree structure from Henrik Schumacher's answer.
SeedRandom[432]
fn[n_] := Which[n < 1, 1, 1 <= n < 2, 2, 2 <= n, 3];
data = Table[(r = RandomReal[{0, 4}]; r -> fn[r]), {i, 1, 1000}];
c = Classify[data, Method -> "DecisionTree"];
tree = c[[1, "Model", "Tree"]];
fromRawArray[a_RawArray] := Developer`FromRawArray[a];
fromRawArray[a_] := a;
Map[Normal, fromRawArray /@ tree[[1]]]
(* <|"FeatureIndices" -> {1, 1},
"NumericalThresholds" -> {-0.894867, -0.0252423},
"NominalSplits" -> {}, "Children" -> {{-2, -3}, {1, -1}},
"LeafValues" -> {{1, 1, 508}, {246, 1, 1}, {1, 249, 1}},
"RootIndex" -> 2, "NominalDimension" -> 0|> *)
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/AVCDecisionTreeForest.m"]
dtree = BuildDecisionTree[List @@@ data]
(* {{0.374931, 1.9972, 1, Number,
1000}, {{0.499981, 1.0175, 1, Number,
493}, {{{245, 1}}}, {{{248, 2}}}}, {{{507, 3}}}} *)
LayeredGraphPlot[DecisionTreeToRules[dtree],
VertexLabeling -> True]
There is a discrepancy of 1 in the obtained values, but otherwise the second tree seems to approximate Classify
's one well. The splitting thresholds of Classify
's tree a most likely obtained over the data being transformed with some embedding/hashing/normalization.
In general, many objects generated by Mathematica can be inspected by trying things of the form InputForm[classifier]
or classifier[string]
with string being one of the elements of classifier["Properties"]
. Recently implemented objects such as ClassifierFunction are mere wrappers for well-structured Association
s (thumbs up for this approach!), so classifier[[1]]
can be very revealing.
The following reveals that
tree = c[[1, "Model", "Tree"]]
is another such object (with head MachineLearning`DecisionTree
).
Inspecting
tree[[1]]
reveals that MachineLearning`DecisionTree
are partially composed of RawArray
s. These can be converted to usual integer arrays as follows:
fromRawArray[a_RawArray] := Developer`FromRawArray[a];
fromRawArray[a_] := a;
fromRawArray /@ tree[[1]]
<|"FeatureIndices" -> {1, 1}, "NumericalThresholds" -> {0.402823, 0.82983}, "NominalSplits" -> {}, "Children" -> {{-1, 2}, {-2, -3}}, "LeafValues" -> {{619, 1, 1}, {1, 130, 1}, {1, 1, 254}}, "RootIndex" -> 1, "NominalDimension" -> 0|>
But I cannot tell you how to interpret this data. I would have to learn first what a decision tree is and how it is constructed...