How can I regroup elements in a list into a tree based on their values?
This is a job for recursion:
ClearAll[f];
f[{fst : {{left__}, _}, sub : Longest[{{left__, __}, _} ...], rest___}] :=
{{fst, f[{sub}]} /. {el_, {}} :> el, Sequence @@ f[{rest}]};
f[x_] := x;
The usage is
f[your-initial-structure]
It would be nice to have a real big sample, but I hope it works anyway. Here is the approach: I would extract the information level by level and build up the structure. Level 1 contains the sections and this is basically your starting point
Select[input, Length[First[#]] === 1 &]
All section number lists which have a length of one are your sections. The first one looks like this
{{"1"}, "Setup code"}
Lets say our current output is a list of all those entries. What you want to do now is to extract all sections of level 2 which have the same first number like the above. Lets say we found 3 subsections call them s1,s2,s3
and we call the section entry from above elm
then we want to replace in our current output
elm -> {elm, {s1, s2, s3}}
This can be done level by level until we are finished. To make it a bit more concrete: the front of our rule above must ensure, that we work only on the elements of the current level, therefore, we check the length of the section numbering
elm : {l : {__String}, what_String} /; Length[l] === 1 :> blub
To extract s1,...
we Select
from our input all entries which are one level deeper than the current one (this means Length[..]===2
here) and which have the same section numbers up to the current level. For instance, all childs of {2,2,3}
have to look like {2,2,3,..}
. In a generic solution this call is therefore
Select[input, Length[First[#]] === i + 1 && First[#][[Range[i]]] === l &]
where i
is the current level. We need the First[#]
because the section numbers are in the first part of every entry.
One last thing: If an entry has no sub-entries, the Select
returns {}
which I simply filter in a second rule. The nice thing is that when we are finished, the output does not change anymore. This is perfect for FixedPoint
TransformDoc[input_] :=
Block[{i = 0},
FixedPoint[
(i++; # /. elm : {l : {__String}, what_String} /; Length[l] === i :>
{elm,Select[input, Length[First[#]] === i + 1 &&
First[#][[Range[i]]] === l &]} /.
{whatever__, {}} :> whatever) &,
Select[input, Length[First[#]] === 1 &]]
]
I hope I explained all details enough to understand the rule I used.
input = {{{"1"}, "Setup code"}, {{"1", "1"},
"Local variables"}, {{"1", "2"}, "Option variables"}, {{"1", "3"},
"Check for null Input"}, {{"2"}, "Argument Logic"}, {{"2", "1"},
"X Spec"}, {{"2", "2"}, "One Dimensional Case"}, {{"2", "2", "1"},
"Tick and Label Generation"}, {{"2", "2", "2"},
"plot generation"}, {{"2", "2", "3"},
"bin names generation"}, {{"2", "2", "4"},
"bar chart generation"}, {{"2", "3"}, "Y Spec"}};
TransformDoc[input] // ToOutline
Finally, please give me a chance to maybe make you reconsider your approach. Why do you need the section numbering at all? Look at LaTeX, as long as you don't want to mix different sections at different places you only need to markup whether you need a section, a subsection, ... The numbering and the structure can simply be build from this. Wouldn't it be easier to just give a number defining the depth of your comment and the rest is calculated automatically?
ReplaceRepeated
with some pre-processing to re-structure outline
into the desired form.
The key ideas are
- construct a tree with root
0
using the information in the first column ofoutline
- pre-process the vertex list to turn leaf indices to strings
- Use
ReplaceRepeated
starting from the root node, and replacing each processed nodei
with the list{"i", {children of i}}
.
First, few simple helper functions:
ClearAll[fatherF, rootsF, chldrnListF]
fatherF = Function[{item, list}, If[Length[item] == 1, {0},
Pick[list, (item[[;; -2]] == #) & /@ list]]];
(* create a list of children lists - the leaf nodes of the tree marked as strings in each part *)
chldrnListF = Function[{grph},
VertexOutComponent[grph, #, 1]& /@ VertexList[grph] /.
i_Integer /; VertexOutComponent[grph, {i}, 1] == {i} :> ToString[i]];
(* to extract roots *)
rootsF = Function[{grph},
Pick[VertexList[grph], (VertexInComponent[grph, {#}, 1] == {#}) & /@
VertexList[grph]]];
and some pre-processing:
indices = First /@ outline;
fathers = fatherF[#, indices] & /@ indices;
edgelist = (Thread[fathers -> indices] /.
Rule[a_, b_] :>
Rule[Position[indices, First@a, 1], Position[indices, b, 1]] /. {} -> {{0}}) //
Map[First@(Sequence @@ #) &, #, {2}] &
(* {0->1,1->2,1->3,1->4,0->5,5->6,5->7,7->8,7->9,7->10,7->11,5->12} *)
(* if needed remove the zero node to get a forest *)
edgelistb = edgelist /. Rule[0, _] :> Sequence[]
and a long "one"-liner:
({0} //. (i_Integer :> {ToString[i], Rest[chldrnListF[Graph@edgelist][[i + 1]]]}) /.
x_String :> ToExpression[x] // Part[#, 1, 2] &) /. i_Integer :> outline[[i]];
% == answer
(* True *)
outline
as a tree or a forest:
labels = MapIndexed[First[#2] -> Placed[Rotate[ToString[#1], 15 Degree], Below] &, outline];
{grph, grphb} = {Graph[edgelist, VertexLabels -> Prepend[labels, {"0"}],
ImagePadding -> 50, ImageSize -> 600],
Graph[edgelistb, VertexLabels -> labels, ImagePadding -> 50, ImageSize -> 400]};
Panel@Row[{grph, grphb}, Spacer[10]]
Details of the steps:
grph = Graph[edgelist];
tmp = If[MemberQ[rootsF[grph], 0],
rootsF[grph] //. (i_Integer :> {ToString[i], Rest[chldrnListF[grph][[i + 1]]]}),
rootsF[ grph] //. (i_Integer :> {ToString[i], Rest[chldrnListF[grph][[i]]]})] /.
x_String :> ToExpression[x]
(* {{"0",{{"1",{"2","3","4"}},{"5",{"6",{"7",{"8","9","10","11"}},"12"}}}}}*)
(* post process to transform strings back to integers and get the appropriate part*)
treeList = If[MemberQ[rootsF[grph], 0], Part[#, 1, 2], #] &[tmp]
(*{{1,{2,3,4}},{5,{6,{7,{8,9,10,11}},12}}} *)
(* map back to outline *)
tocTreeList = treeList /. i_Integer :> outline[[i]]
(* {{{{"1"},"Setup code"},
{{{"1","1"},"Local variables"},
{{"1","2"},"Option variables"},
{{"1","3"},"Check for null Input"}}},
{{{"2"},"Argument Logic"},
{{{"2","1"},"X Spec"},
{{{"2","2"},"One Dimensional Case"},
{{{"2","2","1"},"Tick and Label Generation"},
{{"2","2","2"},"plot generation"},
{{"2","2","3"},"bin names generation"},
{{"2","2","4"},"bar chart generation"}}},
{{"2","3"},"Y Spec"}}}} *)
answer == tocTreeList
(* True *)