Add lists with unequal lengths together to create a matrix
I don't have a free kernel to try your question on, so here's a smaller generalized approach that I could construct in my head. In the end, I'll mention how you can adapt it to yours.
First, consider a non-rectangular list similar to yours:
list = {{{1, a}, {2, b}},
{{3, a}, {2, c}, {4, f}},
{{3, b}, {6, f}, {4, c}, {5, e}}};
The output expected here is a 3x6 matrix with each column corresponding to the unique elements and each row having the corresponding element or zero. Obtaining the unique elements and constructing a larger matrix, as the following, we get:
With[{un = Union@Flatten@list[[All, All, 1]]},
(un /. Rule@@@#&)/@list /. _Integer -> 0
]
(* Out[1] = {{a, b, 0, 0, 0, 0},
{0, c, a, f, 0, 0},
{0, 0, b, c, e, f}}
*)
which, I believe, is what you want. If you want the first row to be the list of unique elements, simply join un
with the rest as:
With[{un = Union@Flatten@list[[All, All, 1]]},
{un} ~Join~ ((un /. Rule@@@#&)/@list /. _Integer -> 0)
]
(* Out[2]= {{1, 2, 3, 4, 5, 6},
{a, b, 0, 0, 0, 0},
{0, c, a, f, 0, 0},
{0, 0, b, c, e, f}}
*)
Now the only change that's required for this to work on your data is to use _SQLDateTime
instead of _Integer
in the With
construct above.
This is based on SparseArray
, like rcollyer's answer, but it's a one-liner since everything can by done by Apply
ing Rule
at the proper level and then using PadRight
to turn a ragged array into a rectangular one. Let
list = {{{1, a}, {2, b}},
{{3, a}, {2, c}, {4, f}},
{{3, b}, {6, f}, {4, c}, {5, e}}};
Then:
PadRight[Normal /@ SparseArray /@ (Apply[Rule, list, {2}])]
{{a, b, 0, 0, 0, 0}, {0, c, a, f, 0, 0}, {0, 0, b, c, e, f}}
EDIT to add: Of course, the real trick is getting the first row right, like @rcollyer points out in the comments below. It's possible to extend this solution in a manner similar to the one in R.M.'s answer, but I'm not sure how much that wins you:
Pillsy`Rectangularize[lists : {___List}] :=
With[{columns = Union@Cases[lists, {n_, _} :> n, {2}]},
Prepend[
PadRight[Normal /@ SparseArray /@
(lists /.
MapThread[{#1, v_} :> (#2 -> v) &, {columns,
Range@Length@columns}])],
columns]]
I tested the function with the following:
namedList = list /. Thread[Range[6] -> CharacterRange["A", "F"]]
{{{"A", a}, {"B", b}}, {{"C", a}, {"B", c}, {"D", f}}, {{"C", b}, {"F", f}, {"D", c}, {"E", e}}}
Pillsy`Rectangularize[namedList]
{{"A", "B", "C", "D", "E", "F"}, {a, b, 0, 0, 0, 0}, {0, c, a, f, 0, 0}, {0, 0, b, c, e, f}}
I did come up with a totally different solution based on Ordering
and DeleteDuplicates
which I kind of like:
Pillsy`AnotherRectangularize[lists_] :=
With[{columns = Union @@ lists[[All, All, 1]]},
Prepend[
Part[PadRight[#[[All, -1]], Length@columns],
Ordering@DeleteDuplicates[#[[All, 1]]~Join~columns]] & /@ lists,
columns]]
Much belated EDIT, using Association
functionality:
ClearAll[Pillsy`Rectangularize];
Pillsy`Rectangularize[lists : {{{_, _} ...} ..}] :=
Transpose@KeyValueMap[Prepend[#2, #1] &]@
KeySort@Merge[
KeyUnion[Map[Transpose /* Apply[AssociationThread], lists], 0 &],
Identity]
Here is an alternative method based upon Reap
and Sow
.
Reap[
MapIndexed[
Sow[#2[[1]] -> #1[[2]], First@#1] &,
list, {2} (* <-- Note level specifier *)
], _,
Flatten@{#1, Normal@SparseArray[#2, Length@list]} &][[2]] //
SortBy[#, First] & // Transpose
If you are not familiar with these two functions, Sow
tags each item you indicate and Reap
organizes the items by each tag. Among the advantages of Reap
and Sow
over GatherBy
is the ability to simultaneously transform each item while tagging it with one of its original properties. Here I take advantage of this by using MapIndexed
to tag each datum with its first element (second parameter in Sow
) while associating the second element of the datum with the index of its data set. (This is done by using the {2}
as the level specifier.) For example,
MapIndexed[#2[[1]]->#1[[2]], list, {2}]
with
list = {
{{1, a}, {2, b}},
{{3, a}, {2, c}, {4, f}},
{{3, b}, {6, f}, {4, c}, {5, e}}
};
gives
{{1 -> a, 1 -> b}, {2 -> a, 2 -> c, 2 -> f}, {3 -> b, 3 -> f, 3 -> c, 3 -> e}}
By creating the association with the data set index, we are marking which row it will end up in after Reap
collects them. Reap
allows us to specify an additional transformation taking 2 parameters the tag and a list of items with that tag as its third argument. For example,
Reap[MapIndexed[Sow[#2[[1]] -> #1[[-1]], First@#1] &, list, {2}],
_, Rule][[2]]
using list
from above gives
{1 -> {1 -> a}, 2 -> {1 -> b, 2 -> c}, 3 -> {2 -> a, 3 -> b}, 4 -> {2 -> f, 3 -> c}, 6 -> {3 -> f}, 5 -> {3 -> e}}
As you can see, this has collected the data according to the first element of each datum while retaining information about which dataset it belongs to.
Since the tagged data has the form, {_Integer -> value ..}
, we can use SparseArray
to generate each row while automatically filling in the empty spaces with 0
. This is what I do with
Flatten@{#1, Normal@SparseArray[#2, Length@list]} &
as the third parameter.
Then, I just SortBy
the First
element of each sub-list, and Transpose
it into its final form
{{1, 2, 3, 4, 5, 6}, {a, b, 0, 0, 0, 0}, {0, c, a, f, 0, 0}, {0, 0, b, c, e, f}}
As to the specific dataset involving SQLDateTime
, it appears that SQLDateTime
contains a full date-time stamp. It is unlikely that any two datum will have the same date-time stamp, so if you want to gather within a full day only, change
Sow[#2[[1]] -> #1[[2]], First@#1] &
to
Sow[#2[[1]] -> #1[[2]], { #1[[1, ;;3]] } ] &
where #1[[1, ;;3]]
only extracts what appears to be the date part of SQLDateTime
. This could be extended to include hours or even minutes by changing ;;3
to ;;4
or ;;5
, respectively. Note, I wrapped the tag in an extra set of braces so that the entire date is considered a tag, and not a collection of individual tags.