pivot_wider for Datasets

This is the simplest way I could think off:

ds = Dataset[{
  <|"Name" -> "Steven","Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
  <|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, "Score" -> 70|>,
  <|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[
  GroupBy[#Name &],
  ResourceFunction["MergeByKey"][{{"Name", "Born"} -> First}]
][
  Values,
  Append[KeyDrop[#, {"Year", "Score"}],
    AssociationThread["Score_" <> ToString[#]& /@ #Year, #Score]
  ]&
]

enter image description here

If different people have different years, you might want to use KeyUnion to make the dataset square again. Documentation for MergeByKey is here

You can use the package "DataReshape.m" as shown below. (It is in my ToDo list to submit the corresponding functions to Wolfram Function Repository very soon.)

Load the package:

Click to copy

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/DataReshape.m"]

Here is the original dataset:

Click to copy

dsData = Dataset[{<|"Name" -> "Steven", "Born" -> 1980, 
     "Year" -> 2017, "Score" -> 115|>, <|"Name" -> "Steven", 
     "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>, <|
     "Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, 
     "Score" -> 70|>, <|"Name" -> "Joe", "Born" -> 1981, 
     "Year" -> 2018, "Score" -> 300|>}];
dsData

enter image description here

Let us convert it to a “proper” long form with ID variables all except “Score”:

Click to copy

dsLong = ToLongForm[dsData, {"Name", "Born", "Year"}, Automatic]

enter image description here

Here we change the values of the column “Variable” to have the values of the column “Year” as suffixes:

Click to copy

dsLong2 = 
 dsLong[All, 
  Join[KeyTake[#, {"Name", "Born", "Value"}], <|
     "Variable" -> 
      StringRiffle[{#Variable, ToString[#Year]}, "_"]|>] &]

enter image description here

Here we convert to wide form with ID variables “Name” and “Born”:

Click to copy

ToWideForm[dsLong2, {"Name", "Born"}, "Variable", "Value"]

enter image description here

Both Sjoerd's and Anton's answers work nicely. Upon Sjoerd's solution using MergeByKey from the Wolfram Function Repository, I just wanted to post here for completeness that one can find almost the same solution with Merge from the Wolfram Language:

Click to copy

ds = Dataset[{
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
  <|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
  <|"Name" -> "Joe",    "Born" -> 1981, "Year" -> 2017, "Score" -> 70 |>,
  <|"Name" -> "Joe",    "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[GroupBy["Name"], Merge[#, Identity] &][Values,
  Append[Map[First, KeyDrop[#, {"Year", "Score"}]], 
    AssociationThread["Score_" <> ToString[#] & /@ #Year, #Score]] &]

pivot_wider for Datasets

Tags:

Dataset

Related

Recent Posts