Using Dataset with TimeObjects in EventSeries
We can operate upon data contained within dataset by applying query operators. Assume that the dataset described in the question has been assigned to the variable ds
. Then, for example, we can convert the embedded association into an event series by applying the EventSeries
operator:
ds[EventSeries]
Alternatively, we could produce a plot by composing the EventSeries
and DateListPlot
operators:
ds[EventSeries /* DateListPlot]
It is likely that these operators can be applied directly to your original dataset. Let's consider the following dataset:
ds2 = Query[Dataset, DateObject] @
{{2016, 1, 1}, {2016, 1, 1}, {2017, 1, 1}, {2017, 1, 1}, {2017, 1, 1}, {2018, 1, 1}};
As in the question, we could use Counts @ ds2
to get the number of occurrences of each date as an association (contained in a dataset). But instead, let's express this operation in query form:
ds2[Counts]
The advantage of using query operator syntax is that we can now compose it with our other query operators to produce the plot directly from the source dataset:
ds2[Counts /* EventSeries /* DateListPlot]
Dataset query syntax is quite elaborate. It is described in detail by the Dataset and Query documentation.
Applying a Function to Multiple Columns
As for the second question, a simple way to apply a function to multiple columns is to use named slot syntax (e.g. #columnName
). For example, consider this dataset:
ds3 = Query[Dataset, AssociationThread[{"a", "b", "c"} -> #]&] @ RandomInteger[10, {5, 3}]
We can add together the columns a and c by means of the query operator #a + #c&
:
ds3[All, #a + #c &]
Alternatively, we could produce a bar chart of those sums:
ds3[BarChart, #a + #c &]
Dataset
has only been around since version 10.0, and not all functions support it as yet. However, Association
has much broader support and is what Dataset
is based on. You can use Normal
to convert the Dataset
into an Association
. EventSeries
can work with associations.
With ds
as the Dataset
in the above post, then
EventSeries[Normal@ds]
(* EventSeries[Time: 31 Dec 2015 12:00:00 to 31 Dec 2015 23:00:00 Data points: 3] *)
Hope this helps.