jq: group and key by property
The accepted answer doesn't produce valid json, but:
{
"name1": [
"1.1.1.1",
"1.1.1.2"
]
}
{
"name2": [
"1.1.1.3",
"1.1.1.4"
]
}
name1
as well as name2
are valid json objects, but the output as a whole isn't.
The following jq
statement results in the desired output as specified in the question:
group_by(.component) | map({ key: (.[0].component), value: [.[] | .ip] }) | from_entries
Output:
{
"name1": [
"1.1.1.1",
"1.1.1.2"
],
"name2": [
"1.1.1.3",
"1.1.1.4"
]
}
Suggestions for simpler approaches are welcome.
If human readability is preferred over valid json, I'd suggest something like ...
jq -r 'group_by(.component)[] | "IPs for " + .[0].component + ": " + (map(.ip) | tostring)'
... which results in ...
IPs for name1: ["1.1.1.1","1.1.1.2"]
IPs for name2: ["1.1.1.3","1.1.1.4"]
As a further example of @replay's technique, after many failures using other methods, I finally built a filter that condenses this Wazuh report (excerpted for brevity):
{
"took" : 228,
"timed_out" : false,
"hits" : {
"total" : {
"value" : 2806,
"relation" : "eq"
},
"hits" : [
{
"_source" : {
"agent" : {
"name" : "100360xx"
},
"data" : {
"vulnerability" : {
"severity" : "High",
"package" : {
"condition" : "less than 78.0",
"name" : "Mozilla Firefox 68.11.0 ESR (x64 en-US)"
}
}
}
}
},
{
"_source" : {
"agent" : {
"name" : "100360xx"
},
"data" : {
"vulnerability" : {
"severity" : "High",
"package" : {
"condition" : "less than 78.0",
"name" : "Mozilla Firefox 68.11.0 ESR (x64 en-US)"
}
}
}
}
},
...
Here is the jq
filter I use to provide an array of objects, each consisting of an agent name followed by an array of names of the agent's vulnerable packages:
jq ' .hits.hits |= unique_by(._source.agent.name, ._source.data.vulnerability.package.name) | .hits.hits | group_by(._source.agent.name)[] | { (.[0]._source.agent.name): [.[]._source.data.vulnerability.package | .name ]}'
Here is an excerpt of the output produced by the filter:
{
"100360xx": [
"Mozilla Firefox 68.11.0 ESR (x64 en-US)",
"VLC media player",
"Windows 10"
]
}
{
"WIN-KD5C4xxx": [
"Windows Server 2019"
]
}
{
"fridxxx": [
"java-1.8.0-openjdk",
"kernel",
"kernel-headers",
"kernel-tools",
"kernel-tools-libs",
"python-perf"
]
}
{
"mcd-xxx-xxx": [
"dbus",
"fribidi",
"gnupg2",
"graphite2",
...
I figured it out myself. I first group by .component
and then just create new lists of ips that are indexed by the component of the first object of each group:
jq ' group_by(.component)[] | {(.[0].component): [.[] | .ip]}'