jq: select when any value is in array
You can use a combination of jq
and shell tricks using arrays to produce the filter. Firstly to produce the shell array, use an array notation from the shell as below. Note that the below notation of bash
arrays will not take ,
as a separator in its definition. Now we need to produce a regex filter to match the string, so we produce an alternation operator
filter=("first" "second")
echo "$(IFS="|"; echo "${filter[*]}"
first|second
You haven't mentioned if the string only matches in the first or last or could be anywhere in the .title
section. The below regex matches for the string anywhere in the string.
Now we want to use this filter in the jq
to match against the .title
string as below. Notice the use of not
to negate the result. To provide the actual match, remove the part |not
.
jq --arg re "$(IFS="|"; echo "${filter[*]}")" '[.[] | select(.title|test($re)|not)]' < json
One way to solve a problem that involves the word "any" is often to use jq's any
, e.g. using your shell variable:
jq --argjson filter "$filter" '
map((.title | split(" ")) as $title
| select(any( $title[] as $t
| $filter[] as $kw
| $kw == $t )))' input.json
Negation
As in formal logic, you can use all
or any
(in conjunction with negation) to solve the negated problem. But don't forget that if you use not
,
jq's not
is a zero-arity filter.
jq --argjson filter "$filter" '
map((.title | split(" ")) as $title
| select(all( $title[] as $t
| $filter[] as $kw
| $kw != $t )))' input.json
Other approaches
The above uses "keyword matching" as that is what the question specifies, but of course the above jq expressions can easily be modified to use regexes or some other type of matching.
If the list of keywords is very long, then a better algorithm for array-intersection would no doubt be desirable.