Match only the first paragraph using bash
This is a tailor-made problem for gnu awk
by using a custom record separator. We can use a custom RS
that breaks file data by 2 or more of an optional \r
followed by \n
:
awk -v RS='(\r?\n){2,}' 'NR == 1' file
This outputs:
Foo
* Bar
If you want awk
to be more efficient when input is very big:
awk -v RS='(\r?\n){2,}' '{print; exit}' file
You can use a GNU grep
like this:
grep -Poz '(?s)^.+?(?=\R{2}|$)' file
See the PCRE regex demo.
Details
(?s)
- a DOTALL inline modifier that makes.
match all chars including linebreak chars^
- start of the whole string.+?
- any 1 or more chars, as few as possible(?=\R{2}|$)
- a positive lookahead that matches a location immediately followed with a double line break sequence (\R{2}
) or end of string ($
).
For GNU awk if the paragraphs are separated by \r\n\r\n
or \n\n
:
$ awk -v RS="\r?\n\r?\n" '{print $0;exit}' file
Output:
Foo
* Bar