Separate fields with a multi-character delimiter using awk
I think the problem you are facing is related to the following statement in the (GNU) awk
manpage [1]:
If FS is a single character, fields are separated by that character. If FS is the null string, then each individual character becomes a separate field. Otherwise, FS is expected to be a full regular expression.
Since your field delimiting pattern contains characters that have a special meaning in regular expressions (the |
and the ^
), you need to escape them properly. Because of the way awk
interprets variables (string literals are parsed twice), you would need to specify that using double backslashes, as in
awk -F '\\|~\\^' '{print $2}' input.txt
Resulting output for your example:
20200425
abc
abc
abc
abc
abc
abc
20200425
To consider only those lines starting with T
, use
awk -F '\\|~\\^' '/^T/ {print $2}' input.txt
or alternatively, by selecting only lines where a certain field (here, the first field) has a value of T
:
awk -F '\\|~\\^' '$1=="T" {print $2}' input.txt
Result for your example in both cases
20200425
Notice that in general, the combined use of awk
, grep
and sed
is rarely necessary. Furthermore, all these tools can directly access files, so using cat
to feed them the text to process is also unnecessary.
[1]: As an (unrelated) side note: The part with the "null string" does not work on all Awk variants. The GNU Awk manual states "This is a common extension; it is not specified by the POSIX standard".