system() yields inconsistent results
Using shell
Try:
$ while read -r line; do date +%s -d "${line%%,*}"; done < input.csv
1597725964
1597726023
1597726083
1597726144
How it works
while read -r line; do
starts awhile
loop and reads a line from stdin."${line%%,*}"
strips the commas and everything after them from the line.date +%s -d "${line%%,*}"
prints the date as epoch.done
completes thewhile
loop.<input.csv
provides the stdin to the loop.
Variation
This prints the full line and adds the epoch as the final column:
$ while read line; do printf "%s, %s\n" "$line" $(date +%s -d "${line%%,*}"); done < input.csv
08/17/2020 21:46:04 -700 , 1 , 2 , 3, 1597725964
08/17/2020 21:47:03 -700 , 1 , 2 , 3, 1597726023
08/17/2020 21:48:03 -700 , 1 , 2, 1597726083
08/17/2020 21:49:04 -700 , 1 , 2, 1597726144
In awk
you can use a coprocess with getline instead of system():
< input.csv awk -F' , ' '{
"date +%s -d \047"$1"\047\n" | getline date
print date
}'
1597725964
1597726023
1597726083
1597726144
With the help of Inian and oguz ismail in comments, and gawk
, we came up with a better solution, which writes into date's stdin, instead of passing the arguments via command line to it. That's better because interpolating variables into a command line always comes with the risk of shell command injection (via input.csv).
< input.csv gawk -F' , ' '{
cmd = "date +%s -f-";
print $1 |& cmd;
close(cmd, "to");
if ((cmd |& getline line) > 0)
print line; close(cmd)
}'
1597725964
1597726023
1597726083
1597726144
Thanks to both!
The call to system(...)
returns zero, thus tmp
is assigned $(0)
, i.e. the whole input line. Observe:
$ echo a b c d | awk '{ x = $(system("exit 3")); print x }'
c
You can't capture a shell command's output using the system
function in awk; hek2mgl's answer demonstrates how to do it correctly.
Then in the printf(...)
call $tmp
is expanded to $8
, because the longest prefix in $0
that constitutes a valid number is 08
; hence the commas in the output. Which can be proven like so:
$ echo foo bar | awk '{ x = "0002junk"; print $x }'
bar
Anyways, for achieving the task described in the question, you don't really need awk. A conjunction of cut
and GNU date
yields the desired output.
$ cut -d, -f1 input.csv | date -f- +%s
1597725964
1597726023
1597726083
1597726144
And using paste
, you can append these timestamps to corresponding records if you don't mind missing spaces around commas.
$ cut -d, -f1 input.csv | date -f- +%s | paste -d, input.csv -
08/17/2020 21:46:04 -700 , 1 , 2 , 3,1597725964
08/17/2020 21:47:03 -700 , 1 , 2 , 3,1597726023
08/17/2020 21:48:03 -700 , 1 , 2,1597726083
08/17/2020 21:49:04 -700 , 1 , 2,1597726144