fastest way convert tab-delimited file to csv in linux

If all you need to do is translate all tab characters to comma characters, tr is probably the way to go.

The blank space here is a literal tab:

$ echo "hello   world" | tr "\\t" ","
hello,world

Of course, if you have embedded tabs inside string literals in the file, this will incorrectly translate those as well; but embedded literal tabs would be fairly uncommon.


perl -lpe 's/"/""/g; s/^|$/"/g; s/\t/","/g' < input.tab > output.csv

Perl is generally faster at this sort of thing than the sed, awk, and Python.


If you're worried about embedded commas then you'll need to use a slightly more intelligent method. Here's a Python script that takes TSV lines from stdin and writes CSV lines to stdout:

import sys
import csv

tabin = csv.reader(sys.stdin, dialect=csv.excel_tab)
commaout = csv.writer(sys.stdout, dialect=csv.excel)
for row in tabin:
  commaout.writerow(row)

Run it from a shell as follows:

python script.py < input.tsv > output.csv

  • If you want to convert the whole tsv file into a csv file:

    $ cat data.tsv | tr "\\t" "," > data.csv
    

  • If you want to omit some fields:

    $ cat data.tsv | cut -f1,2,3 | tr "\\t" "," > data.csv
    

    The above command will convert the data.tsv file to data.csv file containing only the first three fields.

Tags:

Linux

Csv