how to remove fields from a file in linux and output to a new file code example
Example: bash remove columns
# Basic syntax using cut:
cut -d'delimiter' -f fields/columns input_file
# Where:
# - Fields/columns specify the columns you want to keep. They can
# be specified in the following ways:
# -f 2,7,8 where only columns 2, 7, and 8 are returned
# -f 2-8 where columns 2 to 8 inclusive are returned
# -f -8 where columns 1-8 inclusive are returned
# -f 8- where columns 8-end inclusive are returned
# Note, the default delimiter is tab, so you can omit the -d'delimiter'
# if you file is tab-delimited
# Example usage:
cut -f 2,7 input_file # Return columns 2, 7 of a tab-delimited input_file
cut -d' ' -f 2-5 input_file # Return columns 2 to 5 of a space-delimited input_file
cut -d',' -f 3- input_file # Return columns 3 to the end of a comma-delimited input_file
# Basic syntax using awk:
awk 'BEGIN {FS="input_delimiter"; OFS="output_delimiter"}; {print fields/columns}' input_file
# Where:
# - BEGIN specifies what to do before processing the input_file (here
# we set the input and output delimiters)
# - The input_delimiter specifies the delimiter/separator of the
# input and output_delimiter specifies the delimiter of the output
# - Fields/columns specify the columns you want to keep.
# Note, in awk, by default any run of spaces and/or tabs and/or newlines
# is treated as a field/column separator for the input, with leading
# and trailing runs of the same character ignored. Consequently, if
# your file uses any of those delimiters, the FS portion can be omitted.
# Note, in awk, the default output delimiter is a space.
# Note, use commas between the fields to be printed to use the
# output_delimiter between those fields.
# Example usage:
awk '{print $2$7$8}' input_file # Return fields 2,7, and 8 from the
# input_file separated by nothing (i.e. squished together)
awk '{print $2,$7,$8}' input_file # Return fields 2,7, and 8 from the
# input_file separated by spaces
awk 'BEGIN {FS=","; print $2,$7}' input_file # Return comma-separated
# fields 2 and 7 from the input_file separated by spaces
awk 'BEGIN {FS=" "; OFS="\t"; print $2,$7}' input_file # Return
# space-separated fields 2 and 7 from the input_file separated by tabs