Replacing the values in one file with the values in another file in bash

You can use awk for this:

awk -F',|, ' 'NR==FNR{a[$2]=$3} NR>FNR{$8=a[$8];print}' OFS=',' "$file2" "$file1"

This reads model-list.csv, storing all the models and their descriptions into an string-indexed array (eg a["Z800"] == "HP Z800 Workstation"). Then it reads through the list data, replacing each model with the description string from the array.

Explanation:

  • -F',|, ' - this sets the field separator using a regex pattern, in this case the field separator will be either a single comma, or a single comma and a single space.
  • NR==FNR{a[$2]=$3} - NR is an awk internal variable which keeps track of the total number of rows read since the program began. FNR is similar, but keeps track of the number of rows of the current file which have been read. So NR==FNR is an awk idiom which means "if this is the first file to be read", and the associated action is a[$2]=$3 which saves the value of field 3 in the array a, with the string index being set to the value of field 2.
  • NR>FNR{$8=a[$8];print}' - similar to the previous, but this time operates only on files other than the first to be read. For each line, we use the value of field 8 as the index to look up the value in the array, then re-assign field 8 to the array value. Finally, the whole line is printed .
  • OFS=',' "$file2" "$file1" - sets the output field separator to a comma (default is space), then reads in 2 files in the specified order.

Some notes:

  • Bash is a terrible language for database emulation. Are you sure you cannot use a relational database for this?
  • Avoid useless uses of cat. You can do grep ",$ModelNo," "$file1".
  • You can do while IFS=, read -r _ ModelNo ModelName _ to avoid the awk lines.
  • In Bash you can do my_command <<< "$variable" instead of echo "$variable" | my_command.
  • You should use $(my_command) instead of `my_command` for readability.
  • grep -F will search for literal strings.
  • You can check the exit code of grep to see if it found anything. That should be faster than checking the file size.

In bash, assuming a bash version >= 4, you can do this very easily using associative arrays:

#!/usr/bin/env bash

## declare models as an associative array
declare -A models

## read the 1st file, load the Value => Model pair
## pairs into the models array. Note that I'm setting bash's
## Input Field Separator ($IFS) to comma (,) and that I first pass
## the file through sed to remove the spaces after the commas.
## For more on why I'm using <() instead of a pipe, see 
## http://stackoverflow.com/q/9985076/1081936
while IFS=, read -r man val mod; 
do 
    models["$val"]="$mod" 
done <  <(sed  's/, /,/g' "$1") 


## Read the second file. I am defining 9 variables, 8 for
## the first 8 fields, up to the model and $rest for the rest of 
## the fields, up to the end of the line.
while IFS=',' read -r loc ip host dom dnam user manu model rest; 
do
   printf "%s,%s,%s,%s,%s,%s,%s,%s,%s\n" "$loc" "$ip" "$host" "$dom" \
          "$dnam" "$user" "$manu" "${models[$model]}" "$rest";
done <  <(sed  's/, /,/g' "$2") 

Caveats:

  1. This will fail on 1st line of the specific List.csv you posted because model-list.csv has Model Name where List.csv has Model. This means tat there will be no match for ${models[$model]} on the first line. You can fix this either by editing the header of one of the files so that the field names are identical or by using this version instead:

    #!/usr/bin/env bash
    
    declare -A models
    while IFS=, read -r man val mod; 
    do 
        models["$val"]="$mod" 
    done <  <(sed  's/, /,/g' "$1") 
    ## Set up a counter to hold the line numbers
    c=0;
    
    while IFS=',' read -r loc ip host dom dnam user manu model rest; 
    do
        ## Increment the line number
        (( c++ ));
        ## If this is the 1st line, print
        if [ "$c" -eq "1" ]; then 
        printf "%s,%s,%s,%s,%s,%s,%s,%s,%s\n" "$loc" "$ip" "$host" "$dom" \
            "$dnam" "$user" "$manu" "$model" "$rest";
       else
        printf "%s,%s,%s,%s,%s,%s,%s,%s,%s\n" "$loc" "$ip" "$host" "$dom" \
            "$dnam" "$user" "$manu" "${models[$model]}" "$rest";
        fi
    done <  <(sed  's/, /,/g' "$2") 
    
  2. This assumes that your file is as simple as you show, that all fields are defined by commas and that no fields can contain commas.


In Perl this could of course be done much more simply:

perl -F',\s*' -lane '$k{$F[1]}=$F[2]; next if $#F < 4; s/$F[7]/$k{$F[7]}/; print' model-list.csv List.csv 

Explanation

  • -F sets the field delimiter (here a , followed by 0 or more whitespace characters) which is used with -a which automatically splits each input line into the @F array.
  • -l turns on automatic removal of the \n at the end of each line and also adds an implicit \n to each print statement.
  • -n means read the input file line by line and apply whatever script was passed with -e to it.
  • $k{$F[1]}=$F[2] : this populates the has %k where the 2nd field of each line is the key and the value is the 3nd field. This is only relevant for the model-list.csv but will also be run for List.csv. This can safely be ignored as long as List.csv will never contain an 8th field that is also present as a 2nd field in model-list.csv
  • next if $#F < 4 : read the next line if this one has less than 4 fields. This is so as the final print does not print the lines of model-list.csv
  • s/$F[7]/$k{$F[7]}/; print : substitute the 8th field of the current line with whatever is stored in the %k hash for that field and print the line.