How to split file name into variable?
With zsh
:
file='INT_V1_<Product>_<ID>_<Name>_<ddmmyy>.csv'
setopt extendedglob
if [[ $file = (#b)*_(*)_(*)_(*)_(*).csv ]]; then
product=$match[1] id=$match[2] name=$match[3] date=$match[4]
fi
With bash
4.3 or newer, ksh93t or newer or zsh in sh emulation (though in zsh
, you'd rather simply do field=("${(@s:_:)field}")
for splitting than using the split+glob non-sense operator of sh
) you could split the string on _
characters and reference them from the end:
IFS=_
set -o noglob
field=($file) # split+glob operator
date=${field[-1]%.*}
name=${field[-2]}
id=${field[-3]}
product=${field[-4]}
Or (bash 3.2 or newer):
if [[ $file =~ .*_(.*)_(.*)_(.*)_(.*)\.csv$ ]]; then
product=${BASH_REMATCH[1]}
id=${BASH_REMATCH[2]}
name=${BASH_REMATCH[3]}
date=${BASH_REMATCH[4]}
fi
(that assumes $file
contains valid text in the current locale which is not guaranteed for file names unless you fix the locale to C or other locale with a single-byte per character charset).
Like zsh
's *
above, the .*
is greedy. So the first one will eat as many *_
as possible, so the remaining .*
will only match _
-free strings.
With ksh93
, you could do
pattern='*_(*)_(*)_(*)_(*).csv'
product=${file//$pattern/\1}
id=${file//$pattern/\2}
name=${file//$pattern/\3}
date=${file//$pattern/\4}
In a POSIX sh
script, you could use the ${var#pattern}
, ${var%pattern}
standard parameter expansion operators:
rest=${file%.*} # remove .csv suffix
date=${rest##*_} # remove everything on the left up to the rightmost _
rest=${rest%_*} # remove one _* from the right
name=${rest##*_}
rest=${rest%_*}
id=${rest##*_}
rest=${rest%_*}
product=${rest##*_}
Or use the split+glob operator again:
IFS=_
set -o noglob
set -- $file
shift "$(($# - 4))"
product=$1 id=$2 name=$3 date=${4%.*}
You can take the values of your field <Name>
with this command:
cut -d'<' -f4 < csvlist | sed -e 's/>_//g'
(or with awk
):
awk -F'<' '{print $4}' < csvlist | sed -e 's/>_//g'
And you can put them in a variable like this:
variable_name=$(cut -d'<' -f4 < csvlist | sed -e 's/>_//g')
or
awk -F'<' '{print $4}' < csvlist | sed -e 's/>_//g'
It is not clear in the question if you want the same variable for all the values or one single variable for each one of them.