Most simple way of extracting substring in Unix shell?
cut
might be useful:
$ echo hello | cut -c1,3
hl
$ echo hello | cut -c1-3
hel
$ echo hello | cut -c1-4
hell
$ echo hello | cut -c4-5
lo
Shell Builtins are good for this too, here is a sample script:
#!/bin/bash
# Demonstrates shells built in ability to split stuff. Saves on
# using sed and awk in shell scripts. Can help performance.
shopt -o nounset
declare -rx FILENAME=payroll_2007-06-12.txt
# Splits
declare -rx NAME_PORTION=${FILENAME%.*} # Left of .
declare -rx EXTENSION=${FILENAME#*.} # Right of .
declare -rx NAME=${NAME_PORTION%_*} # Left of _
declare -rx DATE=${NAME_PORTION#*_} # Right of _
declare -rx YEAR_MONTH=${DATE%-*} # Left of _
declare -rx YEAR=${YEAR_MONTH%-*} # Left of _
declare -rx MONTH=${YEAR_MONTH#*-} # Left of _
declare -rx DAY=${DATE##*-} # Left of _
clear
echo " Variable: (${FILENAME})"
echo " Filename: (${NAME_PORTION})"
echo " Extension: (${EXTENSION})"
echo " Name: (${NAME})"
echo " Date: (${DATE})"
echo "Year/Month: (${YEAR_MONTH})"
echo " Year: (${YEAR})"
echo " Month: (${MONTH})"
echo " Day: (${DAY})"
That outputs:
Variable: (payroll_2007-06-12.txt)
Filename: (payroll_2007-06-12)
Extension: (txt)
Name: (payroll)
Date: (2007-06-12)
Year/Month: (2007-06)
Year: (2007)
Month: (06)
Day: (12)
And as per Gnudif above, there are always sed/awk/perl for when the going gets really tough.
Unix shells do not traditionally have regex support built-in. Bash and Zsh both do, so if you use the =~
operator to compare a string to a regex, then:
You can get the substrings from the $BASH_REMATCH
array in bash.
In Zsh, if the BASH_REMATCH
shell option is set, the value is in the $BASH_REMATCH
array, else it's in the $MATCH/$match
tied pair of variables (one scalar, the other an array). If the RE_MATCH_PCRE
option is set, then the PCRE engine is used, else the system regexp libraries, for an extended regexp syntax match, as per bash.
So, most simply: if you're using bash:
if [[ "$variable" =~ unquoted.*regex ]]; then
matched_portion="${BASH_REMATCH[0]}"
first_substring="${BASH_REMATCH[1]}"
fi
If you're not using Bash or Zsh, it gets more complicated as you need to use external commands.
Consider also /usr/bin/expr
.
$ expr substr hello 2 3
ell
You can also match patterns against the beginning of strings.
$ expr match hello h
1
$ expr match hello hell
4
$ expr match hello e
0
$ expr match hello 'h.*o'
5
$ expr match hello 'h.*l'
4
$ expr match hello 'h.*e'
2