String replacement using a dictionary
Here's one way with sed
:
sed '
s|"\(.*\)"[[:blank:]]*:[[:blank:]]*"\(.*\)"|\1\
\2|
h
s|.*\n||
s|[\&/]|\\&|g
x
s|\n.*||
s|[[\.*^$/]|\\&|g
G
s|\(.*\)\n\(.*\)|s/\1/\2/g|
' dictionary.txt | sed -f - novel.txt
How it works:
The 1st sed
turns dictionary.txt
into a script-file (editing commands, one per line). This is piped to the 2nd sed
(note the -f -
which means read commands from stdin
) that executes those commands, editing novel.txt
.
This requires translating your format
"STRING" : "REPLACEMENT"
into a sed
command and escaping any special characters in the process for both LHS
and RHS
:
s/ESCAPED_STRING/ESCAPED_REPLACEMENT/g
So the first substitution
s|"\(.*\)"[[:blank:]]*:[[:blank:]]*"\(.*\)"|\1\
\2|
turns "STRING" : "REPLACEMENT"
into STRING\nREPLACEMENT
(\n
is a newline char). The result is then copied over the h
old space.
s|.*\n||
deletes the first part keeping only REPLACEMENT
then s|[\&/]|\\&|g
escapes the reserved characters (this is the RHS
).
It then ex
changes the hold buffer with the pattern space and s|\n.*||
deletes the second part keeping only STRING
and s|[[\.*^$/]|\\&|g
does the escaping (this is the LHS
).
The content of the hold buffer is then appended to pattern space via G
so now the pattern space content is ESCAPED_STRING\nESCAPED_REPLACEMENT
.
The final substitution
s|\(.*\)\n\(.*\)|s/\1/\2/g|
transforms it into s/ESCAPED_STRING/ESCAPED_REPLACEMENT/g