Replace single backslash in R
One quite universal solution is
gsub("\\\\", "", str)
Thanks to the comment above.
Since there isn't any direct ways to dealing with single backslashes, here's the closest solution to the problem as provided by David Arenburg in the comments section
gsub("[^A-Za-z0-9]", "", str) #remove all besides the alphabets & numbers
When inputting backslashes from the keyboard, always escape them.
str <-"this\\is\\my\\string" # note doubled backslashes -> 'this\is\my\string'
gsub("\\", "", str, fixed=TRUE) # ditto
str2 <- "a\\f\\r" # ditto -> 'a\f\r'
gsub("\\", "", str2, fixed=TRUE)# ditto
Note that if you do
str <- "a\f\r"
then str
contains no backslashes. It consists of the 3 characters a
, \f
(which is not normally printable, except as \f
, and \r
(same).
And just to head off a possible question. If your data was read from a file, the file doesn't have to have doubled backslashes. For example, if you have a file test.txt
containing
a\b\c\d\e\f
and you do
str <- readLines("test.txt")
then str
will contain the string a\b\c\d\e\f
as you'd expect: 6 letters separated by 5 single backslashes. But you still have to type doubled backslashes if you want to work with it.
str <- gsub("\\", "", str, fixed=TRUE) # now contains abcdef
From the dput
, it looks like what you've got there is UTF-16 encoded text, which probably came from a Windows machine. According to
- https://en.wikipedia.org/wiki/Unicode#Character_General_Category
- https://en.wikipedia.org/wiki/UTF-16
it encodes glyphs in the Supplementary Multilingual Plane, which is pretty obscure. I'll guess that you need to supply the argument encoding="UTF-16"
to readLines
when you read in the file.