Remove parentheses and text within from strings in R
You could also use:
library(qdap)
companies$Name <- genX(companies$Name, " (", ")")
companies
Name
1 Company A Inc
2 CompanyB
3 Company C Inc.
4 Company D Inc.
5 CompanyE
A gsub
should work here
gsub("\\s*\\([^\\)]+\\)","",as.character(companies$Name))
# or using "raw" strings as of R 4.0
gsub(r"{\s*\([^\)]+\)}","",as.character(companies$Name))
# [1] "Company A Inc" "Company B" "Company C Inc."
# [4] "Company D Inc." "Company E"
Here we just replace occurrences of "(...)" with nothing (also removing any leading space). R makes it look worse than it is with all the escaping we have to do for the parenthesis since they are special characters in regular expressions.
You could use stringr::str_replace
. It's nice because it accepts factor variables.
companies <- data.frame(Name=c("Company A Inc (COMPA)","Company B (BEELINE)",
"Company C Inc. (Coco)", "Company D Inc.",
"Company E"))
library(stringr)
str_replace(companies$Name, " \\s*\\([^\\)]+\\)", "")
# [1] "Company A Inc" "Company B" "Company C Inc."
# [4] "Company D Inc." "Company E"
And if you still want to use strsplit
, you could do
companies$Name <- as.character(companies$Name)
unlist(strsplit(companies$Name, " \\(.*\\)"))
# [1] "Company A Inc" "Company B" "Company C Inc."
# [4] "Company D Inc." "Company E"