Using urltools::url_parse with UTF-8 domains
I could reproduce the issue. I could convert the column domain
to UTF-8 by reading it with readr::parse_character
and latin1
encoding:
library(urltools)
library(tidyverse)
url <- "www.cordes-tiefkühlprodukte.de"
parts <-
url_parse(url) %>%
mutate(domain = parse_character(domain, locale = locale(encoding = "latin1")))
parts
scheme domain port path parameter fragment
1 <NA> www.cordes-tiefkühlprodukte.de <NA> <NA> <NA> <NA>
I guess that the encoding you have to specify (here latin1
) depends only on your locale and not on the url's special characters, but I'm not 100% sure about that.