Using 'rvest' to extract links

I also was able to clean the results from above which for me were quite noisy

links <- page %>% html_nodes("a") %>% html_attr("href")

with a simple regex string matching

links <- links[which(regexpr('common-url-element', links) >= 1)].

library(rvest)     
page <- read_html("http://www.yelp.com/search?find_loc=New+York,+NY,+USA")
page %>% html_nodes(".biz-name") %>% html_attr('href')

Hope this would simplify your problem

Tags:

Web Scraping

R

Yelp

Rvest

Related