rvest - R: Scrape some info from an HTML -


i have html file follows format:

<div id='1' class='location element' style='width:100px; top:5068px; left: 3332px;'><div class='position'></div><div class='time'></div><div class='age'></div>name</div>

and retrieve string first div (in case `location') , name.

so far, can retrieve name using id number.

html_file%>%    html_nodes("#1") %>%   html_text() 

how can retrieve first field 'class'? thanks.

use html_attr:

library(rvest) library(dplyr) html_file%>%      html_nodes("#1") %>%     html_attr("class")  [1] "location element" 

nb: if use html_attrs() can attributes out , can go there too:

library(rvest) library(dplyr) html_file%>%      html_nodes("#1") %>%     html_attrs()  [[1]]                                       id                                    class                                       "1"                       "location element"                                     style  "width:100px; top:5068px; left: 3332px;"  

Comments