r - Name is not XML Namespace compliant -


i'm trying read table on site:

http://spacefem.com/pregnant/due.php?use=edd&m=09&d=10&y=16

i use rvest, error:

library(rvest) read_html("http://spacefem.com/pregnant/due.php?use=edd&m=09&d=10&y=16") 

error: name spoiler:3tbt4d3m not xml namespace compliant [202]

what error mean, , there can around it?

i've gotten far pinpointing internal function causing error: xml2:::doc_parse_raw. however, xml2:::doc_parse_raw call internal c code, making debugging of issue substantially more difficult.

another option use htmltidy (need use v0.3.0 or higher means—as of date of answer—using development version vs cran version until cran 0.3.0+) "clean" document:

library(rvest) library(htmltidy) # devtools::install_github("hrbrmstr/htmltidy") library(httr)  url <- "http://spacefem.com/pregnant/due.php?use=edd&m=09&d=10&y=16"  # site not returning content me w/o more browser-like user agent  res <- get(url, user_agent("mozilla/5.0 (linux; android 6.0; nexus 5 build/mra58n) applewebkit/537.36 (khtml, gecko) chrome/46.0.2490.76 mobile safari/537.36"))  cleaned <- tidy_html(content(res, as="text", encoding="utf-8"),                      list(tidydoctype="html5"))  pg <- read_html(cleaned) 

Comments