Sunday, September 20, 2015
R-Function GScholarScraper to Webscrape Google Scholar Search Result
NOTE: You'll find the update HERE and HERE.NOTE: The script is currently not working because the code of the Google-Scholar site has changed...
I'll see for this as soon as I find some spare time for it!
NOTE: If you try to access GoogleScholar programatically consider this words of caution:
http://stackoverflow.com/questions/7523961/google-scholar-with-matlab/7587994#7587994
...
Based on my previous post on Web Scraping I coded and uploaded the Function "GScholarScraper" HERE for testing!
The function will pull all (!) results, processing pages in chunks of 100 results/titles, and return a file with all titles, links, etc. It will also produce a word cloud using the words in the publication titles.
Please try your own search strings and report errors, etc.!
Build and run properly under:
R version 2.13.0 (2011-04-13) and R version R-2.13.2 (2011-09-30)
Platform: i386-pc-mingw32/i386 (32-bit) locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] stringr_0.5 tm_0.5-6 wordcloud_1.2 Rcpp_0.9.7
loaded via a namespace (and not attached):
[1] plyr_1.5.1 slam_0.1-23
PS: Errors reported lately (see comments) were resolved, the source code was updated..
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment