Sunday, September 20, 2015
Some Simple but Propably Useful Regex Examples with R-Package stringr...
I found that examples for the use of regex in R are rather rare. Thus, I will provide some examples from my own learning materials - mostly stolen from the help pages, with small but maybe illustrative adaptions. ps: I will extent this list of examples HERE occasionally..library(stringr)
shopping_list <- c("bread & Apples §$%&/()=?4", "flouR", "sugar", "milk x2")
str_extract(shopping_list, "[A-Z].*[1-9]")
# this extracts partial strings starting with an upper-case letter
# and ending with a digit, for all elements of the input vector..
# "." period, any single case letter, "*" the preceding item will
# be matched zero or more times, ".*" regex for a string
# comprised of any item being repeated arbitrarily often.
# output:
[1] "Apples §$%&/()=?4" NA NA NA
str_extract(shopping_list, "[a-z]{1,4}")
# this extracts partial strings with lowercase repetitions of 4,
# for all elements of the input vector..
# output:
[1] "brea" "flou" "suga" "milk"
str_extract(shopping_list, "\\b[a-z]{1,4}\\b")
# this extracts whole words with lowercase repetitions of 4,
# for all elements of the input vector..
#output:
[1] NA NA NA "milk"
str <- c("&George W. Bush", "Lyndon B. Johnson?")
gsub("[^[:alnum:][:space:].]", "", str)
# keep alphanumeric signs AND full-stop, remove anything else,
# that is, all other punctuation. what should not be matched is
# designated by the caret.
# output:
[1] "George W. Bush" "Lyndon B. Johnson"
Labels:
R,
Regex,
str_extract,
String-Manipulation,
stringr
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment