How To Convert Uppercase To Lowercase In R

Uppercase and lowercase in alphabets refer to the different forms of alphabetic characters in which uppercase are capitalized letters and lowercase are the small letters.

We need to convert uppercase to lowercase for consistency and standardization in text analysis or data processing tasks, as uppercase and lowercase forms of the same word can be treated as different entities by a computer.

Why do we need to convert uppercase to lowercase in R

Here are some explanation why we need to convert uppercase to lowercase in R:

  • Consistency: Converting all text to lowercase ensures that all words are written in the same format, making it easier to compare and analyze.
  • Standardization: In some cases, data may contain words in uppercase, lowercase, or mixed cases. Converting all text to lowercase helps to standardize the data and make it easier to work with.
  • Text analysis: Many text analysis techniques, such as sentiment analysis or natural language processing, require text to be in lowercase for accuracy.
  • Data processing: Converting text to lowercase can also improve data processing tasks such as string matching, data deduplication, and indexing.
  • Eliminating duplicates: Converting all text to lowercase can help identify and eliminate duplicate entries that differ only in case.

How to convert uppercase to lowercase in R

Here are some approaches for converting uppercase to lower case in R:

  • tolower() function
  • str_to_lower() function
  • chartr() function
  • gsub() function

Approaches

Approach 1: tolower() function

The tolower() function can be said as one of the simplest and efficient ways to convert all uppercase characters in a string to lowercase. It takes a character vector as input and later returns the same vector with all uppercase letters converted to lowercase.

# create a character vector with some uppercase letters
my_string <- "Hello World!"

# convert all uppercase characters to lowercase using tolower() function
my_string_lower <- tolower(my_string)

# print the original string and the lowercase string
cat("Original string: ", my_string, "\n")
cat("Lowercase string: ", my_string_lower, "\n")

Output:

Original string:  Hello World! 
Lowercase string:  hello world!

Explanation:

  • The code creates a character vector my_string which can contain a mix of uppercase and lowercase letters.
  • The tolower() function is applied to my_string. It converts all uppercase characters to lowercase.
  • The resulting lowercase string is stored in a new variable called my_string_lower.
  • The cat() function is used to print both the original string and the lowercase string to the console.

Approach 2: str_to_lower() function

The str_to_lower() function is part of the stringr package in R and it provides a convenient way to convert all strings in a character vector to lowercase. We can apply the tolower() function to each string and it returns a new vector with all strings in lowercase.

# installing the package
install.packages("stringr")
# load the stringr package
library(stringr)

# create a character vector with some uppercase letters
my_strings <- c("This is A SamPLe", "ANOTHER STRING", "yET aNoTHer STRIng")

# convert all strings to lowercase using str_to_lower() function
my_strings_lower <- str_to_lower(my_strings)

# print the original strings and the lowercase strings
cat("Original strings: ", my_strings, "\n")
cat("Lowercase strings: ", my_strings_lower, "\n")

Output:

Original strings:  This is A SamPLe StriNG ANOTHER STRING yET aNoTHer STRIng 
Lowercase strings:  this is a sample string another string yet another string 

Explanation:

  • The code first loads the stringr package which contains the str_to_lower() function.
  • A character vector my_strings is created that contains a mix of uppercase and lowercase letters in different strings.
  • The str_to_lower() function is applied to my_strings, which converts all strings to lowercase.
  • The resulting lowercase strings are stored in a new variable called my_strings_lower.
  • The cat() function is used to print both the original strings and the lowercase strings to the console.

Approach 3: chartr() function

The chartr() function in R is used for replacing one set of characters with another. It can be considered as one of the approaches for replacing all uppercase letters in a string with their lowercase equivalents.

# create a character vector with some uppercase letters
my_string <- "HeLLo WoRLd"

# convert all uppercase characters to lowercase using chartr() function
my_string_lower <- chartr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz", my_string)

# print the original string and the lowercase string
cat("Original string: ", my_string, "\n")
cat("Lowercase string: ", my_string_lower, "\n")

Output:

Original string:  HeLLo WoRLd 
Lowercase string:  hello world 

Explanation:

  • The code first creates a character vector my_string that contains a mix of uppercase and lowercase letters.
  • The chartr() function is applied to my_string, which converts all uppercase characters to lowercase.
  • The function takes three arguments. The uppercase characters to be replaced, the lowercase characters to replace them with and the input string to be modified.
  • The resulting lowercase string is stored in a new variable called my_string_lower.
  • The cat() function is used to print both the original string and the lowercase string to the console.

Approach 4: gsub() function

Also we can use the gsub() function in R to replace one set of characters with another. It can be used to replace all uppercase letters in a string with their lowercase equivalents. It is a more versatile function than chartr() and can be used for more complex replacements.

# create a character vector with some uppercase letters
my_string <- "This is A SamPLe StriNG"

# convert all uppercase characters to lowercase using gsub() function
my_string_lower <- gsub("[A-Z]", "\\L&", my_string, perl = TRUE)

# print the original string and the lowercase string
cat("Original string: ", my_string, "\n")
cat("Lowercase string: ", my_string_lower, "\n")

Output:

Original string:  This is A SamPLe StriNG 
Lowercase string:  this is a sample string 

Explanation:

  • The code first creates a character vector my_string that contains a mix of uppercase and lowercase letters.
  • The gsub() function is applied to my_string, which converts all uppercase characters to lowercase.
  • The function takes four arguments namely a regular expression pattern to match the uppercase characters, the replacement string (which in this case is “\\L&”, where & represents the matched uppercase character and \\L tells the function to convert it to lowercase), the input string to be modified and an option (perl = TRUE) that enables the use of the \\L conversion specifier in the replacement string.
  • The resulting lowercase string is stored in a new variable called my_string_lower.
  • The cat() function is used to print both the original string and the lowercase string to the console.

Best Approach

The tolower() function is the best approach to convert uppercase to lowercase in R. Here are some reasons:

  • Availability: As it is a built-in function in R, it’s always available and easy to use without any additional packages or installations.
  • Optimization: It’s optimized for the task of converting uppercase to lowercase and tends to be more efficient than other approaches.
  • Usability: It is a simple and straightforward function that has just one argument.
  • Updates:As it is a part of the base R package it is well-maintained and unlikely to become outdated or unsupported in future.
  • Remembrance: It is easy to remember and intuitive, since it’s named after the common English phrase “to lower”.
  • Compatibility: It is compatible with both single characters and longer strings making it versatile and flexible for different use cases.
  • Acceptable: It is widely used in the R community and has been thoroughly tested and vetted. This makes it generally considered as a reliable and safe choice for converting uppercase to lowercase in R.

Sample Problems:

Sample Problem 1:

Write a R Code to convert the outcome of a coin c(H,T,T,T,H,H,T,H,T,H,H,H,T) which are all uppercase characters to a string to lowercase using base functions.

Solution:

  • We define a vector outcome with the outcome of a coin toss, where “H” represents heads and “T” represents tails.
  • We use the tolower() function to convert all characters in the outcome vector to lowercase.
  • We store the lowercase vector in a new variable outcome_lower.
  • We use the cat() function to print both the original outcome and the outcome_lower vectors to the console.
  • The output shows that all characters in the outcome_lower vector are now in lowercase.
# Define a vector with the outcome of a coin toss
outcome <- c("H", "T", "T", "T", "H", "H", "T", "H", "T", "H", "H", "H", "T")

# Convert the vector to lowercase using the tolower() function
outcome_lower <- tolower(outcome)

# Print the original and lowercase vectors
cat("Original outcome: ", outcome, "\n")
cat("Lowercase outcome: ", outcome_lower, "\n")

Output:

Original outcome:  H T T T H H T H T H H H T 
Lowercase outcome:  h t t t h h t h t h h h t 

Sample Problem 2:

Write a R Code  to convert the character vector c(“GAME OF THRONES”, “BREAKING BAD”, “STRANGER THINGS”) using the stringr package in the tidyverse to convert all uppercase characters in a character vector to lowercase.

Solution:

  • The library(stringr) line loads the stringr package, which provides a number of functions for working with strings.
  • We define the character vector vec with three elements, each of which is a TV show in all caps.
  • We use the str_to_lower() function to convert all uppercase characters to lowercase in the vec vector.
  • We assign the result to a new variable vec_lower.
  • We view the converted vector using vec_lower, which now contains the TV show titles in lowercase.
  • The str_to_lower() function works by taking a character vector as input and returning a new vector with all uppercase characters converted to lowercase.
library(stringr)

# Define the character vector
vec <- c("GAME OF THRONES", "BREAKING BAD", "STRANGER THINGS")

# Convert all uppercase characters to lowercase
vec_lower <- str_to_lower(vec)

# View the converted vector
vec_lower

Output:

"game of thrones" "breaking bad"    "stranger things"

Sample Problem 3:

Write a R Code to convert the title “THE BIG BANG THEORY” using chartr() in R to convert all uppercase characters in a string to lowercase.

Solution:

  • chartr() is a base R function that can be used to translate characters from one set to another.
  • In this code, we define the original string as title.
  • The first argument in the chartr() function specifies the set of characters to be replaced (in this case, all uppercase letters).
  • The second argument specifies the set of replacement characters (in this case, all lowercase letters).
  • We store the result of the conversion in the title.
  • Finally, we print the result using the print() function.
# Define the string to be converted
title <- "THE BIG BANG THEORY"

# Use chartr() to convert all uppercase characters to lowercase
title <- chartr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz", title)

# Print the result
print(title)

Output:

"the big bang theory"

Sample Problem 4:

Write a R Code to convert a matrix of movie titles using the gsub() function in R to convert all uppercase characters in a matrix to lowercase. c(“AVATAR”, “INCEPTION”, “TITANIC”, “STAR WARS”)

Solution:

  • We first create a matrix movie_titles with four movie titles in uppercase letters.
  • We use the gsub() function to replace all uppercase letters in the matrix with lowercase letters.
  • The regular expression [A-Z] matches all uppercase letters in the matrix.
  • The replacement string \\L\\0 converts the matched uppercase letter to lowercase.
  • The perl = TRUE argument is used to enable Perl-compatible regular expressions.
  • We print the converted matrix using the print() function.
movie_titles <- matrix(c("AVATAR", "INCEPTION", "TITANIC", "STAR WARS"), nrow = 2)

# Convert all uppercase characters to lowercase using gsub()
movie_titles <- gsub("[A-Z]", "\\L\\0", movie_titles, perl=TRUE)

# Print the converted matrix
movie_titles

Output:

     [,1]       [,2]      
[1,] "avatar"   "titanic" 
[2,] "inception" "star wars"

Conclusion

Converting uppercase to lowercase characters is a common task in data cleaning and text processing in R. The base functions, stringr package, chartr() function, and gsub() function are useful tools to achieve this task.

The base function offers a simple and efficient way to convert uppercase characters to lowercase. Regardless of the tool used, converting uppercase to lowercase is an essential step in preparing text data for analysis and visualization.