Functions in R - Creating your first R function

By: Karthik Janar in data-science Tutorials on 2018-05-06

Functions are one of the fundamental building blocks of the R language. They are small pieces of reusable code that can be treated like any other R object. Functions are usually characterized by the name of the function followed by parentheses.

Sys.Date()

## [1] "2018-05-06"

Most functions in R return a value. Functions like Sys.Date() return a value based on your computer's environment, while other functions manipulate input data in order to compute a return value.

The mean() function takes a vector of numbers as input, and returns the average of all of the numbers in the input vector. Inputs to functions are often called arguments. Providing arguments to a function is alsovsometimes called passing arguments to that function. Arguments you want to pass to a function go inside the function's parentheses. Try passing the argument c(2, 4, 5) to the mean() function.

mean(c(2, 4, 5))

## [1] 3.666667

Functions usually take arguments which are variables that the function operates on. For example, the mean() function takes a vector as an argument, like in the case of mean(c(2,6,8)). The mean() function then adds up all of the numbers in the vector and divides that sum by the length of the vector.

The last R expression to be evaluated in a function will become the return value of that function. Below we will create a function called boring_function. This function takes the argument x as input, and returns the value of x without modifying it.

first_function <- function(x) {
  x
}

Now that you"ve created your first function let's test it."

first_function('My first function!')

## [1] "My first function!"

If you want to see the source code for any function, just type the function name without any arguments.

first_function

## function(x) {
##   x
## }

We"re going to replicate the functionality of the mean() function by creating a function called: my_mean(). Remember that to calculate the average of all of the numbers in a vector you find the sum of all the numbers in the vector, and then divide that sum by the number of numbers in the vector.

my_mean <- function(my_vector) {
  # Remember: the last expression evaluated will be returned! 
  sum(my_vector)/length(my_vector)
}

Let us test it.

my_mean(c(4,5,10))

## [1] 6.333333

Next, let's try writing a function with default arguments. You can set default values for a function's arguments, and this can be useful if you think someone who uses your function will set a certain argument to the same value most of the time.

increment <- function(number, by = 1){
     number + by
}

If you take a look in between the parentheses you can see that "by" is equal to 1. This means that the "by" argument will have the default value of 1. We can now use the increment function without providing a value for "by":

increment(5)

## [1] 6

However if I want to provide a value for the "by" argument I still can! The expression: increment(5, 2) will evaluate to 7.

increment(5, 2)

## [1] 7

Let us write a function called "remainder." remainder() will take two arguments: "num" and "divisor" where "num" is divided by "divisor" and the remainder is returned. Imagine that you usually want to know the remainde when you divide by 2, so set the default value of "divisor" to 2. Please be sure that "num" is the first argument and "divisor" is the second argument.

remainder <- function(num, divisor=2) {
  num %% divisor
}

Let's do some testing of the remainder function. Run remainder(5) and see what happens.

remainder(5)

## [1] 1

Let's take a moment to examine what just happened. You provided one argument to the function, and R matched that argument to 'num" since 'num" is the first argument. The default value for 'divisor" is 2, so the function used the default value you provided.

Now let's test the remainder function by providing two arguments. Type: remainder(11, 5) and let's see what happens.

remainder(11,5)

## [1] 1

You can also explicitly specify arguments in a function. When you explicitly designate argument values by name, the ordering of the arguments becomes unimportant. You can try this out by typing: remainder(divisor = 11, num = 5).

remainder(divisor = 11, num = 5)

## [1] 5

As you can see, there is a significant difference between remainder(11, 5) and remainder(divisor = 11, num = 5)

R can also partially match arguments. Try typing remainder(4, div = 2) to see this feature in action.

remainder(4, div = 2)

## [1] 0

With all of this talk about arguments, you may be wondering if there is a way you can see a function's arguments (besides looking at the documentation). Thankfully, you can use the args() function!

args(remainder)

## function (num, divisor = 2) 
## NULL

You may not realize it but I just tricked you into doing something pretty interesting! args() is a function, remainder() is a function, yet remainder was an argument for args(). Yes it's true: you can pass functions as arguments! This is a very powerful concept. Let's write a script to see how it works.

evaluate <- function(func, dat){
  # Remember: the last expression evaluated will be returned! 
  func(dat)
}

Let's take your new evaluate() function for a spin! Use evaluate to find the standard deviation of the vector c(1.4, 3.6, 7.9, 8.8).

evaluate(sd,c(1.4, 3.6, 7.9, 8.8))

## [1] 3.514138

The idea of passing functions as arguments to other functions is an important and fundamental concept in programming. You may be surprised to learn that you can pass a function as an argument without first defining the passed function. Functions that are not named are appropriately known as anonymous functions.

Let's use the evaluate function to explore how anonymous functions work. For the first argument of the evaluate function we"re going to write a tiny function that fits on one line. In the second argument we"ll pass some data to the tiny anonymous function in the first argument.

Type the following command and then we"ll discuss how it works:

evaluate(function(x){x+1}, 6)

## [1] 7

The first argument is a tiny anonymous function that takes one argument x and returns x+1. We passed the number 6 into this function so the entire expression evaluates to 7.

Try using evaluate() along with an anonymous function to return the first element of the vector c(8, 4, 0). Your anonymous function should only take one argument which should be a variable x.

evaluate(function(x){x[1]},c(8,4,0))

## [1] 8

Now try using evaluate() along with an anonymous function to return the last element of the vector c(8, 4, 0). Your anonymous function should only take one argument which should be a variable x.

evaluate(function(x){x[length(x)]},c(8,4,0))

## [1] 0

For the rest of the course we"re going to use the paste() function frequently. The first argument of paste() is ... which is referred to as an ellipsis or simply dot-dot-dot. The ellipsis allows an indefinite number of arguments to be passed into a function. In the case of paste() any number of strings can be passed as arguments and paste() will return all of the strings combined into one string.

paste("Java-Samples", "is", "fun!")

## [1] "Java-Samples is fun!"

Time to write our own modified version of paste(). Telegrams used to be peppered with the words START and STOP in order to demarcate the beginning and end of sentences. Write a function below called telegram that formats sentences for telegrams. For example the expression telegram("Good", "morning") should evaluate to: "START Good morning STOP"

telegram <- function(...){
  paste("START", ..., "STOP")
}

Let us test it.

telegram("bla","bla2","bla3")

## [1] "START bla bla2 bla3 STOP"

Let's explore how to "unpack" arguments from an ellipses when you use the ellipses as an argument in a function. Below I have an example function that is supposed to add two explicitly named arguments called alpha and beta.

mad_libs <- function(...){
  # Do your argument unpacking here!
  arguments <- list(...)
  place <- arguments[["place"]]
  adjective <- arguments[["adjective"]]
  noun <- arguments[["noun"]]
  
  # Notice the variables you'll need to create in order for the code below to
  # be functional!
  paste("News from", place, "today where", adjective, "students took to the streets in protest of the new", noun, "being installed on campus.")
}

Time to use your mad_libs function. Make sure to name the place, adjective, and noun arguments in order for your function to work.

mad_libs(place="Singapore",adjective="Java Samples",noun="wonder")

## [1] "News from Singapore today where Java Samples students took to the streets in protest of the new wonder being installed on campus."

You"re familiar with adding, subtracting, multiplying, and dividing numbers in R. To do this you use the +, -, *, and / symbols. These symbols are called binary operators because they take two inputs, an input from the left and an input from the right.

In R you can define your own binary operators.

User-defined binary operators have the following syntax: %[whatever]% where [whatever] represents any valid variable name.

Write your own binary operator below from absolute scratch! Your binary

# operator must be called %p% so that the expression:
#
#       "Good" %p% "job!"
#
# will evaluate to: "Good job!"

"%p%" <- function(left,right){ # Remember to add arguments!
  paste(left,right)
}

You made your own binary operator! Let's test it out. Paste together the strings: 'I", 'love", 'Java Samples!" using your new binary operator.