Chapter 16 Conditional statements

Quite often you’ll want to write some code that will do one thing in some circumstances and something else the rest of the time. For this sort of conditional expression R has if() and if() else. Like for(), these work with a set of instructions in braces after the brackets. The simplest kind of conditional expression is one where there’s only one option: if().

if(logical expression) { 
  some instructions
}

Here, if the logical expression returns TRUE R will carry out the instructions in the brackets, and if it returns FALSE then it won’t. Here’s an example of an if() statement.

Sets up two vectors. Z1 is 10 random numbers from a standard normal distribution, Z2 is a string of 10 zeros.

# Generate a single random number drawn from 
# a standard normal distribution
X1 <- rnorm(1)

# Tell us what the number is
X1
[1] 0.92434

if(X1>0) { 
  print("The number is positive")
}
[1] "The number is positive"

Here we generate a single random number drawn from a standard normal distribution (meaning one with a mean of 0 and standard deviation 1). The if statement then asks whether the number is greater than zero, and if it is, it prints a message to let you know that the number is positive. In this case the number was greater than zero so we get told this.

16.1 if() else

The slightly more complex version of if() is if() else, which will do one thing if the logical expression you’ve used is true and another if it is false. if() else uses two sets of braces, one before the “else” and one after, like this.

if(logical expression) { 
  some instructions on what to do if the logical expression is true
} else {
  some other instructions on what to do if the logical expression is false
}

This lets us extend our previous example by also returning a message if the value is less than zero.

# Generate a single random number drawn from 
# a standard normal distribution
X1 <- rnorm(1)

# Tell us what the number is
X1
[1] -0.3075

if(X1>0) { 
  print("The number is positive")
} else {
  print("The number is negative")
}
[1] "The number is negative"

Now we get a message whether the random number is negative or positive. There is one problem with this, however: what if the number is equal to zero? This is possible in this scenario. To account for this we can put else and if together to include a second logical test.

# Generate a single random number drawn from 
# a standard normal distribution
X1 <- rnorm(1)

# Tell us what the number is
X1
[1] -0.62641

if(X1>0) { 
  print("The number is positive")
} else if (X1 < 0) {
  print("The number is negative")
} else {
  print("The number is zero")
}
[1] "The number is negative"

That runs. Let’s test it with a zero.


X1 <- 0

if(X1>0) { 
  print("The number is positive")
} else if (X1 < 0) {
  print("The number is negative")
} else {
  print("The number is zero")
}
[1] "The number is zero"

16.2 Using if() in a function

One very common use of if() is to allow us to have more than one possible output from a function depending on the value given to one or more arguments to that function. We’ve already looked at writing functions to calculate confidence intervals in previous chapters. Here is a function that will use resampling to calculate bootstrap confidence intervals for the mean of a vector. Now we’re going to adjust this so that the user can choose to have the confidence intervals calculated for the mean or for the median.

Here is a function that calculates bootstrap confidence intervals for a mean. Note that I’ve written this as a pipeline, which makes the flow of the calculations easier to see, and that I’m using the apply() function to calculate the column means.

boot.conf <- function(x, conf = 95) {
  
  replicate(1000, sample(x, size = length(x),
                         replace = TRUE)) |> 
    apply(2, mean) |> 
    (\(x) c(quantile(x, (100 - conf) / 200),
      quantile(x, 1 - (100 - conf) / 200)))()
}     

There are currently two arguments, x which is the vector and conf which is the confidence interval we would like to calculate. The default value for conf is 95, so by default we will get the 95% confidence intervals. Now we will add a third argument, calc which tells the function whether we want the confidence intervals for the mean or the median. As a default we’ll have the mean.

boot.conf <- function(x, conf = 95, calc = "mean") {

To calculate the confidence intervals on the median rather than the mean we can use the same code as previously, but we need to replace the mean in apply(2, mean) |> with median so apply(2, mean) |>. Knowing this we can use an if statement to make a function that will do either depending on what we ask for. We’re also going to include an option for when calc is something other than mean or median, by incorporating an else if.

boot.conf <- function(x, conf = 95, calc = "mean") {
  
  # Calculations for mean
  if (calc == "mean") {
  replicate(1000, sample(x, size = length(x),
                         replace = TRUE)) |> 
    apply(2, mean) |> 
    (\(x) c(quantile(x, (100 - conf) / 200),
      quantile(x, 1 - (100 - conf) / 200)))()
  }
  
  # Calculations for median
  else if (calc == "median") {
      replicate(1000, sample(x, size = length(x),
                         replace = TRUE)) |> 
    apply(2, median) |> 
    (\(x) c(quantile(x, (100 - conf) / 200),
      quantile(x, 1 - (100 - conf) / 200)))()
  }
  
  # If calc is something else
  else {
    print("calc must be either 'median' or 'mean'")
  }
}     

Let’s see if it works.

test1<-rnorm(100,20,5)
boot.conf(test1)
  2.5%  97.5% 
19.016 21.018 

Because we’ve not given a value for either of the two arguments with default values (conf and calc) we get the default options, so this returns the 95% confidence limits on the mean.

boot.conf(test1,calc = "median")
  2.5%  97.5% 
17.662 22.041 

Now that we’ve specified calc = "median" we get the bootstrap 95% confidence limits on the median.

boot.conf(test1,calc = "marmite")
[1] "calc must be either 'median' or 'mean'"

Love it or hate it, it’s not a measure of central tendency and we can’t calculate confidence intervals for it.

16.3 ifelse()

ifelse() is a way of putting an if () else function in a single set of brackets. It’s useful for applying repeated conditional statements to each element of a vector. The syntax is ifelse(logical test, what to do if it's true, what to do if it's false). As an example, here’s a vector of weights (kg) for 15 tortoises.

weights<-c(1.74,1.83,1.61,1.53,1.52,
           1.78,1.88,1.68,1.78,1.82,
           1.89,1.80,1.75,1.61,1.64)

If you wanted to classify these tortoises as greater than or equal to, or below median weight, then ifelse() is ideal.

tort_weight <- ifelse(weights >= median(weights), "Above or equal", "Below")

tort_weight
 [1] "Below"          "Above or equal" "Below"          "Below"         
 [5] "Below"          "Above or equal" "Above or equal" "Below"         
 [9] "Above or equal" "Above or equal" "Above or equal" "Above or equal"
[13] "Above or equal" "Below"          "Below"         

16.4 Exercises