Chapter 16 Conditional statements
Quite often you’ll want to write some code that will do one thing in some circumstances and something else the rest of the time. For this sort of conditional expression R has if()
and if() else
. Like for()
, these work with a set of instructions in braces after the brackets. The simplest kind of conditional expression is one where there’s only one option: if()
.
if(logical expression) {
some instructions }
Here, if the logical expression returns TRUE R will carry out the instructions in the brackets, and if it returns FALSE then it won’t. Here’s an example of an if() statement.
Sets up two vectors. Z1 is 10 random numbers from a standard normal distribution, Z2 is a string of 10 zeros.
# Generate a single random number drawn from
# a standard normal distribution
<- rnorm(1)
X1
# Tell us what the number is
X11] 0.92434
[
if(X1>0) {
print("The number is positive")
}1] "The number is positive" [
Here we generate a single random number drawn from a standard normal distribution (meaning one with a mean of 0 and standard deviation 1). The if
statement then asks whether the number is greater than zero, and if it is, it prints a message to let you know that the number is positive. In this case the number was greater than zero so we get told this.
16.1 if() else
The slightly more complex version of if()
is if() else
, which will do one thing if the logical expression you’ve used is true and another if it is false. if() else uses two sets of braces, one before the “else” and one after, like this.
if(logical expression) {
if the logical expression is true
some instructions on what to do else {
} if the logical expression is false
some other instructions on what to do }
This lets us extend our previous example by also returning a message if the value is less than zero.
# Generate a single random number drawn from
# a standard normal distribution
<- rnorm(1)
X1
# Tell us what the number is
X11] -0.3075
[
if(X1>0) {
print("The number is positive")
else {
} print("The number is negative")
}1] "The number is negative" [
Now we get a message whether the random number is negative or positive. There is one problem with this, however: what if the number is equal to zero? This is possible in this scenario. To account for this we can put else
and if
together to include a second logical test.
# Generate a single random number drawn from
# a standard normal distribution
<- rnorm(1)
X1
# Tell us what the number is
X11] -0.62641
[
if(X1>0) {
print("The number is positive")
else if (X1 < 0) {
} print("The number is negative")
else {
} print("The number is zero")
}1] "The number is negative" [
That runs. Let’s test it with a zero.
<- 0
X1
if(X1>0) {
print("The number is positive")
else if (X1 < 0) {
} print("The number is negative")
else {
} print("The number is zero")
}1] "The number is zero" [
16.2 Using if() in a function
One very common use of if()
is to allow us to have more than one possible output from a function depending on the value given to one or more arguments to that function. We’ve already looked at writing functions to calculate confidence intervals in previous chapters. Here is a function that will use resampling to calculate bootstrap confidence intervals for the mean of a vector. Now we’re going to adjust this so that the user can choose to have the confidence intervals calculated for the mean or for the median.
Here is a function that calculates bootstrap confidence intervals for a mean. Note that I’ve written this as a pipeline, which makes the flow of the calculations easier to see, and that I’m using the apply()
function to calculate the column means.
<- function(x, conf = 95) {
boot.conf
replicate(1000, sample(x, size = length(x),
replace = TRUE)) |>
apply(2, mean) |>
c(quantile(x, (100 - conf) / 200),
(\(x) quantile(x, 1 - (100 - conf) / 200)))()
}
There are currently two arguments, x
which is the vector and conf
which is the confidence interval we would like to calculate. The default value for conf
is 95
, so by default we will get the 95% confidence intervals. Now we will add a third argument, calc
which tells the function whether we want the confidence intervals for the mean or the median. As a default we’ll have the mean.
<- function(x, conf = 95, calc = "mean") { boot.conf
To calculate the confidence intervals on the median rather than the mean we can use the same code as previously, but we need to replace the mean
in apply(2, mean) |>
with median
so apply(2, mean) |>
. Knowing this we can use an if
statement to make a function that will do either depending on what we ask for. We’re also going to include an option for when calc
is something other than mean
or median
, by incorporating an else if
.
<- function(x, conf = 95, calc = "mean") {
boot.conf
# Calculations for mean
if (calc == "mean") {
replicate(1000, sample(x, size = length(x),
replace = TRUE)) |>
apply(2, mean) |>
c(quantile(x, (100 - conf) / 200),
(\(x) quantile(x, 1 - (100 - conf) / 200)))()
}
# Calculations for median
else if (calc == "median") {
replicate(1000, sample(x, size = length(x),
replace = TRUE)) |>
apply(2, median) |>
c(quantile(x, (100 - conf) / 200),
(\(x) quantile(x, 1 - (100 - conf) / 200)))()
}
# If calc is something else
else {
print("calc must be either 'median' or 'mean'")
} }
Let’s see if it works.
<-rnorm(100,20,5)
test1boot.conf(test1)
2.5% 97.5%
19.016 21.018
Because we’ve not given a value for either of the two arguments with default values (conf
and calc
) we get the default options, so this returns the 95% confidence limits on the mean.
boot.conf(test1,calc = "median")
2.5% 97.5%
17.662 22.041
Now that we’ve specified calc = "median"
we get the bootstrap 95% confidence limits on the median.
boot.conf(test1,calc = "marmite")
1] "calc must be either 'median' or 'mean'" [
Love it or hate it, it’s not a measure of central tendency and we can’t calculate confidence intervals for it.
16.3 ifelse()
ifelse()
is a way of putting an if () else function in a single set of brackets. It’s useful for applying repeated conditional statements to each element of a vector. The syntax is ifelse(logical test, what to do if it's true, what to do if it's false)
. As an example, here’s a vector of weights (kg) for 15 tortoises.
<-c(1.74,1.83,1.61,1.53,1.52,
weights1.78,1.88,1.68,1.78,1.82,
1.89,1.80,1.75,1.61,1.64)
If you wanted to classify these tortoises as greater than or equal to, or below median weight, then ifelse()
is ideal.
<- ifelse(weights >= median(weights), "Above or equal", "Below")
tort_weight
tort_weight1] "Below" "Above or equal" "Below" "Below"
[5] "Below" "Above or equal" "Above or equal" "Below"
[9] "Above or equal" "Above or equal" "Above or equal" "Above or equal"
[13] "Above or equal" "Below" "Below" [