7 Boolean

We finally arrived to the last data type that needs a deeper explanation: the boolean (or logical).
In programming languages, there are a particular type of data (the boolean) representing what we can imagine as true or false, in R they are represented by two values: TRUE (or T) and FALSE (or F). We can see an example here:

t_value <- TRUE
f_value <- FALSE

typeof(t_value)
[1] "logical"
typeof(f_value)
[1] "logical"

R calls them logical (as in computer science). They cannot be explained well if not contextualized, so let’s see some basic practical application here, and be patient, in next lessons we will see another examples.

Boolean as results of comparison

The most used way to get a logical is to evaluate a comparison, such as compare numbers, compare words ecc. Quickly, for numbers:

num1 <- 3
num2 <- 4

# greater than
num1 > num2
[1] FALSE
# greater equal than
num1 >= num2
[1] FALSE
# less than
num1 < num2
[1] TRUE
# less equal than 
num1 <= num2
[1] TRUE
# equality
num1 == num2
[1] FALSE
# not equal to
num1 != num2
[1] TRUE

IMPORTANT: I hope you notice that we used == to identity comparison. And I hope you get why we didn’t use only =. If you don’t, remember that one equal sign assign a value to a variable, so in this case you would have overwritten num1 with the value of num2.
And what about character? They behave in this way:

# define some variables
ch1 <- "Mapk13"
ch2 <- "MAPK13"
ch3 <- "Xist"

# Greater than
ch1 > ch2
[1] FALSE
ch3 > ch1
[1] TRUE

< and > with character, contrary to popular belief, do not take into account the number of characters of the string, but the comparison is based on alphabetical order, with lowercase letters that come prior uppercase ones (that’s why Mapk13 is not greater than MAPK13 and Xist).
For this reason, two characters are equal only if they have the same letter/number/special character, in the same order and in the same case. For example:

ch4 <- "Peg3"
ch5 <- "Peg3"
ch6 <- "peg3"

ch4 == ch5
[1] TRUE
ch5 == ch6
[1] FALSE

Logical operator

Up to now, we have seen logical as a result of comparisons, but what if we want to compare or combine comparisons? I know it sounds silly, but here is an example of implementation of this concept. We will have to do with dataframes, vectors, matrix and other stuff, and we will always filter for some conditions: let’s say that we want to extrapolate data that are below 10 but above 5. These are two comparisons: x < 10 and x > 5. Here we combine two logicals, derived from the two comparisons.
The main logical operators, the one that will be useful for us, are AND, OR and NOT

AND

The AND operator works as follows:

  • TRUE and TRUE = TRUE
  • TRUE and FALSE = FALSE
  • FALSE and TRUE = FALSE
  • FALSE and FALSE = TRUE

An easy trick to remember is: if they are identical (both FALSE or TRUE), the result is TRUE, otherwise is FALSE.
In R, AND operator is the &, some examples:

expr <- 50

(expr < 60) & (expr > 40) # TRUE & TRUE
[1] TRUE
(expr < 60) & (expr < 40) # TRUE & FALSE
[1] FALSE

OR

The OR operator works as follows:

  • TRUE and TRUE = TRUE
  • TRUE and FALSE = TRUE
  • FALSE and TRUE = TRUE
  • FALSE and FALSE = FALSE

An easy trick to remember is: if at least one is TRUE, then the result is TRUE, otherwise if are all FALSE, the result is FALSE.
In R, OR operator is the |, some examples:

expr <- 50

(expr < 60) | (expr > 40) # TRUE | TRUE
[1] TRUE
(expr < 60)| (expr < 40) # TRUE | FALSE
[1] TRUE

NOT

The not operator is used to negate an expression, we have seen an example before, when we compared to numbers to see if they were not equal (!=). It is used before the expression to evaluate, in this form:

expr <- 50

!(expr < 60) | (expr < 40) # NOT TRUE | FALSE
[1] FALSE
!((expr < 60) & (expr < 40)) # NOT (TRUE & FALSE)
[1] TRUE

Here we see two important things:

  • The not operator must be put before a parenthesis (if it contains a comparison) or directly before a TRUE or FALSE variable
  • As for mathematical expressions, order and parenthesis matters: parenthesis and then from left to right

Exercises

Write a R script with the following exercises, they are level pro (I know you can):

Exercise 7.1 Write the expression you would use to evaluate the following statement: we want to see if the patient is in his childhood (2-8 years) and one of its weight is less than 45 kg (as threshold variable) or if its mother has diabetes (we know it is true), and if its nationality is not USA.
Tip: here we have 8 variables. I know I didn’t give neither the age nor the weight nor the nationality; you can create these variables and give the values you want. This exercise is to practice the writing and logical part.

Solution
# create patient variables
patient_age <- 5
patient_weight <- 66
mother_diabetes <- TRUE
patient_state <- "Italy"

# set thresholds and values
age_inf_threshold <- 2
age_sup_threshold <- 8

weight_sup_threshold <- 45

nationality <- "USA"

((patient_age > age_inf_threshold & patient_age < age_sup_threshold) |
  (patient_weight < weight_sup_threshold | mother_diabetes)) & (patient_state != nationality)
[1] TRUE

It is best practice to use more parenthesis to help the readability by both human and R
Ok, next chapter will be on vectors, and we will do another big step towards practical applications and exercise, with real biological questions.