5  Operations on Vectors

In this chapter we will learn how to do some operations on vectors in R.

5.1 Indexing

Suppose we have a vector a with 5 elements and we wanted to isolate the 3rd element of it. We can do this with what is called indexing. To get the 3rd element of a vector a, we do a[3]. Let’s see this with an example:

a <- c(1, 2, 4, 3, 2)
a[3]
[1] 4

We can also extract multiple elements of the vector using a vector of indices inside the []. For example, suppose we wanted to get the 1st, 3rd and 4th element of a. We would put the vector c(1, 3, 4) inside the square brackets:

a[c(1, 3, 4)]
[1] 1 4 3

We can also extract elements of a vector using a logical vector. Doing this will extract the elements where the logical vector is TRUE. To do this, the logical vector needs to have the same length as the vector we are trying to index. Like above, if we want the 1st, 3rd and 4th element of a, we can use a vector with TRUE in the 1st, 3rd and 4th element and FALSE everywhere else:

a[c(TRUE, FALSE, TRUE, TRUE, FALSE)]
[1] 1 4 3

Suppose I want everything in a vector except one element: I want to exclude one element from the vector. For example, suppose I want to see the entire vector a except the 2nd element. We can do this using -2 in the brackets:

a[-2]
[1] 1 4 3 2

5.2 Sequences

Often it is useful to create a sequence of numbers. For a simple sequence like 1, 2, 3, …, 10, we can just do:

1:10
 [1]  1  2  3  4  5  6  7  8  9 10

We can also make the sequence go backwards by reversing the numbers:

10:1
 [1] 10  9  8  7  6  5  4  3  2  1

For sequences that don’t jump in 1s we can use the seq() function. Suppose we wanted to have a sequence from 10 to 100 with steps of 10. We do that with:

seq(from = 10, to = 100, by = 10)
 [1]  10  20  30  40  50  60  70  80  90 100

Instead of specifying the step length with by, we can alternatively specify the length of the sequence. Suppose I wanted to have a sequence going from 0 to 1 in equal steps with 5 numbers in total. I can do that using the length.out option:

seq(from = 0, to = 1, length.out = 5)
[1] 0.00 0.25 0.50 0.75 1.00

5.3 Repeating Numbers

If I wanted to create a vector which is 1 repeated 5 times, I could do:

c(1, 1, 1, 1, 1)
[1] 1 1 1 1 1

But this would get very annoying to type and I could easily make a mistake if I wanted to make many more 1s. If we want to repeat a number many times, we can use the rep() function. For example, if we want to make 100 1s, we would do:

rep(1, times = 100)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

At the start of Chapter 3 we briefly mentioned that the [1] you see at the start of the output meant that the number we saw was the first one. Because with this example we have many numbers that run onto multiple lines, we can see a [38] at the start of line 2. This means that the first 1 on the 2nd line is the 38th element of the vector. The [75] on the 3rd line means the 1st one on that line is the 75th element.

The rep() function can also be combined with vectors. Suppose I wanted to repeat 1, 2, 3 four times:

rep(1:3, times = 4)
 [1] 1 2 3 1 2 3 1 2 3 1 2 3

And if I instead wanted to repeat 1, 2, 3, each 4 times, I would use the each option:

rep(1:3, each = 4)
 [1] 1 1 1 1 2 2 2 2 3 3 3 3

5.4 Summary Statistics for Vectors

We can get summary statistics for vectors using functions. Let’s look at some common ones using a simple vector with the sequence 1 to 10:

a <- 1:10
a
 [1]  1  2  3  4  5  6  7  8  9 10

Get the number of elements of a:

length(a)
[1] 10

Get the minimum value in a:

min(a)
[1] 1

Get the maximum value in a:

max(a)
[1] 10

Get the average of all elements in a:

mean(a)
[1] 5.5

Get the median of all elements in a:

median(a)
[1] 5.5

Note on the median: Normally the median orders all elements of the vector and gives the element in the middle. Because we have an even number of elements in a (10 elements), the median is the average of the two values in the middle after sorting. Because it’s already sorted, these middle values are 5 and 6, so the median is (5+6)/2 = 5.5.

Get the sum of all elements in a:

sum(a)
[1] 55

A useful way to quickly summarize a numeric vector is with the summary() function, which gives the minimum, maximum, mean, median and interquartile range:

summary(a)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    3.25    5.50    5.50    7.75   10.00 

Another useful way of summarizing data is to tabulate it: to count the number of occurrences of each value. We can do that with the table() function:

a <- c(1, 3, 2, 4, 4, 2, 4)
table(a)
a
1 2 3 4 
1 2 1 3 

The output here means that 1 appeared once, 2 appeared twice, 3 appeared once and 4 appeared three times.

5.5 Examples from Mathetmatics for E&BI

5.6 Example 1: A Linear Mortgage

You take out a linear mortgage for €150,000 with a 15-year term and an interest rate of 7%. How much interest will you pay on this loan?

Solution

In a linear mortgage, each year you will make a payment of €10,000 against the value of the loan. This comes from \frac{150000}{15}=10000. We can create a sequence of how the debt evolves over time:

debt <- seq(from = 150000, to = 0, by = -10000)
debt
 [1] 150000 140000 130000 120000 110000 100000  90000  80000  70000  60000
[11]  50000  40000  30000  20000  10000      0

We use by = -10000 because the debt reduces by 10000 each year.

Each year you also need to pay interest of 7% on the value of the loan. The sequence of interest payments is then:

interest <- debt * 0.07
interest
 [1] 10500  9800  9100  8400  7700  7000  6300  5600  4900  4200  3500  2800
[13]  2100  1400   700     0

To get the total interest payments, we just need to sum up the values in interest:

sum(interest)
[1] 84000

We could also have done all of these steps in one go:

sum(seq(150000, to = 0, by = -10000) * 0.07)
[1] 84000

5.7 Example 2: Sum of Geometric Sequences

The parents of a child choose to invest €1,250 at 5% interest each year from when the child is born until (and including) their 20th birthday. On the 20th birthday, how much will the investment be worth?

Solution

The first investment at birth will have been in the bank for 20 years and will have grown in value to 1250\times\left(1.05\right)^{20}= 3316.62. The second investment from the child’s first birthday will have grown in value to 1250\times\left(1.05\right)^{19}= 3158.69. We can proceed with this logic to get the value of all the individual investments. The last investment on the 20th birthday will not have grown, so it’s value remains at 1250\times(1.05)^0=1250.

We can use sequences with R to create a vector which gives the value of each individual investment on the 20th birthday:

values <- 1250 * 1.05^(20:0)
values
 [1] 3316.622 3158.688 3008.274 2865.023 2728.593 2598.660 2474.914 2357.061
 [9] 2244.820 2137.924 2036.118 1939.160 1846.819 1758.876 1675.120 1595.352
[17] 1519.383 1447.031 1378.125 1312.500 1250.000

The total value of the investment is just the sum of these values, which we can calculate using the sum() function:

sum(values)
[1] 44649.06