Programming and Quantitative Skills Sample Exam
Short-Answer Questions (5 points)
Question 1 (1 point)
Calculate the following using R:
\[ \frac{2^4 + 10}{8\times \sqrt{4}} \]
Question 2 (1 point)
Calculate \(\log_2\left(64\right)\) using R.
Question 3 (1 point)
If we create the following vector in R, what class will it be?
c(1, 2, "c", "d")
- Numeric.
- Character.
- Both numeric and character.
- Neither numeric nor character.
Note: You do not need to supply your code for this question.
Question 4 (1 point)
Write an R command to generate a numeric vector containing the following sequence:
\[ (10, 10, 20, 20, 30, 30, 40, 40, \dots, 470, 470, 480, 480, 490, 490, 500, 500) \]
Question 5 (1 point)
The logical vectors a
and b
have equal length. Which of the following options is always the same as !(a | b)
, regardless of the contents of a
and b
?
!a & !b
!a | !b
a | b
Note: You do not need to supply your code for this question.
Data Analysis (5 points)
Download the dataset ceosal.csv. The dataset contains information on chief executive officers (CEOs) at different companies. The variable descriptions are:
salary
: CEO compensation in 1990 (in dollars).age
: CEO agecollege
: \(=1\) if the CEO attended college and \(=0\) otherwise.grad
: \(=1\) if the CEO attended post-graduate education and \(=0\) otherwise.comten
: Years the CEO worked with the company.ceoten
: Years as CEO with the company.profits
: Profits of the company in 1990 (in dollars).
When reading the dataset into R, assign it to df
.
Question 6 (1 point)
How many observations are in the dataset?
Also assign your answer to answer6
in your code.
Question 7 (1 point)
What is the median of the variable comten
?
Also assign your answer to answer7
in your code.
Question 8 (1 point)
What is the mean salary of the CEOs that attended post-graduated education?
Also assign your answer to answer8
in your code.
Question 9 (1 point)
How many people in the dataset attended college but didn’t attend post-graduate education?
Also assign your answer to answer9
in your code.
Question 10 (1 point)
In the dataset, what is the longest someone worked at a company (in years) for before they became CEO?
Also assign your answer to answer10
in your code.
Data Cleaning (4 points)
Download the dataset euro-dollar-2022.csv. The dataset contains the closing Euro-Dollar exchange rate (variable “Price”) each day throughout 2022, together with the opening, highest and lowest exchange rate. Furthermore, it includes the volume traded and the percentage daily change in the closing price.
In this question you will need clean this dataset to answer the questions that follow.
When reading the dataset into R, assign it to df
.
You should do the following cleaning tasks to your dataframe df
:
- Format the
Date
variable to a date. - Sort the data by
Date
ascending (the earliest date in the data should be first, the most recent date last). - Drop rows with any missing data.
- Convert
Price
,Open
,High
andLow
to numeric. - Convert
Vol
to numeric. For example,"33.87K"
should be33870
. Hint: First use thegsub()
function to remove theK
. Then convert the variable to numeric format. Finally multiply it by 1,000. - Convert
Change
to numeric. Tip: Usegsub("\\%", "", x)
to remove a percentage symbol fromx
. - Convert all variable names to lower case.
If you did all the steps correctly, you should have 260 observations. The average of the high
variable should be 0.9561. The average of the vol
variable should be 79,920. If only some of these match your cleaned dataset, you will still be able to answer some of the questions correctly.
Question 11 (1 point)
Create a variable called hml
which is the high
variable minus the low
variable. What is the mean of this variable?
Also assign your answer to answer11
in your code.
Question 12 (1 point)
What is the median of the vol
variable?
Also assign your answer to answer12
in your code.
Question 13 (1 point)
What was the largest negative daily price change in the data?
Write the percentage change without the %
symbol.
Also assign your answer to answer13
in your code.
Question 14 (1 point)
On which date was the largest volume traded?
Also assign your answer to answer14
in your code.
Optimization (3 points)
The following 3 questions will involve working with the following mathematical function defined over all real numbers \(x\):
\[f(x) = -x^2 + 2x - 5\]
Question 15 (1 point)
Plot the function between the \(x\) values \(-3\) and \(+5\). Choose the answer below which best describes the shape of this function:
- Straight line
- Flat
- U shape
- Inverted U shape (upside-down U)
Note: you do not need to save your answer in your R script for this question.
Question 16 (1 point)
Use R to find the value of \(x\) that maximizes this function.
Also assign your answer to answer16
in your code.
Question 17 (1 point)
What value does the function take at its maximum?
Also assign your answer to answer17
in your code.
Aggregating, Merging and Reshaping (3 points)
Download the two datasets:
- gdp-per-capita-growths.csv - this dataset contains 3 variables:
country
,year
andgdp_pc_growth
. The variablegdp_pc_growth
is the growth rate of the country’s per capita gross domestic product (GDP) in that year. - lending-rates.csv - this dataset contains 3 variables:
country
,year
andlending_rate
. The variablelending_rate
is the lending interest rate in that country in that year.
Assign the first dataset gdp-per-capita-growths.csv
to df1
when reading it into R.
Assign the second dataset lending-rates.csv
to df2
when reading it into R.
Question 18 (1 points)
Using the dataset gdp-per-capita-growths.csv
, calculate the average per capita GDP growth rate across countries by year.
In what year in the data was the average GDP per capita growth rate the smallest?
Also assign your answer to answer18
in your code.
Question 19 (1 point)
Merge the datasets gdp-per-capita-growths.csv
and lending-rates.csv
together by the variables "country"
and "year"
, dropping observations without a match. Your merged dataset should have 2,572 observations and 4 variables.
Report the mean growth rate in GDP per capita in the merged dataset.
Also assign your answer to answer19
in your code.
Question 20 (1 point)
Create a subset of the merged data from Question 19 which only includes data on the Netherlands. This is when the country
variable equals "Netherlands"
.
Reshape the Netherlands data to long format and use this long-format data to create a line plot of per capita GDP growth and the lending rate over time in the Netherlands. Choose the answer below which best describes what the plot shows in 2009:
- GDP per capita growth fell sharply, but the lending rate increased.
- GDP per capita growth rose sharply, but the lending rate decreased.
- GDP per capita growth fell sharply, and the lending rate also decreased.
- GDP per capita growth rose sharply, and the lending rate also increased.
Hint: Your long-format data for the Netherlands should have 4 variables: "country"
, "year"
, "variable"
and "value"
, where:
"country"
is"Netherlands"
everywhere."year"
takes on the values 2000-2013 repeated twice."variable"
contains the two variable names ("gdp_pc_growth"
and"lending_rate"
) repeated for each year."value"
contains the values of those variables in those years.
The first 3 rows of your reshaped data should look like:
country year variable value
Netherlands 2000 gdp_pc_growth 3.4535383
Netherlands 2001 gdp_pc_growth 1.5574581
Netherlands 2002 gdp_pc_growth -0.4203677
Note: You do not need to save your answer in your R script for this question.