# Linear Model of Mathematics

Major: Linear model of mathematics
Course code：
Reference：
Requirement:
You only need to do the second and fifth questions!
You need  “R markdown”

Course (please circle) : MATH2831 / MATH2931
I (We) declare that this assessment item is my (our) own work, except where
acknowledged, and has not been submitted for academic credit elsewhere, and
acknowledge that the assessor of this item may, for the purpose of assessing this
item:
• Reproduce this assessment item and provide a copy to another member of
the University; and/or,

• Communicate a copy of this assessment item to a plagiarism checking ser-
vice (which may then retain a copy of the assessment item on its database

for the purpose of future plagiarism checking).
I (We) certify that I (We) have read and understood the University Rules in
Surname Whit Given name Student ID Signature Whit Date Whi
1
2
3

1

Please follow the instructions below for completing the assignment, it is worth
20% of your final mark. You may do this assignment in groups up to 3 people.
Instructions
• Your assignment must be a typeset in one continuous LATEX(.pdf) or R

markdown (knitted to .pdf) document (no separate documents stapled to-
gether).

• Each question should be numbered using section or enumerate environ-
ments in LATEX or with hashes in R markdown.

• All the content for each part of each question should be consecutive, do not
refer to appendices or put questions out of order.
• You must do all your calculations in R and provide all code and relevant
output using the verbatim environment in LATEX or inside R markdown
chunks.
• You must submit a hard copy printed assignment with a completed cover
page (above).
• Font size should be easily readable (10 to 12)

• When stating conclusions for hypothesis tests, you must answer the ques-
tion. E.g. Using α = 0.05, we have evidence (p = 0.004) that latitude is

related to tree height, after controlling for temperature.

2

Assignment 1 – Questions
1. (MATH2831 and MATH2931)
(a) For a simple linear regression model covered in the lecture notes, derive
the relationship between the coefficient of determination R2 and the
sample correlation coefficient r given by

r =
Pn
i=1(yi − y ̄)(xi − x ̄)
pPn
i=1(xi − x ̄)
2 Pn
i=1(yi − y ̄)
2

(b) Consider a simple linear regression model with a known intercept pa-
rameter

yi = β

0 + β1xi + εi

, i = 1, . . . , n

where β

0
is known, β1 is an unknown slope parameter, errors εi are

uncorrelated with zero mean and common variance σ
2
.

i. Find the least squares estimator of β1 (you must justify your an-
swer). Does your estimator differ from the estimator obtained in

lectures for the case where β0 is unknown?
ii. Find the maximum likelihood estimator of β1 (you must justify
iii. Find the mean and variance of the least squares estimator b1 (you
iv. Prove for the model above that the following identity holds

Xn
i=1
(yi − β

0
)
2 =
Xn
i=1
(ybi − β

0
)
2 +
Xn
i=1
(yi − ybi)
2
,

where ybi denotes the fitted value β

0 + b1xi
.

3

2. (MATH2831 and MATH2931) To answer the following question, down-
load the ’auction.txt’ file from moodle. In the data set, the selling price at

auction of 30 antique grandfather clocks were recorded. Also recorded is
the age of the clock and the number of people who made a bid.
Variable Description
Age Age of the clock (years)
Bidders Number of individuals participating in the bidding
Price Selling price (pounds sterling)

(a) Obtain R summary output generated by fitting a simple linear re-
gression model with Price as the response and Age as the predictor.

Include the summary in your assignment.
(b) What are the least squares estimates of the intercept and slope, and
what is the estimated error variance for the fitted model?
(c) How much does price increase or decrease on average, when the age of
the clock increases by one year?
(d) What percentage of variation in the response is explained by the values
of the predictor?
(e) From the R output, state the value of an F test statistic for testing
H0 : β1 = 0 versus H1 : β1 6= 0 where β1 is the slope term in the
model. Also state the p-value for this test, and the conclusion of the
test using a 5% level of significance.
(f) State the observed value of a t test statistic equivalent to the F test

considered above in part (e). How would the computation of the p-
value for the t test be modified for testing H0 : β1 = 0 versus the one

sided alternative H1 : β1 > 0?
(g) Forecast the price of an antique grandfather clock if it is 170 years
old. Also construct a 95 percent prediction interval for the price, and
give a 95 percent confidence interval for the mean when the age of the
clock is 170 years.
(h) Use Bonferroni adjustment to compute a joint confidence interval for
β0 and β1 with at least 95% confidence level.

4

3. (MATH2831 and MATH2931) Let y = (y1, …, yn)

> be a set of re-
sponses, and consider the linear model

y = μ + ε,

where μ = (μ, …, μ)

> and ε is a vector of zero mean, uncorrelated errors

with variance σ
2
. This is a linear model in which the responses have a
constant but unknown mean μ. We will call this model the location model.
(a) If we write the location model in the usual form of the linear model

y = Xβ + ε,
then what is the design matrix X? What is β?
(b) Find X>X, (X>X)

−1 and X>y.

(c) What is the least squares estimator of μ? Show that this least squares
estimator is unbiased.
(d) Using the results we have proved for the general linear models, derive
an expression for an unbiased estimator of σ
2
in the location model.

4. (MATH2931 only) In this question, we will prove the sums of squares
identity

SStotal = SSreg + SSres
stated in lectures for the general linear model.
(a) If y is the n × 1 vector of response values, show that the vector y ̄,
which is the n × 1 vector where all entries are ̄y is given by:

y ̄ = 1(1
>1)
−11
>y,
where 1 is the n × 1 vector of ones.
(b) Show that

SStotal = y
>B
>By,

where B = (I − 1(1
>1)
−11
>) and I is n × n the identity matrix.

(c) Show that the matrix B is symmetric and idempotent.
(d) If X is the design matrix, show that
SSreg = y

>(H − 1(1
>1)
−11
>)y.

where H = X(X>X)
−1X>, the p×p hat matrix. Hint: Write SSreg = Pn
i=1 yb
2
i −
(
Pn
i=1 yi)
2
n

(e) Recall from lectures that (you don’t need to show this)

SSres = y

>(I − H)y.
Hence prove the sums of squares identity
SStotal = SSreg + SSres.
5

dle. It contains the first n observations from the data set Combined Cycle

Power Plant Data Set on the Machine Learning Repository. The data set
contains data points collected from a Combined Cycle Power Plant over 6
years (2006-2011), when the power plant was set to work with full load.
The response and predictor variables of this data set are listed below:
Response Net Hourly Electrical Energy Output (PE) in MW (Mega Watts)
Predictors Hourly Average Ambient Variables Temperature (AT) in oC

Exhaust Vacuum (V) in cm Hg
Ambient Pressure (AP) in milibar
Relative Humidity (RH) in %

(a) Obtain the summary and anova outputs from the multiple linear re-
gression model fitted with all the predictors listed above. Include the

Use the outputs from part (a) to answer the following questions. Use
α = 0.01 for all hypothesis tests.
(b) State the value of the F statistic used to test the hypothesis that
β1 = β2 = β3 = β4 = 0 versus β1 6= 0 or β2 6= 0 or β3 6= 0 or β4 6= 0.
What is the conclusion from this test?
(c) Is there evidence that a model with AT and V is better than a model
with just AT? State the relevant test statistic, p-value and conclusion.
(d) Conduct the appropriate sequential F test to test whether a model
containing all the predictors is preferred over a model with AT as the
predictor. State the relevant test statistic, p-value and conclusion.
(e) Is there evidence that AP is related to the response in the presence of
AT, V and RH? State the relevant test statistic, p-value and conclusion.
(f) Obtain a 99% prediction interval for the response PE when the observed
values of the predictors are given by (AT, V, AP, RH) = (27, 51,
1017, 44).

6

6. (MATH2931 only) Suppose we have the full rank linear model y = Xβ+
ε with n × p design matrix X, normal errors ε ∼ N (0, σ2

In×n). Let b be

the least squares estimator of β.
(a) Show that for any symmetric and idempotent matrix A, all eigenvalues
are either zero or one, and rank(A)=tr(A).
Hint: Apply spectral decomposition to A.
(b) Prove that rank(H)=tr(H) = p and rank(I − H)=tr(I − H) = n − p,
where H = X(XT X)

−1X> is the hat matrix.

(c) Prove that

(b − β)
T XT X(b − β)
σ
2

follows the χ
2
p distribution.

Hint: Write Xb in terms of X, β and ε.
(d) Hence derive a 100(1−α)% joint confidence region of β given in notes

(b − β)
>X
>X(b − β)/pσb

2 6 Fα;p,n−p,

where Fα;p,n−p denotes the upper αth quantile of the Fp,n−p distribu-
tion. Loading... Taking too long? Reload document
| Open in new tab