Milestone 3 – Model development of property prices
Belle also recently heard in the media that the proportion of sales are as follows:
• 20% of sales are located in the Eastern Suburbs;
• 20% of sales are sold in the Northern Suburbs;
• 5% of sales are located in Sydney CBD;
• 30% of sales are located in the Western Suburbs and;
• 25% of sales are located in the Southern Suburbs.
Belle would like to know if the sample data is statistically different to what the media is
suggesting. Remember to state your assumptions and limitations of the result.
In addition to this, Belle would like you to develop a propriety model to predict Sydney house
prices. To begin, Belle would like you to run a simple linear regression of
• price as the dependent variable; and
• intsize as the independent variable.
As part of your reporting, you need to interpret the coefficients of the model and discuss
whether they are economically and statistically significant. Also report the confidence
interval of price and interpret the results.
Next, Belle wants you to run a multiple linear regression. As part of this exercise you need to:
• Explain why a multiple linear regression is beneficial i.e. justify the need for multiple
linear regression and the issues associated with running a simple linear regression.
• Run the model with the following independent variables:
o No of bedrooms;
o Total Income
o Overseas; and
o Owner Occupied.
and price as the dependent variable.
As part of the reporting requirements for the multiple linear regression:
• Interpret the coefficients.
• Define and comment whether each of the coefficients are statistically significant.
Remember to state your assumptions.
• Define and comment whether each of the coefficients are economically significant.
Remember to state your assumptions.
• Limitations and issues with your model. If you decide to use technical terms e.g.
multicollinearity, homoskedasticity, bias, consistency etc. you need to explain what
these terminologies are and place them in the context of your problem and how it
will affect your results.
The final report unifies all three millstones/sections which you have completed which
showcases all the statistical techniques you have learnt to date. This report will consist of five
• Introduction and executive summary (1/2 page max for executive summary).
• Section 1 (using Milestone 1) which comprises a summary of key features of data
based on your first milestone and any feedback received.
• Section 2 (using Milestone 2) comprises a summary of statistical tests conducted
based on your second milestone and any feedback received.
• Section 3 (using Milestone 3) will investigate the development of regression models
to predict sale prices of Sydney’s property market.
• Concluding remarks.
Please refer to the Milestone 1 document.
Length: No longer than 1,500 words, (including labels of graphs (inclusive of all
numbers of the x-axis and y-axis as they are also labels) and any
captions, footnotes, titles and any appendices, all tables are included).
A hard limit of 1,800 words including labels of graphs and any captions,
footnotes, titles and any appendices, it also includes any headers and
footers as well as all tables) has been imposed.
Should you exceed the limit of 1,800 words a 50% penalty will be
imposed and an additional 10% will be added should you exceed this
limit by 10% e.g. if you had 1,950 words the penalty will be 60%. Total
length including all discussion, tables and graphs should not exceed 15
pages. Similarly, should you exceed this limit a 50% penalty will be
imposed and in the same spirit an additional 10% will be added should
you exceed this limit on a per page e.g. if you had 17 pages the penalty
will be 60%.