Examination May 2020
18 April 2020 to 24 May 2020
Student Number

[END]MGT B399F (2000) Page 2 of 2
This examination is arranged in the form of an extended essay and a reflective journal. The
weighting of the extended essay is 40% of the marks for this course. The weighting of the
reflective journal is 20% of the marks for this course.
Part A: Extended Essay
You are required to work on the following extended essay in 1000 – 1200 words.
Under the current situation in Hong Kong (i.e. the threat of COVID-19), Hong Kong has been
undergoing an economic downturn. Assume you have identified a business opportunity under this
circumstance in a specific industry and you are going to start a company to exploit this
Please prepare a business proposal including the following:
 Identify the business opportunity. Please provide concrete evidence to support your idea. (150
– 200 words)
 Based on the above business opportunity, describe the type of industry that you are going to
compete in, products/services that your company is going to provide, and types of customer
needs that these products and services are going to satisfy. (250 – 300 words)
 What strategic actions will you take to compete in the industry? (600 – 700 words)
 Who are your target customers?  What are your selling points?
 What is your business-level strategy?
 What is your corporate-level strategy?
Justify your choices and make assumptions when necessary.
The word limit for each section is just for your reference. You can adjust the number of
words for each section according to your own needs.

Written Report

Page limit 30 pages, excluding title page and Appendix. You should present your nal models in the main report as well as the outline of your model building or tuning parameter selection strategy. In principle, the main report should not contain any R code or R output, except graphs. The model building process, i.e. the technical details of how you arrive at the nal models, should be included in Appendix. You should make references to Appendix sections or pages in the main report. Your report should have but not limited to the following sections and information:

Executive summary: On a separate page. In the rst paragraph, provide your ndings in the context of the project. For example, highlight the implications/interpretations that you have come to during your modeling/prediction process in the context of the project. You should avoid technical (mathematical/statistical) language as much as possible in this part of the project. This is the section in reports which is usually read carefully by managers, who may not have any statistical background. In the second paragraph, highlight your achievements, such as the ranking achieved or additional methods attempted that have good results. Include a table of the prediction error and ranking for each method.

Introduction: (Brie y) describe the objective and organization of your report.

Data: Here you can perform some descriptive analysis on the data. For example, you can provide relevant tables or graphs.

Preprocessing (optional): This is the place to discuss missing data, outliers, and/or any problems existing in the data. If you perform any feature reduction and/or transformation and/or imputation, detail your solution/steps here. Some of the aspects, for example, outlier detection and handling can be postponed till later modeling steps.

Smoothing methods: Use smoothing methods (e.g. spline or local regression) for prediction. Describe your model building strategy and your nal model.

Radom Forests: Use random forests models on the data, describe the tuning parameter selection process and the parameter selected, report the importance of variables.

Boosting: Use boosting methods on the data, describe the tuning parameter selection process and the parameter selected, report the importance of variables.

Additional methods (optional): Describe any other method you have tried.


Statistical Conclusions: Compare your smoothing, random forests and boosting models or any other method you have attempted, and pick the best candidate with respect to some criteria (prediction error, ease of use, computation time, etc.). This is where you provide your statistical conclusion on the models.

Future work: Any aspect of the project you wish that could be done better. Any weak-ness of the current methods you tried that you wish there are improvements. Any other statistical/machine learning methods you wish to learn/try in the future.

Contribution: For teams with more than one student, include a table of the tasks and contribution percentage from each team member. The tasks can include but not limited to: data cleaning, methods (smoothing, random forests, etc.), report writing. Each member should contribute at least 50% to one of the statistical methods. That is, the division of labor cannot be one member performing all statistical methods and model tting and the other member doing everything else.


Use R Markdown to generate the appendix and attached the generated pdf after the main report. Appendix is not counted in the page limit. It should at least contain

Modeling details: It is unlikely all models you have tted are useful/interesting to report. Some models are only of intermediate values, and some models may need to be checked but turns out to fare worse than the current one. You can detail your model building process and tuning parameter selection here to justify the models you presented in the main text, please cross-reference in the report.

Additional Information

Cleaning data with Python: Some of you have asked whether you can use Python. It is not recommended, but you can use Python to clean the data before analysis if you prefer. In this case, you need to embed Python in R, i.e. call Python from R. Also include comments so that your cleaning steps can be understood.

Latex: The R Markdown will create a latex le besides the pdf le if you set keep tex: yes” under output: pdf document” section. You can modify the latex le for better control of the layout and margin of the document if you prefer.