CIMA: Introduction to Data Analytics (2018)

Contact

e-mail

Recommended Textbooks

Kabacoff, R. I. (2011), R in Action. Data Analysis and Graphics with R, Manning Publications Co., Shelter Island.

Software

R:
download R

RStudio:
download RStudio

R Hints

Please download all R-scripts and csv data files to the same catalogue.

Presentation: introduction to R
download file

Course Outline

Block 1: Introduction to R [Saturday]

Activity: Importing data to R; Downloading and installing packages; Standard data mining functions; Despcriptive statistics

Learning Outcomes - Data Preparation: which-function, ifelse-function, sort-function, data frame, replication, sequence

Learning Outcomes - Statistics: measures of location (arithmetic and geometric averages, median), measures of dispersion (standard deviation, variance, coefficient of variation), outliers, skewness and kurtosis, hypothesis testing (two means, two variances)

R Script: importing data to R
download file

R Script: basic operations
download file

R Script: time series operations, descriptive statistics
download file

R Script: average vs. median
download file

Data and Exercises: Foreign exchange rate: USD/PLN and EUR/PLN
download file;

R Script: hypothesis testing
download file

Block 2: Intorduction to Dashboard Creation [Saturday]

Activity: Standard interactive graphs in Plotly package

Learning Outcomes - Dashboarding: box plot, histogram, time series plots, scatter plot, adding multiple dimensions to standard 2D plots

Learning Outcomes - Statistics: how to depict descriptive statistics, statistics and visual inspection or what to inspect and to expect

Time: 90 minutes

R Script: scatter plot with 4 dimensions
download file

R Script: box plot (cat and whiskers plot)
download file

R Script: time series plot and histogram
download file

R Script: customising labels and detecting outliers
download file

Block 3: Multiple Linear Regression [Saturday/Sunday]

Activity: estimation, significance, goodness-of-fit, diagnostics, marginal effects and elasticity

R Script: estimation, significance, and goodness-of-fit in R download file

Data: The curious case of used Audi Q5
download file

Block 4: Introduction to Probability Modelling: Binary Choice Models [Sunday]

Activity: estimation, significance, goodness-of-fit, diagnostics, marginal effects, conditional probability, probability-response analysis

Lecture: non-linear procedures - probit and logit
download file;

R Script: probit/logit - estimation and goodness-of-fit
download file

R Script: marginal effects
download file

R Script: conditional probability simulations
download file

Block 5: Other Non-Linear and Time Series Models: Introductory Remarks [Sunday]

Activity: ordinal choice data, count data models, modelling percentages

Hints: model selection for various explaining variable
download file

R Script: trend adjustment (Hodrick-Prescott)
download file

Data: annualised CPI inflation in Poland
download file

Block 7: Dashboards: Plotting Regression Results [Sunday]

Activity: residuals and goodness-of-fit, marginal effects, simulations and forecast

Presentation: Plotly graphs for OLS estimations;