Statistical modelling basic

Probability distributions are the foundation of statistical models. Natural processes generate data and empirical shape of data can be approximated by mathematical functions.

Some of well-known continuous probability functions are: (i) Normal distribution, (ii) Uniform distribution, (iii) Cauchy distribution, (iv) t distribution, (v) F distribution, (vi) Chi-Square distribution, (vii) Exponential distribution, (viii) Gamma distribution.

EDA can be used to understand the data intuition, understand the shape of it, and try to connect your understanding of the process that generated the data to the data itself.

Fitting a model – Fitting a model means estimating the parameters of the model using the observed data. It involves optimization methods and algorithms, such as maximum likelihood estimation, to help get the parameters.

Overfitting – Overfitting is the term used to mean that you used a dataset to estimate the parameters of your model, but your model is not that good at capturing reality beyond your sampled data.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s