Probability distributions are the foundation of statistical models. Natural processes generate data and empirical shape of data can be approximated by mathematical functions.
Some of well-known continuous probability functions are: (i) Normal distribution, (ii) Uniform distribution, (iii) Cauchy distribution, (iv) t distribution, (v) F distribution, (vi) Chi-Square distribution, (vii) Exponential distribution, (viii) Gamma distribution.
EDA can be used to understand the data intuition, understand the shape of it, and try to connect your understanding of the process that generated the data to the data itself.
Fitting a model – Fitting a model means estimating the parameters of the model using the observed data. It involves optimization methods and algorithms, such as maximum likelihood estimation, to help get the parameters.
Overfitting – Overfitting is the term used to mean that you used a dataset to estimate the parameters of your model, but your model is not that good at capturing reality beyond your sampled data.