The data analysis process is composed of the following steps:
- The statement of problem
- Obtain your data
- Clean the data
- Normalize the data
- Transform the data
- Exploratory statistics
- Exploratory visualization
- Predictive modelling
- Validate your model
- Visualize and interpret your results
- Deploy your solution
All of above activities can be grouped as follows:
The Problem → Data Preparation → Data Exploration → Predictive modeling → Visualization of Results
The problem is defined as asking a high-level question, such as what’s going to be the gold price in the next month.
Data preparation is about how to obtain, clean, normalize, and transform the data which is suitable for modelling.
Data exploration is used to find patterns, connections, and relations in the data, by looking at the data in graphical and statistical form.
Predictive modelling is a process used in data analysis to create or choose a statistical model trying to best predict the probability of an outcome.
Visualization of results
How is it going to present the result.
Quantitative versus qualitative data analysis
Quantitative data: It is numerical measurements expressed in terms of numbers (Structured Data, Statistical analysis, Objective conclusions)
Qualitative data: It is categorical measurements expressed in terms of natural language descriptions (Unstructured data, Summary, Subjective conclusions)