It is a process to understand the data and apply statistics to get insights for a business objective.
1 Objective :
First and foremost, Understand the Objective, the purpose. Why are you doing this? What outcome you want from it?
If these questions are not clear, the rest is in vain. It is the same way that we do in SDLC (Software Development Life Cycle) model, If the requirement is not clear, then you might develop or test the software wrongly.
2 Understanding the data :
Once you have understood the “Objective”, understanding the data is crucial. Suppose if your data contains rows and columns in excel or on a server, check what each row comprises with respect to the columns. If you have bulk dataset, take a sample and then do some wear and tear to understand what it embraces. And most importantly, check if the data fulfils the objective or not. Sometime the data provided to you might have some crucial information missing in the data.
3 Data cleaning and Data transformation :
These two go hand in hand. Data cleaning includes removing and replacing junk data, filling in some gaps if present. Whereas Data transformation consist of transforming the data as per your requirement to achieve the objective. Like if you create some metrics like Weekend vs Weekday sales, Seasonality like Spring, Summer etc. for Sales data and many more. Sometime after you transform and clean the data you might have to transform it more or vice versa. If you have a Large Dataset, try cleaning the data before you start transformation, it will reduce your efforts.
4 Data Enhancement :
Data enhancement is adding value to the data given to you by looking for other external sources or non-traditional data. Today many new forms of data channels are available which can be leveraged for meeting the business objective.
5 Data Analytics :
When you have all the data in desired format, you will perform Analytics which will give you the insights for the business and help in decision making. For this you can you use Linear Regression, Clustering, Decision Tree techniques to come to a conclusion and many more as per requirement. This can be done with help of R language (open source).
6 Data Visualisation :
What good it would do if you are not able to present it well. So, do take time to Visualise your data, for this you can use Microsoft PowerBI. The way you present with the outcome of Analytics does matter. More it appeals, and is user friendly ,more user will indulge in it.
Data Analytics is Storytelling, it tells you what is happening and what you should do to reach your objective.
So start building your story…