The amount of data produced every day is just mind boggling. We cannot imagine a day without mobile phone and without social media- Facebook, twitter, YouTube, snapchat, whatsapp, Pinterest etc etc. People are using social media to communication, to comment, to give thumbs up or down, and to share what they like on internet. Huge data produced on websites, mobile apps and through social media interactions.
2.5 quintillion bytes of data created each day today. – www.domo.com
This would grow many fold up with penetration sensors and other devices with more adoption of internet of things (IoT).
Data, Data and More Data
Here are some interesting numbers to give you an idea of data we end up generating every day, every minute, and every second.
- Every second, Google gets 40,000 search requests
- Every minute, Snapchat users share 527,760 photos
- Every minute, More than 120 professionals join LinkedIn
- Every minute, Users watch 4,146,600 YouTube videos
- Every minute, 456,000 tweets are sent on Twitter
- Every minute, Instagram users post 46,740 photos
- Every day, 1.5 billion active users on Facebook
- Every day, 300 million photos are uploaded
That is really huge! But why we are discussion all this? Do you believe firstly studying this kind data is possible and secondly is it really useful for any business? And the answer is yes. Not just yes, but a big YES. Let us see how.
Dataware housing and business intelligence
In 70’s, when people started using databases, they were concerned mostly with automating the manual processes. Data from the daily business transactions were collected and analyzed. Slowly as the businesses grew, many more variables started coming into the picture like seasonality, location preferences, different tastes from different customer segments, market conditions etc. Then a longer term view on data was required. Data about sales from various products across multiple locations was stored in a different bigger repository year after year. This entire data was aggregated and graphs were plotted to find patterns. All businesses were mostly using this in-house data for deriving intelligence. The tools and technologies to accomplish this are named as Dataware house and business intelligence (DW/BI).
What is Data Science?
In 90’s, users across the world were hooked to internet and social media, the picture changed completely. If someone buys a product or watches a promotional marketing campaign, he/she started putting comments about it on social media. People started liking and forwarding/sharing those comments in their friend circles or groups on social media. Every business got interested in such trends as this entire data was voluntarily available. Processing this data had large value as it was showing the customer sentiment, live and on mass scale.
To process such voluminous data and to discover insights a new discipline called Data Science was born. This type of computing was named as data mining previously. However, it got glamourized with name of data science or machine learning in recent years. This discipline deals with processing of raw or transactional data and extracting intelligence out of it.
There are many technologies and tools used in data science. One can divide data science study in two main sections: data engineering and data analytics.
Data Engineering is the section which contains various data sources from which data is read, processed, transformed and stored in different ways and storage mediums. This includes everything from file system, databases, Relational databases, data warehouse, NO-SQL databases and big data processing mechanisms like Map Reduce, PIG, HIVE etc. Main focus of this section is to store and preprocess data and make it ready so that it can used by data analytics tools to derive intelligence out of it.
This section contains tools and technologies which are used to process the data and generate insights in the form or patterns, business rules or predictive models. Data analytics is of three types: Descriptive analytics, Predictive Analytics and Prescriptive Analytics.
Descriptive analytics deals with the past data. It performs a postmortem of what happened? Then this information is used for future planning and forecasting with manual judgment.
Predictive analytics uses past data to train the model. Once the model is trained, it can be used to predict future outcome based on the current input data. It tells us what can happen? For example, past records customer profiles and selection outcome for a bank loan is used to create a predictive model. This model can be used to select and predict future customers for bank loans. For customers who are selected with predictive analytics, chances that they will default are very less.
Prescriptive Analytics is the next step beyond prediction. Here once the prediction are known, it advises the business on possible outcomes and possible actions so that probability of success increases.
Distance MBA in Data science or AI and ML
A program in distance MBA would focus on both these aspects Data engineering as well as Data Analytics. There are different job roles in data engineering and in data analytics. Data engineering being more focused on data processing or programming may be of interest of computer graduates. Data analytics being more statistically oriented may attract student from other science disciplines. But the bottom line is one has have fair knowledge of both to do well in career.