Today data is growing at very high speed due to digitization of almost everything. In last few years, we have witnessed data generation growing multifold due to computerization, fast development powerful data collection and storage getting cheaper and cheaper. Businesses across the world generate gigantic data sets of sales transactions, stock market data, marketing promotion, personal profiles, customer feedback and comments, etc. This gigantic data needs very different tools and technologies to extract intelligence out of it.
Distance MBA in AI and ML programs are created to study this upcoming field of data mining and machine learning.
Data Mining to extract intelligence
All of this Began with file Systems to store information. Since 1970 relational databases (RDBMS) where information is stored in the kind of rows and columns or table structure started gaining wider acceptance. Users got handy access through query mechanism to retrieve data. . Relational Database Management Systems (RDBMS) became the most prominent tool to store transactional data. It is also called as OLTP, On-Line Transaction Processing.
Progress in technology, created faster and cheaper storage. That fostered database and information processing. Different kinds of application-oriented databases were created. For example object-oriented, object-relational, spatial, multimedia etc.
In 1980, to support managerial decision making a new data repository was proposed called data warehouse. Here data from multiple heterogeneous sources was organized under one schema. Data over several years of transactions was saved. It supplied techniques for aggregating data into summaries. Since it was possible to analyze the information using multiple dimensions, this system was also called as OLAP or On-Line Analytical Processing.
During 90’s, huge volume of data began getting generated as world was linked using www or net. Inputting this huge and analyzing it manually was very hard. This led to development of different instruments and systems for turning this data into knowledge or information. This is the way a whole field of ‘Data Mining’ has been born.
In recent years, growth of social media, web applications on internet, internet of things has multiplied this data to multifold in recent years. Amount of data and the speed at which it is generated is throwing a challenge to engineers to invent new technologies to process and make sense or draw intelligence out of this ultra-mega data.
Knowledge Discovery from Data (KDD) and Data Mining
A term called knowledge discovery from data or KDD. Essential steps involved in KDD or data
This diagram can be broken into three parts around Data mining. First four blocks are for data preparation and next ones are representation or reporting.
Data mining is process of discovering interesting patterns and knowledge from large amounts of data.
Data mining is an interdisciplinary. It has incorporated many technologies, tools and methods from various domains. The domains which have contributed to data mining are statistics, machine learning, pattern recognition, databases, data warehouse, visualization, algorithms, high performance computing and application domains like business intelligence.
Data Mining Vs Machine Learning in distance MBA
From these disciplines machine learning is of our interest and focus for distance MBA. Machine learning studies how computers can automatically learn or improve based on the data. Here past data is provided to the computer for purpose of training itself. Then computer can discover/learn the patterns from the data. Creates a model and apply it on new set of data for prediction. This can aid better decision marking. Example of such learning is computer can automatically select most potential buyer to whom a product like television, home theater system can be easily marketed with maximum chances of success. That is, the selected potential customer would end up doing the purchase.
Now-a-days, machine learning is the word which is gaining more popularity. There are different opinions about whether data mining or machine learning is a subset of another. The book Data Mining: Concepts and Techniques – by Jiawei Han , Micheline Kamber, Jian Pei clarifies this relationship. It says, machine learning focused mostly on the accuracy of the model. Data mining is more comprehensive. It not only focuses more on the efficiency and scalability of mining methods on large data sets but also handing of complex data types like time series data etc.
Let this science of turning ‘raw data into magical intelligence’ may be called whatever, as machine learning or data mining, the techniques are similar. What is important for us is to know the methods and its application for better decision marking. This may be achieved either through patterns by forming a rule or by making computer program automatically learn from data. Distance MBA in AI and ML programs would cover various strategies, tools and technologies for extracting intelligence from data.