The Art and Science of Data Mining in the Aviation Industry
by Sarosh Bhatti
Aviation Enthusiast - Strategist - Analyst - Forecaster and Planner
October 3, 2020
What is Data Mining?
In general terms, “Mining” is the process of extracting some valuable material from the earth, such as coal mining, diamond mining etc. In the context of analytics, “Data Mining” refers to the extraction of useful information from a bulk of data sources or data warehouses. One can see that the term itself is a little bit confusing. In coal or diamond mining, the result of extraction process is coal or diamond. But in the case of Data Mining, the result of the extraction process is not data! Instead, the results of data mining are the patterns and knowledge that we gain at the end of the extraction process. In that sense, Data Mining is also known as Knowledge Discovery or Knowledge Extraction.
Data Mining is the art and science of discovering knowledge, insights, and patterns in data. In practice, it is the act of extracting useful insights and patterns from either an organized and structured or unstructured collection of data. Data Mining is a very broad and multidisciplinary field that borrows techniques from variety of areas. For example, it uses the knowledge of data quality and data organizing from the database area, draws modeling and analytical techniques from statistics, and the art of decision-making from business management.
Data Mining in the Aviation Industry
We are in the information economy, and more and more data are being generated each second. Every time you buy a plane ticket or hotel stay, rent a car or travel on your dream vacation, purchase goods in-flight or in the airport, or walk through the terminal, valuable data is being generated and downloaded to a database.
Aviation organizations are storing, processing, and analyzing this data to gain intelligence and improve their decision-making. This is where data mining comes in. Data mining is the incorporation of quantitative methods or mathematical methods that may include equations, algorithms, logistic regression, neural networks, segmentation and classification, clustering, and other methods that use mathematics. Generally, wherever you have processes and data, it is the application of these powerful mathematical models incorporated with certain statistical techniques that will help extract insights and patterns for business-planning and decision-making.
How Data Mining Happens
There are various steps that can be taken, but some good practices within the aviation industry are:
1. Business Understanding
The first step is establishing the goals of the project and identifying how data mining can help achieve them. A plan should be developed at this stage to include timelines, actions, and role assignments.
2. Data Understanding
Data is collected from all applicable data sources in this step. Always remember that relevant data is the key and understand that secondary data is equally essential as well. Data visualization tools are often used in this stage to explore the properties of the data to ensure it will help achieve the business goals.
3. Data Preparation
Data is then cleansed; quality management principles are applied, and missing data is included to ensure it is ready to be mined. Data processing can take enormous amounts of time depending on the amount of data analyzed and the number of data sources. This can sometimes be the most time-consuming step.
4. Data Modeling
Mathematical models are then used to find patterns in the data using sophisticated data tools.
5. Evaluation
The findings are evaluated and compared to business objectives to determine if they should be deployed across the organization.
6. Deployment
In the final stage, the data mining findings are shared across the organization. An enterprise business intelligence platform can be used to provide a single source of the truth for self-service data discovery.
Types of Data Mining in Aviation
Depending on an airline’s or airport’s maturity in analytics, the two primary types of data mining practices used are Supervised Learning and Unsupervised Learning.
1. Supervised Learning
Aviation organizations use supervised learning when their goal is prediction or classification. A process is considered supervised learning if the goal of the model is to predict the value of an observation. The easiest way to conceptualize this process is to look for a single output variable. Some examples are predictive aircraft maintenance and route forecasts. Common analytical models used in supervised data mining approaches in aviation are: Linear Regressions, Logistic Regressions, Time Series, Classification or Regression Trees, Neural Networks, and K-Nearest Neighbor.
2. Unsupervised Learning
Unsupervised tasks focus on understanding and describing data to reveal underlying patterns. Recommendation systems employ unsupervised learning to track user patterns and provide personalized recommendations to enhance the customer experience. In aviation, this is used to understand customers’ buying behaviors or perform association analyses. For example, certain segments of passengers, when buying an airline ticket, will also buy hotel accommodation and rental cars. Common analytical models used in unsupervised data mining approaches are: Clustering, Association Analysis, Principal Component Analysis, Supervised and Unsupervised Approaches in Practice.
In Closing
We have all heard the saying that data is the new oil. Well, I like to say that it’s the new jet fuel. Whatever analogy we use, data just sitting there untouched does not do anyone any good. It needs to be mined and applied to various business operations to improve efficiencies, enhance the customer experience, and increase revenues.
References
https://www.microstrategy.com/us/resources/introductory-guides/data-mining-explained
https://www.geeksforgeeks.org/data-mining/