Data Mining: Software Helping Business
Run
Group 4
Austin Beam, Brittany Dearien, Warren Irwin, Amanda Medlin, Rob Westerman
Key Words: Data Mining Tools, Knowledge-Discovery in Databases (KDD), Modeling, Scoring, Quantitative View, Qualitative View, Privacy, Ethics, Fraud Detection, Market Segmentation
With an increasing amount of company and consumer data, businesses now have the financial ability to overload their data warehouses with valuable information on customers and clients. These databases store large quantities of specific data for easy retrieval and interpretation by users. Through data mining, it is possible to search large volumes of data in pursuit of hidden patterns, which in turn find predictive trends for the future.
Data mining, also known as Knowledge-Discovery in Databases (KDD), is defined as a computer assisted process of quarrying to analyze enormous sets of data, and then extracting the meaning of that data. While data mining is a somewhat new topic in computing, it applies many older computational techniques from statistics, machine learning, and pattern recognition.
The goal is a simplification and automation of the overall statistical process to apply to the model. Based on procedure through mining, the most critical and necessary phase is to have both large and relevant sets of data. The combination of statistical algorithms, increased computing power, and improved data collection and management converge to make the three technologies that make up the core ingredients of data mining. By modeling, these programs can take information that is known, and in turn, direct it towards information that is not known. To begin with a predictive model, the program can take demographics such as family size, income, or age to answer a question. Furthermore, data mining can sometimes consist of using either various algorithm sets such as an ‘if/then’ decision tree. The way this data is recorded, is more commonly known as how it is scored. Scored data is seen as either qualitative or quantitative views by its relative difficulty to understand. A qualitative view provides insight into the data you are working with, but requires increased interaction capabilities and good visualization. In contrast, a quantitative view is more of an automated process and a bottom line orientation. These data mining tools are used to discover the hidden patterns and relationships in the data, making overall better business decisions.
Companies in a wide range of industries are already using data mining as a means to view and navigate through their data. These industries include but are not limited to: retail, finance, health care, transportation, and manufacturing of goods. The relevant use of data mining to these businesses is endless. It is possible to recognize significant facts, trends, customer loyalty, and exceptions that may otherwise go unnoticed through statistical and mathematical techniques. For example, by displaying the market segmentation of different types of customers, it is much easier to recognize those who purchase the same product. Moreover, visualizing a customer churn enables a business to see those consumers who may leave and go to a competitor in the same market. In its most common use, direct marketing campaigns through data mining tools take place every time a company sends mail to an individual that is related to that person’s purchases or interests. When online, interactive marketing allows a website server to predict what each individual accessing a web site is most likely interested in seeing. A prime example of market basket analysis and data mining tools is evidenced when Wal-Mart places Rotel and Velveeta on the same shelf because of their similar purchase patterns. As trends begin to fade, the purchase patterns picked up with this software can help a grocery chain to stock fewer popsicles as the winter months approach. Data mining can also aid in detecting when fraud will most likely occur in a various transactions.
The numerous advantages of this process have proven to be evident in our daily routines. As discussed in class, if a hunter were to order a turkey call from a magazine, he may also receive an advertisement from businesses such as Cabela’s or Bass Pro Shops for a sale on their Mossberg 12 gauge pump shot guns. This is possible because of the data mining tools used in companies much like Cabela’s. They uncover the everyday trends which may find complimentary goods to the previous purchase made by that consumer. If a hunter is interested in one product, they most likely would like to know that another similar good is on sale or at a reduced price. In addition, government has used data mining to work with taxes, zoning, and defense. Surprisingly enough, data mining has been cited by the U.S. Army unit Able Danger in supposedly identifying one 9/11 attack leader Mohammad Atta, along with three other hijackers involved in the al Qaeda cell a year previous to the attack.
A major disadvantage of data mining includes how relevant the data is, whether or not that group has the right to query that information, and most importantly that of privacy. E-Businesses may sometimes use data mining tools to infringe upon consumers privacy. When making a purchase online, a business may collect information to send unwanted advertisements and spam related to future online deals. Additionally, if an employer has access to medical records, it may sway a decision one way or another during the application process. By considering whether or not the applicant has had a heart attack or previous illness, that employer may chose to not hire that person in hopes of keeping insurance costs down. This causes a dilemma in both ethics and legality.
The booming market of data mining software has increased from $540 million in 2002 to an astonishing $1.5 billion in 2005. Several producers have consolidated to their efforts to create the big players in the data mining software market. These groups consist of Oracle, Angoss, and Unica. This mining technology seems to focus more on analytics and statistical tools than statisticians have in the past. Although statisticians are valuable to the production of designing these software programs in most cases, it is more cost effective and time efficient to run a data mining program in their place. A wide variety of new and emerging data mining software has proven to help both big and small business. They accomplish their goals through this technology, in the end serving each consumer on a more personal level.
As databases continue to grow in size and complexity, data mining helps businesses to search and find the criterion that fits the situation. When searching large quantities of information for otherwise hidden trends and patterns, many aspects of business and science can be done with more ease and less stress. This solution can answer the question that the model presents to help find the necessary trends and patterns that help an organization run day by day.
References
Alexander, Doug. (2005). “Data Mining.”
http://www.eco.utexas.edu/~norman/BUS.FOR/course.mat/Alex/. (April 17, 2006)
Thearling, Kurt. (2006). “Information About Data Mining and Analytic Technologies.”
http://www.thearling.com/. (April 17, 2006)
Thearling, Kurt. (2006). “An Introduction to Data Mining.”
http://www.thearling.com/dmintro/dmintro_frame.htm. (