Articoli correlati a Applied Data Mining for Business and Industry

Applied Data Mining for Business and Industry - Rilegato

 
9780470058862: Applied Data Mining for Business and Industry

Sinossi

The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications.

  • Introduces data mining methods and applications.
  • Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.
  • Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.
  • Features detailed case studies based on applied projects within industry.
  • Incorporates discussion of data mining software, with case studies analysed using R.
  • Is accessible to anyone with a basic knowledge of statistics or data analysis.
  • Includes an extensive bibliography and pointers to further reading within the text.

Applied Data Mining for Business and Industry, 2nd edition is aimed at advanced undergraduate and graduate students of data mining, applied statistics, database management, computer science and economics. The case studies will provide guidance to professionals working in industry on projects involving large volumes of data, such as customer relationship management, web design, risk management, marketing, economics and finance.

Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.

Informazioni sull?autore

Paolo Giudici Department of Economics and Quantitative Methods, University of Pavia, A lecturer in data mining, business statistics, data analysis and risk management, Professor Giudici is also the director of the data mining laboratory. He is the author of around 80 publications, and the coordinator of 2 national research grants on data mining, and local coordinator of a European integrated project on the topic. He was the sole author of the first edition of this book, which has been translated into both Italian and Chinese. He is also one of the Editors of Wiley's Series in Computational Statistics.

Silvia Figini, Ms Figini has worked for 2 years for the Competence centre for data mining analysis and business intelligence at SAS Milan. She is currently completing a PhD in statistics, and already has a collection of publications to her name

Estratto. © Ristampato con autorizzazione. Tutti i diritti riservati.

Applied Data Mining for Business and Industry

By Paolo Giudici Silvia Figini

John Wiley & Sons

Copyright © 2009 John Wiley & Sons, Ltd
All right reserved.

ISBN: 978-0-470-05886-2

Chapter One

Introduction

From an operational point of view, data mining is an integrated process of data analysis that consists of a series of activities that go from the definition of the objectives to be analysed, to the analysis of the data up to the interpretation and evaluation of the results. The various phases of the process are as follows:

Definition of the objectives for analysis. It is not always easy to define statistically the phenomenon we want to analyse. In fact, while the company objectives that we are aiming for are usually clear, they can be difficult to formalise. A clear statement of the problem and the objectives to be achieved is is of the utmost importance in setting up the analysis correctly. This is certainly one of the most difficult parts of the process since it determines the methods to be employed. Therefore the objectives must be clear and there must be no room for doubt or uncertainty.

Selection, organisation and pre-treatment of the data. Once the objectives of the analysis have been identified it is then necessary to collect or select the data needed for the analysis. First of all, it is necessary to identify the data sources. Usually data is taken from internal sources that are cheaper and more reliable. This data also has the advantage of being the result of the experiences and procedures of the company itself. The ideal data source is the company data warehouse, a `store room' of historical data that is no longer subject to changes and from which it is easy to extract topic databases (data marts) of interest. If there is no data warehouse then the data marts must be created by overlapping the different sources of company data.

In general, the creation of data marts to be analysed provides the fundamental input for the subsequent data analysis. It leads to a representation of the data, usually in table form, known as a data matrix that is based on the analytical needs and the previously established aims.

Once a data matrix is available it is often necessary to carry out a process of preliminary cleaning of the data. In other words, a quality control exercise is carried out on the data available. This is a formal process used to find or select variables that cannot be used, that is, variables that exist but are not suitable for analysis. It is also an important check on the contents of the variables and the possible presence of missing or incorrect data. If any essential information is missing it will then be necessary to supply further data. (See Agresti (1990).

Exploratory analysis of the data and their transformation. This phase involves a preliminary exploratory analysis of the data, very similar to on-line analytical process (OLAP) techniques. It involves an initial evaluation of the importance of the collected data. This phase might lead to a transformation of the original variables in order to better understand the phenomenon or which statistical methods to use. An exploratory analysis can highlight any anomalous data, data that is different from the rest. This data will not necessarily be eliminated because it might contain information that is important in achieving the objectives of the analysis. We think that an exploratory analysis of the data is essential because it allows the analyst to select the most appropriate statistical methods for the next phase of the analysis. This choice must consider the quality of the available data. The exploratory analysis might also suggest the need for new data extraction, if the collected data is considered insufficient for the aims of the analysis.

Specification of statistical methods. There are various statistical methods that can be used, and thus many algorithms available, so it is important to have a classification of the existing methods. The choice of which method to use in the analysis depends on the problem being studied or on the type of data available. The data mining process is guided by the application. For this reason, the classification of the statistical methods depends on the analysis's aim. Therefore, we group the methods into two main classes corresponding to distinct/different phases of the data analysis.

Descriptive methods. The main objective of this class of methods (also called symmetrical, unsupervised or indirect) is to describe groups of data in a succinct way. This can concern both the observations, which are classified into groups not known beforehand (cluster analysis, Kohonen maps) as well as the variables that are connected among themselves according to links unknown beforehand (association methods, log-linear models, graphical models). In descriptive methods there are no hypotheses of causality among the available variables.

Predictive methods. In this class of methods (also called asymmetrical, supervised or direct) the aim is to describe one or more of the variables in relation to all the others. This is done by looking for rules of classification or prediction based on the data. These rules help predict or classify the future result of one or more response or target variables in relation to what happens to the explanatory or input variables. The main methods of this type are those developed in the field of machine learning such as neural networks (multilayer perceptrons) and decision trees, but also classic statistical models such as linear and logistic regression models.

Analysis of the data based on the chosen methods. Once the statistical methods have been specified they must be translated into appropriate algorithms for computing the results we need from the available data. Given the wide range of specialised and non-specialised software available for data mining, it is not necessary to develop ad hoc calculation algorithms for the most `standard' applications. However, it is important that those managing the data mining process have a good understanding of the different available methods as well as of the different software solutions, so that they can adapt the process to the specific needs of the company and can correctly interpret the results of the analysis.

Evaluation and comparison of the methods used and choice of the final model for analysis. To produce a final decision it is necessary to choose the best `model' from the various statistical methods available. The choice of model is based on the comparison of the results obtained. It may be that none of the methods used satisfactorily achieves the analysis aims. In this case it is necessary to specify a more appropriate method for the analysis. When evaluating the performance of a specific method, as well as diagnostic measures of a statistical type, other things must be considered such as the constraints on the business both in terms of time and resources, as well as the quality and the availability of data. In data mining it is not usually a good idea to use just one statistical method to analyse data. Each method has the potential to highlight aspects that may be ignored by other methods.

Interpretation of the chosen model and its use in the decision process. Data mining is not only data analysis, but also the integration of the results into the company decision process. Business knowledge, the extraction of rules and their use in the decision process allow us to move from the analytical phase to the production of a decision engine. Once the model has been chosen and tested with a data set, the classification rule can be generalised. For example, we will be able to distinguish which customers will be more profitable or to calibrate differentiated commercial policies for different target consumer groups, thereby increasing the profits of the company.

Having seen the benefits we can get from data mining, it is crucial to implement the process correctly in order to exploit it to its full potential. The inclusion of the data mining process in the company organisation must be done gradually, setting out realistic aims and looking at the results along the way. The final aim is for data mining to be fully integrated with the other activities that are used to support company decisions. This process of integration can be divided into four phases:

Strategic phase. In this first phase we study the business procedures in order to identify where data mining could be more beneficial. The results at the end of this phase are the definition of the business objectives for a pilot data mining project and the definition of criteria to evaluate the project itself.

Training phase. This phase allows us to evaluate the data mining activity more carefully. A pilot project is set up and the results are assessed using the objectives and the criteria established in the previous phase. A fundamental aspect of the implementation of a data mining procedure is the choice of the pilot project. It must be easy to use but also important enough to create interest.

Creation phase. If the positive evaluation of the pilot project results in implementing a complete data mining system it will then be necessary to establish a detailed plan to reorganise the business procedure in order to include the data mining activity. More specifically, it will be necessary to reorganise the business database with the possible creation of a data warehouse; to develop the previous data mining prototype until we have an initial operational version and to allocate personnel and time to follow the project.

Migration phase. At this stage all we need to do is to prepare the organisation appropriately so that the data mining process can be successfully integrated. This means teaching likely users the potential of the new system and increasing their trust in the benefits that the system will bring to the company. This means constantly evaluating (and communicating) the results obtained from the data mining process.

(Continues...)


Excerpted from Applied Data Mining for Business and Industryby Paolo Giudici Silvia Figini Copyright © 2009 by John Wiley & Sons, Ltd. Excerpted by permission of John Wiley & Sons. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

Le informazioni nella sezione "Su questo libro" possono far riferimento a edizioni diverse di questo titolo.

Compra usato

Condizioni: come nuovo
Unread book in perfect condition...
Visualizza questo articolo

EUR 17,05 per la spedizione da U.S.A. a Italia

Destinazione, tempi e costi

EUR 6,03 per la spedizione da Regno Unito a Italia

Destinazione, tempi e costi

Altre edizioni note dello stesso titolo

9780470058879: Applied Data Mining for Business and Industry

Edizione in evidenza

ISBN 10:  0470058870 ISBN 13:  9780470058879
Casa editrice: Wiley, 2009
Brossura

Risultati della ricerca per Applied Data Mining for Business and Industry

Foto dell'editore

P Giudici
Editore: Wiley-Blackwell, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: PBShop.store UK, Fairford, GLOS, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

HRD. Condizione: New. New Book. Shipped from UK. Established seller since 2000. Codice articolo FW-9780470058862

Contatta il venditore

Compra nuovo

EUR 135,79
Convertire valuta
Spese di spedizione: EUR 6,03
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: 15 disponibili

Aggiungi al carrello

Foto dell'editore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: Ria Christie Collections, Uxbridge, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: New. In. Codice articolo ria9780470058862_new

Contatta il venditore

Compra nuovo

EUR 139,21
Convertire valuta
Spese di spedizione: EUR 10,31
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: New. Codice articolo 4239246-n

Contatta il venditore

Compra nuovo

EUR 135,78
Convertire valuta
Spese di spedizione: EUR 17,20
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

P Giudici
Editore: John Wiley & Sons, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: moluna, Greven, Germania

Valutazione del venditore 4 su 5 stelle 4 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: New. Paolo Giudici - Department of Economics and Quantitative Methods, University of Pavia, A lecturer in data mining, business statistics, data analysis and risk management, Professor Giudici is also the director of the data mining laboratory. He is the author . Codice articolo 446911495

Contatta il venditore

Compra nuovo

EUR 144,36
Convertire valuta
Spese di spedizione: EUR 9,70
Da: Germania a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: New. Codice articolo 4239246-n

Contatta il venditore

Compra nuovo

EUR 139,74
Convertire valuta
Spese di spedizione: EUR 17,05
Da: U.S.A. a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: Best Price, Torrance, CA, U.S.A.

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: New. SUPER FAST SHIPPING. Codice articolo 9780470058862

Contatta il venditore

Compra nuovo

EUR 134,18
Convertire valuta
Spese di spedizione: EUR 25,57
Da: U.S.A. a: Italia
Destinazione, tempi e costi

Quantità: 2 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Antico o usato Rilegato

Da: GreatBookPrices, Columbia, MD, U.S.A.

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: As New. Unread book in perfect condition. Codice articolo 4239246

Contatta il venditore

Compra usato

EUR 146,69
Convertire valuta
Spese di spedizione: EUR 17,05
Da: U.S.A. a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Immagini fornite dal venditore

Giudici, Paolo; Figini, Silvia
Editore: Wiley, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Antico o usato Rilegato

Da: GreatBookPricesUK, Woodford Green, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Condizione: As New. Unread book in perfect condition. Codice articolo 4239246

Contatta il venditore

Compra usato

EUR 147,65
Convertire valuta
Spese di spedizione: EUR 17,20
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Paolo Giudici
Editore: John Wiley & Sons Inc, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato

Da: THE SAINT BOOKSTORE, Southport, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Hardback. Condizione: New. New copy - Usually dispatched within 4 working days. 550. Codice articolo B9780470058862

Contatta il venditore

Compra nuovo

EUR 162,88
Convertire valuta
Spese di spedizione: EUR 10,20
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Foto dell'editore

Paolo Giudici
Editore: John Wiley & Sons Inc, 2009
ISBN 10: 0470058862 ISBN 13: 9780470058862
Nuovo Rilegato
Print on Demand

Da: THE SAINT BOOKSTORE, Southport, Regno Unito

Valutazione del venditore 5 su 5 stelle 5 stelle, Maggiori informazioni sulle valutazioni dei venditori

Hardback. Condizione: New. This item is printed on demand. New copy - Usually dispatched within 5-9 working days 550. Codice articolo C9780470058862

Contatta il venditore

Compra nuovo

EUR 167,55
Convertire valuta
Spese di spedizione: EUR 10,20
Da: Regno Unito a: Italia
Destinazione, tempi e costi

Quantità: Più di 20 disponibili

Aggiungi al carrello

Vedi altre 11 copie di questo libro

Vedi tutti i risultati per questo libro