Visual data mining the VisMiner approach

A visual approach to data mining. Data mining has been defined as the search for useful and previously unknown patterns in large datasets, yet when faced with the task of mining a large dataset, it is not always obvious where to start and how to proceed. This book introduces a visual methodology...

Descripción completa

Detalles Bibliográficos
Autor principal: Anderson, Russell K. (-)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Chichester, West Sussex, U.K. ; Hoboken, N.J. : Wiley 2012.
Edición:2nd ed
Colección:New York Academy of Sciences
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628562706719
Tabla de Contenidos:
  • Visual Data Mining: THE VISMINER APPROACH; Contents; Preface; Acknowledgments; 1. Introduction; Data Mining Objectives; Introduction to VisMiner; The Data Mining Process; Initial Data Exploration; Dataset Preparation; Algorithm Selection and Application; Model Evaluation; Summary; 2. Initial Data Exploration and Dataset Preparation Using VisMiner; The Rationale for Visualizations; Tutorial - Using VisMiner; Initializing VisMiner; Initializing the Slave Computers; Opening a Dataset; Viewing Summary Statistics; Exercise 2.1; The Correlation Matrix; Exercise 2.2; The Histogram; The Scatter Plot
  • Exercise 2.3The Parallel Coordinate Plot; Exercise 2.4; Extracting Sub-populations Using the Parallel Coordinate Plot; Exercise 2.5; The Table Viewer; The Boundary Data Viewer; Exercise 2.6; The Boundary Data Viewer with Temporal Data; Exercise 2.7; Summary; 3. Advanced Topics in Initial Exploration and Dataset Preparation Using VisMiner; Missing Values; Missing Values - An Example; Exploration Using the Location Plot; Exercise 3.1; Dataset Preparation - Creating Computed Columns; Exercise 3.2; Aggregating Data for Observation Reduction; Exercise 3.3; Combining Datasets; Exercise 3.4
  • Outliers and Data ValidationRange Checks; Fixed Range Outliers; Distribution Based Outliers; Computed Checks; Exercise 3.5; Feasibility and Consistency Checks; Data Correction Outside of VisMiner; Distribution Consistency; Pattern Checks; A Pattern Check of Experimental Data; Exercise 3.6; Summary; 4. Prediction Algorithms for Data Mining; Decision Trees; Stopping the Splitting Process; A Decision Tree Example; Using Decision Trees; Decision Tree Advantages; Limitations; Artificial Neural Networks; Overfitting the Model; Moving Beyond Local Optima; ANN Advantages and Limitations
  • Support Vector MachinesData Transformations; Moving Beyond Two-dimensional Predictors; SVM Advantages and Limitations; Summary; 5. Classification Models in VisMiner; Dataset Preparation; Tutorial - Building and Evaluating Classification Models; Model Evaluation; Exercise 5.1; Prediction Likelihoods; Classification Model Performance; Interpreting the ROC Curve; Classification Ensembles; Model Application; Summary; Exercise 5.2; Exercise 5.3; 6. Regression Analysis; The Regression Model; Correlation and Causation; Algorithms for Regression Analysis; Assessing Regression Model Performance
  • Model ValidityLooking Beyond R2; Polynomial Regression; Artificial Neural Networks for Regression Analysis; Dataset Preparation; Tutorial; A Regression Model for Home Appraisal; Modeling with the Right Set of Observations; Exercise 6.1; ANN Modeling; The Advantage of ANN Regression; Top-Down Attribute Selection; Issues in Model Interpretation; Model Validation; Model Application; Summary; 7. Cluster Analysis; Introduction; Algorithms for Cluster Analysis; Issues with K-Means Clustering Process; Hierarchical Clustering; Measures of Cluster and Clustering Quality; Silhouette Coefficient
  • Correlation Coefficient