Mastering predictive analytics with Python exploit the power of data in your business by building advanced predictive modeling applications with Python

Exploit the power of data in your business by building advanced predictive modeling applications with Python About This Book Master open source Python tools to build sophisticated predictive models Learn to identify the right machine learning algorithm for your problem with this forward-thinking gui...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Babcock, Joseph, author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing 2016.
Edición:	1st edition
Colección:	Community experience distilled.
Materias:	Business planning > Computer programs.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630298406719

Tabla de Contenidos:

Cover
Copyright
Credits
About the Author
About the Reviewer
www.PacktPub.com
Table of Contents
Preface
Chapter 1: From Data to Decisions - Getting Started with Analytic Applications
Designing an advanced analytic solution
Data layer: warehouses, lakes, and streams
Modeling layer
Deployment layer
Reporting layer
Case study: sentiment analysis of social media feeds
Data input and transformation
Sanity checking
Model development
Scoring
Visualization and reporting
Case study: targeted e-mail campaigns
Data input and transformation
Sanity checking
Model development
Scoring
Visualization and reporting
Summary
Chapter 2: Exploratory Data Analysis and Visualization in Python
Exploring categorical and numerical data in IPython
Installing IPython notebook
The notebook interface
Loading and inspecting data
Basic manipulations - grouping, filtering, mapping, and pivoting
Charting with Matplotlib
Time series analysis
Cleaning and converting
Time series diagnostics
Joining signals and correlation
Working with geospatial data
Loading geospatial data
Working in the cloud
Introduction to PySpark
Creating the SparkContext
Creating an RDD
Creating a Spark DataFrame
Summary
Chapter 3: Finding Patterns in the Noise - Clustering and Unsupervised Learning
Similarity and distance metrics
Numerical distance metrics
Correlation similarity metrics and time series
Similarity metrics for categorical data
K-means clustering
Affinity propagation - automatically choosing cluster numbers
k-medoids
Agglomerative clustering
Where agglomerative clustering fails
Streaming clustering in Spark
Summary
Chapter 4: Connecting the Dots with Models - Regression Methods
Linear regression
Data preparation.
Model fitting and evaluation
Statistical significance of regression outputs
Generalize estimating equations
Mixed effects models
Time series data
Generalized linear models
Applying regularization to linear models
Tree methods
Decision trees
Random forest
Scaling out with PySpark - predicting year of song release
Summary
Chapter 5: Putting Data in its Place - Classification Methods and Analysis
Logistic regression
Multiclass logistic classifiers: multinomial regression
Formatting a dataset for classification problems
Learning pointwise updates with stochastic gradient descent
Jointly optimizing all parameters with second-order methods
Fitting the model
Evaluating classification models
Strategies for improving classification models
Separating Nonlinear boundaries with Support vector machines
Fitting and SVM to the census data
Boosting: combining small models to improve accuracy
Gradient boosted decision trees
Comparing classification methods
Case study: fitting classifier models in pyspark
Summary
Chapter 6: Words and Pixels - Working with Unstructured Data
Working with textual data
Cleaning textual data
Extracting features from textual data
Using dimensionality reduction to simplify datasets
Principal component analysis
Latent Dirichlet Allocation
Using dimensionality reduction in predictive modeling
Images
Cleaning image data
Thresholding images to highlight objects
Dimensionality reduction for image analysis
Case Study: Training a Recommender System in PySpark
Summary
Chapter 7: Learning from the Bottom Up - Deep Networks and Unsupervised Features
Learning patterns with neural networks
A network of one - the perceptron
Combining perceptrons - a single-layer neural network
Parameter fitting with back-propagation.
Discriminative versus generative models
Vanishing gradients and explaining away
Pretraining belief networks
Using dropout to regularize networks
Convolutional networks and rectified units
Compressing Data with autoencoder networks
Optimizing the learning rate
The TensorFlow library and digit recognition
The MNIST data
Constructing the network
Summary
Chapter 8: Sharing Models with Prediction Services
The architecture of a prediction service
Clients and making requests
The GET requests
The POST request
The HEAD request
The PUT request
The DELETE request
Server - the web traffic controller
Application - the engine of the predictive services
Persisting information with database systems
Case study - logistic regression service
Setting up the database
The web server
The web application
The flow of a prediction service - training a model
On-demand and bulk prediction
Summary
Chapter 9: Reporting and Testing - Iterating on Analytic Systems
Checking the health of models with diagnostics
Evaluating changes in model performance
Changes in feature importance
Changes in unsupervised model performance
Iterating on models through A/B testing
Experimental allocation - assigning customers to experiments
Deciding a sample size
Multiple hypothesis testing
Guidelines for communication
Translate terms to business values
Visualizing results
Case Study: building a reporting service
The report server
The report application
The visualization layer
Summary
Index.

Mastering predictive analytics with Python exploit the power of data in your business by building advanced predictive modeling applications with Python

Ejemplares similares