Python for Data Science for Dummies

Python for Data Science For Dummies lets you get your hands dirty with data using one of the top programming languages. This beginner’s guide takes you step by step through getting started, performing data analysis, understanding datasets and example code, working with Google Colab, sampling data, a...

Descripción completa

Detalles Bibliográficos
Otros Autores: Mueller, John, 1958- author (author), Massaron, Luca, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Hoboken, New Jersey : John Wiley & Sons, Inc [2024]
Edición:Third edition
Colección:--For dummies.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009784626106719
Tabla de Contenidos:
  • Intro
  • Title Page
  • Copyright Page
  • Table of Contents
  • Introduction
  • About This Book
  • Foolish Assumptions
  • Icons Used in This Book
  • Beyond the Book
  • Where to Go from Here
  • Part 1 Getting Started with Data Science and Python
  • Chapter 1 Discovering the Match between Data Science and Python
  • Understanding Python as a Language
  • Viewing Python's various uses as a general-purpose language
  • Interpreting Python
  • Compiling Python
  • Defining Data Science
  • Considering the emergence of data science
  • Outlining the core competencies of a data scientist
  • Linking data science, big data, and AI
  • Creating the Data Science Pipeline
  • Understanding Python's Role in Data Science
  • Considering the shifting profile of data scientists
  • Working with a multipurpose, simple, and efficient language
  • Learning to Use Python Fast
  • Loading data
  • Training a model
  • Viewing a result
  • Chapter 2 Introducing Python's Capabilities and Wonders
  • Working with Python
  • Contributing to data science
  • Getting a taste of the language
  • Understanding the need for indentation
  • Working with Jupyter Notebook and Google Colab
  • Performing Rapid Prototyping and Experimentation
  • Considering Speed of Execution
  • Visualizing Power
  • Using the Python Ecosystem for Data Science
  • Accessing scientific tools using SciPy
  • Performing fundamental scientific computing using NumPy
  • Performing data analysis using pandas
  • Implementing machine learning using Scikit-learn
  • Going for deep learning with Keras and TensorFlow
  • Performing analysis efficiently using XGBoost
  • Plotting the data using Matplotlib
  • Creating graphs with NetworkX
  • Chapter 3 Setting Up Python for Data Science
  • Working with Anaconda
  • Using Jupyter Notebook
  • Accessing the Anaconda Prompt
  • Installing Anaconda on Windows
  • Installing Anaconda on Linux.
  • Installing Anaconda on Mac OS X
  • Downloading the Datasets and Example Code
  • Using Jupyter Notebook
  • Starting Jupyter Notebook
  • Stopping the Jupyter Notebook server
  • Defining the code repository
  • Defining a new folder
  • Creating a new notebook
  • Adding notebook content
  • Exporting a notebook
  • Removing a notebook
  • Importing a notebook
  • Understanding the datasets used in this book
  • Chapter 4 Working with Google Colab
  • Defining Google Colab
  • Understanding what Google Colab does
  • Considering the online coding difference
  • Using local runtime support
  • Working with Notebooks
  • Creating a new notebook
  • Opening existing notebooks
  • Using Google Drive for existing notebooks
  • Using GitHub for existing notebooks
  • Using local storage for existing notebooks
  • Saving notebooks
  • Using Drive to save notebooks
  • Using GitHub to save notebooks
  • Using GitHub gists to save notebooks
  • Downloading notebooks
  • Performing Common Tasks
  • Creating code cells
  • Creating text cells
  • Creating special cells
  • Editing cells
  • Moving cells
  • Using Hardware Acceleration
  • Executing the Code
  • Viewing Your Notebook
  • Displaying the table of contents
  • Getting notebook information
  • Checking code execution
  • Sharing Your Notebook
  • Getting Help
  • Part 2 Getting Your Hands Dirty with Data
  • Chapter 5 Working with Jupyter Notebook
  • Using Jupyter Notebook
  • Working with styles
  • Getting Python help
  • Using magic functions
  • Obtaining the magic functions list
  • Working with magic functions
  • Discovering objects
  • Getting object help
  • Obtaining object specifics
  • Using extended Python object help
  • Restarting the kernel
  • Restoring a checkpoint
  • Performing Multimedia and Graphic Integration
  • Embedding plots and other images
  • Loading examples from online sites
  • Obtaining online graphics and multimedia.
  • Chapter 6 Working with Real Data
  • Uploading, Streaming, and Sampling Data
  • Uploading small amounts of data into memory
  • Streaming large amounts of data into memory
  • Generating variations on image data
  • Sampling data in different ways
  • Accessing Data in Structured Flat-File Form
  • Reading from a text file
  • Reading CSV delimited format
  • Reading Excel and other Microsoft Office files
  • Sending Data in Unstructured File Form
  • Managing Data from Relational Databases
  • Interacting with Data from NoSQL Databases
  • Accessing Data from the Web
  • Chapter 7 Processing Your Data
  • Juggling between NumPy and pandas
  • Knowing when to use NumPy
  • Knowing when to use pandas
  • Validating Your Data
  • Figuring out what's in your data
  • Removing duplicates
  • Creating a data map and data plan
  • Manipulating Categorical Variables
  • Creating categorical variables
  • Renaming levels
  • Combining levels
  • Dealing with Dates in Your Data
  • Formatting date and time values
  • Using the right time transformation
  • Dealing with Missing Data
  • Finding the missing data
  • Encoding missingness
  • Imputing missing data
  • Slicing and Dicing: Filtering and Selecting Data
  • Slicing rows
  • Slicing columns
  • Dicing
  • Concatenating and Transforming
  • Adding new cases and variables
  • Removing data
  • Sorting and shuffling
  • Aggregating Data at Any Level
  • Chapter 8 Reshaping Data
  • Using the Bag of Words Model to Tokenize Data
  • Understanding the bag of words model
  • Sequencing text items with n-grams
  • Implementing TF-IDF transformations
  • Working with Graph Data
  • Understanding the adjacency matrix
  • Using NetworkX basics
  • Chapter 9 Putting What You Know into Action
  • Contextualizing Problems and Data
  • Evaluating a data science problem
  • Researching solutions
  • Formulating a hypothesis
  • Preparing your data.
  • Considering the Art of Feature Creation
  • Defining feature creation
  • Combining variables
  • Understanding binning and discretization
  • Using indicator variables
  • Transforming distributions
  • Performing Operations on Arrays
  • Using vectorization
  • Performing simple arithmetic on vectors and matrices
  • Performing matrix vector multiplication
  • Performing matrix multiplication
  • Part 3 Visualizing Information
  • Chapter 10 Getting a Crash Course in Matplotlib
  • Starting with a Graph
  • Defining the plot
  • Drawing multiple lines and plots
  • Saving your work to disk
  • Setting the Axis, Ticks, and Grids
  • Getting the axes
  • Formatting the axes
  • Adding grids
  • Defining the Line Appearance
  • Working with line styles
  • Using colors
  • Adding markers
  • Using Labels, Annotations, and Legends
  • Adding labels
  • Annotating the chart
  • Creating a legend
  • Chapter 11 Visualizing the Data
  • Choosing the Right Graph
  • Creating comparisons with bar charts
  • Showing distributions using histograms
  • Depicting groups using boxplots
  • Seeing data patterns using scatterplots
  • Creating Advanced Scatterplots
  • Depicting groups
  • Showing correlations
  • Plotting Time Series
  • Representing time on axes
  • Plotting trends over time
  • Plotting Geographical Data
  • Using an environment in Notebook
  • Using Cartopy to plot geographic data
  • Avoiding outdated libraries: The Basemap Toolkit
  • Visualizing Graphs
  • Developing undirected graphs
  • Developing directed graphs
  • Part 4 Wrangling Data
  • Chapter 12 Stretching Python's Capabilities
  • Playing with Scikit-learn
  • Understanding classes in Scikit-learn
  • Defining applications for data science
  • Using Transformative Functions
  • Chaining estimators
  • Transforming targets
  • Composing features
  • Handling heterogeneous data
  • Considering Timing and Performance.
  • Benchmarking with timeit
  • Working with the memory profiler
  • Running in Parallel on Multiple Cores
  • Performing multicore parallelism
  • Demonstrating multiprocessing
  • Chapter 13 Exploring Data Analysis
  • The EDA Approach
  • Defining Descriptive Statistics for Numeric Data
  • Measuring central tendency
  • Measuring variance and range
  • Working with percentiles
  • Defining measures of normality
  • Counting for Categorical Data
  • Understanding frequencies
  • Creating contingency tables
  • Creating Applied Visualization for EDA
  • Inspecting boxplots
  • Performing t-tests after boxplots
  • Observing parallel coordinates
  • Graphing distributions
  • Plotting scatterplots
  • Understanding Correlation
  • Using covariance and correlation
  • Using nonparametric correlation
  • Considering chi-square for tables
  • Working with Cramér's V
  • Modifying Data Distributions
  • Using different statistical distributions
  • Creating a Z-score standardization
  • Transforming other notable distributions
  • Chapter 14 Reducing Dimensionality
  • Understanding SVD
  • Looking for dimensionality reduction
  • Using SVD to measure the invisible
  • Performing Factor Analysis and PCA
  • Considering the psychometric model
  • Looking for hidden factors
  • Using components, not factors
  • Achieving dimensionality reduction
  • Squeezing information with t-SNE
  • Understanding Some Applications
  • Recognizing faces with PCA
  • Extracting topics with NMF
  • Recommending movies
  • Chapter 15 Clustering
  • Clustering with K-means
  • Understanding centroid-based algorithms
  • Creating an example with image data
  • Looking for optimal solutions
  • Clustering big data
  • Performing Hierarchical Clustering
  • Using a hierarchical cluster solution
  • Visualizing aggregative clustering solutions
  • Discovering New Groups with DBScan
  • Chapter 16 Detecting Outliers in Data.
  • Considering Outlier Detection.