Machine Learning Infrastructure and Best Practices for Software Engineers Take Your Machine Learning Software from a Prototype to a Fully Fledged Software System

Efficiently transform your initial designs into big systems by learning the foundations of infrastructure, algorithms, and ethical considerations for modern software products Key Features Learn how to scale-up your machine learning software to a professional level Secure the quality of your machine...

Descripción completa

Detalles Bibliográficos
Otros Autores: Staron, Miroslaw, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham : Packt Publishing, Limited 2024.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009799144406719
Tabla de Contenidos:
  • Cover
  • Title page
  • Copyright and credits
  • Dedication
  • Contributors
  • Table of contents
  • Preface
  • Part 1: Machine Learning Landscape in Software Engineering
  • Bookmark 170
  • Chapter 1: Machine Learning Compared to Traditional Software
  • Machine learning is not traditional software
  • Supervised, unsupervised, and reinforcement learning
  • it is just the beginning
  • An example of traditional and machine learning software
  • Probability and software
  • how well they go together
  • Testing and evaluation
  • the same but different
  • Summary
  • References
  • Chapter 2: Elements of a Machine Learning System
  • Elements of a production machine learning system
  • Data and algorithms
  • Data collection
  • Feature extraction
  • Data validation
  • Configuration and monitoring
  • Configuration
  • Monitoring
  • Infrastructure and resource management
  • Data serving infrastructure
  • Computational infrastructure
  • How this all comes together
  • machine learning pipelines
  • References
  • Chapter 3: Data in Software Systems
  • Text, Images, Code, and Their Annotations
  • Raw data and features
  • what are the differences?
  • Images
  • Text
  • Visualization of output from more advanced text processing
  • Structured text
  • source code of programs
  • Every data has its purpose
  • annotations and tasks
  • Annotating text for intent recognition
  • Where different types of data can be used together
  • an outlook on multi-modal data models
  • References
  • Chapter 4: Data Acquisition, Data Quality, and Noise
  • Sources of data and what we can do with them
  • Extracting data from software engineering tools
  • Gerrit and Jira
  • Extracting data from product databases
  • GitHub and Git
  • Data quality
  • Noise
  • Summary
  • References
  • Chapter 5: Quantifying and Improving Data Properties
  • Feature engineering
  • the basics
  • Clean data
  • Noise in data management
  • Attribute noise
  • Splitting data
  • How ML models handle noise
  • References
  • Part 2: Data Acquisition and Management
  • Chapter 6: Processing Data in Machine Learning Systems
  • Numerical data
  • Summarizing the data
  • Diving deeper into correlations
  • Summarizing individual measures
  • Reducing the number of measures
  • PCA
  • Other types of data
  • images
  • Text data
  • Toward feature engineering
  • References
  • Chapter 7: Feature Engineering for Numerical and Image Data
  • Feature engineering
  • Feature engineering for numerical data
  • PCA
  • t-SNE
  • ICA
  • Locally linear embedding
  • Linear discriminant analysis
  • Autoencoders
  • Feature engineering for image data
  • Summary
  • References
  • Chapter 8: Feature Engineering for Natural Language Data
  • Natural language data in software engineering and the rise of Github Copilot
  • What a tokenizer is and what it does
  • Bag-of-words and simple tokenizers
  • WordPiece tokenizer
  • BPE
  • The SentencePiece tokenizer
  • Word embeddings
  • FastText