Data lakes
Otros Autores: | , , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
London : Hoboken :
ISTE, Ltd. ; Wiley
2020.
|
Colección: | Wiley ebooks.
Computer engineering series, databases and big data set ; 2. |
Acceso en línea: | Conectar con la versión electrónica |
Ver en Universidad de Navarra: | https://innopac.unav.es/record=b42152094*spi |
Tabla de Contenidos:
- Cover
- Half-Title Page
- Dedication
- Title Page
- Copyright Page
- Contents
- Preface
- 1. Introduction to Data Lakes: Definitions and Discussions
- 1.1. Introduction to data lakes
- 1.2. Literature review and discussion
- 1.3. The data lake challenges
- 1.4. Data lakes versus decision-making systems
- 1.5. Urbanization for data lakes
- 1.6. Data lake functionalities
- 1.7. Summary and concluding remarks
- 2. Architecture of Data Lakes
- 2.1. Introduction
- 2.2. State of the art and practice
- 2.2.1. Definition
- 2.2.2. Architecture
- 2.2.3. Metadata.
- 2.2.4. Data quality
- 2.2.5. Schema-on-read
- 2.3. System architecture
- 2.3.1. Ingestion layer
- 2.3.2. Storage layer
- 2.3.3. Transformation layer
- 2.3.4. Interaction layer
- 2.4. Use case: the Constance system
- 2.4.1. System overview
- 2.4.2. Ingestion layer
- 2.4.3. Maintenance layer
- 2.4.4. Query layer
- 2.4.5. Data quality control
- 2.4.6. Extensibility and flexibility
- 2.5. Concluding remarks
- 3. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
- 3.1. Our expectations
- 3.2. Modeling data lake functionalities.
- 3.3. Building the knowledge base of industrial data lakes
- 3.4. Our formalization approach
- 3.5. Applying our approach
- 3.6. Analysis of our first results
- 3.7. Concluding remarks
- 4. Metadata in Data Lake Ecosystems
- 4.1. Definitions and concepts
- 4.2. Classification of metadata by NISO
- 4.2.1. Metadata schema
- 4.2.2. Knowledge base and catalog
- 4.3. Other categories of metadata
- 4.3.1. Business metadata
- 4.3.2. Navigational integration
- 4.3.3. Operational metadata
- 4.4. Sources of metadata
- 4.5. Metadata classification
- 4.6. Why metadata are needed.
- 4.6.1. Selection of information (re)sources
- 4.6.2. Organization of information resources
- 4.6.3. Interoperability and integration
- 4.6.4. Unique digital identification
- 4.6.5. Data archiving and preservation
- 4.7. Business value of metadata
- 4.8. Metadata architecture
- 4.8.1. Architecture scenario 1: point-to-point metadata architecture
- 4.8.2. Architecture scenario 2: hub and spoke metadata architecture
- 4.8.3. Architecture scenario 3: tool of record metadata architecture
- 4.8.4. Architecture scenario 4: hybrid metadata architecture.
- 4.8.5. Architecture scenario 5: federated metadata architecture
- 4.9. Metadata management
- 4.10. Metadata and data lakes
- 4.10.1. Application and workload layer
- 4.10.2. Data layer
- 4.10.3. System layer
- 4.10.4. Metadata types
- 4.11. Metadata management in data lakes
- 4.11.1. Metadata directory
- 4.11.2. Metadata storage
- 4.11.3. Metadata discovery
- 4.11.4. Metadata lineage
- 4.11.5. Metadata querying
- 4.11.6. Data source selection
- 4.12. Metadata and master data management
- 4.13. Conclusion
- 5. A Use Case of Data Lake Metadata Management
- 5.1. Context.
- 5.1.1. Data lake definition.