Normal view MARC view ISBD view

Distributed Computing in Big Data Analytics : Concepts, Technologies and Applications.

By: Mazumder, Sourav.
Contributor(s): Singh Bhadoria, Robin | Deka, Ganesh Chandra.
Material type: TextTextSeries: eBooks on Demand.Scalable Computing and Communications Ser: Publisher: Cham : Springer, 2017Copyright date: ©2017Description: 1 online resource (166 pages).Content type: text Media type: computer Carrier type: online resourceISBN: 9783319598345.Subject(s): Big dataGenre/Form: Electronic books.Additional physical formats: Print version:: Distributed Computing in Big Data Analytics : Concepts, Technologies and ApplicationsDDC classification: 004 LOC classification: QA75.5-76.95Online resources: Click here to view this ebook.
Contents:
Intro -- Editor's Notes -- Contents -- On the Role of Distributed Computing in Big Data Analytics -- 1 Introduction -- 2 History and Key Characteristics of Big Data -- 3 Key Aspects of Big Data Analytics -- 4 Popular Technologies for Big Data Analytics Utilizing Concepts of Distributed Computing -- 4.1 Hadoop -- 4.2 Yarn -- 4.3 Hadoop Map Reduce -- 4.4 Spark -- 5 Conclusion -- References -- Fundamental Concepts of Distributed Computing Used in Big Data Analytics -- 1 Introduction -- 2 Multithreading and Multiprocessing -- 2.1 Concept of Multiprocessing -- 2.2 Example of Multiprocessing -- 2.3 Concept of Multithreading -- 2.4 Example of Multithreading -- 2.5 Difference between Multiprocessing and Multithreading -- 3 Computing Architecture in Distributed Computing -- 3.1 SISD -- 3.2 Vector Processor -- 3.3 SIMD -- 3.4 MIMD -- 3.5 SM-MIMD -- 3.6 DM-MIMD -- 4 Scalability in Distributing Computing -- 4.1 Scalability Requirement and Category -- 4.2 Scaling Up -- 4.3 Scaling Out -- 4.4 Prospect of Scale Up and Scale Out -- 5 Queuing Network Model for Distributed Computing -- 5.1 Asynchronous Communication -- 5.2 Queue System -- 5.3 Queue Modeling -- 6 Application of CAP Theorem -- 6.1 Basic Concepts of Consistency, Availability, and Partition Tolerance -- 6.2 Combination of Consistency, Availability, and Partition Tolerance -- 7 Quality of Service (QoS) Requirements in Big Data Analytics -- 7.1 Performance -- 7.2 Interoperability -- 7.3 Fault-Tolerance -- 7.4 Security -- 7.5 Manageability -- 7.6 Load-Balance -- 7.7 High-Availability (HA) -- 7.8 SLA -- 8 Conclusion -- References -- Distributed Computing Patterns Useful in Big Data Analytics -- 1 Introduction -- 2 Primitives for Concurrent Programming -- 2.1 Concurrency Expression -- 2.2 Synchronization -- 3 Communication Protocols and Message Exchange -- 3.1 Synchronous Communication.
3.2 Asynchronous Communication -- 3.3 Pseudo-Synchronous Communication -- 3.4 Client/Server Paradigm -- 3.5 Communication Deployment in Big Data -- 4 Data Distribution in Big Data on Distributed Environments -- 5 Implementation Problems -- 5.1 Race Condition Problems -- 5.2 Message Exchange -- 6 Conclusion -- References -- Distributed Computing Technologies in Big Data Analytics -- 1 Introduction -- 2 Distributed Database -- 2.1 NoSQL Database -- 3 Distributed Storage -- 3.1 Hadoop Distributed File System (HDFS) -- 4 Distributed Computation -- 4.1 Map-Reduce in Hadoop -- 4.2 Spark -- 5 Machine Learning Platforms -- 6 Search System -- 6.1 Search Software -- 7 Big Data Messaging Software -- 8 Cache -- 8.1 Distributed Caching Systems -- 9 Data Visualization -- 10 Conclusion -- References -- Security Issues and Challenges in Big Data Analytics in Distributed Environment -- 1 Introduction -- 1.1 Security Issues in Big Data in Distributed Environment -- 2 Infrastructure Based Security -- 2.1 Secure Computations -- 2.2 Secure Non-relational Data Stores -- 3 Data Privacy -- 3.1 Privacy Preservation in Data Mining -- 3.2 Cryptography Control Mechanism -- 3.3 Granular Access Control -- 4 Data Integrity and Data Management -- 4.1 Granular Audits -- 4.2 Secure Transactions and Transaction Logs -- 4.3 Data Provenance -- 5 Reactive Security -- 5.1 Input Validation at Distributed Nodes -- 5.2 Real Time Security -- 6 Countermeasures -- 7 Conclusion -- References -- Scientific Computing and Big Data Analytics: Application in Climate Science -- 1 Introduction -- 2 Computational Challenges in Solving Scientific Problems -- 3 Climate Change and Big Data Analytics -- 4 Use Case on Climate Analytics -- 4.1 The Scientific Challenge of the Climate System -- 4.2 Computational Challenge of the Climate Modeling -- 4.3 Post-processing Climate Model Output.
4.4 BigData Climate Analytics Using Spark -- 5 Conclusions -- References -- Distributed Computing in Cognitive Analytics -- 1 Introduction -- 2 Building Blocks of Cognitive Analytic System -- 2.1 The Data Repositories -- 2.2 The Data Ingestion Tools -- 2.3 The Analytical Frameworks -- 2.4 The Hardware Components -- 2.5 Key Non-functional Requirements to Consider -- 2.5.1 High Concurrency Throughput -- 2.5.2 Interfaces for Interaction with Systems -- 2.5.3 High Availability and Disaster Recovery -- 2.5.4 Linear Scalability -- 2.5.5 Ability to Prioritize Workload -- 2.6 Cognitive System - Implementation Patterns -- 3 Cognitive System - Use Cases -- 3.1 Cognitive Systems in Health Care -- 3.2 Cognitive Systems in Internet of Things Domain -- 3.3 Cognitive Analytics to Become a Customer Centric Organization -- 3.3.1 Next Best Action -- 3.3.2 Changing Engagement Patterns -- 3.3.3 360 ° View of Customer -- 3.3.4 Understand Thy Customer -- 4 Conclusion -- References -- Distributed Computing in Social Media Analytics -- 1 Introduction -- 2 Open Source Tools for Social Media Analytics -- 3 Influencer Analytics -- 3.1 Understanding the Impact of Influencers -- 3.2 Wimbledon Influencer Case Study -- 4 Social Polling -- 4.1 Sentiment Analysis -- 4.2 Intent Detection -- 4.3 Topic Monitoring -- 4.4 User Segmentation -- 4.5 Some Social Polling Examples -- 4.6 Social Polling for Demand Planning -- 5 Conclusion -- References -- Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases -- 1 Introduction -- 2 Search Engines -- 2.1 Key Technologies -- 2.2 Inverted Index -- 2.3 Sharding of Data -- 2.4 Replication of Data -- 2.5 Denormalized Data Model -- 2.6 Distributed Aggregation and Scoring -- 3 Recommendation Systems -- 4 Semantic Discovery -- 4.1 Problem Description -- 4.2 Semantic Similarity.
4.3 Probabilistic Semantic Similarity Scoring Using PGMHD -- 4.4 Distributed PGMHD -- 5 Word Sense Ambiguity Detection -- 5.1 Ambiguity Score -- 5.2 Resolving Word Sense Ambiguity -- 6 Semantic Knowledge Graph -- 6.1 Model Structure -- 6.2 Materialization of Nodes and Edges -- 6.3 Discovering Semantic Relationships -- 6.4 Scoring Semantic Relationships -- 6.5 Scaling Characteristics -- 7 Real World Applications -- 8 Conclusion -- References.
Tags from this library: No tags from this library for this title. Log in to add tags.

Intro -- Editor's Notes -- Contents -- On the Role of Distributed Computing in Big Data Analytics -- 1 Introduction -- 2 History and Key Characteristics of Big Data -- 3 Key Aspects of Big Data Analytics -- 4 Popular Technologies for Big Data Analytics Utilizing Concepts of Distributed Computing -- 4.1 Hadoop -- 4.2 Yarn -- 4.3 Hadoop Map Reduce -- 4.4 Spark -- 5 Conclusion -- References -- Fundamental Concepts of Distributed Computing Used in Big Data Analytics -- 1 Introduction -- 2 Multithreading and Multiprocessing -- 2.1 Concept of Multiprocessing -- 2.2 Example of Multiprocessing -- 2.3 Concept of Multithreading -- 2.4 Example of Multithreading -- 2.5 Difference between Multiprocessing and Multithreading -- 3 Computing Architecture in Distributed Computing -- 3.1 SISD -- 3.2 Vector Processor -- 3.3 SIMD -- 3.4 MIMD -- 3.5 SM-MIMD -- 3.6 DM-MIMD -- 4 Scalability in Distributing Computing -- 4.1 Scalability Requirement and Category -- 4.2 Scaling Up -- 4.3 Scaling Out -- 4.4 Prospect of Scale Up and Scale Out -- 5 Queuing Network Model for Distributed Computing -- 5.1 Asynchronous Communication -- 5.2 Queue System -- 5.3 Queue Modeling -- 6 Application of CAP Theorem -- 6.1 Basic Concepts of Consistency, Availability, and Partition Tolerance -- 6.2 Combination of Consistency, Availability, and Partition Tolerance -- 7 Quality of Service (QoS) Requirements in Big Data Analytics -- 7.1 Performance -- 7.2 Interoperability -- 7.3 Fault-Tolerance -- 7.4 Security -- 7.5 Manageability -- 7.6 Load-Balance -- 7.7 High-Availability (HA) -- 7.8 SLA -- 8 Conclusion -- References -- Distributed Computing Patterns Useful in Big Data Analytics -- 1 Introduction -- 2 Primitives for Concurrent Programming -- 2.1 Concurrency Expression -- 2.2 Synchronization -- 3 Communication Protocols and Message Exchange -- 3.1 Synchronous Communication.

3.2 Asynchronous Communication -- 3.3 Pseudo-Synchronous Communication -- 3.4 Client/Server Paradigm -- 3.5 Communication Deployment in Big Data -- 4 Data Distribution in Big Data on Distributed Environments -- 5 Implementation Problems -- 5.1 Race Condition Problems -- 5.2 Message Exchange -- 6 Conclusion -- References -- Distributed Computing Technologies in Big Data Analytics -- 1 Introduction -- 2 Distributed Database -- 2.1 NoSQL Database -- 3 Distributed Storage -- 3.1 Hadoop Distributed File System (HDFS) -- 4 Distributed Computation -- 4.1 Map-Reduce in Hadoop -- 4.2 Spark -- 5 Machine Learning Platforms -- 6 Search System -- 6.1 Search Software -- 7 Big Data Messaging Software -- 8 Cache -- 8.1 Distributed Caching Systems -- 9 Data Visualization -- 10 Conclusion -- References -- Security Issues and Challenges in Big Data Analytics in Distributed Environment -- 1 Introduction -- 1.1 Security Issues in Big Data in Distributed Environment -- 2 Infrastructure Based Security -- 2.1 Secure Computations -- 2.2 Secure Non-relational Data Stores -- 3 Data Privacy -- 3.1 Privacy Preservation in Data Mining -- 3.2 Cryptography Control Mechanism -- 3.3 Granular Access Control -- 4 Data Integrity and Data Management -- 4.1 Granular Audits -- 4.2 Secure Transactions and Transaction Logs -- 4.3 Data Provenance -- 5 Reactive Security -- 5.1 Input Validation at Distributed Nodes -- 5.2 Real Time Security -- 6 Countermeasures -- 7 Conclusion -- References -- Scientific Computing and Big Data Analytics: Application in Climate Science -- 1 Introduction -- 2 Computational Challenges in Solving Scientific Problems -- 3 Climate Change and Big Data Analytics -- 4 Use Case on Climate Analytics -- 4.1 The Scientific Challenge of the Climate System -- 4.2 Computational Challenge of the Climate Modeling -- 4.3 Post-processing Climate Model Output.

4.4 BigData Climate Analytics Using Spark -- 5 Conclusions -- References -- Distributed Computing in Cognitive Analytics -- 1 Introduction -- 2 Building Blocks of Cognitive Analytic System -- 2.1 The Data Repositories -- 2.2 The Data Ingestion Tools -- 2.3 The Analytical Frameworks -- 2.4 The Hardware Components -- 2.5 Key Non-functional Requirements to Consider -- 2.5.1 High Concurrency Throughput -- 2.5.2 Interfaces for Interaction with Systems -- 2.5.3 High Availability and Disaster Recovery -- 2.5.4 Linear Scalability -- 2.5.5 Ability to Prioritize Workload -- 2.6 Cognitive System - Implementation Patterns -- 3 Cognitive System - Use Cases -- 3.1 Cognitive Systems in Health Care -- 3.2 Cognitive Systems in Internet of Things Domain -- 3.3 Cognitive Analytics to Become a Customer Centric Organization -- 3.3.1 Next Best Action -- 3.3.2 Changing Engagement Patterns -- 3.3.3 360 ° View of Customer -- 3.3.4 Understand Thy Customer -- 4 Conclusion -- References -- Distributed Computing in Social Media Analytics -- 1 Introduction -- 2 Open Source Tools for Social Media Analytics -- 3 Influencer Analytics -- 3.1 Understanding the Impact of Influencers -- 3.2 Wimbledon Influencer Case Study -- 4 Social Polling -- 4.1 Sentiment Analysis -- 4.2 Intent Detection -- 4.3 Topic Monitoring -- 4.4 User Segmentation -- 4.5 Some Social Polling Examples -- 4.6 Social Polling for Demand Planning -- 5 Conclusion -- References -- Utilizing Big Data Analytics for Automatic Building of Language-agnostic Semantic Knowledge Bases -- 1 Introduction -- 2 Search Engines -- 2.1 Key Technologies -- 2.2 Inverted Index -- 2.3 Sharding of Data -- 2.4 Replication of Data -- 2.5 Denormalized Data Model -- 2.6 Distributed Aggregation and Scoring -- 3 Recommendation Systems -- 4 Semantic Discovery -- 4.1 Problem Description -- 4.2 Semantic Similarity.

4.3 Probabilistic Semantic Similarity Scoring Using PGMHD -- 4.4 Distributed PGMHD -- 5 Word Sense Ambiguity Detection -- 5.1 Ambiguity Score -- 5.2 Resolving Word Sense Ambiguity -- 6 Semantic Knowledge Graph -- 6.1 Model Structure -- 6.2 Materialization of Nodes and Edges -- 6.3 Discovering Semantic Relationships -- 6.4 Scoring Semantic Relationships -- 6.5 Scaling Characteristics -- 7 Real World Applications -- 8 Conclusion -- References.

Description based on publisher supplied metadata and other sources.

There are no comments for this item.

Log in to your account to post a comment.