Kasim, Adetayo.

Applied Biclustering Methods for Big and High-Dimensional Data Using R. - 1 online resource (428 pages) - eBooks on Demand Chapman & Hall/CRC Biostatistics Series . - Chapman & Hall/CRC Biostatistics Series .

Cover -- Title Page -- Copyright Page -- Contents -- Preface -- Contributors -- R Packages and Products -- 1. Introduction -- 1.1. From Clustering to Biclustering -- 1.2. We R a Community -- 1.3. Biclustering for Cloud Computing -- 1.4. Book Structure -- 1.5. Datasets -- 1.5.1. Dutch Breast Cancer Data -- 1.5.2. Diffuse Large B-Cell Lymphoma (DLBCL) -- 1.5.3. Multiple Tissue Types Data -- 1.5.4. CMap Dataset -- 1.5.5. NCI60 Panel -- 1.5.6. 1000 Genomes Project -- 1.5.7. Tourism Survey Data -- 1.5.8. Toxicogenomics Project -- 1.5.9. Yeast Data -- 1.5.10. mglu2 Project -- 1.5.11. TCGA Data -- 1.5.12. NBA Data -- 1.5.13. Colon Cancer Data -- 2. From Cluster Analysis to Biclustering -- 2.1. Cluster Analysis -- 2.1.1. An Introduction -- 2.1.2. Dissimilarity Measures and Similarity Measures -- 2.1.2.1. Example 1: Clustering Compounds in the CMAP DataBased on Chemical Similarity -- 2.1.2.2. Example 2 -- 2.1.3. Hierarchical Clustering -- 2.1.3.1. Example 1 -- 2.1.3.2. Example 2 -- 2.1.4. ABC Dissimilarity for High-Dimensional Data -- 2.2. Biclustering: A Graphical Tour -- 2.2.1. Global versus Local Patterns -- 2.2.2. Bicluster's Type -- 2.2.3. Bicluster's Configuration -- Part I: Biclustering Methods -- 3. δ-Biclustering and FLOC Algorithm -- 3.1. Introduction -- 3.2. δ-Biclustering -- 3.2.1. Single-Node Deletion Algorithm -- 3.2.2. Multiple-Node Deletion Algorithm -- 3.2.3. Node Addition Algorithm -- 3.2.4. Application to Yeast Data -- 3.3. FLOC -- 3.3.1. FLOC Phase I -- 3.3.2. FLOC Phase II -- 3.3.3. FLOC Application to Yeast Data -- 3.4. Discussion -- 4. The xMotif algorithm -- 4.1. Introduction -- 4.2. xMotif Algorithm -- 4.2.1. Setting -- 4.2.2. Search Algorithm -- 4.3. Biclustering with xMotif -- 4.3.1. Test Data -- 4.3.2. Discretisation and Parameter Settings -- 4.3.2.1. Discretisation -- 4.3.2.2. Parameters Setting. 4.3.3. Using the biclust Package -- 4.4. Discussion -- 5. Bimax Algorithm -- 5.1. Introduction -- 5.2. Bimax Algorithm -- 5.2.1. Setting -- 5.2.2. Search Algorithm -- 5.3. Biclustering with Bimax -- 5.3.1. Test Data -- 5.3.2. Biclustering Using the Bimax Method -- 5.3.3. Influence of the Parameters Setting -- 5.4. Discussion -- 6. The Plaid Model -- 6.1. Plaid Model -- 6.1.1. Setting -- 6.1.2. Overlapping Biclusters -- 6.1.3. Estimation -- 6.1.4. Search Algorithm -- 6.2. Implementation in R -- 6.2.1. Constant Biclusters -- 6.2.2. Misclassification of the Mean Structure -- 6.3. Plaid Model in BiclustGUI -- 6.4. Mean Structure of a Bicluster -- 6.5. Discussion -- 7. Spectral Biclustering -- 7.1. Introduction -- 7.2. Normalisation -- 7.2.1. Independent Rescaling of Rows and Columns (IRRC) -- 7.2.2. Bistochastisation -- 7.2.3. Log Interactions -- 7.3. Spectral Biclustering -- 7.4. Spectral Biclustering Using the biclust Package -- 7.4.1. Application to DLBCL Dataset -- 7.4.2. Analysis of a Test Data -- 7.5. Discussion -- 8. FABIA -- 8.1. FABIA Model -- 8.1.1. The Idea -- 8.1.2. Model Formulation -- 8.1.3. Parameter Estimation -- 8.1.4. Bicluster Extraction -- 8.2. Implementation in R -- 8.3. Case Studies -- 8.3.1. Breast Cancer Data -- 8.3.2. Multiple Tissues Data -- 8.3.3. Diffuse Large B-Cell Lymphoma (DLBCL) Data -- 8.4. Discussion -- 9. Iterative Signature Algorithm -- 9.1. Introduction: Bicluster Definition -- 9.2. Iterative Signature Algorithm -- 9.3. Biclustering Using ISA -- 9.3.1. isa2 Package -- 9.3.2. Application to Breast Data -- 9.3.3. Application to the DLBCL Data -- 9.4. Discussion -- 10. Ensemble Methods and Robust Solutions -- 10.1. Introduction -- 10.2. Motivating Example (I) -- 10.3. Ensemble Method -- 10.3.1. Initialization Step -- 10.3.2. Combination Step -- 10.3.2.1. Similarity Indices -- 10.3.2.2. Correlation Approach. 10.3.2.3. Hierarchical Clustering -- 10.3.2.4. Quality Clustering -- 10.3.3. Merging Step -- 10.4. Application of Ensemble Biclustering for the Breast Cancer Data Using superbiclust Package -- 10.4.1. Robust Analysis for the Plaid Model -- 10.4.2. Robust Analysis of ISA -- 10.4.3. FABIA: Overlap between Biclusters -- 10.4.4. Biclustering Analysis Combining Several Methods -- 10.5. Application of Ensemble Biclustering to the TCGA Data Using biclust Implementation -- 10.5.1. Motivating Example (II) -- 10.5.2. Correlation Approach -- 10.5.3. Jaccard Index Approach -- 10.5.4. Comparison between Jaccard Index and the CorrelationApproach -- 10.5.5. Implementation in R -- 10.6. Discussion -- Part II: Case Studies and Applications -- 11. Gene Expression Experiments in Drug Discovery -- 11.1. Introduction -- 11.2. Drug Discovery -- 11.2.1. Historic Context -- 11.2.2. Current Context -- 11.2.3. Collaborative Research -- 11.3. Data Properties -- 11.3.1. High-Dimensional Data -- 11.3.2. Complex and Heterogeneous Data -- 11.3.2.1. Patient Segmentation -- 11.3.2.2. Targeted Therapy -- 11.3.2.3. Compound Differentiation -- 11.4. Data Analysis: Exploration versus Confirmation -- 11.5. QSTAR Framework -- 11.5.1. Introduction -- 11.5.2. Typical Data Structure -- 11.5.3. Main Findings -- 11.6. Inferences and Interpretations -- 11.7. Conclusion -- 12. Biclustering Methods in Chemoinformatics and Molecular Modelling in Drug Discovery Experiments: Connecting Gene Expression and Target Prediction Data -- 12.1. Introduction -- 12.1.1. Connecting Target Prediction and Gene Expression Data to Explain the Mechanism of Action -- 12.2. Data -- 12.2.1. CMap Gene Expression Data -- 12.2.2 Target Prediction Data -- 12.3. Integrative Data Analysis Steps -- 12.3.1. Clustering of Compounds -- 12.3.2. Feature Selection -- 12.3.3. Pathway Analysis -- 12.4. Biclustering with FABIA. 12.5. Data Analysis Using the R Package IntClust -- 12.5.1. Step 1: Calculation of Similarity Scores -- 12.5.2. Step 2: Target Prediction-Based Clustering -- 12.5.3. Step 3: Feature Selection -- 12.5.4. Biclustering Using fabia -- 12.6. Discussion -- 13. Integrative Analysis of miRNA and mRNA Data -- 13.1. Data Preprocessing -- 13.2. Joint Biclustering of miRNA and mRNA Data -- 13.3. Application to NCI-60 Panel Data -- 13.3.1. FABIA miRNA-mRNA Biclustering Solution -- 13.3.2. Further Description of miRNA-mRNA Biclusters -- 13.4. Discussion -- 14. Enrichment of Gene Expression Modules Using Multiple Factor Analysis and Biclustering -- 14.1. Introduction -- 14.2. Data Setting -- 14.3. Gene Module -- 14.3.1. Examples of Gene Module -- 14.3.2. Gene Module Summarization -- 14.3.3. Enrichment of Gene Module -- 14.4. Multiple Factor Analysis -- 14.4.1. Normalization Step -- 14.4.2. Simultaneous Analysis Step -- 14.5. Biclustering and Multiple Factor Analysis to Find Gene Modules -- 14.6. Implementation in R -- 14.6.1. MFA -- 14.6.2. Biclustering Using FABIA -- 14.7. Discussion -- 15. Ranking of Biclusters in Drug Discovery Experiments -- 15.1. Introduction -- 15.2. Information Content of Biclusters -- 15.2.1. Theoretical Background -- 15.2.2. Application to Drug Discovery Data Using the biclustRank R Package -- 15.3. Ranking of Biclusters Based on Their Chemical Structures -- 15.3.1. Incorporating Information about Chemical StructuresSimilarity -- 15.3.2. Similarity Scores Plot -- 15.3.2.1. Heatmap of Similarity Scores -- 15.3.3. Profiles Plot of Genes and Heatmap of Chemical Structures for a Given Bicluster -- 15.3.4. Loadings and Scores -- 15.4. Discussion -- 16. HapFABIA: Biclustering for Detecting Identity by Descent -- 16.1. Introduction -- 16.2. Identity by Descent -- 16.3. IBD Detection by Biclustering -- 16.4. Implementation in R. 16.4.1. Adaptation of FABIA for IBD Detection -- 16.4.2. HapFabia Package -- 16.5. Case Study I: A Small DNA Region in 61 ASW Africans -- 16.6. Case Study II: The 1000 Genomes Project -- 16.6.1. IBD Sharing between Human Populations and Phasing Errors -- 16.6.2. Neandertal and Denisova Matching IBD Segments -- 16.7. Discussion -- 17. Overcoming Data Dimensionality Problems in Market Segmentation -- 17.1. Introduction -- 17.1.1. Biclustering on Marketing Data -- 17.2. When to Use Biclustering -- 17.2.1. Automatic Variable Selection -- 17.2.2. Reproducibility -- 17.2.3. Identification of Market Niches -- 17.3. Binary Data -- 17.3.1. Analysis of the Tourism Survey Data -- 17.3.2. Results -- 17.3.3. Comparisons with Popular Segmentation Algorithms -- 17.3.3.1. Bootstrap Samples -- 17.3.3.2. Artificial Data -- 17.4. Analysis of the Tourism Survey Data Using the BCrepBimax Method -- 17.5. Discussion -- 18. Pattern Discovery in High-Dimensional Problems Using Biclustering Methods for Binary Data -- 18.1. Introduction -- 18.2. Identification of in vitro and in vivo DisconnectsUsing Transcriptomics Data -- 18.2.1. Background -- 18.2.2. Dataset -- 18.3. Disconnects Analysis Using Fractional Polynomials -- 18.3.1. Significant Effect in vitro -- 18.3.2. Disconnect between in vitro and in vivo -- 18.4. Biclustering of Genes and Compounds -- 18.5. Bimax Biclustering for the TGP Data -- 18.6. iBBiG Biclustering of the TGP Data -- 18.7. Discussion -- 19. Identification of Local Patterns in the NBA Performance Indicators -- 19.1. Introduction -- 19.2. NBA Sortable Team Stats -- 19.3. Analysis of the Traditional Performance Indicators: Construction of a Performance Module -- 19.3.1. Traditional Performance Indicators -- 19.3.2. Hierarchical Clustering and Principal Component Analysis -- 19.4. Analysis of Performance Indicators Using Multiple Factor Analysis. 19.4.1. Data Structure and Notations.

9781482208245


Big data.


Electronic books.

QA76.9.B45.A67 2017

5.7