Parallel Processing and Applied Mathematics : 10th International Conference, PPAM 2013, Warsaw, Poland, September 8-11, 2013, Revised Selected Papers, Part I.

By: Wyrzykowski, RomanContributor(s): Dongarra, Jack | Karczewski, Konrad | Wasniewski, JerzyMaterial type: TextTextSeries: eBooks on DemandPublisher: Berlin/Heidelberg : Springer Berlin Heidelberg, 2014Copyright date: ©2014Description: 1 online resource (817 pages)Content type: text Media type: computer Carrier type: online resourceISBN: 9783642552243Subject(s): Parallel processing (Electronic computers)-CongressesGenre/Form: Electronic books.Additional physical formats: Print version:: Parallel Processing and Applied Mathematics : 10th International Conference, PPAM 2013, Warsaw, Poland, September 8-11, 2013, Revised Selected Papers, Part IDDC classification: 004.35 LOC classification: QA76.58 .P373 2014Online resources: Click here to view this ebook.
Contents:
Intro -- Preface -- Organization -- Contents - Part I -- Algebra and Geometry Combined Explains How the Mind Does Math -- 1 Introduction -- 2 Dimension Theory and Its Relation to Standard CM and RM Arrays of Fortran and C -- 2.1 Submatrices Aij of A in Fortran and C and Their Generalization -- 2.2 Tutorial on the Essence of Dimension -- 3 Converting Standard Format to RB Format In-Place via Vector Transposition -- 3.1 Dense Linear Algebra Algorithms for MC Use RB or SB Format -- 3.2 The VIPX Vector Transpose Algorithm -- 4 A Comparison of GKK and TT Transpose Algorithms -- 4.1 Storage Amounts for GKK and TT Algorithms -- 4.2 Vector Transpose and n m A Matrices Where n=kmq -- 4.3 A Clarifying Example with n=673 and m=384 -- 5 A GCD Transpose Algorithm -- 6 The Power of Negative Integers -- 6.1 Finding Primitive Roots Using Smaller Integers -- References -- Numerical Algorithms and Parallel Scientific Computing -- Exploiting Data Sparsity in Parallel Matrix Powers Computations -- 1 Introduction -- 1.1 The Blocking Covers Technique -- 1.2 Parallel Matrix Powers Algorithms -- 2 Derivation of Parallel Blocking Covers -- 3 Hierarchical Semiseparable Matrix Example -- 3.1 PA0 for HSS Matrices -- 3.2 PA1 for HSS Matrices -- 3.3 Complexity Analysis -- 4 Performance Model -- 5 Future Work and Conclusions -- References -- Performance of Dense Eigensolvers on BlueGene/Q -- 1 Introduction -- 2 The BlueGene/Q Architecture -- 3 Parallel Libraries Investigated -- 3.1 Routines Tested in the Libraries -- 4 Measurements Done -- 5 Scaling Results up to One Rack of JUQUEEN -- 6 Results for Different Parts of the Spectrum -- 7 Conclusions -- References -- Experiences with a Lanczos Eigensolver in High-Precision Arithmetic -- 1 Introduction -- 2 Background and Methodology -- 3 Convergence on Real-World Matrices -- 4 The Effects of Clusters of Eigenvalues.
5 The Effects of High-Precision Arithmetic on the Lanczos Process -- 6 Conclusions -- References -- Adaptive Load Balancing for Massively Parallel Multi-Level Monte Carlo Solvers -- 1 Introduction -- 2 Scalable Parallel Implementation of MLMC-FVM -- 3 Problem Setting and Estimates for Computational Work -- 3.1 Estimates for the Computational Work of the FVM Solver -- 3.2 Acoustic Wave Equation in Random Medium -- 3.3 Limits of the Static Load Balancing -- 4 Adaptive Load Balancing -- 4.1 Computation and Distribution of Loads -- 4.2 Implementation Remarks -- 5 Efficiency and Linear Scaling in Numerical Simulations -- 6 Conclusion -- References -- Parallel One--Sided Jacobi SVD Algorithm with Variable Blocking Factor -- 1 Introduction -- 2 Parallel Computation in a Parallel Iteration Step -- 3 Computational and Communication Complexity -- 4 Numerical Experiments -- 5 Conclusions -- References -- An Identity Parareal Method for Temporal Parallel Computations -- 1 Introduction -- 2 The Identity Parareal Method -- 2.1 Convergence Analysis -- 2.2 Applications in Scientific Computing -- 3 Bucket-Brigade Implementation of iParareal -- 3.1 Bucket-Brigade Communication Interface -- 3.2 Speed-Up Ratio -- 4 Performance Measurement -- 5 Summary -- References -- Improving Perfect Parallelism -- 1 Introduction -- 2 Test Systems -- 3 Examples -- 3.1 Memory Copy -- 3.2 The Power Method -- 3.3 The Spike Algorithm -- 3.4 Radix Sort -- 4 Guidelines -- 5 Conclusion -- References -- Methods for High-Throughput Computation of Elementary Functions -- 1 Introduction -- 2 Background -- 3 Design Principles -- 4 Building Blocks for Elementary Functions -- 4.1 Polynomial Approximation -- 4.2 Newton-Raphson Iterations -- 5 Implementation of High-Throughput LibM -- 5.1 Log Function -- 5.2 Exp Function -- 5.3 Trigonometric Functions -- 5.4 Architecture-Specific Optimizations.
6 Performance and Accuracy Evaluation -- 7 Conclusion -- References -- Engineering Nonlinear Pseudorandom Number Generators -- 1 Introduction -- 2 Small Nonlinear Generators -- 2.1 Elliptic Curves -- 2.2 The M127 Generator -- 2.3 The M31x4 Generator -- 2.4 Tweaking Tyche with a Counter -- 2.5 Tyche as a Counter-Dependent Generator -- 3 Results and Discussion -- References -- Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs -- 1 Background -- 2 Software for GFN Searching -- 2.1 PRP Testing -- 2.2 Sieving -- 3 Distribution of Large GFN Primes -- 4 GFN Mega-Primes -- 5 Continuing the Search -- References -- Iterative Solution of Singular Systems with Applications -- 1 Introduction -- 2 Iterative Solution of Singular Symmetric Semidefinite Systems -- 3 Solution of Neumann Problems -- 4 Application in Upscaling -- 4.1 Benchmarks -- 4.2 Basic Performance Comparison -- 4.3 Cluster Computations -- 4.4 The Largest Benchmark -- 5 Conclusions -- References -- Statistical Estimates for the Conditioning of Linear Least Squares Problems -- 1 Introduction -- 2 Condition Estimation for Linear Least Squares -- 2.1 Conditioning of the Least Squares Solution -- 2.2 Componentwise Condition Estimates -- 3 Numerical Experiments -- 3.1 Accuracy of Statistical Estimates -- 4 Conclusion -- References -- Numerical Treatment of a Cross-Diffusion Model of Biofilm Exposure to Antimicrobials -- 1 Introduction -- 2 Mathematical Model -- 3 Numerical Method -- 4 A Typical Simulation and Grid Refinement -- 5 Conclusion -- References -- Performance Analysis for Stencil-Based 3D MPDATA Algorithm on GPU Architecture -- 1 Introduction -- 2 Related Works -- 3 Kepler NVIDIA Architecture -- 4 3D MPDATA Overview -- 5 Analysis of 3D MPDATA with NVIDIA Visual Profiler -- 6 Performance Analysis Based on GPU Global Memory Transactions.
7 Performance Results -- 8 Conclusions and Future Work -- References -- Elliptic Solver Performance Evaluation on Modern Hardware Architectures -- 1 Introduction -- 2 Model Parallelisation -- 2.1 Current State -- 2.2 Hybrid MPI+OpenMP Improvements -- 2.3 Parallelisation of Stencil Computations -- 2.4 Parallelization of TDMA -- 2.5 Tuning on Xeon Phi -- 2.6 Mapping Application Topology to Cluster Topology -- 3 Results -- 4 Summary -- References -- Parallel Geometric Multigrid Preconditioner for 3D FEM in NuscaS Software Package -- 1 Introduction -- 2 NuscaS Package -- 3 Multigrid Method -- 4 Parallel Implementation of Multigrid in NuscaS -- 4.1 Generation of Mesh Hierarchy in Parallel -- 4.2 Data Structures -- 4.3 Mesh Refinement -- 4.4 Construction of Subsequent Levels of Multigrid -- 4.5 Details of Implementation -- 5 Performance of Conjugate Gradient Method with Parallel Multigrid Preconditioner -- 5.1 Parallel Multigrid Preconditioner -- 5.2 Performance Results -- 6 Conclusions and Further Work -- References -- Scalable Parallel Generation of Very Large Sparse Benchmark Matrices -- 1 Introduction -- 2 Methodology -- 2.1 Algorithm -- 2.2 Enlargement Functions -- 2.3 Mapping Functions -- 3 Experiments -- 4 Discussion -- 5 Conclusions -- References -- Parallel Non-Numerical Algorithms -- Co-operation Schemes for the Parallel Memetic Algorithm -- 1 Introduction -- 2 Problem Formulation -- 3 Parallel Memetic Algorithm and Co-operation Schemes -- 3.1 Parallel Memetic Algorithm Outline -- 3.2 Co-operation Schemes -- 4 Experimental Results -- 4.1 Settings -- 4.2 Analysis and Discussion -- 5 Conclusions and Future Work -- References -- Scalable and Efficient Parallel Selection -- 1 Introduction -- 2 Sequential Selection -- 3 Parallel Selection -- 3.1 Median Approximation -- 3.2 Partitioning -- 3.3 Proceed in the Target Partition.
4 Quality of Our Median Approximation -- 4.1 Simulation -- 4.2 Worst Case -- 5 Performance Evaluation -- 5.1 Different Inputs -- 5.2 Scalability -- 6 Conclusion -- References -- Optimal Diffusion for Load Balancing in Heterogeneous Networks -- 1 Introduction -- 2 The Heterogeneous Extrapolated Diffusion Method -- 3 The 2D-Torus -- 4 Optimum ij -- 5 Optimum Edge Weights and Speeds -- 6 Numerical Experiments -- References -- Parallel Bounded Model Checking of Security Protocols -- 1 Introduction -- 2 The Needham-Schroeder Public Key Protocol -- 3 Idea of the Chains of States -- 3.1 Intruder and Attacks -- 3.2 Correct Chains of States -- 3.3 Verification Algorithm -- 4 Experimental Results and Summary -- References -- Tools and Environments for Parallel/Distributed/Cloud Computing -- Development of Domain-Specific Solutions Within the Polish Infrastructure for Advanced Scientific Research -- 1 Introduction -- 2 Polish Grid Infrastructure -- 3 Domain-Specific Solutions -- 4 Use Cases -- 4.1 Bioinformatics -- Processing Genetic Data -- 4.2 Metallurgy -- Grid-Based Numerical Modeling Dedicated to Simulation of Metallurgical Production Processes -- 4.3 Acoustics -- New Services for Urban Planning, Research and Education -- 4.4 Ecology -- Phenology Observations Automated by IT Platforms -- 5 Summary -- References -- Cost Optimization of Execution of Multi-level Deadline-Constrained Scientific Workflows on Clouds -- 1 Introduction -- 2 Related Work -- 3 Application and Infrastructure Model -- 4 Problem Formulation Using AMPL -- 5 Evaluation -- 6 Conclusions and Future Work -- References -- Parallel Computations in the Volunteer--Based Comcute System -- 1 Introduction -- 2 Related Work -- 3 Proposed Solution -- 3.1 Architecture -- 3.2 Distributed Volunteer Task Execution -- 3.3 Performance Factors -- 3.4 A Versatile Client Template -- 4 Experiments.
4.1 Testbed Application and Configurations.
Tags from this library: No tags from this library for this title. Log in to add tags.
Item type Current location Call number URL Status Date due Barcode
Electronic Book UT Tyler Online
Online
QA76.58 .P373 2014 (Browse shelf) https://ebookcentral.proquest.com/lib/uttyler/detail.action?docID=3096931 Available EBC3096931

Intro -- Preface -- Organization -- Contents - Part I -- Algebra and Geometry Combined Explains How the Mind Does Math -- 1 Introduction -- 2 Dimension Theory and Its Relation to Standard CM and RM Arrays of Fortran and C -- 2.1 Submatrices Aij of A in Fortran and C and Their Generalization -- 2.2 Tutorial on the Essence of Dimension -- 3 Converting Standard Format to RB Format In-Place via Vector Transposition -- 3.1 Dense Linear Algebra Algorithms for MC Use RB or SB Format -- 3.2 The VIPX Vector Transpose Algorithm -- 4 A Comparison of GKK and TT Transpose Algorithms -- 4.1 Storage Amounts for GKK and TT Algorithms -- 4.2 Vector Transpose and n m A Matrices Where n=kmq -- 4.3 A Clarifying Example with n=673 and m=384 -- 5 A GCD Transpose Algorithm -- 6 The Power of Negative Integers -- 6.1 Finding Primitive Roots Using Smaller Integers -- References -- Numerical Algorithms and Parallel Scientific Computing -- Exploiting Data Sparsity in Parallel Matrix Powers Computations -- 1 Introduction -- 1.1 The Blocking Covers Technique -- 1.2 Parallel Matrix Powers Algorithms -- 2 Derivation of Parallel Blocking Covers -- 3 Hierarchical Semiseparable Matrix Example -- 3.1 PA0 for HSS Matrices -- 3.2 PA1 for HSS Matrices -- 3.3 Complexity Analysis -- 4 Performance Model -- 5 Future Work and Conclusions -- References -- Performance of Dense Eigensolvers on BlueGene/Q -- 1 Introduction -- 2 The BlueGene/Q Architecture -- 3 Parallel Libraries Investigated -- 3.1 Routines Tested in the Libraries -- 4 Measurements Done -- 5 Scaling Results up to One Rack of JUQUEEN -- 6 Results for Different Parts of the Spectrum -- 7 Conclusions -- References -- Experiences with a Lanczos Eigensolver in High-Precision Arithmetic -- 1 Introduction -- 2 Background and Methodology -- 3 Convergence on Real-World Matrices -- 4 The Effects of Clusters of Eigenvalues.

5 The Effects of High-Precision Arithmetic on the Lanczos Process -- 6 Conclusions -- References -- Adaptive Load Balancing for Massively Parallel Multi-Level Monte Carlo Solvers -- 1 Introduction -- 2 Scalable Parallel Implementation of MLMC-FVM -- 3 Problem Setting and Estimates for Computational Work -- 3.1 Estimates for the Computational Work of the FVM Solver -- 3.2 Acoustic Wave Equation in Random Medium -- 3.3 Limits of the Static Load Balancing -- 4 Adaptive Load Balancing -- 4.1 Computation and Distribution of Loads -- 4.2 Implementation Remarks -- 5 Efficiency and Linear Scaling in Numerical Simulations -- 6 Conclusion -- References -- Parallel One--Sided Jacobi SVD Algorithm with Variable Blocking Factor -- 1 Introduction -- 2 Parallel Computation in a Parallel Iteration Step -- 3 Computational and Communication Complexity -- 4 Numerical Experiments -- 5 Conclusions -- References -- An Identity Parareal Method for Temporal Parallel Computations -- 1 Introduction -- 2 The Identity Parareal Method -- 2.1 Convergence Analysis -- 2.2 Applications in Scientific Computing -- 3 Bucket-Brigade Implementation of iParareal -- 3.1 Bucket-Brigade Communication Interface -- 3.2 Speed-Up Ratio -- 4 Performance Measurement -- 5 Summary -- References -- Improving Perfect Parallelism -- 1 Introduction -- 2 Test Systems -- 3 Examples -- 3.1 Memory Copy -- 3.2 The Power Method -- 3.3 The Spike Algorithm -- 3.4 Radix Sort -- 4 Guidelines -- 5 Conclusion -- References -- Methods for High-Throughput Computation of Elementary Functions -- 1 Introduction -- 2 Background -- 3 Design Principles -- 4 Building Blocks for Elementary Functions -- 4.1 Polynomial Approximation -- 4.2 Newton-Raphson Iterations -- 5 Implementation of High-Throughput LibM -- 5.1 Log Function -- 5.2 Exp Function -- 5.3 Trigonometric Functions -- 5.4 Architecture-Specific Optimizations.

6 Performance and Accuracy Evaluation -- 7 Conclusion -- References -- Engineering Nonlinear Pseudorandom Number Generators -- 1 Introduction -- 2 Small Nonlinear Generators -- 2.1 Elliptic Curves -- 2.2 The M127 Generator -- 2.3 The M31x4 Generator -- 2.4 Tweaking Tyche with a Counter -- 2.5 Tyche as a Counter-Dependent Generator -- 3 Results and Discussion -- References -- Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs -- 1 Background -- 2 Software for GFN Searching -- 2.1 PRP Testing -- 2.2 Sieving -- 3 Distribution of Large GFN Primes -- 4 GFN Mega-Primes -- 5 Continuing the Search -- References -- Iterative Solution of Singular Systems with Applications -- 1 Introduction -- 2 Iterative Solution of Singular Symmetric Semidefinite Systems -- 3 Solution of Neumann Problems -- 4 Application in Upscaling -- 4.1 Benchmarks -- 4.2 Basic Performance Comparison -- 4.3 Cluster Computations -- 4.4 The Largest Benchmark -- 5 Conclusions -- References -- Statistical Estimates for the Conditioning of Linear Least Squares Problems -- 1 Introduction -- 2 Condition Estimation for Linear Least Squares -- 2.1 Conditioning of the Least Squares Solution -- 2.2 Componentwise Condition Estimates -- 3 Numerical Experiments -- 3.1 Accuracy of Statistical Estimates -- 4 Conclusion -- References -- Numerical Treatment of a Cross-Diffusion Model of Biofilm Exposure to Antimicrobials -- 1 Introduction -- 2 Mathematical Model -- 3 Numerical Method -- 4 A Typical Simulation and Grid Refinement -- 5 Conclusion -- References -- Performance Analysis for Stencil-Based 3D MPDATA Algorithm on GPU Architecture -- 1 Introduction -- 2 Related Works -- 3 Kepler NVIDIA Architecture -- 4 3D MPDATA Overview -- 5 Analysis of 3D MPDATA with NVIDIA Visual Profiler -- 6 Performance Analysis Based on GPU Global Memory Transactions.

7 Performance Results -- 8 Conclusions and Future Work -- References -- Elliptic Solver Performance Evaluation on Modern Hardware Architectures -- 1 Introduction -- 2 Model Parallelisation -- 2.1 Current State -- 2.2 Hybrid MPI+OpenMP Improvements -- 2.3 Parallelisation of Stencil Computations -- 2.4 Parallelization of TDMA -- 2.5 Tuning on Xeon Phi -- 2.6 Mapping Application Topology to Cluster Topology -- 3 Results -- 4 Summary -- References -- Parallel Geometric Multigrid Preconditioner for 3D FEM in NuscaS Software Package -- 1 Introduction -- 2 NuscaS Package -- 3 Multigrid Method -- 4 Parallel Implementation of Multigrid in NuscaS -- 4.1 Generation of Mesh Hierarchy in Parallel -- 4.2 Data Structures -- 4.3 Mesh Refinement -- 4.4 Construction of Subsequent Levels of Multigrid -- 4.5 Details of Implementation -- 5 Performance of Conjugate Gradient Method with Parallel Multigrid Preconditioner -- 5.1 Parallel Multigrid Preconditioner -- 5.2 Performance Results -- 6 Conclusions and Further Work -- References -- Scalable Parallel Generation of Very Large Sparse Benchmark Matrices -- 1 Introduction -- 2 Methodology -- 2.1 Algorithm -- 2.2 Enlargement Functions -- 2.3 Mapping Functions -- 3 Experiments -- 4 Discussion -- 5 Conclusions -- References -- Parallel Non-Numerical Algorithms -- Co-operation Schemes for the Parallel Memetic Algorithm -- 1 Introduction -- 2 Problem Formulation -- 3 Parallel Memetic Algorithm and Co-operation Schemes -- 3.1 Parallel Memetic Algorithm Outline -- 3.2 Co-operation Schemes -- 4 Experimental Results -- 4.1 Settings -- 4.2 Analysis and Discussion -- 5 Conclusions and Future Work -- References -- Scalable and Efficient Parallel Selection -- 1 Introduction -- 2 Sequential Selection -- 3 Parallel Selection -- 3.1 Median Approximation -- 3.2 Partitioning -- 3.3 Proceed in the Target Partition.

4 Quality of Our Median Approximation -- 4.1 Simulation -- 4.2 Worst Case -- 5 Performance Evaluation -- 5.1 Different Inputs -- 5.2 Scalability -- 6 Conclusion -- References -- Optimal Diffusion for Load Balancing in Heterogeneous Networks -- 1 Introduction -- 2 The Heterogeneous Extrapolated Diffusion Method -- 3 The 2D-Torus -- 4 Optimum ij -- 5 Optimum Edge Weights and Speeds -- 6 Numerical Experiments -- References -- Parallel Bounded Model Checking of Security Protocols -- 1 Introduction -- 2 The Needham-Schroeder Public Key Protocol -- 3 Idea of the Chains of States -- 3.1 Intruder and Attacks -- 3.2 Correct Chains of States -- 3.3 Verification Algorithm -- 4 Experimental Results and Summary -- References -- Tools and Environments for Parallel/Distributed/Cloud Computing -- Development of Domain-Specific Solutions Within the Polish Infrastructure for Advanced Scientific Research -- 1 Introduction -- 2 Polish Grid Infrastructure -- 3 Domain-Specific Solutions -- 4 Use Cases -- 4.1 Bioinformatics -- Processing Genetic Data -- 4.2 Metallurgy -- Grid-Based Numerical Modeling Dedicated to Simulation of Metallurgical Production Processes -- 4.3 Acoustics -- New Services for Urban Planning, Research and Education -- 4.4 Ecology -- Phenology Observations Automated by IT Platforms -- 5 Summary -- References -- Cost Optimization of Execution of Multi-level Deadline-Constrained Scientific Workflows on Clouds -- 1 Introduction -- 2 Related Work -- 3 Application and Infrastructure Model -- 4 Problem Formulation Using AMPL -- 5 Evaluation -- 6 Conclusions and Future Work -- References -- Parallel Computations in the Volunteer--Based Comcute System -- 1 Introduction -- 2 Related Work -- 3 Proposed Solution -- 3.1 Architecture -- 3.2 Distributed Volunteer Task Execution -- 3.3 Performance Factors -- 3.4 A Versatile Client Template -- 4 Experiments.

4.1 Testbed Application and Configurations.

Description based on publisher supplied metadata and other sources.

There are no comments on this title.

to post a comment.