Normal view MARC view ISBD view

Disk-Based Algorithms for Big Data.

By: Healey, Christopher G.
Material type: TextTextSeries: eBooks on Demand.Publisher: Boca Raton : CRC Press, 2016Copyright date: ©2017Description: 1 online resource (205 pages).Content type: text Media type: computer Carrier type: online resourceISBN: 9781315302867.Subject(s): Big dataGenre/Form: Electronic books.Additional physical formats: Print version:: Disk-Based Algorithms for Big DataDDC classification: 005.7 LOC classification: QA76.9.B45.H43 2017Online resources: Click here to view this ebook.
Contents:
Cover -- Half Title -- Title Page -- Copyright Page -- Dedication -- Table of Contents -- List of Tables -- List of Figures -- Preface -- Chapter 1: Physical Disk Storage -- 1.1 PHYSICAL HARD DISK -- 1.2 CLUSTERS -- 1.2.1 Block Allocation -- 1.3 ACCESS COST -- 1.4 LOGICAL TO PHYSICAL -- 1.5 BUFFER MANAGEMENT -- Chapter 2: File Management -- 2.1 LOGICAL COMPONENTS -- 2.1.1 Positioning Components -- 2.2 IDENTIFYING RECORDS -- 2.2.1 Secondary Keys -- 2.3 SEQUENTIAL ACCESS -- 2.3.1 Improvements -- 2.4 DIRECT ACCESS -- 2.4.1 Binary Search -- 2.5 FILE MANAGEMENT -- 2.5.1 Record Deletion -- 2.5.2 Fixed-Length Deletion -- 2.5.3 Variable-Length Deletion -- 2.6 FILE INDEXING -- 2.6.1 Simple Indices -- 2.6.2 Index Management -- 2.6.3 Large Index Files -- 2.6.4 Secondary Key Index -- 2.6.5 Secondary Key Index Improvements -- Chapter 3: Sorting -- 3.1 HEAPSORT -- 3.2 MERGESORT -- 3.3 TIMSORT -- Chapter 4: Searching -- 4.1 LINEAR SEARCH -- 4.2 BINARY SEARCH -- 4.3 BINARY SEARCH TREE -- 4.4 k-d TREE -- 4.4.1 k-d Tree Index -- 4.4.2 Search -- 4.4.3 Performance -- 4.5 HASHING -- 4.5.1 Collisions -- 4.5.2 Hash Functions -- 4.5.3 Hash Value Distributions -- 4.5.4 Estimating Collisions -- 4.5.5 Managing Collisions -- 4.5.6 Progressive Overflow -- 4.5.7 Multirecord Buckets -- Chapter 5: Disk-Based Sorting -- 5.1 DISK-BASED MERGESORT -- 5.1.1 Basic Mergesort -- 5.1.2 Timing -- 5.1.3 Scalability -- 5.2 INCREASED MEMORY -- 5.3 MORE HARD DRIVES -- 5.4 MULTISTEP MERGE -- 5.5 INCREASED RUN LENGTHS -- 5.5.1 Replacement Selection -- 5.5.2 Average Run Size -- 5.5.3 Cost -- 5.5.4 Dual Hard Drives -- Chapter 6: Disk-Based Searching -- 6.1 IMPROVED BINARY SEARCH -- 6.1.1 Self-Correcting BSTs -- 6.1.2 Paged BSTs -- 6.2 B-TREE -- 6.2.1 Search -- 6.2.2 Insertion -- 6.2.3 Deletion -- 6.3 B* TREE -- 6.4 B+ TREE -- 6.4.1 Prefix Keys -- 6.5 EXTENDIBLE HASHING -- 6.5.1 Trie.
6.5.2 Radix Tree -- 6.6 HASH TRIES -- 6.6.1 Trie Insertion -- 6.6.2 Bucket Insertion -- 6.6.3 Full Trie -- 6.6.4 Trie Size -- 6.6.5 Trie Deletion -- 6.6.6 Trie Performance -- Chapter 7: Storage Technology -- 7.1 OPTICAL DRIVES -- 7.1.1 Compact Disc -- 7.1.2 Digital Versatile Disc -- 7.1.3 Blu-ray Disc -- 7.2 SOLID STATE DRIVES -- 7.2.1 Floating Gate Transistors -- 7.2.2 Read-Write-Erase -- 7.2.3 SSD Controller -- 7.2.4 Advantages -- 7.3 HOLOGRAPHIC STORAGE -- 7.3.1 Holograms -- 7.3.2 Data Holograms -- 7.3.3 Commercialization -- 7.4 MOLECULAR MEMORY -- 7.5 MRAM -- Chapter 8: Distributed Hash Tables -- 8.1 HISTORY -- 8.2 KEYSPACE -- 8.3 KEYSPACE PARTITIONING -- 8.4 OVERLAY NETWORK -- 8.5 CHORD -- 8.5.1 Keyspace -- 8.5.2 Keyspace Partitioning -- 8.5.3 Overlay Network -- 8.5.4 Addition -- 8.5.5 Failure -- Chapter 9: Large File Systems -- 9.1 RAID -- 9.1.1 Parity -- 9.2 ZFS -- 9.2.1 Fault Tolerance -- 9.2.2 Self-Healing -- 9.2.3 Snapshots -- 9.3 GFS -- 9.3.1 Architecture -- 9.3.2 Master Metadata -- 9.3.3 Mutations -- 9.3.4 Fault Tolerance -- 9.4 HADOOP -- 9.4.1 MapReduce -- 9.4.2 MapReduce Implementation -- 9.4.3 HDFS -- 9.4.4 Pig -- 9.4.5 Hive -- 9.5 CASSANDRA -- 9.5.1 Design -- 9.5.2 Improvements -- 9.5.3 Query Language -- 9.6 PRESTO -- Chapter 10: NoSQL Storage -- 10.1 GRAPH DATABASES -- 10.1.1 Neo4j -- 10.1.2 Caching -- 10.1.3 Query Languages -- 10.2 DOCUMENT DATABASES -- 10.2.1 SQL Versus NoSQL -- 10.2.2 MongoDB -- 10.2.3 Indexing -- 10.2.4 Query Languages -- Appendix A: Order Notation -- A.1 Θ-NOTATION -- A.2 Ο-NOTATION -- A.3 Ω-NOTATION -- A.4 INSERTION SORT -- A.5 SHELL SORT -- Appendix B: Assignment 1: Search -- B.1 KEY AND SEEK LISTS -- B.2 PROGRAM EXECUTION -- B.3 IN-MEMORY SEQUENTIAL SEARCH -- B.4 IN-MEMORY BINARY SEARCH -- B.5 ON-DISK SEQUENTIAL SEARCH -- B.6 ON-DISK BINARY SEARCH -- B.7 PROGRAMMING ENVIRONMENT.
B.7.1 Reading Binary Integers -- B.7.2 Measuring Time -- B.7.3 Writing Results -- B.8 SUPPLEMENTAL MATERIAL -- B.9 HAND-IN REQUIREMENTS -- Appendix C: Assignment 2: Indices -- C.1 STUDENT FILE -- C.2 PROGRAM EXECUTION -- C.3 IN-MEMORY PRIMARY KEY INDEX -- C.4 IN-MEMORY AVAILABILITY LIST -- C.4.1 First Fit -- C.4.2 Best Fit -- C.4.3 Worst Fit -- C.5 USER INTERFACE -- C.5.1 Add -- C.5.2 Find -- C.5.3 Del -- C.5.4 End -- C.6 PROGRAMMING ENVIRONMENT -- C.6.1 Writing Results -- C.7 SUPPLEMENTAL MATERIAL -- C.8 HAND-IN REQUIREMENTS -- Appendix D: Assignment 3: Mergesort -- D.1 INDEX FILE -- D.2 PROGRAM EXECUTION -- D.3 AVAILABLE MEMORY -- D.4 BASIC MERGESORT -- D.5 MULTISTEP MERGESORT -- D.6 REPLACEMENT SELECTION MERGESORT -- D.7 PROGRAMMING ENVIRONMENT -- D.7.1 Measuring Time -- D.7.2 Writing Results -- D.8 SUPPLEMENTAL MATERIAL -- D.9 HAND-IN REQUIREMENTS -- Appendix E: Assignment 4: B-Trees -- E.1 INDEX FILE -- E.2 PROGRAM EXECUTION -- E.3 B-TREE NODES -- E.3.1 Root Node Offset -- E.4 USER INTERFACE -- E.4.1 Add -- E.4.2 Find -- E.4.3 Print -- E.4.4 End -- E.5 PROGRAMMING ENVIRONMENT -- E.6 SUPPLEMENTAL MATERIAL -- E.7 HAND-IN REQUIREMENTS -- Index.
Tags from this library: No tags from this library for this title. Log in to add tags.
Item type Current location Call number URL Status Date due Barcode
Electronic Book UT Tyler Online
Online
QA76.9.B45.H43 2017 (Browse shelf) http://ebookcentral.proquest.com/lib/uttyler/detail.action?docID=4745264 Available EBC4745264

Cover -- Half Title -- Title Page -- Copyright Page -- Dedication -- Table of Contents -- List of Tables -- List of Figures -- Preface -- Chapter 1: Physical Disk Storage -- 1.1 PHYSICAL HARD DISK -- 1.2 CLUSTERS -- 1.2.1 Block Allocation -- 1.3 ACCESS COST -- 1.4 LOGICAL TO PHYSICAL -- 1.5 BUFFER MANAGEMENT -- Chapter 2: File Management -- 2.1 LOGICAL COMPONENTS -- 2.1.1 Positioning Components -- 2.2 IDENTIFYING RECORDS -- 2.2.1 Secondary Keys -- 2.3 SEQUENTIAL ACCESS -- 2.3.1 Improvements -- 2.4 DIRECT ACCESS -- 2.4.1 Binary Search -- 2.5 FILE MANAGEMENT -- 2.5.1 Record Deletion -- 2.5.2 Fixed-Length Deletion -- 2.5.3 Variable-Length Deletion -- 2.6 FILE INDEXING -- 2.6.1 Simple Indices -- 2.6.2 Index Management -- 2.6.3 Large Index Files -- 2.6.4 Secondary Key Index -- 2.6.5 Secondary Key Index Improvements -- Chapter 3: Sorting -- 3.1 HEAPSORT -- 3.2 MERGESORT -- 3.3 TIMSORT -- Chapter 4: Searching -- 4.1 LINEAR SEARCH -- 4.2 BINARY SEARCH -- 4.3 BINARY SEARCH TREE -- 4.4 k-d TREE -- 4.4.1 k-d Tree Index -- 4.4.2 Search -- 4.4.3 Performance -- 4.5 HASHING -- 4.5.1 Collisions -- 4.5.2 Hash Functions -- 4.5.3 Hash Value Distributions -- 4.5.4 Estimating Collisions -- 4.5.5 Managing Collisions -- 4.5.6 Progressive Overflow -- 4.5.7 Multirecord Buckets -- Chapter 5: Disk-Based Sorting -- 5.1 DISK-BASED MERGESORT -- 5.1.1 Basic Mergesort -- 5.1.2 Timing -- 5.1.3 Scalability -- 5.2 INCREASED MEMORY -- 5.3 MORE HARD DRIVES -- 5.4 MULTISTEP MERGE -- 5.5 INCREASED RUN LENGTHS -- 5.5.1 Replacement Selection -- 5.5.2 Average Run Size -- 5.5.3 Cost -- 5.5.4 Dual Hard Drives -- Chapter 6: Disk-Based Searching -- 6.1 IMPROVED BINARY SEARCH -- 6.1.1 Self-Correcting BSTs -- 6.1.2 Paged BSTs -- 6.2 B-TREE -- 6.2.1 Search -- 6.2.2 Insertion -- 6.2.3 Deletion -- 6.3 B* TREE -- 6.4 B+ TREE -- 6.4.1 Prefix Keys -- 6.5 EXTENDIBLE HASHING -- 6.5.1 Trie.

6.5.2 Radix Tree -- 6.6 HASH TRIES -- 6.6.1 Trie Insertion -- 6.6.2 Bucket Insertion -- 6.6.3 Full Trie -- 6.6.4 Trie Size -- 6.6.5 Trie Deletion -- 6.6.6 Trie Performance -- Chapter 7: Storage Technology -- 7.1 OPTICAL DRIVES -- 7.1.1 Compact Disc -- 7.1.2 Digital Versatile Disc -- 7.1.3 Blu-ray Disc -- 7.2 SOLID STATE DRIVES -- 7.2.1 Floating Gate Transistors -- 7.2.2 Read-Write-Erase -- 7.2.3 SSD Controller -- 7.2.4 Advantages -- 7.3 HOLOGRAPHIC STORAGE -- 7.3.1 Holograms -- 7.3.2 Data Holograms -- 7.3.3 Commercialization -- 7.4 MOLECULAR MEMORY -- 7.5 MRAM -- Chapter 8: Distributed Hash Tables -- 8.1 HISTORY -- 8.2 KEYSPACE -- 8.3 KEYSPACE PARTITIONING -- 8.4 OVERLAY NETWORK -- 8.5 CHORD -- 8.5.1 Keyspace -- 8.5.2 Keyspace Partitioning -- 8.5.3 Overlay Network -- 8.5.4 Addition -- 8.5.5 Failure -- Chapter 9: Large File Systems -- 9.1 RAID -- 9.1.1 Parity -- 9.2 ZFS -- 9.2.1 Fault Tolerance -- 9.2.2 Self-Healing -- 9.2.3 Snapshots -- 9.3 GFS -- 9.3.1 Architecture -- 9.3.2 Master Metadata -- 9.3.3 Mutations -- 9.3.4 Fault Tolerance -- 9.4 HADOOP -- 9.4.1 MapReduce -- 9.4.2 MapReduce Implementation -- 9.4.3 HDFS -- 9.4.4 Pig -- 9.4.5 Hive -- 9.5 CASSANDRA -- 9.5.1 Design -- 9.5.2 Improvements -- 9.5.3 Query Language -- 9.6 PRESTO -- Chapter 10: NoSQL Storage -- 10.1 GRAPH DATABASES -- 10.1.1 Neo4j -- 10.1.2 Caching -- 10.1.3 Query Languages -- 10.2 DOCUMENT DATABASES -- 10.2.1 SQL Versus NoSQL -- 10.2.2 MongoDB -- 10.2.3 Indexing -- 10.2.4 Query Languages -- Appendix A: Order Notation -- A.1 Θ-NOTATION -- A.2 Ο-NOTATION -- A.3 Ω-NOTATION -- A.4 INSERTION SORT -- A.5 SHELL SORT -- Appendix B: Assignment 1: Search -- B.1 KEY AND SEEK LISTS -- B.2 PROGRAM EXECUTION -- B.3 IN-MEMORY SEQUENTIAL SEARCH -- B.4 IN-MEMORY BINARY SEARCH -- B.5 ON-DISK SEQUENTIAL SEARCH -- B.6 ON-DISK BINARY SEARCH -- B.7 PROGRAMMING ENVIRONMENT.

B.7.1 Reading Binary Integers -- B.7.2 Measuring Time -- B.7.3 Writing Results -- B.8 SUPPLEMENTAL MATERIAL -- B.9 HAND-IN REQUIREMENTS -- Appendix C: Assignment 2: Indices -- C.1 STUDENT FILE -- C.2 PROGRAM EXECUTION -- C.3 IN-MEMORY PRIMARY KEY INDEX -- C.4 IN-MEMORY AVAILABILITY LIST -- C.4.1 First Fit -- C.4.2 Best Fit -- C.4.3 Worst Fit -- C.5 USER INTERFACE -- C.5.1 Add -- C.5.2 Find -- C.5.3 Del -- C.5.4 End -- C.6 PROGRAMMING ENVIRONMENT -- C.6.1 Writing Results -- C.7 SUPPLEMENTAL MATERIAL -- C.8 HAND-IN REQUIREMENTS -- Appendix D: Assignment 3: Mergesort -- D.1 INDEX FILE -- D.2 PROGRAM EXECUTION -- D.3 AVAILABLE MEMORY -- D.4 BASIC MERGESORT -- D.5 MULTISTEP MERGESORT -- D.6 REPLACEMENT SELECTION MERGESORT -- D.7 PROGRAMMING ENVIRONMENT -- D.7.1 Measuring Time -- D.7.2 Writing Results -- D.8 SUPPLEMENTAL MATERIAL -- D.9 HAND-IN REQUIREMENTS -- Appendix E: Assignment 4: B-Trees -- E.1 INDEX FILE -- E.2 PROGRAM EXECUTION -- E.3 B-TREE NODES -- E.3.1 Root Node Offset -- E.4 USER INTERFACE -- E.4.1 Add -- E.4.2 Find -- E.4.3 Print -- E.4.4 End -- E.5 PROGRAMMING ENVIRONMENT -- E.6 SUPPLEMENTAL MATERIAL -- E.7 HAND-IN REQUIREMENTS -- Index.

Description based on publisher supplied metadata and other sources.

Author notes provided by Syndetics

<p>Christopher Healey is the Goodnight Distinguished Professor of Analytics at North Carolina State University.</p>

There are no comments for this item.

Log in to your account to post a comment.