Practical Text Analytics : (Record no. 1052564)

001 - CONTROL NUMBER
control field EBC5560078
006 - FIXED-LENGTH DATA ELEMENTS--ADDITIONAL MATERIAL CHARACTERISTICS
additional material characteristics m o d |
007 - PHYSICAL DESCRIPTION FIXED FIELD--GENERAL INFORMATION
fixed length control field cr cnu||||||||
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 200119s2018 xx o ||||0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number 9783319956633
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
Cancelled/invalid ISBN 9783319956626
035 ## - SYSTEM CONTROL NUMBER
System control number (MiAaPQ)EBC5560078
035 ## - SYSTEM CONTROL NUMBER
System control number (Au-PeEL)EBL5560078
035 ## - SYSTEM CONTROL NUMBER
System control number (OCoLC)1059371456
040 ## - CATALOGING SOURCE
Original cataloging agency MiAaPQ
Language of cataloging eng
Description conventions rda
-- pn
Transcribing agency MiAaPQ
Modifying agency MiAaPQ
050 #4 - LIBRARY OF CONGRESS CALL NUMBER
Classification number HF4999.2-6182
082 0# - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number 005.7
090 ## - LOCALLY ASSIGNED LC-TYPE CALL NUMBER (OCLC); LOCAL CALL NUMBER (OCLC)
Classification number (OCLC) (R) ; Classification number, CALL (RLIN) (NR) HF4999.2-6182
100 1# - MAIN ENTRY--PERSONAL NAME
Personal name Anandarajan, Murugan.
245 10 - TITLE STATEMENT
Title Practical Text Analytics :
Remainder of title Maximizing the Value of Text Data.
264 #1 -
-- Cham :
-- Springer,
-- 2018.
264 #4 -
-- ©2019.
300 ## - PHYSICAL DESCRIPTION
Extent 1 online resource (294 pages)
336 ## - Content
Term text
Code txt
Content rdacontent
337 ## - Media
Term computer
Code c
Media rdamedia
338 ## - Carrier
Term online resource
Code cr
Carrier rdacarrier
490 0# - SERIES STATEMENT
Series statement eBooks on Demand
490 1# - SERIES STATEMENT
Series statement Advances in Analytics and Data Science Ser. ;
Volume number/sequential designation v.2
505 0# - FORMATTED CONTENTS NOTE
Formatted contents note Intro -- Dedication -- Preface -- Acknowledgments -- Contents -- About the Authors -- List of Abbreviations -- List of Figures -- List of Tables -- Chapter 1: Introduction to Text Analytics -- 1.1 Introduction -- 1.2 Text Analytics: What Is It? -- 1.3 Origins and Timeline of Text Analytics -- 1.4 Text Analytics in Business and Industry -- 1.5 Text Analytics Skills -- 1.6 Benefits of Text Analytics -- 1.7 Text Analytics Process Road Map -- 1.7.1 Planning -- 1.7.2 Text Preparing and Preprocessing -- 1.7.3 Text Analysis Techniques -- 1.7.4 Communicating the Results -- 1.8 Examples of Text Analytics Software -- References -- Part I: Planning the Text Analytics Project -- Chapter 2: The Fundamentals of Content Analysis -- 2.1 Introduction -- 2.2 Deductive Versus Inductive Approaches -- 2.2.1 Content Analysis for Deductive Inference -- 2.2.2 Content Analysis for Inductive Inference -- 2.3 Unitizing and the Unit of Analysis -- 2.3.1 The Sampling Unit -- 2.3.2 The Recording Unit -- 2.3.3 The Context Unit -- 2.4 Sampling -- 2.5 Coding and Categorization -- 2.6 Examples of Inductive and Deductive Inference Processes -- 2.6.1 Inductive Inference -- 2.6.2 Deductive Inference -- References -- Further Reading -- Chapter 3: Planning for Text Analytics -- 3.1 Introduction -- 3.2 Initial Planning Considerations -- 3.2.1 Drivers -- 3.2.2 Objectives -- 3.2.3 Data -- 3.2.4 Cost -- 3.3 Planning Process -- 3.4 Problem Framing -- 3.4.1 Identifying the Analysis Problem -- 3.4.2 Inductive or Deductive Inference -- 3.5 Data Generation -- 3.5.1 Definition of the Project's Scope and Purpose -- 3.5.2 Text Data Collection -- 3.5.3 Sampling -- 3.5.3.1 Non-probability Sampling -- 3.5.3.2 Probability Sampling -- 3.5.3.3 Sampling for Classification Analysis -- 3.5.3.4 Sample Size -- 3.6 Method and Implementation Selection -- 3.6.1 Analysis Method Selection.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 3.6.2 The Selection of Implementation Software -- References -- Further Reading -- Part II: Text Preparation -- Chapter 4: Text Preprocessing -- 4.1 Introduction -- 4.2 The Preprocessing Process -- 4.3 Unitize and Tokenize -- 4.3.1 N-Grams -- 4.4 Standardization and Cleaning -- 4.5 Stop Word Removal -- 4.5.1 Custom Stop Word Dictionaries -- 4.6 Stemming and Lemmatization -- 4.6.1 Syntax and Semantics -- 4.6.2 Stemming -- 4.6.3 Lemmatization -- 4.6.4 Part-of-Speech (POS) Tagging -- References -- Further Reading -- Chapter 5: Term-Document Representation -- 5.1 Introduction -- 5.2 The Inverted Index -- 5.3 The Term-Document Matrix -- 5.4 Term-Document Matrix Frequency Weighting -- 5.4.1 Local Weighting -- 5.4.1.1 Logarithmic (Log) Frequency -- 5.4.1.2 Binary/Boolean Frequency -- 5.4.2 Global Weighting -- 5.4.2.1 Document Frequency (df) -- 5.4.2.2 Global Frequency (gf) -- 5.4.2.3 Inverse Document Frequency (idf) -- 5.4.3 Combinatorial Weighting: Local and Global Weighting -- 5.4.3.1 Term Frequency-Inverse Document Frequency (tfidf) -- 5.5 Decision-Making -- References -- Further Reading -- Part III: Text Analysis Techniques -- Chapter 6: Semantic Space Representation and Latent Semantic Analysis -- 6.1 Introduction -- 6.2 Latent Semantic Analysis (LSA) -- 6.2.1 Singular Value Decomposition (SVD) -- 6.2.2 LSA Example -- 6.3 Cosine Similarity -- 6.4 Queries in LSA -- 6.5 Decision-Making: Choosing the Number of Dimensions -- References -- Further Reading -- Chapter 7: Cluster Analysis: Modeling Groups in Text -- 7.1 Introduction -- 7.2 Distance and Similarity -- 7.3 Hierarchical Cluster Analysis -- 7.3.1 Hierarchical Cluster Analysis Algorithm -- 7.3.2 Graph Methods -- 7.3.2.1 Single Linkage -- 7.3.2.2 Complete Linkage -- 7.3.3 Geometric Methods -- 7.3.3.1 Centroid -- 7.3.3.2 Ward's Minimum Variance Method -- 7.3.4 Advantages and Disadvantages of HCA.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 7.4 k-Means Clustering -- 7.4.1 kMC Algorithm -- 7.4.2 The kMC Process -- 7.4.3 Advantages and Disadvantages of kMC -- 7.5 Cluster Analysis: Model Fit and Decision-Making -- 7.5.1 Choosing the Number of Clusters -- 7.5.1.1 Subjective Methods -- 7.5.1.2 Graphing Methods -- Scree Plot -- Silhouette Plot -- 7.5.2 Naming/Describing Clusters -- 7.5.3 Evaluating Model Fit -- 7.5.4 Choosing the Cluster Analysis Model -- References -- Further Reading -- Chapter 8: Probabilistic Topic Models -- 8.1 Introduction -- 8.2 Latent Dirichlet Allocation (LDA) -- 8.3 Correlated Topic Model (CTM) -- 8.4 Dynamic Topic Model (DT) -- 8.5 Supervised Topic Model (sLDA) -- 8.6 Structural Topic Model (STM) -- 8.7 Decision Making in Topic Models -- 8.7.1 Assessing Model Fit and Number of Topics -- 8.7.2 Model Validation and Topic Identification -- 8.7.3 When to Use Topic Models -- References -- Further Reading -- Chapter 9: Classification Analysis: Machine Learning Applied to Text -- 9.1 Introduction -- 9.2 The General Text Classification Process -- 9.3 Evaluating Model Fit -- 9.3.1 Confusion Matrices/Contingency Tables -- 9.3.2 Overall Model Measures -- 9.3.2.1 Accuracy -- 9.3.2.2 Error Rate -- 9.3.3 Class-Specific Measures -- 9.3.3.1 Precision -- 9.3.3.2 Recall -- 9.3.3.3 F-Measure -- 9.4 Classification Models -- 9.4.1 Naïve Bayes -- 9.4.2 k-Nearest Neighbors (kNN) -- 9.4.3 Support Vector Machines (SVM) -- 9.4.4 Decision Trees -- 9.4.5 Random Forests -- 9.4.6 Neural Networks -- 9.5 Choosing a Classification -- 9.5.1 Model Fit -- References -- Further Reading -- Chapter 10: Modeling Text Sentiment: Learning and Lexicon Models -- 10.1 Lexicon Approach -- 10.2 Machine Learning Approach -- 10.2.1 Naïve Bayes (NB) -- 10.2.2 Support Vector Machines (SVM) -- 10.2.3 Logistic Regression -- 10.3 Sentiment Analysis Performance: Considerations and Evaluation -- References.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note Further Reading -- Part IV: Communicating the Results -- Chapter 11: Storytelling Using Text Data -- 11.1 Introduction -- 11.2 Telling Stories About the Data -- 11.3 Framing the Story -- 11.3.1 Storytelling Framework -- 11.3.2 Applying the Framework -- 11.4 Organizations as Storytellers -- 11.4.1 United Parcel Service -- 11.4.2 Zillow -- 11.5 Data Storytelling Checklist -- References -- Further Reading -- Chapter 12: Visualizing Analysis Results -- 12.1 Strategies for Effective Visualization -- 12.1.1 Be Purposeful -- 12.1.2 Know the Audience -- 12.1.3 Solidify the Message -- 12.1.4 Plan and Outline -- 12.1.5 Keep It Simple -- 12.1.6 Focus Attention -- 12.2 Visualization Techniques in Text Analytics -- 12.2.1 Corpus/Document Collection-Level Visualizations -- 12.2.2 Theme and Category-Level Visualizations -- 12.2.2.1 LSA Dimensions -- 12.2.2.2 Cluster-Level Visualizations -- 12.2.2.3 Topic-Level Visualizations -- 12.2.2.4 Category or Class-Level Visualizations -- 12.2.2.5 Sentiment-Level Visualizations -- 12.2.3 Document-Level Visualizations -- References -- Further Reading -- Part V: Text Analytics Examples -- Chapter 13: Sentiment Analysis of Movie Reviews Using R -- 13.1 Introduction to R and RStudio -- 13.2 SA Data and Data Import -- 13.3 Objective of the Sentiment Analysis -- 13.4 Data Preparation and Preprocessing -- 13.4.1 Tokenize -- 13.4.2 Remove Stop Words -- 13.5 Sentiment Analysis -- 13.6 Sentiment Analysis Results -- 13.7 Custom Dictionary -- 13.8 Out-of-Sample Comparison -- References -- Further Reading -- Chapter 14: Latent Semantic Analysis (LSA) in Python -- 14.1 Introduction to Python and IDLE -- 14.2 Preliminary Steps -- 14.3 Getting Started -- 14.4 Data and Data Import -- 14.5 Analysis -- Further Reading -- Chapter 15: Learning-Based Sentiment Analysis Using RapidMiner -- 15.1 Introduction -- 15.2 Getting Started in RapidMiner.
505 8# - FORMATTED CONTENTS NOTE
Formatted contents note 15.3 Text Data Import -- 15.4 Text Preparation and Preprocessing -- 15.5 Text Classification Sentiment Analysis -- Reference -- Further Reading -- Chapter 16: SAS Visual Text Analytics -- 16.1 Introduction -- 16.2 Getting Started -- 16.3 Analysis -- Further Reading -- Index.
588 ## -
-- Description based on publisher supplied metadata and other sources.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element Big data.
655 #4 - INDEX TERM--GENRE/FORM
Genre/form data or focus term Electronic books.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Hill, Chelsey.
700 1# - ADDED ENTRY--PERSONAL NAME
Personal name Nolan, Thomas.
776 08 - ADDITIONAL PHYSICAL FORM ENTRY
Display text Print version:
Main entry heading Anandarajan, Murugan
Title Practical Text Analytics : Maximizing the Value of Text Data
Place, publisher, and date of publication Cham : Springer,c2018
International Standard Book Number 9783319956626
797 2# - LOCAL ADDED ENTRY--CORPORATE NAME (RLIN)
Corporate name or jurisdiction name as entry element ProQuest (Firm)
830 #0 - SERIES ADDED ENTRY--UNIFORM TITLE
Uniform title Advances in Analytics and Data Science Ser.
856 40 - ELECTRONIC LOCATION AND ACCESS
Uniform Resource Identifier <a href="https://ebookcentral.proquest.com/lib/uttyler/detail.action?docID=5560078">https://ebookcentral.proquest.com/lib/uttyler/detail.action?docID=5560078</a>
Link text Click here to view this ebook.
901 ## - LOCAL DATA ELEMENT A, LDA (RLIN)
Platform EBC
901 ## - LOCAL DATA ELEMENT A, LDA (RLIN)
Platform EBL
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type Electronic Book
Source of classification or shelving scheme
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type Electronic Book
Source of classification or shelving scheme
Holdings
Withdrawn status Lost item Source of classification or shelving scheme Damaged status Not for loan Permanent Location Current Location Shelving location Date acquired Full call number Barcode Date last seen Uniform Resource Identifier Price effective from Koha item type
          UT Tyler Online UT Tyler Online Online 2020-01-23 HF4999.2-6182 EBC5560078 2020-01-23 https://ebookcentral.proquest.com/lib/uttyler/detail.action?docID=5560078 2020-01-23 Electronic Book