Wiley.com
Print this page Share

Practical Text Mining with Perl

ISBN: 978-0-470-17643-6
Hardcover
320 pages
August 2008
List Price: US $133.00
Government Price: US $92.12
Enter Quantity:   Buy
Practical Text Mining with Perl (0470176431) cover image
This is a Print-on-Demand title. It will be printed specifically to fill your order. Please allow an additional 10-15 days delivery time. The book is not returnable.

List of Figures.

List of Tables.

Preface.

Acknowledgments.

1. Introduction.

1.1 Overview of this Book.

1.2 Text Mining and Related Fields.

1.3 Advice for Reading this Book. 

2. Text Patterns.

2.1 Introduction.

2.2 Regular Expressions.

2.3 Finding Words in a Text.

2.4 Decomposing Poe's "The Tell-Tale Heart" into Words.

2.5 A Simple Concordance.

2.6 First Attempt at Extracting Sentences.

2.7 Regex Odds and Ends.

2.8 References. 

3. Quantitative Text Summaries.

3.1 Introduction.

3.2 Scalars, Interpolation, and Context in Perl.

3.3 Arrays and Context in Perl.

3.4 Word Lengths in Poe's "The Tell-Tale Heart".

3.5 Arrays and Functions.

3.6 Hashes.

3.7 Two Text Applications.

3.8 Complex Data Structures.

3.9 References.

3.10 First Transition. 

4. Probability and Text Sampling.

4.1 Introduction.

4.2 Probability.

4.3 Conditioned Probability.

4.4 Mean and Variance of random Variables.

4.5 The Bag-of-Words Model for Poe's :The Black Cat".

4.6 The Effect of Sample Size.

4.7 References. 

5. Applying Information Retrieval to Text Mining.

5.1 Introduction.

5.2 Counting Letters and Words.

5.3 Text Counts and Vectors.

5.4 The Term-Document Matrix Applied to Poe.

5.5 Matrix Multiplication.

5.6 Functions of Counts.

5.7 Document Similarity.

5.8 References. 

6. Concordance Lines and Corpus Linguistics.

6.1 Introduction.

6.2 Sampling.

6.3 Corpus as Baseline.

6.4 Concordancing.

6.5 Collocations and Concordance Lines.

6.6 Applications with References.

6.7 Second Transition. 

7. Multivariate Techniques with Text.

7.1 Introduction.

7.2 Basic Statistics.

7.3 Basic Linear Algebra.

7.4 Principal Component Matrices.

7.5 Text Applications.

7.6 Applications and References. 

8. Text Clustering.

8.1 Introduction.

8.2 Clustering.

8.3 A Note on Classification.

8.4 References.

8.5 Last Transition. 

9. A Sample of Additional Topics.

9.1 Introduction.

9.2 Perl Modules.

9.3 Other Languages: Analyzing Goethe in German.

9.4 Permutation Tests.

9.5 References. 

Appendix A. Overview of Perl for Text Mining.

A.1 Basic Data Structures.

A.2 Operators.

A.3 Branching and Looping.

A.4 A Few Functions.

A.5 Introduction to Regular Expressions. 

Appendix B. Summary of R used in this Book

B.1 Basics of R.

B.2 This Book's R Code..

References.

Index.

Related Titles

Database & Data Warehousing Technologies

by Huma M. Lodhi (Editor), Stephen H. Muggleton (Editor), Yi Pan (Series Editor), Albert Y. Zomaya (Series Editor)
by Laurence T. Yang
by Alex Kriegel, Boris M. Trukhnov
by Paul Turley, Thiago Silva, Bryan C. Smith, Ken Withee
Back to Top