更新时间:2021-07-16 20:14:41
coverpage
Mastering Python for Data Science
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files eBooks discount offers and more
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Chapter 1. Getting Started with Raw Data
The world of arrays with NumPy
Empowering data analysis with pandas
Data cleansing
Data operations
Summary
Chapter 2. Inferential Statistics
Various forms of distribution
A z-score
A p-value
One-tailed and two-tailed tests
Type 1 and Type 2 errors
A confidence interval
Correlation
Z-test vs T-test
The F distribution
The chi-square distribution
The chi-square test of independence
ANOVA
Chapter 3. Finding a Needle in a Haystack
What is data mining?
Presenting an analysis
Studying the Titanic
Chapter 4. Making Sense of Data through Advanced Visualization
Controlling the line properties of a chart
Creating multiple plots
Playing with text
Styling your plots
Box plots
Heatmaps
Scatter plots with histograms
A scatter plot matrix
Area plots
Bubble charts
Hexagon bin plots
Trellis plots
A 3D plot of a surface
Chapter 5. Uncovering Machine Learning
Different types of machine learning
Decision trees
Linear regression
Logistic regression
The naive Bayes classifier
The k-means clustering
Hierarchical clustering
Chapter 6. Performing Predictions with a Linear Regression
Simple linear regression
Multiple regression
Training and testing a model
Chapter 7. Estimating the Likelihood of Events
Chapter 8. Generating Recommendations with Collaborative Filtering
Recommendation data
User-based collaborative filtering
Item-based collaborative filtering
Chapter 9. Pushing Boundaries with Ensemble Models
The census income dataset
Random forests
Chapter 10. Applying Segmentation with k-means Clustering
The k-means algorithm and its working
The k-means clustering with countries
Clustering the countries
Chapter 11. Analyzing Unstructured Data with Text Mining
Preprocessing data
Creating a wordcloud
Word and sentence tokenization
Parts of speech tagging
Stemming and lemmatization
The Stanford Named Entity Recognizer
Performing sentiment analysis on world leaders using Twitter
Chapter 12. Leveraging Python in the World of Big Data
What is Hadoop?
Python MapReduce
File handling with Hadoopy
Pig