Python data science handbook by jake vanderplas

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Python Data Science Handbook: full text in Jupyter Notebooks

License

MIT, Unknown licenses found

Licenses found

jakevdp/PythonDataScienceHandbook

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Читайте также:  Golang markdown to html

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Python Data Science Handbook

This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.

cover image

  • Read the book in its entirety online at https://jakevdp.github.io/PythonDataScienceHandbook/
  • Run the code using the Jupyter notebooks available in this repository’s notebooks directory.
  • Launch executable versions of these notebooks using Google Colab:
  • Launch a live notebook server with these notebooks using binder:
  • Buy the printed book through O’Reilly Media

The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.

The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages. Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A Whirlwind Tour of Python: it’s a fast-paced introduction to the Python language aimed at researchers and scientists.

See Index.ipynb for an index of the notebooks available to accompany the text.

The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.

The packages I used to run the code in the book are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:

$ conda install --file requirements.txt 

To create a stand-alone environment named PDSH with Python 3.5 and all the required package versions, run the following:

$ conda create -n PDSH python=3.5 --file requirements.txt 

You can read more about using conda environments in the Managing Environments section of the conda documentation.

Читайте также:  Css красивая кнопка ссылка

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.

About

Python Data Science Handbook: full text in Jupyter Notebooks

Источник

Python Data Science Handbook

This website contains the full text of the Python Data Science Handbook by Jake VanderPlas; the content is available on GitHub in the form of Jupyter notebooks.

The text is released under the CC-BY-NC-ND license, and code is released under the MIT license.

If you find this content useful, please consider supporting the work by buying the book!

Table of Contents¶

Preface¶

1. IPython: Beyond Normal Python¶

2. Introduction to NumPy¶

3. Data Manipulation with Pandas¶

  • Introducing Pandas Objects
  • Data Indexing and Selection
  • Operating on Data in Pandas
  • Handling Missing Data
  • Hierarchical Indexing
  • Combining Datasets: Concat and Append
  • Combining Datasets: Merge and Join
  • Aggregation and Grouping
  • Pivot Tables
  • Vectorized String Operations
  • Working with Time Series
  • High-Performance Pandas: eval() and query()
  • Further Resources

4. Visualization with Matplotlib¶

  • Simple Line Plots
  • Simple Scatter Plots
  • Visualizing Errors
  • Density and Contour Plots
  • Histograms, Binnings, and Density
  • Customizing Plot Legends
  • Customizing Colorbars
  • Multiple Subplots
  • Text and Annotation
  • Customizing Ticks
  • Customizing Matplotlib: Configurations and Stylesheets
  • Three-Dimensional Plotting in Matplotlib
  • Geographic Data with Basemap
  • Visualization with Seaborn
  • Further Resources

5. Machine Learning¶

  • What Is Machine Learning?
  • Introducing Scikit-Learn
  • Hyperparameters and Model Validation
  • Feature Engineering
  • In Depth: Naive Bayes Classification
  • In Depth: Linear Regression
  • In-Depth: Support Vector Machines
  • In-Depth: Decision Trees and Random Forests
  • In Depth: Principal Component Analysis
  • In-Depth: Manifold Learning
  • In Depth: k-Means Clustering
  • In Depth: Gaussian Mixture Models
  • In-Depth: Kernel Density Estimation
  • Application: A Face Detection Pipeline
  • Further Machine Learning Resources
Читайте также:  Python string formatting binary

Appendix: Figure Code¶

© 2012-2017 Jake VanderPlas, license unless otherwise noted. Generated by Pelican.

Источник

Python Data Science Handbook

Publication date 2022-07-02 Usage Attribution-NonCommercial-NoDerivs 4.0 InternationalCreative Commons License Topics Python, Data Science, programming, coding, book, python books Collection folkscanomy_computer_inbox; folkscanomy_computer; folkscanomy; additional_collections Language English

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all-IPython. NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.

Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.

Addeddate 2022-07-01 18:49:40 Identifier python-data-science-handbook.pdf Identifier-ark ark:/13960/s22v8fp04sg Ocr tesseract 5.1.0-1-ge935 Ocr_detected_lang en Ocr_detected_lang_conf 1.0000 Ocr_detected_script Latin Ocr_detected_script_conf 1.0000 Ocr_module_version 0.0.16 Ocr_parameters -l eng Page_number_confidence 96.71 Ppi 300 Scanner Internet Archive HTML5 Uploader 1.6.4

Источник

Оцените статью