The Python Data Analysis Library (pandas) is a data structures and analysis library.
Intro to pandas data structures, working with pandas data frames and Using pandas on the MovieLens dataset is a well-written three-part introduction to pandas blog series that builds on itself as the reader works from the first through the third post.
pandas exercises is a GitHub repository with Jupyter Notebooks that let you practice sorting, filtering, visualizing, grouping, merging and more with pandas.
A simple way to anonymize data with Python and Pandas is a good tutorial on removing sensitive data from your unfiltered data sets.
Time Series Analysis with Pandas show you how to combine Python 3.6, pandas, matplotlib and seaborn to analyze and visualize open data from Germany's power grid. This is a great tutorial to learn these tools with a realistic data set.
Analyzing a photographer's flickr stream using pandas explains how the author grabbed a bunch of Flickr data using the flickr-api library then analyzed the EXIF data in the photos using pandas.
Pandas Crosstab Explained
shows how to use the
crosstab function in pandas so you can summarize
and group data.
This two-part series on loading data into a pandas DataFrame presents what to do when CSV files do not match your expectations and how to handle missing values so you can start performing your analysis rather than getting frustrated with common issues at the beginning of your workflow.
Building a financial model with pandas explains how to create an amortization schedule with corresponding table and charts that show the pay off period broken down by interest and principal.
tabula-py: Extract table from PDF into Python DataFrame presents how to use the Python wrapper for the Tabula library that makes it easier to extract table data from PDF files.
Time Series Forecast Case Study with Python: Monthly Armed Robberies in Boston walks through the data wrangling, analysis and visualization steps with a public data set of murders in Boston from 1966 to 1975. This particular data problem may not be your thing but by going through the process you can learn a lot that can be applied to any data set.
A Gentle Visual Intro to Data Analysis in Python Using Pandas
presents spreadsheet-like pictures to show conceptually what
pandas is doing with your data as you apply various functions like
Data Manipulation with Pandas: A Brief Tutorial uses some example data sets to show how the most commonly-used functions in pandas work.
Analyzing Pronto CycleShare Data with Python and Pandas uses Seattle bikeshare data as a source for wrangling, analysis and visualization.
Stylin' with pandas shows how to add colors and sparklines to your output when using pandas for data visualization.
Python and JSON: Working with large datasets using Pandas is a well-done detailed tutorial that shows how to mung and analyze JSON data.
Fun with NFL Stats, Bokeh, and Pandas uses National (American) Football League data as a source for wrangling and visualization.
Analyzing my Spotify Music Library With Jupyter And a Bit of Pandas shows how to grab all of your user data from the Spotify API then analyze it using pandas in Jupyter Notebook.
Scalable Python Code with Pandas UDFs explains that pandas operations can often be parallelized for better performance using the Pandas UDFs feature in PySpark version 2.3 or greater.
Deploy web apps with the Ansible configuration management tool.