Data analysis

Data analysis involves a broad set of activities to clean, process and transform a data collection to learn from it. Python is commonly used as a programming language to perform data analysis because many tools, such as Jupyter Notebook, pandas and Bokeh, are written in Python and can be quickly applied rather than coding your own data analysis libraries from scratch.

Data analysis resources

  • The following series on data exploration uses Python as the implementation language while walking through various stages of how to analyze a data set.

    • Part 1 gives insight into how you should think about data and clarify what you are looking to learn.
    • Part 2 explains categorization and transforming a data set into one that is easier to analyze.
    • Part 3 shows how to visualize the results of your data exploration.
  • The Python Data Science Handbook is available to read for free online, although I also recommend buying the book as it is a great resource for learning the topic.

  • PyData TV contains all the videos from the PyData conference series. The conference talks are often given by professional data scientists and the developers who write these analysis libraries, so there is a wealth of information not necessarily captured anywhere else.

  • Python Plotting for Exploratory Data Analysis is a great tutorial on how to use simple data visualizations to bootstrap your understanding of a data set. The walkthrough covers histograms, time series analysis, scatter plots and various forms of bar charts.

What else would you like to learn about Python and data?

Tell me about standard relational databases.

What're these NoSQL data stores hipster developers keep talking about?

Why is Python a good programming language to use?

Sign up for a monthly email with Full Stack Python tutorials. No spam ever.


Matt Makai 2012-2018