Data analysis

Data analysis involves a broad set of activities to clean, process and transform a data collection to learn from it. Python is commonly used as a programming language to perform data analysis because many tools, such as Jupyter Notebook, pandas and Bokeh, are written in Python and can be quickly applied rather than coding your own data analysis libraries from scratch.

Data analysis resources

  • The following series on data exploration uses Python as the implementation language while walking through various stages of how to analyze a data set.

    • Part 1 gives insight into how you should think about data and clarify what you are looking to learn.
    • Part 2 explains categorization and transforming a data set into one that is easier to analyze.
    • Part 3 shows how to visualize the results of your data exploration.
  • The Python Data Science Handbook is available to read for free online, although I also recommend buying the book as it is a great resource for learning the topic.

  • PyData TV contains all the videos from the PyData conference series. The conference talks are often given by professional data scientists and the developers who write these analysis libraries, so there is a wealth of information not necessarily captured anywhere else.

  • Python Plotting for Exploratory Data Analysis is a great tutorial on how to use simple data visualizations to bootstrap your understanding of a data set. The walkthrough covers histograms, time series analysis, scatter plots and various forms of bar charts.

What else would you like to learn about Python and data?

Tell me about standard relational databases.

What're these NoSQL data stores hipster developers keep talking about?

Why is Python a good programming language to use?

Sign up for two emails per month with Python tutorials and Full Stack Python updates.

Full Stack Python

Full Stack Python is an open book that explains concepts in plain language and provides helpful resources for those topics.
Updates via newsletter, Twitter & Facebook.
1. IntroductionLearning ProgrammingWhy Use Python?Python 2 or 3?Enterprise PythonPython CommunityCompanies using PythonBest Python ResourcesBest Python VideosBest Python Podcasts2. Development EnvironmentsText Editors & IDEsVimEmacsSublime TextJupyter NotebookShellsBash shellZshEnvironment configurationApplication DependenciesSource ControlGitMercurialApache SubversionHosted Source ControlGitHubBitBucketGitLab3. Core LanguageGeneratorsComprehensions4. TestingUnit TestingIntegration TestingCode MetricsDebuggingLoggingMarkdown6. Security7. Web DevelopmentWeb FrameworksDjangoFlaskBottlePyramidFalconMorepathSanicOther Web FrameworksTemplate EnginesJinja2MakoDjango TemplatesWeb DesignCascading Style Sheets (CSS)HTMLResponsive DesignMinificationBootstrapJavaScriptTask QueuesCeleryRedis Queue (RQ)DramatiqStatic Site GeneratorsPelicanLektorMkDocsWebSocketsuvloop8. DeploymentServersStatic ContentPlatform-as-a-ServiceVirtual Private ServersOperating SystemsUbuntuWeb ServersApache HTTP ServerNginxCaddyWSGI ServersGreen Unicorn (Gunicorn)Continuous IntegrationJenkinsConfiguration ManagementAnsibleDockerServerlessAWS LambdaGoogle Cloud Functions9. DataRelational DatabasesPostgreSQLMySQLSQLiteObject-relational MappersSQLAlchemyPeeweeDjango ORMSQLObjectPony ORMNoSQL Data StoresRedisMongoDBApache CassandraNeo4jData analysispandasBokehd3.js10. Web APIsMicroservicesBotsAPI CreationAPI IntegrationTwilio11. DevOpsMonitoringCachingRollbarWeb Analytics12. Change LogWhat Full Stack MeansAbout the AuthorFuture DirectionsPage Statuses ...or view all topics.

Matt Makai 2012-2018