Monitoring

Monitoring tools capture, analyze and display information for a web application's execution. Every application has issues arise throughout all levels of the web stack. Monitoring tools provide transparency so developers and operations teams can respond and fix problems.

Why is monitoring necessary?

Capturing and analyzing data about your production environment is critical to proactively deal with stability, performance, and errors in a web application.

Difference between monitoring and logging

Monitoring and logging are very similar in their purpose of helping to diagnose issues with an application and aid the debugging process. One way to think about the difference is that logging happens based on explicit events while monitoring is a passive background collection of data.

For example, when an error occurs, that event is explicitly logged through code in an exception handler. Meanwhile, a monitoring agent instruments the code and gathers data not only about the logged exception but also the performance of the functions.

This distinction between logging and monitoring is vague and not necessarily the only way to look at it. Pragmatically, both are useful for maintaining a production web application.

Monitoring layers

There are several important resources to monitor on the operating system and network level of a web stack.

  1. CPU utilization
  2. Memory utilization
  3. Persistence storage consumed versus free
  4. Network bandwidth and latency

Application level monitoring encompasses several aspects. The amount of time and resources dedicated to each aspect will vary based on whether an application is read-heavy, write-heavy, or subject to rapid swings in traffic.

  1. Application warnings and errors (500-level HTTP errors)
  2. Application code performance
  3. Template rendering time
  4. Browser rendering time for the application
  5. Database querying performance

Open source monitoring projects

  • Sentry started life as a Python-only monitoring project but can now be used for any programming language.

  • Service Canary

  • ping.gg (source code)

  • glances (source code)

  • statsd is a node.js network daemon that listens for metrics and aggregates them for transfer into another service such as Graphite.

  • Graphite stores time-series data and displays them in graphs through a Django web application.

  • Sensu is an open source monitoring framework written in Ruby but applicable to any programming language web application.

  • Graph Explorer by Vimeo is a Graphite-based dashboard with added features and a slick design.

  • Munin is a client plugin-based monitoring system that sends monitoring traffic to the Munin node where the data can be analyzed and visualized. Note this project is written in Perl so Perl 5 must be installed on the node collecting the data.

  • Bucky measures the performance of a web application from end user's browsers and sends that data back to the server for collection.

Hosted monitoring services

Hosted monitoring software takes away the burden of deploying and operating the software yourself. However, hosted monitoring costs (often a significant amount of) money and take your application's data out of your hands so these services are not the right fit for every project.

Error Tracking

  • Rollbar instruments both the server side and client side to capture and report exceptions. The pyrollbar code library provides quick integration for Python web applications. There are also specific instructions for common web frameworks such as Django and Pyramid.
  • Sentry is the hosted version of the open source tool that is used to monetize and support further development.

Application Performance Monitoring (APM)

  • New Relic provides application and database monitoring as well as plug ins for capturing and analyzing data about other developer tools in your stack, such as Twilio.
  • Opbeat Built for django. Opbeat combines performance metrics, release tracking, and error logging into a single simple service.
  • Scout monitors the performance of Django and Flask apps, auto-instrumenting views, SQL queries, templates, and more.

Status Pages

  • Status.io focuses on uptime and response metrics transparency for web applications.
  • StatusPage.io (yes, there's both a Status and StatusPage.io) provides easy set up status pages for monitoring application up time.

Incident Management

  • PagerDuty alerts a designated person or group if there are stability, performance, or uptime issues with an application.

Monitoring resources

Monitoring learning checklist

  1. Review the software-as-a-service and open source monitoring tools below. Third party services tend to be easier to set up and host the data for you. Open source projects give you more control but you'll need to have additional servers ready for the monitoring.

  2. My recommendation is to install New Relic's free option with the trial period to see how it works with your app. It'll give you a good idea of the capabilities for application-level monitoring tools.

  3. As your app scales take a look at setting up one of the the open source monitoring projects such as StatsD with Graphite. The combination of those two projects will give you fine-grained control over the system metrics you're collecting and visualizing.

What do you want to learn about next?

How do I automate server configuration and deployments?

I want to learn more about app users via web analytics.

What are web application programming interfaces (APIs)?


Matt Makai 2012-2022