Debugging API errors in production

Published on 2021-05-12

After developing and deploying an API, most developers will track server errors and exceptions using application monitoring systems such as Sentry or AppSignal. These tools are great to find such errors, however, they most likely won’t track so-called client errors. These are the errors that fall into the HTTP status code group that starts with a 4. Important examples include:

  • 400 Bad Request
  • 401 Unauthorized
  • 403 Forbidden
  • 404 Not Found
  • 422 Unprocessable Entity

The list is much longer, have a look at the Wikipedia client errors list for example.

Most of these errors are caused by the client doing something wrong. For example, requesting a resource that does not exist anymore or sending in invalid data when trying to modify a resource. The server application handled this correctly and did not throw an exception, it informed the requester of the error with this HTTP status code and possibly some human readable description in the body content of the response. As you’ll understand this might result in a lot of noise when this would be tracked in an application performance monitoring system.

Are your users experiencing them in production?

To keep your API users happy, it is important to help them resolve these errors. In API clients that were built by less experienced developers they might go unnoticed to them. So how can you help, if they don’t detect or handle them themselves? You can use Callcounter to detect client errors with the status code chart. This is a doughnut or pie chart which shows the distribution of HTTP status codes in the selected period. Client errors are shown in yellow, server errors are shown in red while success status codes are green.

Using the chart one of the Callcounter users discovered his API experienced a 75% error rate for all requests without him ever realising this. The doughnut chart was only 25% green! After filtering on some of the above status codes he discovered that one API client was requesting all kinds of old data again and again, every hour, every day. Quite a big waste of resources on both sides and nobody noticed until integrating Callcounter!

Help new API users and keep existing ones happy

As the example demonstrates, it is really useful to keep an eye on these client errors in production. It’s a different kind of debugging than most developers are used to, but it really helps the users. Just have a look once a week and see what percent of the requests are yellow.

When enough Callcounter customers find it useful we might implement an automatic email alert if the percentage of last week/day was above a threshold.