tl;dr: This excerpt from my upcoming book, Practical Dashboards, is the fourth in an eight-part series on determining which metrics to visually flag on a dashboard (i.e., with alert dots, different-colored text, etc.) in order to draw attention to metrics that require it. In this post, I look at the “single-threshold” method of determining which metrics to flag and why, despite being extremely common, this method has several major drawbacks that become obvious when pointed out. In a later post in this series, I introduce a more useful approach called “four-threshold” visual flags.

One of the most common ways to determine which metrics to visually flag on a dashboard is what I call the “single-threshold” method. On dashboards that use this method, a threshold value (usually, a target of some type) is chosen for each metric and metrics that fall on the undesirable side of that threshold value get flagged:

While very common, this method has several major drawbacks and limitations that become obvious when pointed out:

Minor problems look the same as catastrophes. Because metrics are either flagged or not flagged, a metric that requires no attention because it’s only slightly below target looks the same on a dashboard as a metric that’s indicating a disaster in progress. Both are simply “flagged.” On a dashboard where multiple metrics are flagged, then, the user has no idea which ones to focus on first or if there are any real emergencies lurking among those flagged metrics. They must, therefore, carefully evaluate each flagged metric to determine how much attention it actually requires (assuming they have the necessary background knowledge to do that, which they often don’t).
Metrics that are doing unexpectedly well don’t get flagged at all. If the number of workplace accidents suddenly drops to a surprisingly low level, this would certainly warrant investigation to determine what could explain this welcome improvement, and what we might do so that it continues. This important development wouldn’t be flagged at all on a dashboard that uses single-threshold flags, though, and would, therefore, risk going unnoticed.

To address this last issue, dashboard creators sometimes show a “good” (e.g., green) flag beside metrics that fall on the desirable side of their threshold value:

This just produces what’s called “Christmas tree syndrome,” though, since almost all metrics get flagged almost all the time, even if they’re all behaving normally and none of them actually require attention. For example, the dashboard below is showing a perfectly normal day when no metrics actually require attention, but it still gets “lit up” with alert dots anyway, making it difficult or impossible to know if anything actually requires attention or not:

If we look at this same dashboard on the following day, though, as it turns out, it’s a terrible day when at least half a dozen metrics require urgent attention:

If a dashboard looks basically the same on a perfectly normal day and on a terrible day with multiple crises, well, that’s obviously a problem.

Single thresholds are also hard for users to set. When we ask users to set this type of threshold, we’re asking them to pick the point at which a given metric flips from being considered perfectly fine (not flagged) to a problem that needs to be dealt with (flagged), but this is virtually never the way that things work in reality. Instead, there’s almost always a range over which a metric would gradually go from being considered perfectly fine to vaguely concerning to a major concern to a crisis. Choosing one single threshold value to represent the entire “vaguely concerning to crisis” range isn’t just difficult, it’s impossible.

The next post in this series will discuss a variation on single-threshold flags, i.e., “% deviation from target” flags, and I’ll explain why they don’t work well either. Then, we’ll review the last of the four common-but-ineffective flagging methods that I see on dashboards: Good/Satisfactory/Poor ranges. After that, I’ll introduce the four-threshold flags that I now recommend since this type of visual flag doesn’t have any of the drawbacks or limitations that I list for the four common-but-ineffective types of flags. I'll then conclude the series with a post on useful statistics for setting visual flag thresholds automatically.

To be notified of future posts like this, subscribe to the Practical Reporting email list.