Single-threshold flags on dashboards: Very common and very problematic (book excerpt)

tl;dr: This excerpt from my upcoming book, Beyond Dashboards, is the fourth in an eight-part series on determining which metrics to visually flag on a dashboard (i.e., with alert dots, different-colored text, etc.) in order to draw attention to metrics that require it. In this post, I look at the “single-threshold” method of determining which metrics to flag and why, despite being extremely common, this method has several major drawbacks that become obvious when pointed out. In a later post in this series, I introduce a more useful approach called “four-threshold” visual flags.

One of the most common ways to determine which metrics to visually flag on a dashboard is what I call the “single-threshold” method. On dashboards that use this method, a threshold value is chosen for each metric and metrics that fall below (or, for some metrics, above) that threshold value get flagged:

metrics w red dots.png

While very common, this method has several major drawbacks and limitations that become obvious when pointed out:

  • The thresholds are hard for users to set. When we ask users to set this type of threshold, we’re asking them to pick the point at which a given metric flips from being considered perfectly fine (not flagged) to a problem that needs to be dealt with (flagged), but this is virtually never the way that things work in reality. Instead, there’s almost always a range over which a metric would gradually go from being considered perfectly fine to vaguely concerning to a major concern to a crisis. Choosing one single value to represent the entire “vaguely concerning to crisis” range isn’t just difficult, it’s impossible.

  • Minor problems look the same as catastrophes. Because metrics are either flagged or not flagged, a metric that’s a little lower than we’d like it to be looks the same on a dashboard as a metric that’s indicating the worst problem that’s occurred in years. Both are simply “flagged.” On a dashboard where multiple metrics are flagged, then, the user has no idea which ones to focus on first or if there are any real emergencies lurking among those flagged metrics.

  • Metrics that are doing unexpectedly well don’t get flagged at all. If the number of workplace accidents suddenly drops to a surprisingly low level, this would certainly merit investigation to determine what could explain this welcome change and what we might do so that it continues. This important development wouldn’t be flagged at all on a dashboard that uses single-threshold flags, though, and would, therefore, risk going unnoticed.

Sometimes, dashboard creators show a “good” (e.g., green) flag beside metrics that fall on the desirable side of their threshold value:

metrics w red green dots.png

This just produces “Christmas tree syndrome,” though, since almost all metrics get flagged almost all the time, and drawing attention to everything all the time obviously isn’t useful. It also means that metrics that fall within a completely normal range look the same as metrics that are doing extraordinarily well (both just have green flags), so users still can’t tell if a flagged metric requires attention or not.

The problems I’ve discussed here are often compounded by the fact that people in the organization may have different understandings of what single-threshold values are actually supposed to represent. For example, people might variously assume that they’re the point at which a metric becomes a minor concern, a major concern, a crisis, the point at which a user would be required to take action, or the point at which a metric drops below a limit set in a service level agreement. Without a clear, agreed-upon definition of exactly what threshold values are supposed to actually mean, users will find it even more difficult to set them and they’ll have even less meaning.

The next post in this series will discuss a variation on single-threshold flags, which I call “% deviation from target” flags, and I’ll explain why they don’t work effectively either. Then, we’ll review the last of the four common-but-ineffective flagging methods that I see on dashboards: Good/Satisfactory/Poor ranges. After that, I’ll introduce the four-threshold flags that I now recommend since this type of visual flag doesn’t have any of the drawbacks or limitations that I list for the four common-but-ineffective types of flags. I'll then conclude the series with a post on useful statistics for setting visual flag thresholds automatically.


To be notified of future posts like this, subscribe to the Practical Reporting email list.