Real-time cost anomaly detection in cloud environments always sits on a tradeoff curve: the faster you detect potential overspend, the more likely you are to catch false positives caused by unsettled data. This challenge exists across all clouds, but it’s especially visible in Google Cloud due to how billing data is reported and exported.
Google Cloud services report usage and cost to Cloud Billing at varying intervals, and Cloud Billing frequently exports this data to BigQuery throughout the day on a best-effort basis, with no delivery or latency guarantees. Some services report quickly; others lag. That variability creates both powerful early-detection opportunities and subtle failure modes.
Let’s dive into a customer example of phantom anomalies in Google Cloud billing exports.
A phantom spike created by lagging credits
In this customer environment, Google Compute Engine (GCE) E2 Instance usage is fully covered by Committed Use Discounts (CUDs), meaning the net cost on those instance SKUs is expected to be $0—and the anomaly model is trained to expect exactly that.
On July 29, the environment incurred:
- Usage cost: $90
- CUD credits: –$90
- Final net cost: $0
Here’s how the data showed up:
- The cost (red) rose first and plateaued.
- The credits (green) arrived later and then reached the same level.
- The net cost (black) briefly spiked before settling back to zero.
For most of that day, net spend climbed above normal before plateauing at $18.62. Ternary’s anomaly detection flagged the spike. But midway through the next day, once credits had fully landed, the net cost returned to $0—making the anomaly appear to “disappear.”
Not a model error—it was a timing artifact
This phantom anomaly wasn’t the result of an incorrect threshold or a faulty model. It was a timing artifact. From a signal perspective, cost and credits are two related time series that are slightly out of phase. Subtracting them in real time naturally produces short-lived transients—even when the final value is zero.
From a data perspective, the anomaly was real when it occurred. It only became a “phantom” once the full credit amounts were posted, correcting the net cost after the fact.
Why does this happen more often in Google Cloud?
Because Google Cloud exports billing data to BigQuery frequently throughout the day (rather than as a once-daily build), it enables much earlier anomaly detection than clouds where billing data arrives in larger delayed batches. That’s a real advantage—but it also makes phantom anomalies more likely when offsets like credits arrive later.
In contrast, AWS and Microsoft Azure tend to hide more of this behavior behind slower or more aggregated delivery. Google Cloud surfaces both the upside and the risk of working closer to real time.
Speed vs. cleanliness: A tradeoff you can tune
This behavior highlights a key FinOps tradeoff:
- Early detection gives you faster insights but may include short-lived false positives.
- Waiting for billing to settle produces cleaner data but delays your response.
With Ternary, you can tune this tradeoff directly. By adjusting the anomaly grace period, you can choose your own balance between faster detection and a lower transient false-positive rate.
The takeaway
“Phantom anomalous spend” isn’t a bug. It’s a natural consequence of real-time detection layered on top of eventually consistent cloud billing. The goal isn’t to eliminate these signals entirely but to provide enough context and control to distinguish real runaway spending from short-lived reconciliation artifacts.
Because in cloud cost monitoring, when you look often matters as much as what you see.
See Ternary Anomaly Detection in action.