What Really Matters in a Benchmark? Ten Questions to Ask Your Vendor

May 22, 2013 by Dave Labuda

As the market for charging and policy systems has heated up thanks to explosive mobile broadband and application growth, so has the rhetoric around performance. Many vendors promote their own benchmarks, which creates confusion for CSP buyers. In reality, the only benchmarks that matter are those executed in conjunction with operators on their premises. To sort through the chaos and extravagance, CSPs need to know the right questions to ask, and factors to consider, when determining whether a proffered benchmark has real meaning.

What are the claims?

We see some eye-popping claims in the charging and policy market that appear tough to justify. For example, we’ve seen vendors cite enormous transaction per second (TPS) volumes, claiming they could manage every call in the world in real-time on a single system. We have also seen sudden 100 percent improvements in performance, ultra-low latency, and even the ability to support 3 billion users – half the world’s mobile population – on a single platform.

Performance and Scalability Matter

Whether any of these claims have merit, it’s clear that performance and scalability are very important today. There’s little or no debate that real-time transaction processing capabilities are necessary to monetize mobile data and LTE services. That said, traditional real-time processing is 10 times as expensive as batch processing. This is a real problem because traffic growth is far outpacing revenue growth. Operators already maintain massive infrastructure and data centers, so there is increasing pressure to reduce operational costs. Simply scaling up existing real-time infrastructure isn’t sustainable; it costs too much. In response, operators are pushing increasing TPS requirements, coupled with stringent SLAs, into their RFIs and RFPs.

10 Factors to Consider When Evaluating Benchmarks

When benchmark numbers come back in response to these RFPs, how can operators make sense of them? Here are 10 things operators should consider as they wade through the measurements.

1. Who executed the benchmark? A benchmark conducted by a vendor within its own lab environment isn’t worth much. It’s important to look for benchmarks that CSPs have executed on their premises with their actual price plans and business models in place. This is the best way to get a reliable benchmark for how a solution will perform in production

2. Are the measurements clearly defined? There’s no industry standard definition for a real-time transaction, so what’s measured needs to be defined clearly. The best measurement in a mobile charging and policy environment is from “diameter in, to diameter out,” including transaction replication. This measure should include the full path of the transaction entering the system; through processing and synchronous replication; and through the response back to the network. If the measurement only includes one piece of this puzzle, it will be misleading and of less value

3. Which product was measured? It needs to be 100 percent clear which commercially-released product was measured when the benchmark was conducted. The benchmark should measure an available, off-the-shelf solution that is not executed on customized versions of hardware or software

4. What hardware was used? The hardware configuration used in the benchmark must be clearly stated. This way the CSP can compare benchmarks while accounting for size, or scale, and the cost of the hardware configuration

5. Was a high availability configuration used for the benchmark? Real-time charging and policy systems must execute in a fully redundant, highly available production environment to ensure they meet operator SLAs. Often, performance claims are overstated because the processing excludes synchronous replication and/or disk logging. Any benchmark based on a non-realistic or non-production configuration is useless in measuring actual performance and scalability

6. Were subscribers fragmented? Some vendors create parallelism by dividing the subscriber database into many small sets of data that run independently. While this generates ‘big performance numbers,’ it’s not a production-worthy configuration. In reality, breaking subscribers up creates an administrative nightmare both from a system management and system integration perspective. Additionally, with the popularity of data sharing plans, this approach creates artificial boundaries between subscribers which inhibit the ability to share balances

7. Are the business scenarios described and published? If, for example, the benchmark is based on 1 billion 1 minute phone calls all rated at 10 cents, the test isn’t based on realistic conditions. To get true performance stats, the system has to perform against a mix of actual usage and traffic patterns that will include complex consumer and corporate tariffs and discounts

8. Is the subscriber-to-traffic ratio realistic? It doesn’t mean much to claim that a system can support 250 million subscribers if the benchmark only reflects 20 percent of what would be a peak traffic load. Be especially wary of mobile data benchmarks using artificially large quotas to bring down the traffic load

9. Are authorization and resource reservation included in online charging scenarios? They have to be included for all billable transactions. If they aren’t, it effectively defeats the purpose of measuring real-time performance

10. Can the benchmark be repeated? Like any scientific finding, the results aren’t fact if they can’t be replicated. The benchmark should be repeatable on the CSP’s site. If they can’t be, or a vendor won’t stand by their published numbers, it’s a good bet there was some manipulation or fuzzy math involved.

This entry was tagged Benchmarking, Real-time charging