New Webinar: Modernising Without Destabilising: How Bread Financial Is Building Confidence Through Change

Learn more

New webinar with Bread Financial

Learn more
Contact us

Response Times

5 Ways to Negotiate Performance SLAs

<span id="hs_cos_wrapper_name" class="hs_cos_wrapper hs_cos_wrapper_meta_field hs_cos_wrapper_type_text" style="" data-hs-cos-general-type="meta_field" data-hs-cos-type="text" >5 Ways to Negotiate Performance SLAs</span>

Date 17 December 2017

Author Dr. Manzoor Mohammed

1) Use more than one measure for SLA response time

There may be a scenario where the service provider can meet either an average or a percentile response time target, but not both. In this case, the customer may experience unacceptable response times.

The classic real-world example that most people can relate to is train punctuality. E.g. A train operator may report that 90% of train journeys are on time; however, 10% of train journeys, which tend to occur during the busy hour, are late. The customers perception of the service is based on the 10% late journeys.

2) Use a meaningful sample for measuring

In certain circumstances, the number of measurements made will not be enough to make a meaningful sample for either average or percentile response times. Define the minimum number of samples needed to get a meaningful sample.

This must be balanced against Heisenberg’s principle, which says, the very fact of observing something changes its nature. Excessive sampling may result in a degradation of service response time.

3) Determine the arrival rate distribution

If this isn’t defined then its possible that the customer can batch requests and send them all at once to the service. The average rate over which the response time is measured will be the same, however the intensity of the arrival rate has a significant impact on the services ability to meet SLA response times.

4) Use an appropriate distribution for calculation response time percentiles

If the average response time target is known, a percentile response time target may be derived using a probability distribution. Two different probability distributions are typically used, exponential and normal.

Typically for a given percentile, the exponential distribution will predict a higher percentile response time than the normal distribution where the normal distribution doesn’t have a large standard deviation. Thus a SLA derived from an exponential distribution will favour the service provider over the client.

5) Determine your average and maximum throughput rates

Service performance cannot solely be measured using response time. Throughput is a key performance measure. When defining a SLA, both average and maximum throughput should be stated. An SLA based on average throughput will impair the service provider from ensuring appropriate capacity is in place to meet the SLA response times.

 

Dr. Manzoor Mohammed
About the author

Dr. Manzoor Mohammed

Manzoor co-founded Capacitas and pioneered its core principle and methodology – treating performance and cost as inseparable. His work has delivered quantifiable impact and value for clients, including Tinder, Qualtrics, Ancestry, and Cegid. He now leads the firm’s thinking on AI infrastructure, applying proven optimisation principles to a new generation of computationally demanding workloads.

FinOps and AI: Building the Financial Discipline for the Next Wave of Enterprise Intelligence

AI FinOps represents an evolution rather than a replacement of traditional FinOps. It extends the model into a domain where financial, technical, and product decisions are tightly interconnected.

Read insight

Confidence Under Load: How We Verified AKS Readiness for Peak

How Capacitas verified AKS readiness for peak demand by validating workload performance, autoscaling, cluster capacity, monitoring, and incident response.

Read insight

Building Cloud Resilience: Lessons from the AWS Outage

Learning from the Latest Outage. Events like this week’s AWS disruption highlight one clear truth: resilience must be designed, not assumed.

Read insight

Bringing Order to Chaos: A Practical Guide to Chaos Testing in the Cloud

In today’s cloud-native environments, resilience is not optional—it’s critical. Chaos testing has emerged as a key practice for validating system behaviour under failure conditions.

Read insight