How we help The T5 approachTeam focusedWe work with our clients, we listen and make choices together.User-centredWe understand clients’ goals and focus on outcomes that our clients’ customers will love.OptimisedWe help clients’ streamline and optimise so they can safely and repeatedly reach their Start delivering value fasterWe provide consultancy and leadership in cloud strategy, adoption and migration, including re-platforming and re-architecting. Who we work withWe provide product, technical and delivery consultancy services across a number of industry sectors in the UK and Europe.Our clients range from regulated financial institutions to creative start-ups.Our mission is to bring the highest quality engineering, plus Lean, Agile and DevOps practices to the smartest digital companies. We create optimisationWe help our clients to rapidly deliver value faster through clever optimisation.We do this by combining technologies and techniques that promote high quality, rapid delivery and by creating outcome driven, user-centred teams. One size does not fit allWe believe in implementing principles not frameworks. We take a holistic approach and understand that successful transformation requires more than just focusing on technical deliverables. We understand product developmentWe know how to successfully incorporate the software development life cycle into the product development life cycle and how to align teams to value streams. The conscious effort to remain customer focussed/end user focussed at all times, combined with agile delivery methods, enable organisations to achieve more predictable, valuable outcomes. There is no single agile methodology or framework that outlines how to create successful products for users. Understanding users is unique to each organisation and success comes from taking a measured, iterative approach to releasing value. Lean, Agile and DevOps practices, enable us to incorporate actionable feedback loops, which give teams the ability to make timely improvements based on data, by feeding both technical and user feedback back into the life cycle. Deliver smarterSmart engineering accelerates delivery through a culture of collaboration. Building the future of technologyWe believe that truly digitally transformed organisations deliver better digital products and services to their customers.We help our clients to rapidly deliver value faster through clever optimisation.We do this by combining technologies and techniques that promote high quality, rapid delivery and by creating outcome driven, user-centred teams. Our mission is to bring the highest quality engineering, plus Lean, Agile and DevOps practices to the smartest digital companies. What we do Lean, agile devopsIt's always our approach to how we work. We offer ways of working consultancy, transformation strategy, leadership and implementation across the whole product development life cycle.We provide consultancy and leadership in digital transformation, cloud strategy, adoption and migration, including re-platforming and re-architecting.We provide consultancy and leadership in digital transformation, cloud strategy, adoption and migration, including re-platforming and re-architecting. Read more Smart engineering Read more Cloud first Read more Optimise teams and technologyBuild technical and delivery capabilityDeliver working increments of value rapidlyWorking collaboratively - with your team, building capability from withinLeadership and skills transfer What we doWe believe that truly digitally transformed organisations deliver better digital products and services to their customers. The conscious effort to remain customer focussed/end user focussed at all times, combined with agile delivery methods, enable organisations to achieve more predictable, valuable outcomes. Technology Techniques Teams Team focused User-centered Optimised 1 2 3 4 5

Uncovering the limitations of Amazon ec2 burstable instances

Nowadays, cloud computing is a mature, widely established model to efficiently develop and operate applications at scale. Despite this, the limitations that lurk beneath cloud vendor abstractions can still easily take engineers by surprise. Learn from my experience and avoid the pitfalls I recently encountered working with the Amazon EC2 burstable instance model.

Project background

I joined a product development team to help them increase the capacity and observability of their B2B e-commerce application. The aim was to make the system resilient against an anticipated peak traffic season looming ever closer.

The tech stack comprised two PHP web applications running on a small cluster of Amazon EC2 instances distributed across two environments, namely:

  • Development: dev-1, dev-2 and dev-3
  • Production: prod-1, prod-2 and prod-3

 

We had just added several alerts and dashboards to Datadog, our centralised application monitoring platform. The configuration of the alerts was mirrored across both environments to obtain early warning of concerning trends in Development before they had the chance to emerge in Production.

A busy server

The team set up an alarm to monitor CPU utilisation for all the EC2 instances in the platform. The alarm would trigger if CPU consumption remained higher than 60% on average for a few minutes. Sometime later, this notification appeared in our non-production alerts channel.

 

There are an array of reasons why available CPU capacity can dip dramatically for a short period in a development server, even if the server is typically quiet:

  • Someone running ad-hoc tests on the box
  • An expensive, scheduled process or web request
  • A large system update

The first time this happened we didn’t put much effort into discerning the underlying cause. We used the above mundane reasons to explain the problem away and moved on to more pressing matters. However, this wasn’t the last we’d see of this alert.

What is Lambda@Edge?

CloudFront Lambda@Edge is a form of edge computing on the AWS cloud that enables you to run Python or Node.js Lambda functions directly on the AWS CloudFront CDN. Edge functions provide lightweight processing capabilities and deliver responses with extremely low latency, as they run across worldwide distributed edge locations that are very close to end users. This makes them ideal for use cases such as transparent traffic redirection and manipulation as well as UX/UI experiments that don’t negatively impact the performance of your site.AWS also launched CloudFront Functions in 2021. CloudFront Functions are a lighter flavour of CloudFront edge compute —very similar to CloudFlare Workers— with better performance than Lambda@Edge but more constraints in their capabilities and the type of requests they’re able to handle. This makes them suitable for a narrower set of use cases that are outside of the scope of this blog post.Edge computing has been gaining a lot of traction lately, as more companies begin to ship their own edge compute products, enabling people to build immensely scalable and fast frontend experiences to their user base.

The mysterious spike strikes again

Following this first instance, it wasn’t long before the high CPU utilisation alarms for dev-2 became a regular occurrence.

We could no longer dismiss this as a transient issue so we decided to look closer at the problem. Sure, testing in the environment could increase the workload on the instance but not to the extreme levels our alerts were showing. What was even weirder is that dev-1 or dev-3 were not suffering from the same problem! Nonetheless, there were a few obvious explanations we could rule out to narrow things down.

A denial-of-service attack

Our system had recently become the target of a few low-scale DDoS attacks. Crawlers and bots, both benign and malicious, probe our public endpoints routinely for weaknesses so this explanation didn’t sound far-fetched at all. However, neither our quiet traffic gauges nor the uninteresting makeup of our incoming HTTP requests backed up this hypothesis.

The instance was genuinely busy dealing with legitimate traffic

One peculiarity of dev-2 was that it was the only instance in the Development cluster that processed messages from our application queues, which could explain why its resource utilisation graphs looked busier than the other two. There again, worker thread concurrency was set to 1 and the message processing activities weren’t particularly taxing. We didn’t see an uptick in incoming HTTP requests or any kind of heavy transactions that could drive up compute this much either so we weren’t satisfied we had found the problem yet.Once we had ruled out most simple explanations, we began considering giving up and launching a new instance to replace the poorly performing dev-2, writing off the issue as server misconfiguration or data corruption. While it is common practice to destroy servers in the cloud to squash bizarre, localised problems, I was slightly concerned the same issue might eventually reoccur in a different Development instance or worse, in Production. To assuage my worries, I took one final look at the detailed CPU usage breakdown metrics Datadog exposes about the hosts it monitors.

The metric system.cpu.stolen (in yellow) immediately drew my attention as it was shown as the single biggest contributor to the high CPU figures we were registering. But what is system.cpu.stolen exactly? Who’s stealing our CPUs from us, and why would they do that?

The breakthrough

The Datadog documentation explains that the system.cpu.stolen metric measures the number of CPU cycles that the hypervisor steals from one of its guest virtual machines (VM). When a VM hypervisor steals cycles from a guest VM, it suspends or throttles the VM’s CPU resources to reallocate them to a different VM in the cluster it manages. In the vast scale of the public cloud, this rebalancing occurs to mitigate hardware constraints at the data centre; it is the way hypervisors ensure fair distribution of their finite physical resources.Another important thing to note at this point is that every EC2 provisioned in both our Development and Production environments was a t3.micro, a low-cost, 3rd generation burstable instance type. What makes burstable instances different to other general-purpose compute offerings in EC2 is that they’re cheaper and, critically, that they are expected to operate on a low CPU utilisation baseline most of the time. They are capable of automatically increasing (bursting) their base compute capacity in response to a spike in demand, although they will not keep this up for too long (for the same price, that is).The amount of time a burstable instance can exceed its nominal performance levels is defined by its allotted CPU credits, specified in this table. A t3.micro has a 10% CPU utilisation baseline and accrues 12 credits per hour as long as CPU usage remains below that cap. If the instance exceeds this threshold, it will start consuming burst credits. What happens to the instance when all of its burst credits are depleted depends on its credit specification:
  • Standard: consumes accrued credits to burst beyond its baseline capacity. Once all burst credits are used up, the CPU will be gradually throttled back down to match its baseline.
  • Unlimited: the instance is never throttled, even if all of its credits are drained. Instead, Unlimited burst instances consume surplus credits, an additional type of burst currency specific to this specification which incurs additional fees for every hour it continues to run above its baseline.
Now that we knew all of this, I was able to piece together a very strong hypothesis that dev-2 had been routinely exceeding its low compute baseline, depleting all of its burst credits and ending up throttled by AWS.Here are some CloudWatch metrics detailing CPU credits balance across all of our burst-capable Development EC2s at the time:

The two graphs above depict an EC2 instance on a Standard credit specification, running out of credits and being throttled as a result. These graphs established the direct link between the CPU resource starvation alerts we had been regularly seeing in dev-2. At this stage, the remaining balance for dev-2 was nearly 0. It didn’t accrue any credits as it rarely remained under its baseline 10% CPU utilisation target.

Solving this problem was far easier for us than identifying it, as all we had to do was flip the credit specification of dev-2 to Unlimited. This is something you can do on a running EC2 instance. We might incur slightly higher EC2 costs as a result of doing this if we expect the baseline to be surpassed regularly but we found it was worth it as the price difference was negligible in a tiny compute cluster of t3.micro instances.

 

Even more so than the cost of our Development environment bill, we were concerned about the possibility that our Production EC2s were subject to the same limitation, as they are also t3.micro. Fortunately, this was not the case. The default credit specification for new burstable EC2 instances for our account/region combination had been previously configured to be Unlimited. dev-2 might have been set up during a time when the default account-wide setting was Standard.

Conclusions

One of the most surprising things I learned from this was how conservative the EC2 burstable CPU capacity baselines are and how easy it is to breach them if you’re not careful. This was something that had been pretty far down in my list of potential reasons why an EC2 could misbehave up to this point.

Provisioning burstable EC2 instances on a Standard credit specification is the cheapest way to run EC2 on-demand instances but it is important to familiarise ourselves with the intricate rules that underpin its billing mechanism to avoid wasting time troubleshooting easily preventable resource saturation issues in the cloud.

Regular performance testing and routine monitoring of our services help us develop a mental model of what normal usage looks like in our systems. This allows us to plan for capacity and minimise costs by ensuring our infrastructure can operate smoothly without breaching vendor-imposed restrictions.

 

To find out more about where to start, or how to bring the transformation magic sauce into your organisation

Let’s get started

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

PRIVACY POLICY

Your privacy is important to us. It is T5 Digital’s policy to respect your privacy and comply with any applicable law and regulation regarding any personal information we may collect about you, including across our website, https://t5.digital/, and other sites we own and operate.

This policy is effective as of 27 May 2022 and was last updated on 27 May 2022.

Information We Collect

Information we collect includes both information you knowingly and actively provide us when using or participating in any of our services and promotions, and any information automatically sent by your devices in the course of accessing our products and services.

Log Data

When you visit our website, our servers may automatically log the standard data provided by your web browser. It may include your device’s Internet Protocol (IP) address, your browser type and version, the pages you visit, the time and date of your visit, the time spent on each page, other details about your visit, and technical details that occur in conjunction with any errors you may encounter.

Please be aware that while this information may not be personally identifying by itself, it may be possible to combine it with other data to personally identify individual persons.

Collection and Use of Information

We may collect personal information from you when you do any of the following on our website:

Use a mobile device or web browser to access our content
Contact us via email, social media, or on any similar technologies
When you mention us on social media

We may collect, hold, use, and disclose information for the following purposes, and personal information will not be further processed in a manner that is incompatible with these purposes:

Please be aware that we may combine information we collect about you with general information or research data we receive from other trusted sources.

Security of Your Personal Information

When we collect and process personal information, and while we retain this information, we will protect it within commercially acceptable means to prevent loss and theft, as well as unauthorized access, disclosure, copying, use, or modification.

Although we will do our best to protect the personal information you provide to us, we advise that no method of electronic transmission or storage is 100% secure, and no one can guarantee absolute data security. We will comply with laws applicable to us in respect of any data breach.

You are responsible for selecting any password and its overall security strength, ensuring the security of your own information within the bounds of our services.

How Long We Keep Your Personal Information

We keep your personal information only for as long as we need to. This time period may depend on what we are using your information for, in accordance with this privacy policy. If your personal information is no longer required, we will delete it or make it anonymous by removing all details that identify you.

However, if necessary, we may retain your personal information for our compliance with a legal, accounting, or reporting obligation or for archiving purposes in the public interest, scientific, or historical research purposes or statistical purposes.

Children’s Privacy

We do not aim any of our products or services directly at children under the age of 13, and we do not knowingly collect personal information about children under 13.

International Transfers of Personal Information

The personal information we collect is stored and/or processed where we or our partners, affiliates, and third-party providers maintain facilities. Please be aware that the locations to which we store, process, or transfer your personal information may not have the same data protection laws as the country in which you initially provided the information. If we transfer your personal information to third parties in other countries: (i) we will perform those transfers in accordance with the requirements of applicable law; and (ii) we will protect the transferred personal information in accordance with this privacy policy.

Your Rights and Controlling Your Personal Information

You always retain the right to withhold personal information from us, with the understanding that your experience of our website may be affected. We will not discriminate against you for exercising any of your rights over your personal information. If you do provide us with personal information you understand that we will collect, hold, use and disclose it in accordance with this privacy policy. You retain the right to request details of any personal information we hold about you.

If we receive personal information about you from a third party, we will protect it as set out in this privacy policy. If you are a third party providing personal information about somebody else, you represent and warrant that you have such person’s consent to provide the personal information to us.

If you have previously agreed to us using your personal information for direct marketing purposes, you may change your mind at any time. We will provide you with the ability to unsubscribe from our email-database or opt out of communications. Please be aware we may need to request specific information from you to help us confirm your identity.

If you believe that any information we hold about you is inaccurate, out of date, incomplete, irrelevant, or misleading, please contact us using the details provided in this privacy policy. We will take reasonable steps to correct any information found to be inaccurate, incomplete, misleading, or out of date.

If you believe that we have breached a relevant data protection law and wish to make a complaint, please contact us using the details below and provide us with full details of the alleged breach. We will promptly investigate your complaint and respond to you, in writing, setting out the outcome of our investigation and the steps we will take to deal with your complaint. You also have the right to contact a regulatory body or data protection authority in relation to your complaint.

Use of Cookies

We use “cookies” to collect information about you and your activity across our site. A cookie is a small piece of data that our website stores on your computer, and accesses each time you visit, so we can understand how you use our site. This helps us serve you content based on preferences you have specified.

Limits of Our Policy

Our website may link to external sites that are not operated by us. Please be aware that we have no control over the content and policies of those sites, and cannot accept responsibility or liability for their respective privacy practices.

Changes to This Policy

At our discretion, we may change our privacy policy to reflect updates to our business processes, current acceptable practices, or legislative or regulatory changes. If we decide to change this privacy policy, we will post the changes here at the same link by which you are accessing this privacy policy.

If required by law, we will get your permission or give you the opportunity to opt in to or opt out of, as applicable, any new uses of your personal information.

Contact Us

For any questions or concerns regarding your privacy, you may contact us using the following details:

Anna Broadhurst
info@t5.digital

COMPETITION TERMS & CONDITIONS

Introduction

By participating in the competition, you are agreeing to these competition terms and conditions.The competition is being run by T5 Digital.

Eligibility to Enter

The competition is open to entrants who are at least 18 years of age or older. T5 Digital employees and associates are excluded from the draw.

By entering the competition, you confirm that you are eligible to do so and that you are eligible to receive any prizes that may be awarded to you. 

There is a limit of one entry per person and the competition is completely free to enter.

The Prize

The winning prize will be Bespoke Food and Drink Hamper.

The use of specific brands as prizes by T5 Digital does not imply any affiliation with or endorsement of such brands.

The prize is non-transferable and non-exchangeable, and no cash alternatives will be provided.

We reserve the right to substitute prizes of equal or greater value if circumstances beyond our control require doing so.

T5 Digital’s decision on any aspect of the competition is final and binding, and no correspondence will be entered into about it.

Winner Announcement

The winner will be chosen at random and notified via the email address provided on 14th March 2025.

T5 Digital will make two attempts to contact the winner via email.

If the winner does not respond to the emails informing them of their win within 14 days of the second email, they forfeit their right to the prize, and T5 Digital reserves the right to select and notify a new winner.

Delivery of the Prize

The winner will allow 14 days for the prize to be delivered; otherwise, alternative collection or delivery arrangements can be made through mutual agreement.

Data Protection and Publicity

You agree that any personal information that you provide when entering the competition will be used by T5 Digital for the purposes of administering the competition and for other purposes as specified in our Privacy Policy.

All entrants may request information on the winning participant by emailing info@t5.digital.

If requested by T5 Digital, the winner agrees to release their first name and place of employment to other competition participants.

The winner’s first name and country of residence will be announced on T5 Digital’s website and social media channels.

Limitation of Liability

T5 Digital accepts no liability for any damage, loss, injury, or disappointment suffered by entrants as a result of participating in the competition or being selected for a prize.

General

T5 Digital reserves the right, at any time and without prior notice, to cancel the competition or amend these terms and conditions.

Let’s get started