Capacity Planning for Backend Applications
Capacity planning for this post primarily focuses on estimating, infrastructure related tasks that need to be considered to make sure you have enough resources to take on the projected/future traffic and computational demands for new features. A lot of sizing and capacity planning is a semi-scientific approach.
Capacity Planning The Three Cases
- Feature Based – When releasing a new feature in an existing application.
- Seasonal – When you are planning for a particular event. Like black friday, check in with your marketing and growth team on what their projections are.
- Intrinsic Growth
Systematic Approach to Capacity Planning Three Basic Steps :-
- Analyze Current Capacity – What is your current Memory utilization, CPU Utilization, io utilization and what are their Max, Min, Average and 98 percentiles. This should give a good view of your current statistics and you can even extrapolate how much extra load your servers can handle. If there is a API SLA what is it and how close you are to your current SLA.
- Determine New Load Requirements – As mentioned earlier Its hard to very accurately predict the expected load.
- Future Capacity Planning – What are the projected estimations for traffic.
Accuracy of Projections
You can strive to be accurate with the projected load but you can never truly be certain. Thinking about a big store like Macy’s trying to predict how many customers are going to walk through their door on black friday / boxing day. Macys or any retail store knows for sure that there is going to be a increased number of customers visiting their physical location, however predicting how many people are going to visit is based on so many factors some known some unknown. And even if you know what the factors are then understanding how it affects your business is more a prediction that may or may not be accurate.
Similarly imagine releasing a new feature on a mobile application, the possibility of knowing how many users would be exercising that one feature is an educated guess. If the feature catches on and is popular then it would cause a lot more than expected traffic. The answer to that question depends on if it is for an existing app with a steady stream of users or if it is for a brand new app or are there other things like seasonal demands that are expected.
Therefore you always provision for more infrastructure that you estimate, the norm is to have a 50% headroom. This rule primarily forces you to over provision. This is a recommended practice, that gives you time to monitor and react if you have an overwhelming increase in traffic. While capacity planning it is important to know how much time your team would need to increase capacity and the steps that need to be followed while increasing capacity. Other reasons for leaving a 50% head room is to accommodate for hardware failure, a DOS attack, Unpredictability of software. Also if your team does deployments where you normally take a couple of VM out of the pool and update it with it with the new build. then your application farm needs to be able to run without those servers you just took out of rotation. Being a platform you are also vulnerable to rogue clients who intentionally or unintentionally bombard your services with multiple requests. All these uncertainties dictate the need for a 50% headroom.
If you feel 50% of headroom is over provisioning then it is suggested to go up to a max of 40% but never less that than.
Rogue Clients & Incorrect Client Implementation
Eg. Imagine you are in a mobile app for your public library, where the app needs to check for every search you do if you have already read the books you see in your search results. Considering an average person reads 30 books a year is it not better for the app to store those 30 books on the device itself rather than ask the server everytime a search result is returned if that book was read by the user. Such architectural considerations are important to understand while designing systems. Its good to have the Backend teams review client implementations but it is not a very practical approach.
Things to Consider while Capacity Planning
Most important thing would first be to identify all the components of your backend applications. Then you can get to analyse and understand more about them.
- Memory Utilization
- CPU Utilization
- I/O Utilization
- App performance while Batch job run
- Consider up and down stream systems and how they effect your back end application
- Is your data being read or written.
- Proxy settings
- App connection and time out settings
It is key to understand the characteristics of your application, is CPU intensive or memory intensive it’s important to understand the footprint of your various applications. As it’s not always that you just need to add more VMs you could simply add exactly what your application needs.
For application servers it’s fairly easy to increase capacity by adding VMs to your Load balancers. However horizontal scaling your data store is relatively a complex task and also depends on the maturity of you team and their skills. Increasing the nodes in your cluster or any kind of scaling without careful pre-planning is not advisable. Also looking into the pattern of which the DB is going to be used is important are there going to be increased read or writes ? What is the current capacity of the database ? What are the number of queries per second q/s and what is the current transactions per second t/s.
You might also want to keep in check if you would need to increase capacity of you logstash.
Also while planning your capacity also need to keep in mind the capacity of your and add enough headroom for –
- Load Balancers – Most often forgotten.
- Network Gear
- Network Bandwidth of Data Center
- Dependent systems (Upstream and downstream systems and APIs)
Headroom Ideal Usage Percentage Maximum Capacity) Current = ×∑( − ( Usage Growth(t) Optimization Projects(t))
This equation states that the headroom of a particular component of your system is equal to the ideal usage percentage of the maximum capacity minus the current usage minus the sum over a time period (here it is 12 months) of the growth rate minus the optimization. We will cover the ideal usage percentage in the next section of this chapter; for now, let’s use 50% as the number. If the headroom number is positive, you have enough headroom for the period of time used in the equation. If it is negative, you do not.
Ownership / Roles & Responsibilities
There needs to be one person who is primarily responsible for capacity planning and it is normally a QA engineer. The SDET can simulate the expected traffic pattern and keeping in mind the actual use cases, user flows, understanding how backend app and the data store supporting the app functions. Understanding the user flow is also critical.
Also is it possible to release the new version of the app to a slice of the user base for a period of a couple of days to understand the usage patterns and capacity plan on actual extrapolation of real usage.
Capacity planning and predicting how many users would be using that feature and with the multiple layers of caching. Consider if the app is using returning fairly static content or is it personalized for the specific user ? All these things affect the projections of traffic and its capacity planning.
Measuring Current Usage
Have a contingency plan – How much time would it take to increase your capacity by 50% ? What is the process and the implications of doing so. Is it that you only need App servers which can be deployed in a matter of minutes or would you need to expand your Database to take on more load ? and if so do you have a detailed plan on how your team could accomplish that ? It would also be good to know the Time it would take to accomplish it.
Capacity Planning is rapidly evolving with the increase in adoption several cloud services. Cloud Services like AWS have simple settings where the service itself can scale automagically when there is an increase in traffic and even distribute traffic across various geographies can be achieved.