Alibaba Cloud - Auto Scaling
What is Auto Scaling?
Going by literal meaning, auto-scaling refers to automatic increase or decrease of computing resources based on usage demand to ensure optimum use of the resources and thus avoiding over-utilization of cost that is not actually required.
In Alibaba cloud, auto-scaling refers to dynamically adding and removing computational resources on the run to the resource pool as the demand for application usage increases or decreases. When usage of application is high in peak hours, auto-scaling automatically adds more resources to the pool while it automatically removes additional resources from the pool when usage of application is low.
When talking about computational resources, it can be CPU, memory or physical resources within a server or can be additional ECS instances as well. Based on the type of resources being added or removed, we can define scaling as being vertical scaling or horizontal scaling.
Horizontal and Vertical Scaling
- Vertical scaling refers to increase of resources of a server e.g. CPU & memory based on metrics. If an ECS instances is showing high usage of CPU or Memory consumption during a specific time, we can schedule it to increase CPU & memory resources during a certain period of time on the same ECS instance. Range values of metrics can be used for vertical scaling.
- Horizontal scaling refers to dynamic addition of ECS instances instead of addition of CPU or memory resources. All instances that are being added or removed with auto-scaling are automatically attached or detached from server load balancer as well and added or removed from RDS white list as well. Nothing need to be modified manually when auto-scaling is configured.
Scale-in & Scale-out
Based on terminology of addition or removal of resources, we can define scaling-in or scaling-out as:
- Scale-out refers to addition of more computational resources when load on application increases
- Scale-in refers to removal of additional computational resources when application consumption is not high and needs less resources
Alibaba Cloud Scaling Modes
- Scheduled scaling mode: Usage of application can be predictable or un-predictable. In certain use cases we can predict in advance when the usage of application will be high. Classical example for this behavior will be an online ecommerce application starting a 50% sale from 1am to 4am. So we can expect high usage of application during this time period. So we can use time as configuration parameter for auto-scaling. We can configure to add more computing resources or additional ECS instances during this time and remove additional ECS instances after this time. This type of scheduling is known as scheduled scaling where we can schedule the addition or removal of resources based on time.
- Dynamic scaling mode: But in some cases we cannot predict in advance the high or low demand for the application. Let’s consider an online website providing blogs and learning tutorials for student’s academics subjects. So usage of the application can’t be predicted as it is based on availability time of students what is their preferable time of study and day of week. In such cases, auto-scaling can be configured based on number of requests or based on load on application. This type of scaling is called dynamic scaling as it is not based on fixed time but based on requirement of application usage.
- Fixed scaling mode: In fixed scaling mode, we can set a minimum, maximum or expected number of ECS instances in the configuration so that whenever the application is started, it should maintain same number of instances in the scaling group as defined in the setting.
- Health mode: This mode works in conjunction with health check feature of the scaling group. Status of instances within a scaling group is checked in intervals and un-healthy instances is removed/replaced from the scaling group.
- Manual/Custom mode: In this mode, addition or removal of resources can be performed manually or based on scaling rules defined.
For all above modes, multiple modes can be combined to meet the requirement.
Advantages of Auto-scaling
- Cost optimization. Saving power costs and costs of extra resources which are not in use. This can considerably reduce operating costs of the infrastructure.
- Flexibility: Resources can be added or removed manually as well thus offering flexibility to control resources manually as per requirement of company policies.
- Health status of ECS instances deployed for the application automatically triggers the replacement of any un-healthy instance with a healthy one.
- Improves the availability of servers and application thus increasing uptime & availability values.
Limits
B/m table shows the auto-scaling limits that applies to a single user account in Alibaba cloud: