Saturday, August 10, 2019

Autoscaling: Azure HDInsight Cluster


Introduction

Scale out/in of HDInsight Spark cluster is required due to variable workload been executed at specific intervals. This strategy help to optimize the cost and Azure resources. Here in this article we will discuss about various options available to perform autoscaling 


From Azure Portal

Scheduled based HDI Cluster Scale out in is possible from azure portal itself – from HDI settings select “Cluster Size”










The HDI Spark now has “Enable Autoscale” feature available in preview mode – there are 2 options available under this; 1) Load based, and 2) Schedule Based

Load-Based


Simply saying that it will do autoscaling of the cluster based on amount of CPU cores and memory required to complete the pending jobs. If CPU cores and memory required is more than the available CPU core and memory then it will trigger autoscale (up/down) accordingly.

Schedule-Based

With Schedule-based we need to configure the Autoscale schedule as displayed:




One the schedule is configured the – the autoscaling will happen as specified:


Conclusion

This example showcase one particular way to autoscale the HDInsight cluster from Azure portal itself. There are various other custom approaches to achieve similar benefits using Azure CLI or PowerShell, those options we will discuss in next post. 

No comments:

Post a Comment

Autoscaling: Azure HDInsight Cluster