Azure HDInsight is a fully-managed cloud service that makes it easy, fast, and cost-effective to process massive amounts of data. Use the most popular open-source frameworks such as Hadoop, Spark, Hive, LLAP, Kafka, Storm, HBase & more. Azure HDInsight enables a broad range of scenarios such as ETL, Data Warehousing, IoT and more.
Service features
Preconfigured clusters optimized for different big data scenarios
99.9 % SLA on the cluster
High Availability
Cost-effective for cloud scale
Network Security: Secure Gateway Azure VNET Support
Data Security: Encryption +Role-based access control on Storage
Integration: Azure Cosmos DB and other Azure data services
Components
Hadoop
Spark
Interactive Query
Kafka
HBase
Storm
Extend HDInsight to install any Open Source Engine 1
Enterprise Security Package
Pricing features
Azure HDInsight Clusters
Billed on a per minute basis, clusters run a group of nodes depending on the component. Nodes vary by group (e.g. Worker Node, Head Node, etc.), quantity, and instance type (e.g. D1v2).
Refer to the FAQ below for details on workloads and the required nodes. Customers will be billed for each node for the duration of the cluster's life.
Pricing Details
HDInsight Cluster is composed of a group of nodes. In the lifecycle of the cluster, customers need to pay for these nodes. Billing starts from creation of the cluster, and ends in deletion of the cluster. Billing is done proportionately every minute.
Pricing Method
Component | Pricing |
---|---|
Hadoop, Spark, Interactive Query, Kafka*, Storm, HBase | Base price/node-hour + ¥0/core-hour |
Enterprise Security Package | Base price/node-hour + ¥0.06/core-hour |
Memory Optimized nodes for HDInsight
Instances | Number of cores | RAM | Disk size | Pricing |
---|---|---|---|---|
E2 v3 | 2 | 16 GB | 50 GB |
¥1.31 /hour
(about¥974.64 /month) |
E4 v3 | 4 | 32 GB | 100 GB |
¥2.63 /hour
(about¥1,956.72 /month) |
E8 v3 | 8 | 64 GB | 200 GB |
¥5.27 /hour
(about¥3,920.88 /month) |
E16 v3 | 16 | 128 GB | 400 GB |
¥10.54 /hour
(about¥7,841.76 /month) |
E20 v3 | 20 | 160 GB | 500 GB |
¥16.92 /hour
(about¥12,588.48 /month) |
E32 v3 | 32 | 256 GB | 800 GB |
¥21.08 /hour
(about¥15,683.52 /month) |
E64i v3 | 64 | 432 GB | 1,600 GB |
¥42.14 /hour
(about¥31,352.16 /month) |
E64 v3 | 64 | 432 GB | 1,600 GB |
¥42.14 /hour
(about¥31,352.16 /month) |
Compute Optimized nodes for HDInsight
Instances | Number of cores | RAM | Disk size | Pricing |
---|---|---|---|---|
F1 | 1 | 2 GB | 16 GB |
¥0.531 /hour
(about¥395.064 /month) |
F2 | 2 | 4 GB | 32 GB |
¥1.102 /hour
(about¥819.888 /month) |
F4 | 4 | 8 GB | 64 GB |
¥2.193 /hour
(about¥1,631.592 /month) |
F8 | 8 | 16 GB | 128 GB |
¥4.387 /hour
(about¥3,263.928 /month) |
F16 | 16 | 32 GB | 256 GB |
¥8.763 /hour
(about¥6,519.672 /month) |
Instances | Number of cores | RAM | Disk size | Pricing |
---|---|---|---|---|
F1 | 1 | 2 GB | 16 GB |
¥0.531 /hour
(about¥395.064 /month) |
F2 | 2 | 4 GB | 32 GB |
¥1.102 /hour
(about¥819.888 /month) |
F4 | 4 | 8 GB | 64 GB |
¥2.193 /hour
(about¥1,631.592 /month) |
F8 | 8 | 16 GB | 128 GB |
¥4.387 /hour
(about¥3,263.928 /month) |
F16 | 16 | 32 GB | 256 GB |
¥8.763 /hour
(about¥6,519.672 /month) |
F16s v2 | 16 | 32 GB | 256 GiB |
¥10.597/hour
(about¥7,884.168 /month) |
General Purpose nodes for HDInsight
AV2 HDInsight nodes run on Av2 Standard VM, which is the latest generation of A-series virtual machines with similar CPU performance and faster disk.
Instances | Number of cores | RAM | Disk size | Pricing |
---|---|---|---|---|
A1 v2 | 1 | 2 GB | 10 GB |
¥0.545 /hour
(about¥405.48 /month) |
A2 v2 | 2 | 4 GB | 20 GB |
¥1.079 /hour
(about¥802.776 /month) |
A2m v2 | 2 | 16 GB | 20 GB |
¥2.079 /hour
(about¥1,546.776 /month) |
A4 v2 | 4 | 8 GB | 40 GB |
¥2.169 /hour
(about¥1,613.736 /month) |
A4m v2 | 4 | 32 GB | 40 GB |
¥4.166 /hour
(about¥3,099.504 /month) |
A8 v2 | 8 | 16 GB | 80 GB |
¥4.326 /hour
(about¥3,218.544 /month) |
A8m v2 | 8 | 64 GB | 80 GB |
¥8.328 /hour
(about¥6,196.032 /month) |
A Series Universal Nodes
A3 is economical option for meeting universal demands. Customers running basic query applications and modes on Hadoop will benefit from using the A Series.
The A1 node can be only used as Storm’s Zookeeper node. The A2 node can be only used as Zookeeper node of HBase and Storm.
A Series cannot be used as data nodes in Linux Cluster. They can only be used as the head node and ZooKeeper node, and only A1, A2 and A3 are available.
Instances | Number of cores | RAM | Disk size | Pricing Per Node |
---|---|---|---|---|
A1 | 1 | 1.75 GB | 70 GB |
¥ 0.3981/hour
(about ¥ 296.1864/month) |
A2 | 2 | 3.5 GB | 135 GB |
¥ 0.9662/hour
(about ¥ 718.8528/month) |
A3 | 4 | 7 GB | 285 GB |
¥ 1.9425/hour
(about ¥ 1,445.22/month) |
D Series Nodes: CPU accelerated by 60%, bigger memory, local SSD
D1, D2 and D11 nodes can only be used as Zookeeper nodes for HBase and Storm.
Instances | Number of cores | Memory | Disk size | Pricing | |
---|---|---|---|---|---|
D1 | 1 | 3.5 GB | 50 GB |
¥ 0.5981/hour
(about ¥ 444.9864/month) |
|
D2 | 2 | 7 GB | 100 GB |
¥ 1.2362/hour
(about ¥ 919.7328/month) |
|
D3 | 4 | 14 GB | 200 GB |
¥ 2.4725/hour
(about ¥ 1,839.54/month) |
|
D4 | 8 | 28 GB | 400 GB |
¥ 4.965/hour
(about ¥ 3,693.96/month) |
|
D5 | 16 | 56 GB | 800 GB |
¥ 9.8999/hour
(about ¥ 7,365.5256/month) |
|
D11 | 2 | 14 GB | 100 GB |
¥ 1.9717/hour
(about ¥ 1,466.9448/month) |
|
D12 | 4 | 28 GB | 200 GB |
¥ 3.9434/hour
(about ¥ 2,933.8896/month) |
|
D13 | 8 | 56 GB | 400 GB |
¥ 7.8867/hour
(about ¥ 5,867.7048/month) |
|
D14 | 16 | 112 GB | 800 GB |
¥ 11.2234/hour
(about ¥ 8,350.2096/month) |
Dv2 Series Optimized Nodes: A New Generation of CPU
The Dv2 Series is the new generation of D Series instances with stronger CPU, of which the memory and disk configurations are the same as the D Series. The instances of the Dv2 Series are based on a new generation of 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, which can reach up to 3.2GHz using Intel Turbo Boost Technology 2.0. The Dv2 Series can cater to customers requiring low delay, local SSD access or faster CPU to run applications. The bigger memory of the D Series and Dv2 Series can improve performance for customers using HDInsight HBase. Customers using HDInsight Storm and Spark can upload bigger reference data via its bigger memory, and realize bigger throughput through its faster CPU.
The D Series will continue to apply, but the DV2 series is recommended.
Instances | Number of cores | Memory | Disk size | Pricing | |
---|---|---|---|---|---|
D1 v2 | 1 | 3.5 GB | 50 GB |
¥ 0.5981/hour
(about ¥ 444.9864/month) |
|
D2 v2 | 2 | 7 GB | 100 GB |
¥ 1.2362/hour
(about ¥ 919.7328/month) |
|
D3 v2 | 4 | 14 GB | 200 GB |
¥ 2.4725/hour
(about ¥ 1,839.54/month) |
|
D4 v2 | 8 | 28 GB | 400 GB |
¥ 4.965/hour
(about ¥ 3,693.96/month) |
|
D5 v2 | 16 | 56 GB | 800 GB |
¥ 9.8999/hour
(about ¥ 7,365.5256/month) |
|
D11 v2 | 2 | 14 GB | 100 GB |
¥ 1.9717/hour
(about ¥ 1,466.9448/month) |
|
D12 v2 | 4 | 28 GB | 200 GB |
¥ 3.9434/hour
(about ¥ 2,933.8896/month) |
|
D13 v2 | 8 | 56 GB | 400 GB |
¥ 7.8867/hour
(about ¥ 5,867.7048/month) |
|
D14 v2 | 16 | 112 GB | 800 GB |
¥ 11.2234/hour
(about ¥ 8,350.2096/month) |
FAQ
Expand all-
How are the different HDInsight cluster types billed?
HDInsight deploys different number of nodes for each cluster type. Within a given cluster type, there are different roles for the various nodes, which allow a customer to size those nodes in a given role appropriate to the details of their workload. For example, a Hadoop cluster can have its worker nodes provisioned with a large amount of memory if the type of analytics being performed are memory intensive.
HDInsights’ Hadoop Cluster can deploy three kinds of roles:
- Head node (2 nodes)
- Data node (at least 1 node)
- Zookeeper nodes (3 nodes)
HDInsight’s HBase Cluster can deploy three kinds of roles:
- Control Server (2 nodes)
- Zone Server (at least 1 node)
- Main node/Zookeeper node (3 nodes)
HDInsight’s Storm Cluster can deploy three kinds of roles:
- Nimbus node (2 nodes)
- Supervision Server (at least 1 node)
- Zookeeper node (3 nodes)
-
If my cluster ran for less than an hour, how much would I get billed?
We charge for the number of minutes your cluster is running, rounded to the nearest minute, not hour.
-
Could you give me an example on how billing works?
If you run a cluster for 100 hours in US East with two D13 v2 head nodes, three D12 v2 data nodes, and three D11 v2 zookeepers, the billing would be the following in the scenario:
On a Standard HDInsight cluster—100 hours x ( 2 x ¥7.8867/hour + 3 x ¥3.9434/hour + 3 x ¥1.9717/hour) = ¥3351.87
-
How can I check that I have properly stopped an HDInsight cluster and that I am not being billed for it?
In order to stop an HDInsight cluster, you must delete the cluster. By default, all data an HDInsight cluster operates on resides in Azure Blob storage, so data will not be impacted by this. If you want to preserve your Hive metadata (tables, schemas) you should provision a cluster with an external metadata store. You can find more details in this documentation .
-
How many data nodes do I need for my HDInsight cluster?
The number of data nodes will vary depending on your needs. With the elasticity available in Azure cloud services, you can try a variety of cluster sizes to determine your own optimal mix of performance and cost, and only pay for what you use at any given time. Clusters can also be scaled on demand to grow and shrink to match the requirements of your workload.
-
What if I need more HDInsight data nodes than my subscription allows?
Each subscription has a default limit on how many HDInsight data nodes can be created. If you need to create a larger HDInsight cluster or multiple HDInsight clusters that together exceed your current subscription maximum, you can request that your subscription's billing limits be increased. Please open "Support Type" for related operations. Depending on the maximum nodes per subscription that you request, you may be asked for additional information that will allow us to optimize your deployment(s).
-
How much would a cluster with "x" data nodes cost?
To estimate the cost of clusters of various sizes, try the Azure Calculator .
-
How can I reduce costs on clusters I use infrequently?
There are a number of options to reduce the costs:
-
Drive higher utilization of your existing clusters.
1.Delete clusters while not in use. For more information about deleting a cluster, see Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI
2.Scale down. For more information about manually scaling clusters, see Scale HDInsight clusters
- Deploy the clusters with lower cost. This includes proper planning on how many nodes to use, which type of node to use for head nodes and worker nodes, and which region to launch the cluster as HDInsight offers many different node types to deploy to, with a range of pricing options. Review the Base price/node-hour section of this article for pricing and for more information see Capacity planning for HDInsight clusters
-
-
How much would a cluster with "x" data nodes cost?
To estimate the cost of clusters of various sizes, try the Azure Calculator .
Support & SLA
If you have any questions or need help, please visit Azure Support and select self-help service or any other method to contact us for support.
As for HDInsight, we guarantee that any HDInsight cluster you deploy can establish external connections at least 99.9% of the time during the monthly billing cycle. To learn more about the details of our Service Level Agreement, please visit the Service Level Agreements page.