Speech Services pricing
Speech Services
INSTANCE | CATEGORY | FEATURES | PRICE | |
---|---|---|---|---|
Free - Web 1 concurrent 1 | Speech to Text | Standard | 5 audio hours free per month | |
Custom |
5 audio hours free per month Endpoint hosting: 1 model free per month 2 |
|||
Enhanced add-on features:
Language identification Batch diarization for 3+ speakers |
¥3.66 per audio hour per feature | |||
Text to Speech | ||||
Neural | 0.5M characters free per month | |||
Speech Translation | Standard | 5 audio hours free per month | ||
Standard - Web 20 concurrent request 1 | Speech to Text | Real-time | Batchv3.2 API or higher 3 | |
Standard | ¥3 per audio hour | ¥1.83 per hour | ||
Custom |
¥4.452 per audio hour Endpoint hosting: ¥0.547 per model per hour |
¥2.3 per hour Endpoint hosting: N/A |
||
Enhanced add-on features: |
¥3.05 per hour per feature |
Continuous Language Identification and Diarization Included 4 | ||
Text to Speech | ||||
Neural | ¥95.4 per 1M characters | |||
Speech Translation | Standard | ¥10.176 per audio hour |
INSTANCE | CATEGORY | FEATURES | PRICE | |
---|---|---|---|---|
Free - Web 1 concurrent 1 | Speech to Text | Standard | 5 audio hours free per month | |
Custom |
5 audio hours free per month Endpoint hosting: 1 model free per month 2 |
|||
Enhanced add-on features:
Language identification Batch diarization for 3+ speakers |
¥3.66 per audio hour per feature | |||
Text to Speech | ||||
Neural | 0.5M characters free per month | |||
Speech Translation | Standard | 5 audio hours free per month | ||
Standard - Web 20 concurrent request 1 | Speech to Text | Real-time | Batchv3.2 API or higher 3 | |
Standard | ¥3 per audio hour | ¥1.83 per hour | ||
Custom |
¥4.452 per audio hour Endpoint hosting: ¥0.547 per model per hour |
¥2.3 per hour Endpoint hosting: N/A |
||
Enhanced add-on features: |
¥3.05 per hour per feature |
Continuous Language Identification and Diarization Included 4 | ||
Text to Speech | ||||
Neural | ¥95.4 per 1M characters | |||
Speech Translation | Standard | ¥10.176 per audio hour |
INSTANCE | CATEGORY | FEATURES | PRICE | |
---|---|---|---|---|
Free - Web 1 concurrent 1 | Speech to Text | Standard | 5 audio hours free per month | |
Custom |
5 audio hours free per month Endpoint hosting: 1 model free per month 2 |
|||
Enhanced add-on features:
Language identification Batch diarization for 3+ speakers |
¥3.66 per audio hour per feature | |||
Text to Speech | ||||
Neural | 0.5M characters free per month | |||
Speech Translation | Standard | 5 audio hours free per month | ||
Standard - Web 20 concurrent request 1 | Speech to Text | Real-time | Batchv3.2 API or higher 3 | |
Standard | ¥3 per audio hour | ¥1.83 per hour | ||
Custom |
¥4.452 per audio hour Endpoint hosting: ¥0.547 per model per hour Custom Neural Training ¥529.152 per hour Custom Neural Long Audio Characters ¥1017.6 per M |
¥2.3 per hour Endpoint hosting: N/A |
||
Enhanced add-on features: |
¥3.05 per hour per feature |
Continuous Language Identification and Diarization Included 4 | ||
Text to Speech | ||||
Neural | ¥95.4 per 1M characters | |||
Speech Translation | Standard | ¥10.176 per audio hour |
Commitment Tiers
Instance | Category | Features | Price(Per Month) | Overage |
---|---|---|---|---|
Azure-Standard | Text to Speech | Neural 1 |
¥6,105.6 for 80M characters
¥24,804 for 400M characters ¥95,400 for 400M characters ¥152,640 for 4000M characters |
¥76.32 per 1M characters
¥62.01 per 1M characters ¥47.7 per 1M characters ¥38.16 per 1M characters |
Connected container - Standard | Text to Speech | Neural 1 |
¥5,800.32 for 80M characters
¥23,563.8 for 400M characters ¥90,630 for 400M characters ¥145,008 for 4000M characters |
¥72.5 per 1M characters
¥58.9 per 1M character-counts ¥45.32 per 1M character-counts ¥36.252 per 1M characters |
Computer Vision
This state-of-the-art, cloud-based API provides developers with access to advanced algorithms that allow you to extract rich information from images to categorize and process visual data. Capabilities include image analysis, tagging, recognition celebrities, text extraction, and smart thumbnail generation.
Image Analysis
Instance | Features | Price | |
---|---|---|---|
Free (F0) - Web/Container | All | 5,000 free transactions per month20 transactions per minute | |
Standard (S1) - Web/Container | Group 1 |
Tag
Face GetThumbnail Color Image Type GetAreaOfInterest People Detection (preview) Smart Crops OCR Adult Celebrity Landmark Object Detection Brand |
0-1M transactions - ¥6.36 per 1,000 transactions 1-10M transactions - ¥ 5.088 per 1,000 transactions 10-100M transactions - ¥ 4.134 per 1,000 transactions |
Group 2 |
Describe
Read Caption Dense Captions |
0-1M transactions - ¥9.54 per 1,000 transactions 1M+ transactions - ¥3.82 per 1,000 transactions |
Spatial Analysis
Instance | Features | Price |
---|---|---|
Free (F0) - Web/Container | Spatial Analysis on Edge | 1 free camera/month |
Standard (S1) - Web/Container | ¥0.07314 per hour |
Content Moderator
Content moderator enhances your ability to detect potentially offensive or unwanted images through machine-learning based classifiers, custom blacklists, and optical character recognition (OCR). It helps you detect potential profanity in more than 100 languages and match text against your custom lists automatically. Content Moderator also checks for possible personally identifiable information (PII). Each Text API call can contain up to 1,024 characters each. Scan images (minimum 128 pixels, maximum 4MB size) for adult and racy content, optical character recognition (OCR) detection. You can also match against custom image lists. Each API call is a transaction.
INSTANCE | TRANSACTIONS PER SECOND (TPS) | FEATURES | PRICE |
---|---|---|---|
Free | 1 TPS | Moderate | 5,000 transactions free per month |
1 TPS | Review | N/A | |
Standard | 10 TPS | Moderate |
0-1M transactions - ¥10.18 per 1,000 transactions
1M-5M transactions - ¥7.63 per 1,000 transactions 5M-10M transactions - ¥6.11 per 1,000 transactions 10M+ transactions - ¥4.07 per 1,000 transactions |
Language Service
Language Service API is a cloud-based service that provides advanced natural language processing over raw text, and includes three main functions—sentiment analysis, key phrase extraction, and language detection.
INSTANCE | FEATURES |
Inferencing
Per 1,000 text records |
---|---|---|
Free - Web |
Sentiment Analysis
Key Phrase Extraction Language Detection Entity Extraction Document summarization (Extractive) Conversational language understanding |
5,000 transactions free per month |
Standard
up to 100 requests per second and 1,000 requests per minute |
Sentiment Analysis
Key Phrase Extraction Language Detection Entity Extraction Document summarization (Extractive) |
0-500,000 text records — ¥10.176 per 1,000 text records
0.5M-2.5M text records — ¥7.632 per 1,000 text records 2.5M-10.0M text records — ¥3.053 per 1,000 text records 10M+ text records — ¥2.54 per 1,000 text records ¥20.352 per 1,000 text records |
Conversational language understanding | ¥21.56 |
Translator Text
Translator Text API is a cloud-based machine translation service supporting multiple languages, reaching more than 95% of world's gross domestic product (GDP). Use Translator to build applications, websites, tools, or any solution requiring multi-language support.
INSTANCE | FEATURES | PRICE |
---|---|---|
Free |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
2M chars free per month |
S1 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥102 per million chars |
Document Translation | ¥152.6 per million characters | |
S2 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥20,925 / month / Up to 250M chars per month, Overage : ¥84 per million chars |
S3 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥61,070 / month / Up to 1B chars per month, Overage : ¥61 per million chars |
S4 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥457,932 / month / Up to 10B chars per month, Overage : ¥46 per million chars |
D3
Variable cost plus Fixed plus overage |
Document Translation |
¥61,817/month
675M chars per month included Overage: ¥10.1124 per million chars |
Translator Text
Translator Text API is a cloud-based machine translation service supporting multiple languages, reaching more than 95% of world's gross domestic product (GDP). Use Translator to build applications, websites, tools, or any solution requiring multi-language support.
INSTANCE | FEATURES | PRICE |
---|---|---|
Free |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
2M chars free per month |
S1 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥102 per million chars |
Document Translation | ¥152.6 per million characters | |
S2 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥20,925 / month / Up to 250M chars per month, Overage : ¥84 per million chars |
S3 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥61,070 / month / Up to 1B chars per month, Overage : ¥61 per million chars |
S4 |
Text Translation
Language Detection Bilingual Dictionary Transliteration |
¥457,932 / month / Up to 10B chars per month, Overage : ¥46 per million chars |
Language Understanding
Language Understanding (LUIS) offers a fast and effective way of adding language understanding to applications. With LUIS, you can use pre-existing, world-class, pre-built models whenever they suit your purposes. When you need specialized models, LUIS guides you through the process of quickly building them.
INSTANCE | TRANSACTIONS PER SECOND (TPS) 1 | FEATURES | PRICE |
---|---|---|---|
Free
2
-
Web |
5 TPS | Text Requests | 10,000 transactions* free per month * |
Standard -
Web |
50 TPS | Text Requests | ¥15.26 per 1000 transactions per month * |
Training
Instance | Feature | Training |
---|---|---|
Free - Web | Conversational language understanding |
Standard training: free
Advanced training: up to 1 hour free |
Standard(S) - Web | Conversational language understanding |
Standard training: free
Advanced training: ¥32.3 /hour |
FAQ
Expand allCommon
-
How are the Azure AI Services APIs billed?
The Face API and Computer Vision API are billed per 1,000 API transaction calls when a production API call is being actively executed.
-
What will happen if I exceed the transaction limit at the Standard tier?
If the usage on a standard tier is exceeded, the account starts to accrue overages. These overages are billed on a monthly basis, and are calculated at the rate specified for each tier.
-
Can I change the service tier I subscribed to?
You can upgrade to a higher tier at any time. The billing rate corresponding to the higher tier and the amounts included will take effect immediately.
Computer Vision
-
What operations can be completed with Computer Vision API?
Tag - The Computer Vision API returns tags based on more than 2,000 recognizable objects, living beings, types of scenery, and actions. If tags are ambiguous or unusual, the API response will provide 'hints' to clarify the meaning of the tag.
GetThumbnail - GetThumbnail generates high quality thumbnails after images are uploaded. The Computer Vision API algorithm analyzes objects within images, then crops images according to the requirements for the region of interest (ROI).
Color - The Computer Vision algorithm extracts colors from an image. The colors are analyzed in three different contexts (foreground, background, and whole). They are grouped into 12 dominant accent colors.
Image Type - The Computer Vision API can set a Boolean flag to indicate whether an image is black and white or color. It can use the same method to indicate whether an image is a line drawing. It can also indicate whether an image is clip art, along with its quality.
OCR - Optical Character Recognition (OCR) technology detects text content in an image and extracts the identified text into a machine-readable character stream. You can use the results for searches and numerous other purposes, from medical records to security and banking. It automatically detects the language. OCR saves time and provides convenience for users by allowing them to simply take photos of text instead of transcribing it. Please refer to the Computer Vision Documentation page for supported languages.
Adult - Apply adult/racy settings to automatically restrict adult content in images.
Celebrities - Azure’s celebrity recognition model can recognize 200,000 celebrities from business, politics, sports, and entertainment around the world.
Content Moderator
-
What are the limits/restrictions of the content that can be moderated by using the API?
When using the API, images need to have a minimum of 128 pixels and a maximum file size of 4MB. Text can be at most 1024 characters long.
-
What happens if the content passed to the text API or the image API exceeds the size limits?
The text API will return an error code that informs that the text is longer than permitted. The image API will also return an error code that informs that the image does not meet the size requirements.
-
Is there an extra cost to using Human review tool?
Human review tool is included in your subscription.
Text Analytics
-
How does billing for the Text Analytics API work?
The Text Analytics API can be purchased in units of the S0-S4 tier at a fixed price. Each unit of a tier comes with included quantities of API transactions. If the user exceeds the included quantities, overages are charged at the rate specified in the pricing table above. These overages are prorated and the service is billed on a monthly basis. The included quantities in a tier are reset each month. In the S tier, the service is billed for only the amount of Text Records submitted to the service.
-
What happens if I exceed the transaction limit on my free tier for Text Analytics?
Usage is throttled if the transaction limit is reached on the Free tier. Customers cannot accrue overages on the free tier.
-
What constitutes a transaction in the S0-S4 tiers of the Text Analytics API?
Any annotation to a document counts as a transaction. Batch scoring calls will also take into consideration the number of documents that need to be scored in that transaction. So for instance, if 1,000 documents are sent for sentiment analysis in a single API call, that will count for 1,000 transactions. If an API supports more than one annotation operation, that will also be considered. Let’s say an API call performs both sentiment analysis and key-phrase extraction on 1,000 documents, that will count for 2,000 transactions (2 annotations × 1,000 documents).
-
What happens if I exceed the transaction limit on the S0-S4 tier?
If the usage on the S0-S4 tier is exceeded, the account starts to accrue overages. These overages are billed on a monthly basis and are calculated at the rate specified for each tier.
-
Can I change the tier of service I subscribed to?
You may upgrade to a higher tier at any time. Billing rate and included quantities corresponding to the higher tier will begin immediately.
-
What constitutes a Text Record in the S Tier?
A text record in the S tier contains up to 1,000 characters as measured by String.Length . If an input document into the text analytics API is more than 1,000 characters, it counts as one text record for each unit of 1,000 characters. For instance, if an input document sent to the API contains 7,500 characters, it would count as 8 text records. If an input document sent to the API contains 500 characters, it would count as 1 text record. If two documents are submitted, one document of 500 characters and one document of 1,200 characters, then the service would be billed for three text records in total: one record for the 500 character document and two text records for the 1,200 character document.
Translator Text
-
How do I calculate monthly volume?
For the Microsoft Translator Text API, the volume you are billed for is the number of characters in the input. Every Unicode code point counts as a character. Every character of the input counts. Each translation of a text to a new language counts as a separate translation. The number of queries, words, bytes, or sentences is irrelevant.
To estimate your monthly volume, take the total characters to translate, multiply it by the number of languages you want to have it translated into, then take the number and spread it over the maximum number of hours or days you are able to wait for completion.
More information on how we count characters for the Translator Text API can be found in our documentation .
-
What happens if I reach the limit of the free subscription plan?
If you subscribe to the free subscription plan, the Microsoft Translator service will stop if you reach 2 million characters during a subscription month for the Text Translation API. The Microsoft Translator service will start again at the beginning of your next subscription month or when you change your subscription to a paid plan.
-
What languages does Microsoft Translator support?
See the language list for text translation using the Microsoft Translator Text API.
Developer oriented language lists, including language codes can be found in our documentation .
-
Can I customize my translations?
Customization currently is not available with subscriptions on Azure.cn.
Language Understanding
-
What is a transaction?
For text requests, a transaction is an API call with query length up to 500 characters.
For speech requests, a transaction is an utterance with query length up to 15 seconds long.
-
Is speech requests included in the free tier?
No, the free tier only includes text requests with max length 500 characters.
-
What is a Dispatch?
Dispatch is a feature that enables processing two models/applications with one API call.
Speech Services
-
How does billing work?
For Speech Translation, Speech to Text : usage is billed in one-second increments
For Text to Speech : usage is billed per character
Please reference the pricing note here for the SSML tag charging, Chinese, Japanese and Korean(CJK) character pricing.
Support & SLA
If you have any questions or need help, please visit Azure Support and select self-help service or any other method to contact us for support.
We guarantee that Azure AI Services running at the Standard tier will be available at least 99.9% of the time. No SLA is provided for the Free tier. If you want to learn more about the details of our server level agreement, please visit the Service Level Agreement page.