Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to analyze and find relationships in text
A user doesn't need to provision any servers or have machine learning experience to use the managed cloud service.
Potential Amazon Comprehend use cases
Amazon Comprehend scans documents to identify patterns within text. The service can apply to a range of use cases, such as customer sentiment analysis or organizing documents based on topic.
For example, Comprehend could analyze text from the transcript of a customer service call to identify key phrases that suggest whether the customer had a positive or negative experience.
How Amazon Comprehend works
A user accesses Amazon Comprehend from the AWS Management Console. The service accesses data from social media posts, emails and other text documents stored in Amazon S3. Once the user calls a Comprehend API, the service will analyze text for key phrases and relationships. Comprehend will return confidence scores for each user request that indicate how confident the service is that its results are accurate; the higher the score, the more confident the service is.
Comprehend can process single or batch text analysis requests. The service will automatically organize documents based on relevant terms and topics. A user will retain ownership over any documents that Comprehend accesses, but Amazon may store and use the text inputs to better train and develop future models. This excludes any personally identifiable information.
Amazon Comprehend APIs
The service uses six different APIs to discover insights from text. They are:
- Keyphrase Extraction API: Identifies key phrases and terms;
- Sentiment Analysis API: Returns the overall meaning and feeling of the text, either positive, negative, neutral or mixed;
- Syntax API: Allows a user to tokenize text to define word boundaries and also label words in their different parts of speech, such as nouns and verbs;
- Entity Recognition API: Identifies and labels different entities in the text, such as people, places or locations;
- Language Detection API: Identifies the primary language in which a text is written. The service can identify more than 100 languages; and
- Custom Classification API: Enables a user to build a custom text classification models.
Amazon Comprehend Medical, released at AWS re:Invent 2018, is built specifically for the medical field and can identify industry-specific terms and jargon. Comprehend also offers a specific Medical Named Entity and Relationship Extraction API. AWS does not store or use any text inputs from Amazon Comprehend Medical for future machine learning training.
Amazon Comprehend pricing and limitations
As of December 2018, Amazon Comprehend is available in three U.S., two Europe and one Asia Pacific region. The service can perform textual analysis on documents in six different languages: English, French, Spanish, German, Italian and Portuguese.
Amazon Comprehend is available in the AWS Free Tier, but a user will be charged based on the amount of processed text per month. Five API requests (Entity Recognition, Keyphrase Extraction, Sentiment Analysis, Syntax and Language Detection) are measured in 100 character units, with a 300 character minimum.
Pricing is further broken down into three tiers by the number of units per month: up to 10 million units, between 10 million and 50 million units and over 50 million units. AWS charges different prices per API request based on the total number of units requested per billing cycle.
The Custom Classification API inference requests are measured in the same units, but a user will be charged $3 per hour for model training. This API is billed at $0.0005 per unit and $0.50 per month for custom model management.