Amazon Comprehend

  • A managed Natural Language Processing (NLP) service that you can use to extract meaningful information from unstructured texts so you can analyze them in a human-like context.
  • It is an off-the-shelf solution that does not require deep machine learning expertise to get started.
  • Works with social media feeds, web pages, comments, product reviews, articles, or emails.
  • Can analyze texts in real-time by using built-in and custom models.

Common Use Cases

  • Sentiment analysis for social media posts
  • Organize documents by topics
  • IT Certification Category (English)728x90
  • Knowledge management and discovery
  • Classifies support tickets for better issue handling
  • Medical cohort analysis

Amazon Comprehend generates insights in six (6) categories:

  • Entities
    • Detects and categorizes real-world objects like date, organization, person, quantity, brands, or even a title given to a song or movie.
    • Custom Entity Recognition 
      • Allows you to identify new entities that are not supported by the preset entities. 
      • This is useful if you want to extract entities that are specific only to your business, such as product codes.
  • Sentiment
    • Detects and classifies emotions into neutral, positive, negative, or mixed.
  • Language
    • Detects the language used in a text by using identifiers from RFC 5646. 
    • Useful for multilingual companies or applications.
  • Key Phrases
    • A key phrase refers to a noun or a noun phrase that describes a particular thing.
  • Personally Identifiable Information (PII)
    • Determines sensitive information that could be used to identify a person, such as full name, birth date, bank account number, phone number, or email.
  • Syntax
    • Determine the different parts of speech used in the document, such as noun, pronoun, verb, adjective, adverb, etc.

Concepts

  • Each insight is associated with a confidence score.
  • A confidence score is between 0 and 100, indicating the probability that a given prediction is correct.
  • A product review with a positive sentiment and a 0.99 confidence score highly suggest positive feedback from a customer.
  • Topic Modeling
    • Classifies a collection of documents according to its common subject.
    • For example, you can use Topic Modeling to categorize news articles into politics, sports, business, entertainment, etc. 
  • Comprehend custom
    • It helps non-experts in machine learning build and train their own NLP models suited to their specific needs.
    • Amazon Comprehend uses a machine learning method called transfer learning to train custom models.

Pricing

  • Charges are based on units where a single unit is equal to 100 characters. 
  • 3 unit (300 characters) minimum charge per request.
  • All insights except for Syntax analysis are charged for $0.0001 per 10M units. Syntax Analysis is charged for $0.00005 per 10M units.
  • Topic Modeling has a flat rate of $1.00 per job.

Note: If you are studying for the AWS Certified Machine Learning Specialty exam, we highly recommend that you take our AWS Certified Machine Learning – Specialty Practice Exams and read our Machine Learning Specialty exam study guide.

AWS Certified Machine Learning Specialty Practice Exams

 

Validate Your Knowledge

Question 1

A Machine Learning Specialist working for an e-commerce company is creating an application using Amazon Comprehend. The application will analyze sentiments for reviews about various electronic products. During development, he noticed that all device model names are labeled as Commercial item. The Specialist wants to identify the model names under a more specific category.

Which approach will produce the MOST appropriate result?

  1. Use regular expressions to determine the entities.
  2. Use Topic Modelling to determine entities.
  3. Create a Custom Entity Recognition model.
  4. Create a list for each product and use string matching to determine their entities.

Correct Answer: 3

Custom entity recognition extends the capability of Amazon Comprehend by enabling you to identify new entity types not supported as one of the preset generic entity types. This means that in addition to identifying entity types such as LOCATION, DATE, PERSON, and so on, you can analyze documents and extract entities like product codes or business-specific entities that fit your particular needs.

Tutorials Dojo Study Guide and Cheatsheet

Creating a custom entity recognition model is a more effective approach, compared to using string matching or regular expressions to identify entities. For example, to extract product codes, it would be difficult to enumerate all possible patterns to apply string matching. But a custom entity recognition model can learn the context where those product codes are most likely to appear and then make such inferences even though it has never previously seen the exact product codes. As well, typos in product codes and the addition of new product codes can still be expected to be caught by Amazon Comprehend’s custom entity recognition model but would be missed when using string matches against a static list.

Hence, the correct answer is: Create a Custom Entity Recognition model.

The option that says: Use regular expressions to determine the entities is incorrect. Although this is possible, it isn’t as effective as creating a Custom Entity Recognition model.

The option that says: Use Topic Modelling to determine entities is incorrect because this is specifically used for determining themes/topics from a collection of documents. Take note that we only need to identify entities from a list of words.

The option that says: Create a list for each product and use string matching to determine their entities is incorrect. Like regular expressions, it would be difficult to match all possible patterns with string matching. This would produce less accurate results than when using a Custom Entity Recognition model.

References:
https://docs.aws.amazon.com/comprehend/latest/dg/custom-entity-recognition.html
https://aws.amazon.com/blogs/machine-learning/build-a-custom-entity-recognizer-using-amazon-comprehend/

Note: This question was extracted from our AWS Certified Machine Learning – Specialty Practice Exams.

For more AWS practice exam questions with detailed explanations, visit the Tutorials Dojo Portal:

Tutorials Dojo AWS Practice Tests

References:
https://aws.amazon.com/comprehend/
https://docs.aws.amazon.com/comprehend/latest/dg/how-it-works.html
https://aws.amazon.com/comprehend/pricing/

Pass your AWS, Azure, and Google Cloud Certifications with the Tutorials Dojo Portal

Tutorials Dojo portal

Our Bestselling AWS Certified Solutions Architect Associate Practice Exams

AWS Certified Solutions Architect Associate Practice Exams

Enroll Now – Our AWS Practice Exams with 95% Passing Rate

AWS Practice Exams Tutorials Dojo

Enroll Now – Our Azure Certification Exam Reviewers

azure reviewers tutorials dojo

Enroll Now – Our Google Cloud Certification Exam Reviewers

Tutorials Dojo Exam Study Guide eBooks

Tutorials Dojo Study Guide and Cheat Sheets-2

Subscribe to our YouTube Channel

Tutorials Dojo YouTube Channel

FREE Intro to Cloud Computing for Beginners

FREE AWS, Azure, GCP Practice Test Samplers

Browse Other Courses

Generic Category (English)300x250

Recent Posts

AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!

Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!

View Our AWS, Azure, and GCP Exam Reviewers

Our Community

~98%
passing rate
Around 95-98% of our students pass the AWS Certification exams after training with our courses.
200k+
students
Over 200k enrollees choose Tutorials Dojo in preparing for their AWS Certification exams.
~4.8
ratings
Our courses are highly rated by our enrollees from all over the world.

What our students say about us?

error: Content is protected !!