Amazon CloudSearch Cheat Sheet

Last updated on November 14, 2024

Bookmarks

Features
Scaling
Fault Tolerance
Monitoring
Pricing

Amazon CloudSearch Cheat Sheet

A fully-managed service in the AWS Cloud that makes it easy to set up, manage, and scale a search solution for your website or application.

Features

You can use CloudSearch to index and search both structured data and plain text.
Full text search with language-specific text processing
Boolean search

Prefix searches
Range searches
Term boosting
Faceting
Highlighting
Autocomplete Suggestions
You can get search results in JSON or XML, sort and filter results based on field values, and sort results alphabetically, numerically, or according to custom expressions.
CloudSearch can scale to accommodate the amount of data uploaded to the domain and the volume and complexity of search requests.
You can integrate CloudSearch with API Gateway.

To access the CloudSearch search and document services, you use separate domain-specific endpoints:

http://doc-domainname–domainid.us-east-1.cloudsearch.amazonaws.com—a domain’s document service endpoint is used to upload documents.
http://search-domainname–domainid.us-east-1.cloudsearch.amazonaws.com—a domain’s search endpoint is used to submit search requests.
CloudSearch supports authentication using AWS Signature Version 4.
To search your data with CloudSearch, the first thing you need to do is create a search domain. If you have multiple collections of data that you want to make searchable, you can create multiple search domains.
A search partition is the portion of your data which fits on a single search instance. A search domain can have one or more search partitions, and the number of search partitions can change as your documents are indexed.
A facet is an index field that represents a category that you want to use to refine and filter search results. When you submit search requests to CloudSearch, you can request facet information to find out how many hits share the same value in a facet.
During indexing, CloudSearch processes the contents of text and text-array fields according to the language-specific analysis scheme configured for the field. An analysis scheme controls how the text is normalized, tokenized, and stemmed, and specifies any stopwords or synonyms to take into account during indexing. CloudSearch provides default analysis schemes for each supported language.
You can customize how search results are ranked by defining expressions that calculate custom values for every document that matches your search criteria.
To make your data searchable, you need to format your data in JSON or XML. Each item that you want to be able to receive as a search result is represented as a document. Every document has a unique document ID and one or more fields that contain the data that you want to search and return in results.
You can specify a variety of options to constrain your search, request facet information, control ranking, and specify what you want to be returned in the results. You can get search results in either JSON or XML. By default, CloudSearch returns results in JSON.
By default, CloudSearch returns search results ranked according to the hits’ relevance _scores. A document’s relevance _score indicates how relevant a particular search hit is to the search request.

Scaling

Scaling for Traffic (domain depth) – When a search instance nears its maximum load, CloudSearch deploys a duplicate search instance to provide additional processing power. When traffic drops, it removes search instances to minimize costs.
Scaling for Data (domain width) – When the amount of data you add to your domain exceeds the capacity of the initial search instance type, CloudSearch scales your search domain to a larger search instance type. After a domain exceeds the capacity of the largest search instance type, CloudSearch partitions the search index across multiple search instances. When the volume of data in your domain shrinks, it scales down your domain to fewer search instances or a smaller search instance type to minimize costs.

Fault Tolerance

You can expand a CloudSearch domain to an additional AZ in the same region to increase fault tolerance in the event of a service disruption.
When you turn on the Multi-AZ option, CloudSearch provisions and maintains extra instances for your search domain in a second AZ to ensure high availability. The maximum number of AZs a domain can be deployed in is two.

Amazon CloudSearch Monitoring

You can retrieve information about each of your search domains through CloudSearch Console, AWS CLI or SDK.
Monitor a domain using CloudWatch.
Log API calls made to CloudSearch using CloudTrail.

Amazon CloudSearch Pricing

Customers are billed according to their monthly usage across the following dimensions:
- Search instances
- Document batch uploads
- IndexDocuments requests
- Data transfer

Note: If you are studying for the AWS Certified Data Engineer Associate exam, we highly recommend that you take our AWS Certified Data Engineer Associate Practice Exams and read our Data Engineer Associate exam study guide.

Amazon CloudSearch Cheat Sheet References:

https://docs.aws.amazon.com/cloudsearch/latest/developerguide/
https://aws.amazon.com/cloudsearch/pricing/
https://aws.amazon.com/cloudsearch/faqs/

Written by: Jon Bonso

Jon Bonso is the co-founder of Tutorials Dojo, an EdTech startup and an AWS Digital Training Partner that provides high-quality educational materials in the cloud computing space. He graduated from Mapúa Institute of Technology in 2007 with a bachelor's degree in Information Technology. Jon holds 10 AWS Certifications and is also an active AWS Community Builder since 2020.

AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!

Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!

View Our AWS, Azure, and GCP Exam Reviewers Check out our FREE courses