Amazon CloudSearch - Developer Guide

Developer Guide

Amazon CloudSearch

API Version 2013-01-01

Amazon CloudSearch Developer Guide

Amazon CloudSearch: Developer Guide

Amazon's trademarks and trade dress may not be used in connection with any product or service

that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any

manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are

the property of their respective owners, who may or may not be aﬃliated with, connected to, or

sponsored by Amazon.

Amazon CloudSearch Developer Guide

Table of Contents

What Is Amazon CloudSearch? ....................................................................................................... 1

Are You New to Amazon CloudSearch? ................................................................................................... 2

How Search Works ....................................................................................................................................... 3

Indexing ..................................................................................................................................................... 3

Facets ......................................................................................................................................................... 4

Text Processing ........................................................................................................................................ 5

Sorting Results ......................................................................................................................................... 5

Search Requests ....................................................................................................................................... 5

Automatic Scaling ........................................................................................................................................ 6

Scaling for Data ....................................................................................................................................... 7

Scaling for Traﬃc .................................................................................................................................... 8

Accessing Amazon CloudSearch ................................................................................................................ 8

Regions and Endpoints .......................................................................................................................... 9

Signing Requests ..................................................................................................................................... 9

Frequently asked questions ..................................................................................................................... 10

Getting Started .............................................................................................................................. 11

Before You Begin ........................................................................................................................................ 11

Step 1: Create a Search Domain ............................................................................................................. 12

Step 2: Upload Data for Indexing .......................................................................................................... 14

Step 3: Search Your Domain .................................................................................................................... 15

Searching with the Search Tester ...................................................................................................... 15

Submitting Search Requests from a Web Browser ........................................................................ 17

Searching Numeric Fields .................................................................................................................... 17

Sorting the Search Results ................................................................................................................. 19

Getting Facet Information .................................................................................................................. 20

Getting Search Highlights ................................................................................................................... 21

Step 4: Delete Your Movies Domain ...................................................................................................... 22

Migrating to the 2013-01-01 API ................................................................................................. 24

Creating 2013-01-01 Domains ................................................................................................................ 24

Conﬁguring 2013-01-01 Domains .......................................................................................................... 24

New Conﬁguration Service Actions and Options ........................................................................... 27

Obsolete Conﬁguration Service Actions and Options ................................................................... 27

Uploading Data to 2013-01-01 Domains ............................................................................................. 28

Searching 2013-01-01 Domains ............................................................................................................. 29

API Version 2013-01-01 iii

Amazon CloudSearch Developer Guide

New Search Parameters and Options ............................................................................................... 31

Obsolete Search Parameters and Options ...................................................................................... 32

Updated Limits ........................................................................................................................................... 33

Creating and Managing Search Domains ..................................................................................... 34

Creating a Search Domain ....................................................................................................................... 34

Creating a Domain Using the Console ............................................................................................. 36

Creating a Domain Using the AWS CLI ............................................................................................ 37

Creating a Domain Using the AWS SDKs ......................................................................................... 38

Conﬁguring Access ..................................................................................................................................... 38

Writing Access Policies for Amazon CloudSearch .......................................................................... 39

Amazon CloudSearch Policy Examples ............................................................................................. 42

Conﬁguring Access for Amazon CloudSearch Using the Console ............................................... 48

Conﬁguring Access for Amazon CloudSearch with the AWS CLI ................................................. 49

Conﬁguring Access to a Domain's Endpoints Using the AWS SDKs ............................................ 50

Conﬁguring Scaling Options .................................................................................................................... 51

Choosing Scaling Options ................................................................................................................... 52

Conﬁguring Scaling Options through the Console ........................................................................ 53

Conﬁguring Scaling Options through the AWS CLI ....................................................................... 54

Conﬁguring Scaling Options through the AWS SDK ..................................................................... 55

Conﬁguring Availability Options ............................................................................................................. 55

Conﬁguring Availability Options through the Console ................................................................. 56

Conﬁguring Availability Options Using AWS CLI ............................................................................ 57

Conﬁguring Availability Options through the AWS SDK .............................................................. 58

Conﬁguring Domain Endpoint Options ................................................................................................. 58

Conﬁguring Domain Endpoint Options Using the Amazon CloudSearch Console ................... 58

Conﬁguring Domain Endpoint Options Using the AWS CLI ......................................................... 58

Conﬁguring Domain Endpoint Options Using the AWS SDKs ..................................................... 59

Monitoring Search Domains ..................................................................................................................... 59

Getting Domain Information .............................................................................................................. 59

Monitoring a Domain with Amazon CloudWatch ........................................................................... 65

Logging Conﬁguration API Calls ........................................................................................................ 68

Tracking your Amazon CloudSearch Usage and Charges ............................................................. 71

Deleting a Domain ..................................................................................................................................... 71

Deleting a Domain Using the Console ............................................................................................. 72

Deleting a Domain Using the AWS CLI ............................................................................................ 72

Deleting Domains Using the AWS SDKs .......................................................................................... 72

API Version 2013-01-01 iv

Amazon CloudSearch Developer Guide

Tagging Amazon CloudSearch Domains ................................................................................................ 73

Working with Tags (Console) .............................................................................................................. 74

Controlling How Data is Indexed ................................................................................................. 75

Preparing Your Data .................................................................................................................................. 75

Mapping Document Data to Index Fields ........................................................................................ 76

Creating Document Batches ............................................................................................................... 76

Conﬁguring Index Fields ........................................................................................................................... 82

Conﬁguring Individual Index Fields ................................................................................................... 84

Conﬁguring Index Fields Using the Console ................................................................................... 85

Conﬁguring Index Fields Using the AWS SDK ................................................................................ 86

Using Dynamic Fields ................................................................................................................................ 86

Conﬁguring Dynamic Fields ................................................................................................................ 87

Ignoring Unrecognized Document Fields ......................................................................................... 88

Searching Dynamic Fields ................................................................................................................... 89

Conﬁguring Analysis Schemes ................................................................................................................. 90

Stemming ............................................................................................................................................... 91

Stopwords ............................................................................................................................................... 93

Synonyms .............................................................................................................................................. 94

Conﬁguring Analysis Schemes Using the Console ......................................................................... 95

Conﬁguring Analysis Schemes Using the AWS CLI ........................................................................ 96

Conﬁguring Analysis Schemes Using the AWS SDKs ..................................................................... 98

Indexing Bigrams for Chinese, Japanese, and Korean ................................................................... 98

Customizing Japanese Tokenization ................................................................................................. 99

Text Processing ......................................................................................................................................... 104

Supported Languages ........................................................................................................................ 105

Language Speciﬁc Settings .............................................................................................................. 105

Uploading and Indexing Data ..................................................................................................... 123

Uploading Data ........................................................................................................................................ 123

Submitting Document Upload Requests ....................................................................................... 125

Bulk Uploads ....................................................................................................................................... 126

Uploading Data Using the Console ................................................................................................ 127

Uploading Data Using the AWS CLI ............................................................................................... 129

Posting Documents to a Document Service Endpoint via HTTP ............................................... 129

Indexing Document Data ....................................................................................................................... 129

Indexing Documents Using the Console ........................................................................................ 131

Indexing Documents Using the AWS CLI ....................................................................................... 131

API Version 2013-01-01 v

Amazon CloudSearch Developer Guide

Indexing Documents with the AWS SDK ....................................................................................... 131

Searching Your Data .................................................................................................................... 132

Submitting Search Requests .................................................................................................................. 133

Searching with the Search Tester ................................................................................................... 135

Constructing Compound Queries ......................................................................................................... 137

Searching for Text in Amazon CloudSearch ....................................................................................... 139

Searching for Individual Terms ........................................................................................................ 140

Searching for Phrases ........................................................................................................................ 142

Searching for Literal Strings ............................................................................................................ 143

Searching for Preﬁxes ........................................................................................................................ 144

Searching for Numbers ........................................................................................................................... 146

Searching for Dates and Times ............................................................................................................. 146

Searching for a Range of Values .......................................................................................................... 147

Searching for a Date Range ............................................................................................................. 147

Searching for a Location Range ...................................................................................................... 148

Searching for a Text Range .............................................................................................................. 148

Searching and Ranking Results by Geographic Location ................................................................. 148

Searching Within an Area ................................................................................................................. 149

Sorting Results by Distance ............................................................................................................. 149

Searching DynamoDB Data .................................................................................................................... 151

Conﬁguring a Domain to Search DynamoDB Data ...................................................................... 151

Uploading Data from DynamoDB ................................................................................................... 153

Synchronizing a Search Domain with a DynamoDB Table ......................................................... 154

Filtering Matching Documents .............................................................................................................. 156

Tuning Search Requests ......................................................................................................................... 156

Analyzing Query Latency .................................................................................................................. 156

Querying For More Information ................................................................................................. 159

Retrieving Data from Index Fields ........................................................................................................ 159

Getting Statistics for Numeric Fields ................................................................................................... 160

Getting and Using Facet Information .................................................................................................. 161

Getting Facet Information ................................................................................................................ 162

Using Facet Information .................................................................................................................... 164

Highlighting Search Hits ........................................................................................................................ 171

Getting Suggestions ................................................................................................................................ 172

Conﬁguring Suggesters ..................................................................................................................... 173

Retrieving Suggestions ...................................................................................................................... 176

API Version 2013-01-01 vi

Amazon CloudSearch Developer Guide

Controlling Search Results .......................................................................................................... 179

Sorting Results ......................................................................................................................................... 179

Using Relative Field Weighting to Customize Text Relevance ........................................................ 181

Conﬁguring Expressions ......................................................................................................................... 181

Writing Expressions ............................................................................................................................ 182

Deﬁning Expressions in Search Requests ...................................................................................... 183

Conﬁguring Reusable Expressions .................................................................................................. 184

Comparing Expressions ..................................................................................................................... 186

Getting Results as XML .......................................................................................................................... 188

Paginating Results ................................................................................................................................... 189

Deep Paging Beyond 10,000 Hits ................................................................................................... 189

Integrating with API Gateway .................................................................................................... 192

Prerequisites .............................................................................................................................................. 192

Creating and Conﬁguring an API (Console) ........................................................................................ 195

Testing the API (Console) ....................................................................................................................... 196

Handling Errors ........................................................................................................................... 198

Error Types in Amazon CloudSearch ................................................................................................... 198

Retrying Requests in Amazon CloudSearch ....................................................................................... 199

Amazon CloudSearch API Reference .......................................................................................... 200

Conﬁguration API Reference ................................................................................................................. 200

Submitting Conﬁguration Requests ............................................................................................... 200

Actions .................................................................................................................................................. 203

Data Types ........................................................................................................................................... 256

Common Parameters ......................................................................................................................... 302

Common Errors ................................................................................................................................... 304

Document Service API Reference ......................................................................................................... 306

documents/batch ................................................................................................................................ 307

Search API Reference .............................................................................................................................. 319

Search .................................................................................................................................................... 319

Submitting Suggest Requests .......................................................................................................... 342

Suggest ................................................................................................................................................. 343

Search Service Errors ......................................................................................................................... 346

Troubleshooting ........................................................................................................................... 348

Uploading Documents ............................................................................................................................ 348

Deleting All Documents .......................................................................................................................... 349

Domain Not Scaling Down After Deleting Documents .................................................................... 350

API Version 2013-01-01 vii

Amazon CloudSearch Developer Guide

Document Update Latency .................................................................................................................... 350

Large Number of 5xx Errors When Uploading Documents ............................................................. 350

Search Latency and Timeouts ............................................................................................................... 350

Search Latency and Timeouts ............................................................................................................... 351

Sudden Increase in 5xx Errors when Searching ................................................................................. 351

Indexing Failures after Updating Indexing Options .......................................................................... 351

Domain Not Found Error ........................................................................................................................ 352

Number of Searchable Documents Not Returned ............................................................................ 352

Conﬁguration Service Access Policies Not Working .......................................................................... 352

Search and Document Service Access Policies Not Working ........................................................... 353

Amazon CloudSearch Console Permissions Errors ............................................................................ 354

Using Wildcards to Search Text Fields Doesn't Produce Expected Results ................................... 354

Inconsistent Results When Using Cursors for Deep Paging ............................................................ 355

Certiﬁcate Errors When Using an SDK ................................................................................................ 355

Limits ............................................................................................................................................ 357

Resources ...................................................................................................................................... 362

Document History ........................................................................................................................ 363

AWS Glossary ............................................................................................................................... 369

API Version 2013-01-01 viii

Amazon CloudSearch Developer Guide

What Is Amazon CloudSearch?

Important

Amazon CloudSearch is no longer available to new customers. Existing customers of

Amazon CloudSearch can continue to use the service as usual. Learn more.

Amazon CloudSearch is a fully managed service in the cloud that makes it easy to set up, manage,

and scale a search solution for your website or application.

With Amazon CloudSearch you can search large collections of data such as web pages, document

ﬁles, forum posts, or product information. You can quickly add search capabilities without having

to become a search expert or worry about hardware provisioning, setup, and maintenance. As your

volume of data and traﬃc ﬂuctuates, Amazon CloudSearch scales to meet your needs.

Note

This document describes the Amazon CloudSearch 2013-01-01 API. If you have

2011-02-01 search domains and need to reference the old documentation, you can

download a PDF of the 2011-02-01 Developer Guide.

You can use Amazon CloudSearch to index and search both structured data and plain text. Amazon

CloudSearch features:

• Full text search with language-speciﬁc text processing

• Boolean search

• Preﬁx searches

• Range searches

• Term boosting

• Faceting

• Highlighting

• Autocomplete suggestions

API Version 2013-01-01 1

Amazon CloudSearch Developer Guide

You can get search results in JSON or XML, sort and ﬁlter results based on ﬁeld values, and sort

results alphabetically, numerically, or according to custom expressions.

To build a search solution with Amazon CloudSearch, you take the following steps:

• Create and conﬁgure a search domain. A search domain includes your searchable data and the

search instances that handle your search requests. If you have multiple collections of data that

you want to make searchable, you can create multiple search domains.

• Upload the data you want to search to your domain. Amazon CloudSearch indexes your data

and deploys the search index to one or more search instances.

• Search your domain. You send a search request to your domain's search endpoint as an HTTP/

HTTPS GET request.

Topics

• Are You New to Amazon CloudSearch?

• How Search Works

• Automatic Scaling in Amazon CloudSearch

• Accessing Amazon CloudSearch

• Frequently asked questions

Are You New to Amazon CloudSearch?

For a high-level overview of Amazon CloudSearch, service highlights, and pricing information, see

the Amazon CloudSearch detail page. If you are ready to start using Amazon CloudSearch, you

should begin with Getting Started with Amazon CloudSearch.

You can interact with Amazon CloudSearch through the AWS Management Console, AWS SDKs, or

AWS CLI. While you can also submit API requests directly to Amazon CloudSearch, the SDKs and

AWS CLI automatically sign your requests as needed and provide centralized tools for interacting

with Amazon CloudSearch domains in conjunction with other AWS services. For information about

the AWS SDKs, see Tools for Amazon Web Services. For information about installing and using the

AWS CLI, see the AWS Command Line Interface User Guide.

For more information about conﬁguring and managing your search domains, getting your data into

Amazon CloudSearch, submitting search requests, and processing the responses, see:

Are You New to Amazon CloudSearch? API Version 2013-01-01 2

Amazon CloudSearch Developer Guide

• Preparing Your Data—how to format your data so you can upload it to an Amazon CloudSearch

domain for indexing

• conﬁgure indexing options—how to conﬁgure indexing options for an Amazon CloudSearch

domain

• Searching Your Data with Amazon CloudSearch—how to use the Amazon CloudSearch query

language

• Controlling Search Results—how to sort, ﬁlter, and paginate search results

How Search Works

The collection of data that you want to search (sometimes referred to as your corpus) can consist

of unstructured full-text documents, semi-structured documents such as those formatted in mark-

up languages like XML, or structured data that conforms to a strict data model. Each item that you

want to be able to search (such as a forum post or web page) is represented as a document. Every

document has a unique ID and one or more ﬁelds that contain the data that you want to search

and include in results.

To make your data searchable, you represent it as a batch of documents in either JSON or XML and

upload the batch to your search domain. Amazon CloudSearch then generates a search index from

your document data according to your domain's conﬁguration options. You submit queries against

this index to ﬁnd the documents that meet speciﬁc search criteria.

As your data changes, you submit updates to add, change, or delete documents from your index.

Updates are applied continuously in the order they are received.

For information about how to format your data, see Preparing Your Data.

Indexing in Amazon CloudSearch

To build a search index from your data, Amazon CloudSearch needs the following information:

• Which document ﬁelds do you want to search?

• Which document ﬁeld values do you want to retrieve with the search results?

• Which document ﬁelds represent categories that you want to use to reﬁne and ﬁlter search

results?

• How should the text within a particular ﬁeld be processed?

How Search Works API Version 2013-01-01 3

Amazon CloudSearch Developer Guide

You deﬁne this metadata in your domain conﬁguration by conﬁguring indexing options. You use

indexing options to specify the ﬁelds included in the search index and control how you can use

those ﬁelds.

You must conﬁgure a corresponding index ﬁeld for each document ﬁeld that occurs in your data—

there's a one-to-one mapping between document ﬁelds and the ﬁelds in your Amazon CloudSearch

index. In addition to the index ﬁeld name, you specify the following:

• The index ﬁeld type

•

Whether the ﬁeld is searchable (text and text-array ﬁelds are always searchable)

• Whether the ﬁeld can be used as a category (facet)

• Whether the ﬁeld value can be returned with the search results

• Whether the ﬁeld can be used to sort the results

• Whether highlights can be returned for the ﬁeld

• A default value to use if no value is speciﬁed in the document data.

For information about how to conﬁgure index ﬁelds for Amazon CloudSearch, see conﬁgure

indexing options.

Facets in Amazon CloudSearch

A facet is an index ﬁeld that represents a category that you want to use to reﬁne and ﬁlter

search results. When you submit search requests to Amazon CloudSearch, you can request facet

information to ﬁnd out how many hits share the same value in a facet. You can display this

information along with the search results and use it to enable users to interactively reﬁne their

searches. (This is often referred to as faceted navigation or faceted search.)

A facet can be any date, literal, or numeric ﬁeld that has faceting enabled in your domain

conﬁguration. For each facet, Amazon CloudSearch calculates the number of hits that share the

same value. You can deﬁne buckets to calculate facet counts for particular subsets of the facet

values. Only buckets that have matches are included in the facet results.

For information about conﬁguring facets, see conﬁgure indexing options. For information about

using facet information to support faceted navigation, see Getting and Using Facet Information in

Amazon CloudSearch.

Facets API Version 2013-01-01 4

Amazon CloudSearch Developer Guide

Text Processing in Amazon CloudSearch

During indexing, Amazon CloudSearch processes the contents of text and text-array ﬁelds

according to the language-speciﬁc analysis scheme conﬁgured for the ﬁeld. An analysis scheme

controls how the text is normalized, tokenized, and stemmed, and speciﬁes any stopwords or

synonyms to take into account during indexing. Amazon CloudSearch provides default analysis

schemes for each supported language. For information about conﬁguring custom analysis schemes,

see Conﬁguring Analysis Schemes. For information about how Amazon CloudSearch normalizes

and tokenizes text and applies conﬁgured text options when indexing text ﬁelds and processing

search requests, see Text Processing in Amazon CloudSearch.

Sorting Results in Amazon CloudSearch

You can customize how search results are ranked by deﬁning expressions that calculate custom

values for every document that matches your search criteria. For example, you might deﬁne

an expression that takes into account the value in a document's popularity ﬁeld as well as

the default relevance score calculated by Amazon CloudSearch Expressions are simply numeric

expressions that use standard numeric operators and functions. Expressions can reference int and

double ﬁelds, other expressions, a document's relevance score (_score), as well as the epoch time

(_time). When you submit search requests, you specify the expression(s) you want to use to sort the

search results. You can also reference expressions within your search criteria.

A document's relevance _score indicates how relevant a particular search hit is to the search

request. To calculate the relevance score, Amazon CloudSearch takes into account how many times

the search terms appear in a document relative to the other documents in the index.

For information about how to conﬁgure expressions for your domain, see Conﬁguring Expressions.

Search Requests in Amazon CloudSearch

You submit search requests to your domain's search endpoint as HTTP/HTTPS GET requests. You

can specify a variety of options to constrain your search, request facet information, control ranking,

and specify what you want to be returned in the results. You can get search results in either JSON

or XML. By default, Amazon CloudSearch returns results in JSON.

When you submit a search request, Amazon CloudSearch performs text processing on the search

string. The search string is processed to:

• Convert all characters to lowercase

Text Processing API Version 2013-01-01 5

Amazon CloudSearch Developer Guide

• Split the string into separate terms on whitespace and punctuation boundaries

• Remove terms that are on the stopword list for the ﬁeld being searched.

• Map stems and synonyms according to the stemming and synonym options conﬁgured for the

ﬁeld being searched.

After this preprocessing is complete, Amazon CloudSearch looks up the search terms in the index

and identiﬁes all of the documents that match the request. To generate a response, Amazon

CloudSearch processes this list of search hits to ﬁlter and sort the matching documents and

compute facets. Amazon CloudSearch then returns the response in JSON or XML.

By default, Amazon CloudSearch returns search results ranked according to the hits' relevance

_scores. Alternatively, your request can specify the index ﬁeld or expression that you want to use to

sort the hits. For example, you might want to sort hits by an index ﬁeld that contains the price or

an expression that calculates popularity.

For more information about searching, ranking, and paginating results, see Searching Your Data

with Amazon CloudSearch.

Automatic Scaling in Amazon CloudSearch

A search domain has one or more search instances, each with a ﬁnite amount of RAM and CPU

resources for indexing data and processing requests. How many search instances a domain needs

depends on the documents in your collection and the volume and complexity of your search

requests.

Amazon CloudSearch can determine the size and number of search instances required to deliver

low latency, high throughput search performance. When you upload your data and conﬁgure your

index, Amazon CloudSearch builds an index and picks the appropriate initial search instance type.

As you use your search domain, Amazon CloudSearch can scale to accommodate the amount of

data uploaded to the domain and the volume and complexity of search requests.

When you create a search domain, a single instance is deployed for the domain. As the following

illustration shows, you always have at least one instance for your domain. Amazon CloudSearch

automatically scales the domain by adding instances as the volume of data or traﬃc increases.

Automatic Scaling API Version 2013-01-01 6

Amazon CloudSearch Developer Guide

Scaling for Data

When the amount of data you add to your domain exceeds the capacity of the initial search

instance type, Amazon CloudSearch scales your search domain to a larger search instance type.

After a domain exceeds the capacity of the largest search instance type, Amazon CloudSearch

partitions the search index across multiple search instances. (The number of search instances

required to hold the index partitions is sometimes referred to as the domain's width.)

When the volume of data in your domain shrinks, Amazon CloudSearch scales down your domain

to fewer search instances or a smaller search instance type to minimize costs.

Scaling for Data API Version 2013-01-01 7

Amazon CloudSearch Developer Guide

Note

If your domain has scaled up to accommodate your index size and you delete a large

number of documents, the domain scales down the next time the full index is rebuilt.

Although the index is automatically rebuilt periodically, to scale down as quickly as possible

you can explicitly run indexing when you are done deleting documents.

Scaling for Traﬃc

As your search request volume or complexity increases, it takes more processing power to handle

the load. A high volume of document uploads also increases the load on a domain's search

instances. When a search instance nears its maximum load, Amazon CloudSearch deploys a

duplicate search instance to provide additional processing power. (The number of duplicate search

instances is sometimes referred to as the domain's depth.)

When traﬃc drops, Amazon CloudSearch removes search instances to minimize costs. For example,

a new domain might scale up to handle the initial inﬂux of documents, and scale back down after

you have ﬁnished uploading your data and are only submitting updates.

If your domain experiences a sudden surge in traﬃc, Amazon CloudSearch deploys additional

search instances. It takes a few minutes to set up the new instances, however, so you might see an

increase in 5xx errors until the new instances can start processing requests. For more information

about handling 5xx errors, see Handling Errors.

Keep in mind that the type and complexity of your search requests aﬀect overall search

performance and in some cases increase the number of search instances required to operate your

domain. Submitting a high volume of small or single-document batches can aﬀect your search

domain's performance. For more information, see Tuning Search Request Performance in Amazon

CloudSearch.

Accessing Amazon CloudSearch

You can access Amazon CloudSearch through the Amazon CloudSearch console, the AWS SDKs, or

the AWS CLI.

• The Amazon CloudSearch console enables you to easily create, conﬁgure, and monitor your

search domains, upload documents, and run test searches. Using the console is the easiest way

Scaling for Traﬃc API Version 2013-01-01 8

Amazon CloudSearch Developer Guide

to get started with Amazon CloudSearch and provides a central command center for the ongoing

management of your search domains.

• The AWS SDKs support all of the Amazon CloudSearch API operations, making it easy to manage

and interact with your search domains using your preferred technology. The SDKs automatically

sign requests as needed using your AWS credentials.

• The AWS CLI wraps all of the Amazon CloudSearch API operations to provide a simple way to

create and conﬁgure search domains, upload the data you want to search, and submit search

requests. The AWS CLI automatically signs requests as needed using your AWS credentials.

Regions and Endpoints for Amazon CloudSearch

Amazon CloudSearch provides regional endpoints for accessing the conﬁguration service and

domain-speciﬁc endpoints for accessing the search and document services.

You use the conﬁguration service to create and manage your search domains. The region-speciﬁc

conﬁguration service endpoints are of the form: cloudsearch.region.amazonaws.com. For

example, cloudsearch.us-east-1.amazonaws.com. For a current list of supported regions,

see Regions and Endpoints in the AWS General Reference.

To access the Amazon CloudSearch search and document services, you use separate domain-

speciﬁc endpoints:

•

http://doc-domainname-domainid.us-east-1.cloudsearch.amazonaws.com—a

domain's document service endpoint is used to upload documents

•

http://search-domainname-domainid.us-east-1.cloudsearch.amazonaws.com—a

domain's search endpoint is used to submit search requests

Signing Amazon CloudSearch Requests

If you're using a language for which AWS provides an SDK, we recommend that you use the SDK

to submit Amazon CloudSearch requests. All of the AWS SDKs greatly simplify the process of

signing requests and save you a signiﬁcant amount of time when compared to using the Amazon

CloudSearch APIs directly. The SDKs integrate easily with your development environment and

provide easy access to related commands. You can also use the Amazon CloudSearch console and

AWS CLI to submit signed requests with no additional eﬀort.

Regions and Endpoints API Version 2013-01-01 9

Amazon CloudSearch Developer Guide

If you choose to call the Amazon CloudSearch APIs directly, you must sign your own requests.

Conﬁguration service requests must always be signed. Upload, search, and suggest requests

must be signed, unless you conﬁgure anonymous access for those services. To sign a request,

you calculate a digital signature using a cryptographic hash function, which returns a hash

value based on the input. The input includes the text of your request and your secret access

key. The hash function returns a hash value that you include in the request as your signature.

The signature is part of the Authorization header of your request. After receiving your request,

Amazon CloudSearch recalculates the signature using the same hash function and input that you

used to sign the request. If the resulting signature matches the signature in the request, Amazon

CloudSearch processes the request. Otherwise, the request is rejected.

Amazon CloudSearch supports authentication using AWS Signature Version 4. For more

information, see Signature Version 4 Signing Process.

Frequently asked questions

What is the cutoﬀ point for “current customers”?

We created an allowlist of account IDs that are already using Amazon CloudSearch. However,

we will allowlist any new account of customers previously using Amazon CloudSearch. If you are

having diﬃculties, please submit a support ticket.

What do we mean by “access” to the service?

Current customers can do anything they could previously. The only change is that non-current

customers cannot access Amazon CloudSearch.

Can existing Amazon CloudSearch customers create new repositories if they were alreadyAmazon

CloudSearch?

Yes. If you are having diﬃculties, please submit a support ticket

Frequently asked questions API Version 2013-01-01 10

Amazon CloudSearch Developer Guide

Getting Started with Amazon CloudSearch

To start searching your data with Amazon CloudSearch, you simply take the following steps:

• Create and conﬁgure a search domain

• Upload and index the data you want to search

• Send search requests to your domain

This tutorial shows you how to get up and running using the AWS Management Console for

Amazon CloudSearch. To make it even easier to get started, we've generated a sample data set

of 5,000 popular movie titles that you can download and examine, upload to your own search

domain, and submit search queries against to see how Amazon CloudSearch works.

Using the AWS Management Console and the sample movie data, you'll have your own search

domain up and running in about half an hour.

To begin, Get Signed Up.

Topics

• Before You Begin with Amazon CloudSearch

• Step 1: Create an Amazon CloudSearch Domain

• Step 2: Upload Data to Amazon CloudSearch for Indexing

• Step 3: Search Your Amazon CloudSearch Domain

• Step 4: Delete Your Amazon CloudSearch Movies Domain

Before You Begin with Amazon CloudSearch

To use Amazon CloudSearch, you need an Amazon Web Services (AWS) account. Your AWS account

enables you to access Amazon CloudSearch and other AWS services, such as Amazon Simple

Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2). As with other AWS

services, you pay only for the Amazon CloudSearch resources you use. There are no sign up fees

and charges are not incurred until you create a search domain.

If you already have an AWS account, you are automatically signed up for Amazon CloudSearch.

Before You Begin API Version 2013-01-01 11

Amazon CloudSearch Developer Guide

To create an AWS account

1. Go to https://aws.amazon.com and click Sign Up Now.

2. Follow the instructions to sign up. You will need to enter payment information before you can

begin using Amazon CloudSearch.

Step 1: Create an Amazon CloudSearch Domain

An Amazon CloudSearch domain encapsulates a collection of data you want to search, the search

instances that process your search requests, and a conﬁguration that controls how your data is

indexed and searched. You create a separate search domain for each collection of data you want

to make searchable. For each domain, you conﬁgure indexing options that describe the ﬁelds

you want to include in your index and how you want to use them, analysis schemes that specify

language-speciﬁc text processing options for individual ﬁelds, expressions that you can use to

customize how search results are ranked, and access policies that control access to the domain's

document and search endpoints.

You interact with a search domain to:

• Conﬁgure index and search options

• Submit data for indexing

• Perform searches

Each domain has a unique endpoint through which you submit search requests to the domain. For

example, the endpoint for a domain called movies created in the US East (N. Virginia) region might

be:

Example

search-movies-mtshfsu2rje7ywr66uit3dei4m.us-east-1.cloudsearch.amazonaws.com

When creating a search domain, you specify a unique name for the domain. Domain names must

start with a letter or number and be at least 3 and no more than 28 characters long. The allowed

characters are: a-z, 0-9, and hyphen (-). By default, new domains are created in the US East (N.

Virginia) region. To create a domain in another region, you must explicitly specify the region when

creating the domain.

Step 1: Create a Search Domain API Version 2013-01-01 12

Amazon CloudSearch Developer Guide

To conﬁgure the new domain, you must specify:

• Indexing options for the data you want to search.

• Access policies for the domain's document service and search service endpoints.

This tutorial shows you how to create and interact with a domain using the Amazon CloudSearch

console. To learn more, see Creating a Search Domain.

Important

The domain you're about to create will be live and you will incur the standard Amazon

CloudSearch usage fees for the domain until you delete it. For more information about

Amazon CloudSearch usage rates, go to the Amazon CloudSearch detail page.

To create your movies domain

1. Go to the Amazon CloudSearch console at https://console.aws.amazon.com/cloudsearch/

home.

2. Choose Create domain.

3. Enter a name for your new domain. Domain names must start with a letter or number and

be at least 3 and no more than 28 characters. Domain names can contain the following

characters: a-z (lower case), 0-9, and - (hyphen). Upper case letters and underscores are not

allowed.

4. Leave the other settings as their defaults and choose Next.

5. Select Sample data and choose IMDb movies (demo) from the dropdown. You can also

automatically conﬁgure a search domain by analyzing a sample of your data.

6. Choose Next.

7. Review the index ﬁelds being conﬁgured. Eleven ﬁelds are conﬁgured automatically for

the imdb-movie data: actors, directors, genres, image_url, plot, rank, rating, release_date,

running_time_secs, title, and year.

Note

By default, all options are enabled for each ﬁeld. While this is convenient for

development and testing, ﬁne-tuning the options conﬁgured for each ﬁeld according

Step 1: Create a Search Domain API Version 2013-01-01 13

Amazon CloudSearch Developer Guide

to how you use those ﬁelds can reduce the size of your index. If your domain uses more

than a single small search instance, tuning can help minimize the cost of running your

domain.

When you ﬁnish reviewing the indexing options, choose Next.

8. For simplicity in this tutorial, use an open access domain. Choose Allow open access to the

domain and choose Next.

9. Review the domain conﬁguration and click Create to create your domain.

Amazon CloudSearch initializes resources for the domain, which can take about ten minutes.

During this initialization process, the status of the domain is Processing. Once the status changes

to Active, you can upload your data and start searching.

Step 2: Upload Data to Amazon CloudSearch for Indexing

You upload the data you want to search to your domain so that Amazon CloudSearch can build and

deploy a searchable index. To be indexed by Amazon CloudSearch, the data must be formatted in

either JSON or XML. The Amazon CloudSearch console can automatically convert the following ﬁle

types to the required format:

• Document batches formatted in JSON or XML (.json, .xml)

• Comma Separated Value (.csv)

• Text Documents (.txt)

When you upload a CSV ﬁle, Amazon CloudSearch parses each row separately. The ﬁrst row deﬁnes

the document ﬁelds, and each subsequent row becomes a separate document. For all other ﬁle

types Amazon CloudSearch creates a single document and the contents of the ﬁle are mapped to

a single text ﬁeld. If metadata is available for the ﬁle, the metadata is mapped to corresponding

document ﬁelds—the ﬁelds generated from the document metadata vary depending on the ﬁle

type.

The sample IMDb movies data is already formatted in JSON.

This tutorial shows how to submit data through the Amazon CloudSearch console, but you can

also convert and upload documents with the command line tools, and upload documents using the

Step 2: Upload Data for Indexing API Version 2013-01-01 14

Amazon CloudSearch Developer Guide

documents/batch resource. (To upload more than 5 MB of data, you must use the command line

tools or API.)

To upload the sample data to your movies domain

1. Go to the Amazon CloudSearch console at https://console.aws.amazon.com/cloudsearch/

home.

2. In the left navigation pane, choose Domains. Choose the name of your movies domain to view

the domain dashboard.

3. Choose Actions, Upload documents.

4. Select Sample data and choose IMDb movies (demo) from the dropown.

5. Choose Next.

6. Review the upload summary and choose Upload documents to send the data to your domain

for indexing.

Note

To see how the data is formatted, choose Download the generated document batch.

For more information about preparing your own data, see Preparing Your Data.

You now have a fully functional Amazon CloudSearch domain that you can start searching. Updates

are applied continuously in the order they are received, so you can start searching your domain

right away.

Step 3: Search Your Amazon CloudSearch Domain

You can use the search tester in the Amazon CloudSearch console to submit sample search

requests and view the results. You can also submit sample search requests through a Web browser

or using cURL. In your application, you can use any HTTP library to send search traﬃc to your

Amazon CloudSearch domain.

Searching with the Search Tester

The search tester in the Amazon CloudSearch console enables you to submit sample search

requests using any of the supported query parsers: simple, structured, lucene, or dismax. By

default, requests are processed with the simple query parser. You can specify options for the

Step 3: Search Your Domain API Version 2013-01-01 15

Amazon CloudSearch Developer Guide

selected parser, ﬁlter and sort the results, and browse the conﬁgured facets. The search hits

are automatically highlighted in the search results. For information about how this is done,

see Highlighting Search Hits in Amazon CloudSearch. You can also select a suggester to get

suggestions as you enter terms in the Search ﬁeld. (You must conﬁgure a suggester before you

can get suggestions. For more information see Getting Autocomplete Suggestions in Amazon

CloudSearch.)

By default, results are sorted according to an automatically-generated relevance score, _score. For

information about customizing how results are ranked, see Sorting Results in Amazon CloudSearch.

To search your domain

1. Go to the Amazon CloudSearch console at https://console.aws.amazon.com/cloudsearch/

home.

2. In the left navigation panel, choose your movies domain to open its conﬁguration.

3. Choose Run a test search.

To perform a simple text search, enter a search query and choose Run. By default, all text and

text-array ﬁelds are searched.

To search particular ﬁelds, expand Options and enter a comma-separated list of the ﬁelds you

want to search for in the Search ﬁelds ﬁeld. You can append a weight to each ﬁeld with a caret

(^) to control the relative importance of each ﬁeld in the search results. For example, specifying

title^5, description weights hits in the title ﬁeld ﬁve times more than hits in the

description ﬁeld when calculating relevance scores for each matching document.

To use the structured query syntax, select Structured from the Query parser menu. Once you've

selected the structured query parser, enter your structured query in the Search ﬁeld and choose

Run. For example, to ﬁnd all of the movies with star in the title that were released in the year 2000

or earlier, you could enter: (and title:'star' year:{,2000]). For more information, see

Constructing Compound Queries. To submit Lucene or DisMax queries, select the appropriate query

parser.

You can specify additional options for the selected query parser to conﬁgure the default operator

and control which operators can be used in a query. For more information, see Search Request

Parameters.

You can copy and paste the request URL to submit the request and view the response from a Web

browser. Requests can be sent via HTTP or HTTPS.

Searching with the Search Tester API Version 2013-01-01 16

Amazon CloudSearch Developer Guide

Submitting Search Requests from a Web Browser

You can submit search requests directly to your search endpoint from any Web browser. You

can use any of the query parsers (simple, structured, lucene, or dismax) and specify a variety of

options to constrain your search, request facet information, customize ranking, and control what

information is returned in the results.

For example, to search your movies domain and get the titles of all of the available Star Wars

movies, append the following search string to your search endpoint. (2013-01-01 is the API version

and must be speciﬁed.)

Example

/2013-01-01/search?q=star+wars&return=title

Note

Your domain's search endpoint is shown on the domain dashboard. You can also perform a

search from the AWS Management Console, view the raw request and response, and copy

the request URL from the Search Request ﬁeld. A domain's search and document service

endpoints remain the same for the life of the domain.

By default, Amazon CloudSearch returns the response in JSON. You can also get the search results

formatted in XML by specifying the format parameter, format=xml. (Note that errors can be

returned in either JSON or XML, depending on where the error originated.)

Searching Numeric Fields

You can use the structured query syntax, q.parser=structured, to ﬁnd documents that have

particular numeric attributes. You can search for an exact value or a range of values within any

numeric ﬁeld (double, double-array, int, int-array). To search for a range, you specify the

upper and lower bounds, separated by a comma, and enclose the range in brackets or braces. Use

square brackets ([,]) when you want to include the bounds, and curly braces ({,}) to exclude the

bounds. For example:

•

year:2000 matches documents whose year ﬁeld contains the value 2000.

•

year:[2000,} matches documents whose year ﬁeld contains a value greater than or equal to

2000

Submitting Search Requests from a Web Browser API Version 2013-01-01 17

Amazon CloudSearch Developer Guide

•

year:{,2000] matches documents whose year ﬁeld contains a value less than or equal to 2000

•

year:[2000,2011] matches documents whose year ﬁeld contains a value between 2000 and

2011, inclusive.

•

year:{2000,2011} matches documents whose year ﬁeld contains a value between 2000 and

2011, exclusive.

You can also search date ﬁelds for a speciﬁc date or date range, but

you must enclose each date string in single quotes: release_date:

['2000-01-01T00:00:00Z','2011-01-01T00:00:00Z'].

For example, the following structured query searches for "star" in the title ﬁeld, ﬁnds all of the

matching movies that were released before 2000, and returns the title, year, and relevance score

for each one:

Example

q=(and title:'star' year:{,2000])&q.parser=structured&return=title,year,_score

The response shows the status of the request, the number of matching documents, and the

requested ﬁelds for each hit.

{

"status": {

"rid": "hLPckLsoEQoELQo=",

"time-ms": 2

"hits": {

"found": 15,

"start": 0,

"hit": [

{

"id": "tt0076759",

"fields": {

"title": "Star Wars",

"year": "1977",

"_score": "5.7601414"

}

Searching Numeric Fields API Version 2013-01-01 18

Amazon CloudSearch Developer Guide

{

"id": "tt0088170",

"fields": {

"title": "Star Trek III: The Search for Spock",

"year": "1984",

"_score": "4.2371693"

}

]

}

For more information about constructing search queries, see Searching Your Data with Amazon

CloudSearch.

Sorting the Search Results

By default, Amazon CloudSearch sorts the search results according to an automatically generated

relevance _score. You can change how results are ranked by using the sort parameter in your

search request to specify the ﬁeld or expression you want to use for ranking. (An expression is

a custom numeric expression that can be evaluated for each document in the set of matching

documents. For information about deﬁning your own expressions, see Conﬁguring Expressions.)

If you specify a text ﬁeld with the sort parameter, the results are sorted alphabetically according

to that ﬁeld. For example, to sort results from your movies domain alphabetically by title, add

&sort=title asc to your query string:

Example

2013-01-01/search?q=(and genres:'Sci-Fi' year:

{,2000])&q.parser=structured&return=title,year&sort=title asc

Note that you must explicitly specify the sort direction, asc (ascending) or desc (descending).

When you sort alphabetically, Amazon CloudSearch sorts by Unicode codepoint. This means

numbers come before letters and uppercase letters come before lowercase letters. Numbers are

sorted as strings; for example, 10 will come before 2.

Similarly, you can specify an integer ﬁeld with the sort parameter to sort the results numerically.

Sorting the Search Results API Version 2013-01-01 19

Amazon CloudSearch Developer Guide

If you specify a comma separated list of ﬁelds or expressions, the ﬁrst ﬁeld or expression is used as

the primary sort criteria, the second is used as the secondary sort criteria, and so on.

For more information about ranking results, see Sorting Results in Amazon CloudSearch

Getting Facet Information