Search#
Concourse automatically indexes every String value for full-text
search. This means you can search for records containing specific
words or phrases without any additional configuration or index
management.
Full-Text Search#
The search method performs a full-text search against a specific
key and returns the IDs of all records that contain a matching
value.
1 2 | |
1 2 | |
The search query is matched against the indexed text of every
String value stored for the given key. Matching is
case-insensitive and supports partial word matches (substring
matching).
How Indexing Works#
When a String value is written to Concourse, the storage engine
automatically breaks it into substrings and indexes each one. This
enables efficient substring searches without requiring the query to
match the entire value.
For example, writing the value "Jeff Nelson" generates index
entries for substrings like "jeff", "jef", "je", "nelson",
"nel", etc. A search for "eff" or "nels" would match this
value.
Tags Are Not Indexed#
Values stored as Tags are not indexed for full-text search. Use Tags when you want to store string data without the overhead of search indexing (e.g., identifiers, codes, or structured data that will only be queried with exact-match operators).
Search in Queries#
You can embed full-text search directly within
CCL queries using the CONTAINS and NOT_CONTAINS
operators. This allows you to combine search with other conditions
in a single query.
CONTAINS#
The CONTAINS operator finds records where a key’s value matches
a search query:
1 | |
1 2 3 | |
NOT_CONTAINS#
The NOT_CONTAINS operator finds records where a key’s value does
not match a search query:
1 | |
1 2 3 | |
Combining Search with Other Conditions#
Because CONTAINS and NOT_CONTAINS are standard query operators,
you can combine them with any other operators in a CCL expression:
1 2 | |
1 2 3 | |
This eliminates the need to perform a separate search call and
then intersect the results with a find call.
Compiled Search Queries#
Concourse internally compiles search queries for optimal performance. When a search query consists of a single token, Concourse uses an optimized algorithm (Boyer-Moore) for direct string matching rather than the general substring index. This compilation happens transparently and requires no configuration.
For repeated searches with the same query pattern, the compiled form is cached, further reducing overhead.
Search Configuration#
Maximum Substring Length#
The max_search_substring_length configuration option controls the
maximum length of substrings that are indexed for full-text search.
The default is 40 characters.
Reducing this value decreases the storage overhead of search indexes but limits the length of search queries that can match. Increasing it allows longer search queries at the cost of larger indexes.
1 2 | |
Stopwords#
Concourse indexes all words, including common stopwords (e.g., “the”, “is”, “and”). This ensures that searches for phrases containing stopwords return accurate results.
Stopword Policy Change
Prior to version 0.12, Concourse excluded common English stopwords from the search index. Starting in version 0.12, all words are indexed to ensure search accuracy. This change means searches involving stopwords now work correctly, but existing data written before the upgrade retains the old indexing behavior for those values.