Skip to content

Glossary#

Captain#

The node within a partition that is responsible for coordinating replication and ensuring consistency for the data owned by that partition.

Cluster#

A group of Concourse nodes amongst which data is partitioned and replicated, but together form a single logical database.

Cohort#

A set of nodes that each store a copy of the same data. Cohorts are formed by the replication strategy to ensure fault tolerance.

Coordinator#

The node to which a client connects to perform an operation. The coordinator is responsible for routing the request to the appropriate node(s) within the cluster and returning the result to the client.

In Concourse, coordinators are chosen per operation and any node may serve as a coordinator.

Ensemble Protocol#

The consensus protocol used by Concourse to coordinate operations across nodes in a cluster. The ensemble protocol ensures that distributed transactions maintain strong consistency and serializable isolation.

Gossip Protocol#

A peer-to-peer communication protocol used by nodes in a cluster to discover other nodes, share state information, and detect failures. Each node periodically exchanges information with a random subset of other nodes, causing state to propagate through the cluster.

Key#

A string that names a field within a record. A key maps to one or more values. Keys are analogous to column names in a relational database, but unlike columns, keys are not predefined and can vary between records.

Leader#

In the context of a partition, the node responsible for coordinating writes and ensuring that all replicas receive the same data. Concourse uses a peer-to-peer model where leadership is distributed across partitions rather than centralized in a single node.

Node#

A Concourse instance that is a member of a distributed cluster.

Optimistic Availability#

A property of distributed systems that allows tolerance for arbitrary node failure while preserving availability for an operation as long as the coordinator and at least one relevant process agree on the state of the system. Given sufficient partitioning and replication, there is optimism that the system remains available in the face of failure or latency.

Partition#

A subset of nodes in a distributed cluster that each contain data for certain token ranges. Data is partitioned within Concourse to distribute load across the cluster.

Record#

A schemaless group of fields mapping keys to values. A single record should map to a single person, place, or thing in the real world. Each record has a unique 64-bit identifier.

Replica#

A node that stores a copy of a piece of data. Multiple replicas of the same data exist across different nodes to ensure fault tolerance. The number of replicas is controlled by the replication_factor configuration.

Schemaless#

A Concourse feature that allows users to store data without first specifying the data format or data types with the database, and allows records within the database to contain different formats and data types. Keys and values can be added to any record at any time without migration.

Strong Consistency#

A property of distributed systems where all processes observe state changes in the same order. According to the CAP Theorem, distributed databases are strongly consistent if every read receives the most recent write or an error, in the event that the most recent write cannot be determined because of network failure or latency.

Three-Phase Commit#

A distributed consensus protocol that extends the two-phase commit protocol with an additional prepare phase. This extra phase ensures that the system can recover from coordinator failure without blocking. Concourse uses a variant of this protocol for distributed transactions.

Token#

A hash value derived from a record’s identifier that determines which partition owns the record’s data. Tokens are distributed across a consistent hash ring to ensure even data distribution across cluster nodes.

Value#

A dynamically typed quantity stored for a key in a record. Concourse supports several value types including Boolean, Double, Float, Integer, Long, String, Tag, Link, and Timestamp. See Data Types for details.