Importing Data#

Concourse supports importing data from JSON and CSV formats, both programmatically through the API and via command-line tools.

JSON Import#

The insert method is the primary way to import structured data into Concourse. It accepts JSON strings containing objects or arrays of objects.

Single Object#

// Java
Set<Long> records = concourse.insert(
    "{\"name\": \"Jeff Nelson\", "
    + "\"company\": \"Cinchapi\", "
    + "\"age\": 30}");

1 2	`// CaSH insert({"name": "Jeff Nelson", "company": "Cinchapi", "age": 30})`

Array of Objects#

Pass a JSON array to insert multiple records at once:

// Java
Set<Long> records = concourse.insert(
    "[{\"name\": \"Alice\", \"dept\": \"Engineering\"}, "
    + "{\"name\": \"Bob\", \"dept\": \"Design\"}]");

JSON Structure#

JSON objects in the import data must follow these rules:

Each key maps to a JSON primitive or an array of JSON primitives
Nested objects are not supported
Arrays of primitives create multi-valued fields

{
  "name": "Jeff Nelson",
  "age": 30,
  "active": true,
  "tags": ["founder", "engineer"]
}

The tags key in this example would contain two values: "founder" and "engineer".

Resolvable Links#

JSON import data can include resolvable link instructions to automatically create links to records matching a criteria. Use the @criteria@ syntax:

{
  "name": "Engineering",
  "members": "@department = Engineering@"
}

See Writing Data for details.

CSV Import#

Concourse provides a CLI tool for importing CSV files.

Basic Usage#

1	`concourse import /path/to/data.csv`

The importer reads the CSV file, uses the first row as key names, and inserts each subsequent row as a new record.

Data Source Annotation#

Use the --annotate-data-source flag to automatically add a field to each imported record indicating the source file:

1	`concourse import --annotate-data-source /path/to/data.csv`

Import Options#

The CSV importer supports:

Header detection: The first row is automatically treated as the header containing key names
Type inference: Values are automatically typed (numbers, booleans, strings)
Batch processing: Records are inserted in batches for performance

Map and Multimap Import#

The Java API supports importing data from Map and Multimap objects:

Map Import#

// Java
Map<String, Object> data = Maps.newLinkedHashMap();
data.put("name", "Jeff Nelson");
data.put("age", 30);
long record = concourse.insert(data);

Multimap Import#

Use a Multimap when keys should have multiple values:

// Java
Multimap<String, Object> data =
    LinkedHashMultimap.create();
data.put("name", "Jeff Nelson");
data.put("tag", "founder");
data.put("tag", "engineer");
long record = concourse.insert(data);

Batch Import#

Import multiple maps at once:

// Java
List<Multimap<String, Object>> batch =
    Lists.newArrayList();
batch.add(createRecord("Alice"));
batch.add(createRecord("Bob"));
Set<Long> records = concourse.insert(batch);

Import into Existing Records#

All insert methods support writing into specific existing records. The insert fails for a given record if any of the key/value associations already exist.

// Java
boolean success = concourse.insert(
    "{\"department\": \"Engineering\"}", 1);

Map<Long, Boolean> results = concourse.insert(
    "{\"department\": \"Engineering\"}",
    Lists.newArrayList(1L, 2L));

Resolution#

When importing data, Concourse resolves each key/value pair by attempting to add it to the target record. If the exact key/value pair already exists in the record, the add is skipped (and the insert fails for that record when using insert into existing records). This ensures that imports are idempotent for new records but enforce uniqueness when targeting existing records.