The repository includes a data_dictionary.json file that documents all ArchivesSpace record types, their fields, and field metadata derived from the JSONModel schema definitions in common/schemas/.
The file has three top-level keys:
metadata— provenance, timestamps, and documentation for the file itselfenumerations— allowed values for dynamic enumerations referenced by fieldsrecord_types— a map of record type name → record type object
Each record type object contains:
description— what the record type representsui_display— which interfaces render this type (staff,public, orabstract)fields— array of field objectssubrecords— array of embedded sub-record definitions (optional)relationships— array of relationship definitions (optional)
Each field object uses these canonical properties:
| Key | Description |
|---|---|
name |
Field name as used in the API/schema |
type |
Data type (string, integer, boolean, object, array, datetime, etc.) |
description |
What the field stores |
staff_label |
Label shown in the Staff Interface (null if not displayed) |
public_label |
Label shown in the Public Interface (null if not displayed) |
required |
true if the system will reject records missing this field |
readonly |
true if the field is system-generated and cannot be set by clients |
example |
An example value (optional) |
default |
Default value applied when the field is omitted (optional) |
enum |
Array of allowed literal values for fields with a fixed controlled vocabulary (optional) |
dynamic_enum |
Name of the enumeration list whose allowed values are managed at runtime via the Manage Controlled Value Lists page (optional) |
conditions |
Conditional validation rules, e.g. making a field required only when another field has a specific value (optional) |
pattern |
Regex or human-readable pattern the value must match (optional) |
max_length |
Maximum character length for string fields (optional) |
min_items |
Minimum number of items required for array fields (optional) |
items |
For array-typed fields, the record type name of each array element (optional) |
refs |
For relationship/link fields, the list of record types this field may reference (optional) |
required_permission |
Permission code a user must hold to read or set this field (optional) |
export_mappings |
How this field maps to export formats (ead, marc, dc); null means not exported in that format (optional) |
solr_field |
Solr field name this value is indexed into (optional) |
solr_index |
Indexing strategy: indexed (exact match), full_text (tokenized search), etc. (optional) |
solr_note |
Human-readable note describing the Solr indexing behavior (optional) |
version_note |
ArchivesSpace version when this field was added or changed, with a short note (optional) |
Copy the generate_data_dictionary.rb and update_data_dictionary.rb scripts to the archivesspace/scripts directory to run
scripts/generate_data_dictionary.rb reads the ArchivesSpace source files directly and rebuilds data_dictionary.json from scratch:
# Preview — prints a summary without writing anything
ruby --disable-gems scripts/generate_data_dictionary.rb --dry-run
# Generate and write to data_dictionary.json (prompts before overwriting)
ruby --disable-gems scripts/generate_data_dictionary.rb
# Write to a different path
ruby --disable-gems scripts/generate_data_dictionary.rb --output path/to/output.json
# Overwrite without prompting
ruby --disable-gems scripts/generate_data_dictionary.rb --force
# Use a different demo database fixture for examples
ruby --disable-gems scripts/generate_data_dictionary.rb --demo-db path/to/demo.sql.gz
# Skip example extraction entirely
ruby --disable-gems scripts/generate_data_dictionary.rb --no-examplesThe script reads from:
| Source | What it provides |
|---|---|
common/schemas/*.rb |
Field names, types, constraints (maxLength, pattern), required/readonly flags, dynamic_enum, default, items, refs |
common/locales/en.yml |
Staff interface field labels (shared) |
frontend/config/locales/en.yml |
Staff interface field labels (frontend-specific) |
public/config/locales/en.yml |
Public interface field labels |
common/db/migrations/*.rb |
Enumeration value lists |
build/mysql_db_fixtures/demo.sql.gz |
Example values for fields (populated automatically where available) |
The script automatically populates example values by reading build/mysql_db_fixtures/demo.sql.gz. For each record type whose name matches a database table, the first non-null value found in the demo data is used as the example for each matching field. Fields that store opaque foreign-key IDs (*_id columns) are excluded. Pass --no-examples to skip this step, or --demo-db PATH to use a different fixture file.
The following fields still require manual enrichment and are written as null when the demo database cannot supply them: description, example (for fields without a matching DB column), export_mappings, solr_field, solr_index, solr_note, conditions, required_permission, version_note.
After regenerating, use scripts/update_data_dictionary.rb to merge in a file containing your manual enrichments (see below).
To add or update record types, create a new JSON file with the same structure as data_dictionary.json containing only the record types you want to add or change:
{
"record_types": {
"my_record_type": {
"description": "Description of this record type.",
"ui_display": ["staff"],
"fields": [
{
"name": "title",
"type": "string",
"description": "The title of the record.",
"staff_label": "Title",
"public_label": "Title",
"required": true,
"readonly": false,
"example": "My Collection"
}
]
}
}
}You do not need to include a metadata key — the update script manages timestamps automatically.
Use scripts/update_data_dictionary.rb to merge your new file into the canonical dictionary:
# Preview changes without writing anything
ruby --disable-gems scripts/update_data_dictionary.rb my_new_entries.json --dry-run
# Apply changes (updates data_dictionary.json in place)
ruby --disable-gems scripts/update_data_dictionary.rb my_new_entries.json
# Write to a different output file instead
ruby --disable-gems scripts/update_data_dictionary.rb my_new_entries.json --output path/to/output.jsonThe script prints a color-coded changelog showing every added, removed, and modified field, then updates data_dictionary.json and appends an entry to its changelog array. Existing record types not present in your new file are preserved unchanged.