Facets & Aggregations
Analyze search results through aggregations to understand data distribution and patterns.
See significant terms aggregations as well.
Basic usage
# Single facet
openaleph-search search query-string "corruption" --args "facet=dataset"
# Multiple facets
openaleph-search search query-string "investigation" \
--args "facet=dataset&facet=schema&facet=countries"
Types
Terms aggregations
Count distinct values in keyword fields.
Cardinality aggregations
Get total count of distinct values.
openaleph-search search query-string "investigation" \
--args "facet=countries&facet_total:countries=true"
Date histogram aggregations
Group results by time intervals.
openaleph-search search query-string "transaction" \
--args "facet=created_at&facet_interval:created_at=month"
Parameters
facet
Field name to facet on.
facet_size:FIELD
Number of values to return (default: 20).
facet_total:FIELD
Include total distinct count.
facet_values:FIELD
Return actual values (default: true).
# Only counts, no values
--args "facet=entities&facet_values:entities=false&facet_total:entities=true"
facet_interval:FIELD
Time interval for date fields.
Intervals: year, quarter, month, week, day, hour, minute
facet_type:FIELD
Aggregation type for special fields.
metric:TYPE
Numeric metric aggregation. TYPE is one of sum, avg, min, max. Value is a FtM property name (the numeric. ES field prefix is resolved internally).
# Single metric
--args "metric:sum=amount"
# Multiple metrics on same or different fields
--args "metric:sum=amount&metric:avg=amount&metric:min=registrationArea"
Response key format: {field}.{type} (e.g. amount.sum)
Metric aggregations
Compute numeric metrics (sum, average, min, max) on numeric fields. Uses the metric: prefix with FtM property names — the numeric. ES field prefix is resolved internally.
# Sum of payment amounts
openaleph-search search query-string "*" \
--args "filter:schema=Payment&metric:sum=amount"
# Multiple metrics
openaleph-search search query-string "*" \
--args "filter:schema=Payment&metric:sum=amount&metric:avg=amount&metric:min=amount&metric:max=amount"
Supported types: sum, avg, min, max
Response format
Aggregations appear in the aggregations section:
{
"hits": {...},
"aggregations": {
"dataset.values": {
"buckets": [
{"key": "panama_papers", "doc_count": 1250},
{"key": "paradise_papers", "doc_count": 890}
]
},
"schema.cardinality": {
"value": 12
},
"created_at.intervals": {
"buckets": [
{
"key": 1609459200000,
"key_as_string": "2021-01-01",
"doc_count": 145
}
]
},
"names.significant_terms": {
"buckets": [
{
"key": "mossack fonseca",
"doc_count": 25,
"score": 0.8745,
"bg_count": 100
}
]
},
"amount.sum": {
"value": 125000.0
},
"amount.avg": {
"value": 2500.0
}
}
}
Common fields
Apart from the common group fields, individual FollowTheMoney properties can be used as well via properties.<prop>
Entity fields
schema- Entity schema typeschemata- Schema inheritance (e.g.schemata=LegalEntityincludes all its descendants)dataset- Dataset identifier
Group fields
These groups are part of the index as keyword fields:
addresseschecksumscountriesdatesemailsentitiesgendersidentifiersipslanguagesmimetypesnamesphonestopicsurls
Name fields
names- Normalized entity names (includes the NER mentions fromAnalyzableentities.)name_symbols- Name symbols (extracted fromnames)
Date histograms
Calendar intervals
openaleph-search search query-string "transaction" \
--args "facet=dates&facet_interval:dates=month"
Example values: year, quarter, month, week, day
Fixed intervals
Examples: 1h, 15m, 7d, 1M
Date range with histogram
openaleph-search search query-string "event" \
--args "filter:gte:properties.startDate=2020-01-01&filter:lte:properties.startDate=2023-12-31&facet=properties.startDate&facet_interval:properties.startDate=quarter"
Includes empty buckets within range.
Post-filters
Each facet excludes its own filters to show alternative options:
# Dataset facet shows ALL datasets, not just filtered ones
openaleph-search search query-string "company" \
--args "filter:dataset=collection1&filter:dataset=collection2&facet=dataset"
This allows users to see alternative filter options.
Performance
Execution strategy
All facets use execution_hint: map for keyword fields.
High cardinality
Fields with many unique values:
- Use facet size limits
- Monitor query performance
- Consider sampling for large datasets
Examples
Multi-facet analysis
openaleph-search search query-string "investigation" \
--args "facet=dataset&facet=schema&facet=countries&facet=created_at&facet_interval:created_at=month"
Document classification
openaleph-search search query-string "*" \
--args "filter:schemata=Document&facet=properties.mimeType&facet=languages&facet_size:properties.mimeType=100"
Entity network
openaleph-search search query-string "person" \
--args "filter:schema=Person&facet=dataset&facet=countries"
Temporal trends
openaleph-search search query-string "company" \
--args "facet=schema&facet=created_at&facet_interval:created_at=year&facet_size:schema=50"
Payment totals
openaleph-search search query-string "*" \
--args "filter:schema=Payment&filter:beneficiary=entity-id&metric:sum=amount&metric:avg=amount"
Error handling
Invalid fields
Non-existent fields return empty results:
Type mismatches
Requesting histograms on non-date fields falls back to term aggregation.
Authorization failures
Restricted fields return empty results while maintaining query functionality.