Highlighting
Show where search terms appear in results with highlighted text snippets.
Elasticsearch documentation highlighters
Basic usage
Enable highlighting via query parameters:
Parameters
highlight
Enable highlighting.
- Type:
bool - Default:
false
highlight_count
Number of snippets per document.
- Type:
int - Default:
3 - Use
0to return full highlighted text
max_highlight_analyzed_offset
Maximum characters to analyze per document.
- Type:
int - Default:
999999
Reduce this value for better performance on large documents:
Highlighter types
The system uses different Elasticsearch highlighters optimized for each field type:
Fast Vector Highlighter (FVH)
Used for: content field (full-text of source documents)
Best for long text with accurate phrase highlighting. Requires term vectors to be stored in the index.
Configuration (via environment):
OPENALEPH_SEARCH_HIGHLIGHTER_FVH_ENABLED=true(default)- Requires
OPENALEPH_SEARCH_CONTENT_TERM_VECTORS=true(default)
If fast vector highlighting is disabled, the unified highlighter is used for the content field.
Unified Highlighter
Used for: name field
Balanced performance for entity names and titles.
Plain Highlighter
Used for: names (keywords), text, and other fields
Fast highlighting for simple matches.
Configuration
Control highlighting behavior via environment variables:
highlighter_fvh_enabled
Use Fast Vector Highlighter for content field.
When false, uses Unified Highlighter instead.
highlighter_fragment_size
Characters per snippet.
Default: 200
highlighter_number_of_fragments
Snippets per document.
Default: 3
highlighter_phrase_limit
Maximum phrases to analyze per document.
Default: 64
Lower values improve performance but may miss some matches.
highlighter_boundary_max_scan
Characters to scan for sentence boundaries.
Default: 100
highlighter_no_match_size
Fragment size when no match found.
Default: 300
highlighter_max_analyzed_offset
Maximum characters to analyze.
Default: 999999
Response format
Highlighted results appear in the highlight field:
{
"hits": {
"hits": [
{
"_id": "doc-123",
"_source": {...},
"highlight": {
"content": [
"Evidence of <em>corruption</em> was found...",
"The <em>investigation</em> revealed..."
],
"name": [
"<em>John Smith</em>"
]
}
}
]
}
}
Matched terms are wrapped in <em> tags.
Fields highlighted
Multiple fields are highlighted automatically:
content- Main document textname- Entity namesnames- Name keywordstext- Secondary text content
Examples
Basic highlighting
More snippets
Full text highlighting
Limited document size
openaleph-search search query-string "report" \
--args "highlight=true&max_highlight_analyzed_offset=100000"
With filters
openaleph-search search query-string "corruption" \
--args "filter:schema=Document&filter:countries=us&highlight=true"
Performance considerations
Index size
Fast Vector Highlighter requires term vectors, which increase index size by approximately 20-30%.
Disable term vectors if index storage size is a serious concern:
Requires reindexing to take effect.
Query performance
- More snippets (
highlight_count) = slower queries - Larger documents = slower highlighting
- Lower
phrase_limit= faster but less accurate - Reduce
max_highlight_analyzed_offsetfor large documents
Optimization tips
For better performance:
# Reduce snippets
--args "highlight=true&highlight_count=2"
# Limit analyzed text
--args "highlight=true&max_highlight_analyzed_offset=500000"
# Use dehydration with highlighting
--args "highlight=true&dehydrate=true"
Troubleshooting
No highlights returned
Check that:
- highlight=true parameter is set
- Query matches terms in highlightable fields
- Terms exist in analyzed fields (not just keyword fields)
Incomplete highlights
- Document may exceed
max_highlight_analyzed_offset - Query may exceed
phrase_limit - Field may not have term vectors enabled
Slow highlighting
- Reduce
highlight_count - Lower
max_highlight_analyzed_offset - Decrease
phrase_limit - (Other than that the name suggests, the FVH seems to be slower than the unified highlighter): Consider disabling FVH:
OPENALEPH_SEARCH_HIGHLIGHTER_FVH_ENABLED=false