MCP Search Best Practices
Markdown guidance for LLMs using GovTribe MCP search tools.
The following guide is given to LLMs when they call the Search Instructions tool on the GovTribe MCP. It serves as useful context to understand how searching works in GovTribe.
Search Guidance
General
When building GovTribe search payloads, every field marked as required must always be included in the request — even when the user wants an “empty” or unfiltered search. If a required field has no user-provided value, you (the assistant) must supply a valid empty value for that field. Use the following defaults:
String fields: ""
Arrays: []
Date ranges: { "from": null, "to": null }
Numeric ranges: { "min": null, "max": null }
Sort objects: { "key": "_score", "direction": "desc" }
Query: always a string, never null — use "" when no free-text search is needed.
You (the assistant) must never omit a required property and must never use null where the field expects a string or array.
Query
A string or empty string. Uses elasticsearch (not lucine) logic.
Keyword mode: use quotes for exact phrases, | for OR, - to exclude. Semantic mode: natural language with synonyms. Empty string for aggregation-only queries.
Only keyword mode supports operators (like "", |, and -). Semantic mode only supports queries without operators.
If you want to search by or exclude results by their GovTribe ids, use the dedicated
idsfilter.
Pagination & Sizing
Choose the smallest value that answers the question; paginate only when needed.
Sorting
Only set
sortwhen:a user or prompt explicitly asks for a specific ordering, or
the task is analytical (e.g., time series, leaderboards, stats).
Similarity Search
Use
similar_filterwhen the user wants “items like this other item.”Always provide both:
govtribe_type(the referenced item’s type), andgovtribe_id(the referenced item’s ID).Example:
{"govtribe_type":"federal_contract_award","govtribe_id":"<ID>"}.Date Input GrammarAll
..._date_rangefields accept:Plain dates
YYYY-MM-DD, orElasticsearch date math (e.g.,
now-7d,now-1h/d,now+12M/d).If the user gives only one bound, infer the other sensibly (e.g., “last 90 days” →
from: now-90d/d,to: now/d).Using only one bound at a time is acceptable – only providing to will search for anything before that date.
When you include a range object, include both
fromandto.
Agency & ID FieldsContracting (Awarding) Agency ≈ “who signed it.”
Funding Agency ≈ “who owns the money.”
If the user provides names instead of IDs, first resolve with
research_federal_agencies, then pass the resulting IDs to the target search.
Location FiltersFields like
place_of_performance/vendor_locationaccept countries, states, counties, cities, postal codes. Match the user’s specificity.
Aggregations (Leaderboards & Roll‑ups)Use
aggregationswhen the user wants counts, sums, “top N …”, or overall stats.Keep
aggregationsvalues within the tool’s allowed enum; return a compact sample of raw items if it clarifies the aggregates.
ID Lookups (Cross‑Tool Pattern)When the API expects IDs (agencies, vendors, vehicles, categories) and the user gives names, first call the appropriate
research_*resolver tool to get IDs, then perform the main query. We in Beta and creating more research tools, so if there is not a research tool available, inform the user "This tool has not been migrated to MCP yet" and ask for the ID directly.
Search ModeThis guidance explains how to set the two parameters used by GovTribe research tools:
search_mode(mode selector) andquery(the final, transformed query string to execute).Parameter contract
search_mode: choose "keyword" or "semantic" per the decision checklist below.query: set this to the transformed query produced by the chosen mode. This is the exact string sent to the search tool.Exception (structured / aggs-only): If the user request is strictly about aggregations or filtering on structured fields (e.g., NAICS/PSC/UEI/CAGE/contract IDs), do not send a free-text query. Omit
queryor set it tonulland rely entirely on structured parameters.
Aggregations (aggs) support
"Aggs" (rollups, distributions, leaderboards such as
dollars_obligatedortop_agencies) are reliably available only inkeywordsearch mode.semanticsearch mode focuses on semantic document retrieval and does not guarantee/optimize aggs. If the user requests totals, counts, “top N”, “breakdown by …”, “distribution of …”, or time-series rollups, choosekeywordsearch mode.
Search Modekeyword— “fuzzy keyword search with optional exact matching”What it does
Runs an Elasticsearch
simple_query_string-style search with:Default operator: AND across all tokens.
Fuzzy matching enabled (e.g.,
fuzziness,fuzzy_max_expansions).Prioritizes direct term/phrase overlap. Good recall on misspellings and near‑matches; precision improves when you add quotes around phrases.
What the AI may do to the user’s query
Keep the query verbatim, or:
Fix obvious spelling errors outside of quoted strings and outside identifier-looking tokens.
Add double quotes around words/phrases the user clearly intends as exact strings (names, IDs, titles, multi-word entities).
Build OR lists using the
|operator (simple-query-string OR). Example:"enterprise cyber range" | "enterprise cyber training".Exclude terms using
-term.Do NOT use other SQS operators or field scoping (
+ () ~ * ^ field:etc.). Do not alter text inside quotes.
Strengths
Supports aggregations.
Best for lookup and literal intent:
Exact identifiers, codes, file names, titles, part numbers, solicitation/notice IDs (e.g., W912HQ-24-R-0123), UEIs/CAGE, NAICS/PSC.
Proper nouns or quoted strings where the user expects near-exact hits.
Short queries with 1–3 salient tokens.
Tolerant of minor typos and near-spellings.
Limitations
Weaker for conceptual or open-ended asks where meaning matters more than words.
Can miss semantic equivalents not present as index synonyms.
When to choose search mode
keywordThe user’s intent is “fetch this exact thing” (lookup, navigational).
The user requests aggregations (e.g., "top vendors", "count by agency/NAICS", "trend by month").
The query contains quoted text or looks like an ID/code (hyphenated/all-caps alphanumerics/digit patterns).
The user mentions an exact title/name (“RFP ‘Enterprise Cyber Range’”).
The query is very short and specific (“CMMC L3 RFI”).
How to construct
query(safe transformations)Normalize whitespace and fix obvious typos except inside quoted strings or identifier-like tokens.
Wrap clear entities in double quotes: organization names, multi-word titles, IDs.
If the user enumerates alternatives, build an OR list with
|between (possibly quoted) terms/phrases.If the user wants to exclude a term, prefix it with
-.Do not add operators beyond quotes,
|, or-.Boolean operators like
ORorANDare not supported.
Examples (→ assign to
queryfor search modekeyword)W912HQ-24-R-0123 → "W912HQ-24-R-0123"
cisa endpoint detection rfi → "CISA" "endpoint detection" RFI
uei v1abcde345f6 → "V1ABCDE345F6"
"enterprise cyber range" training → "enterprise cyber range" training
search modesemantic— “Dense vector semantic search (meaning over words)”What it does
Embeds the user query into a dense vector and retrieves semantically similar content, independent of exact lexical overlap.
What the AI may do to the user’s query
Send verbatim, or apply light reformulation:
Synonym/paraphrase expansion (2–6 items): e.g., RFP ⇢ request for proposal; solicitation; notice · set-aside ⇢ small business; 8(a); WOSB; SDVOSB; HUBZone · IT services ⇢ information technology; software development.
Query relaxation (only if likely sparse): drop/soften narrow constraints (numbers, exact dates, long conjunctive tails) while keeping core intent.
Keep reformulation in plain natural language.
No boolean operators:
OR,AND, etc. are not supported.Cap length: keep the final string concise (≈ 20–25 words).
Strengths
Best for conceptual, exploratory, or intent-heavy queries:
“What’s similar to …”, “alternatives”, “how/why/best ways”.
Broad topical searches where terminology varies (synonyms, abbreviations).
Multi-clause queries that read like a question or task.
Limitations
Does not reliably support aggregations; if aggregates are required, use search mode
keyword.May underweight exact identifiers and strict literal intent.
Relaxation can introduce drift if overused—apply conservatively.
When to choose search mode
semantic(signals)The query is a question or seeks guidance/ideas (“how to”, “ways to”, “similar to”, “alternatives”).
The topic is broad/ambiguous or relies on synonyms.
The query mixes multiple related notions where meaning matters more than literal overlap.
How to construct
query(reformulation recipe)Keep the core intent in plain language.
Append 2–6 high-value domain synonyms/paraphrases.
If sparse, relax the narrowest numeric/date constraints last.
Do not inject contradictions or change scope. Keep it ≤ ~25 words.
Examples (→ assign to
queryfor search modesemantic)ways to find recompete contracts → ways to find recompete contracts; identify expiring awards; renewal opportunities; follow-on opportunities
small business set-aside cloud modernization rfp examples → examples of small business set-aside cloud modernization solicitations; RFPs; RFIs; sources sought
find similar notices to FAA SWIM support → notices similar to FAA SWIM support; aviation data integration; system wide information management; enterprise integration support
How the AI should choose a modeOne-screen decision checklist
After choosing, set
search_modeaccordingly, then build and assign the resulting string toqueryper the relevant construction rules.Pick search mode
keywordif any are true:The query includes quotes that signal exact matching.
The query contains an ID/code/numbered token (solicitation/notice ID, UEI, CAGE, NAICS/PSC, contract number).
The user intent is lookup / navigate to a specific document.
The query is ≤ 3 tokens and appears specific rather than conceptual.
The user asks for aggregations (counts/tops/distributions/time series).
Pick search mode
semanticif any are true:The query is a question or seeks conceptual/semantic matches.
The topic is broad, ambiguous, or relies on synonyms.
The user asks for “similar to / related to / alternatives”.
The query mixes multiple related ideas where meaning matters more than literal overlap.
Tie-breakers
If the query mixes a unique identifier with conceptual context (e.g., "W912HQ-24-R-0123 recompete history"), choose search mode
keywordand keep the ID quoted; include extra unquoted context.If the query has no unique tokens and would benefit from synonyms, choose search mode
semantic.If constraints conflict (very long exact phrase + “similar to”), prefer search mode
semanticunless there’s a quoted ID—then search modekeyword.
Compact pseudocodeMinimal, mode-specific query construction rulesIf search mode
keyword: setqueryto a string that:Preserves user quotes; adds quotes to clear entities/IDs.
Fixes obvious typos outside quotes and outside IDs.
Outputs: a space-separated list of (possibly quoted) terms/phrases, with optional
|ORs and-excludes. No other operators.If search mode
semantic: setqueryto a string that:Starts with the user’s natural-language query.
Appends 2–6 domain-aware synonyms/paraphrases.
If likely sparse, removes the narrowest numeric/date constraints last.
Keeps the whole string concise (≈ 20–25 words).
Quick domain-flavored examplesUser query →
search_mode→queryW912HQ-24-R-0123 → 1 → "W912HQ-24-R-0123"
"cyber incident response" BPA → 1 → "cyber incident response" BPA
uei v1abcde345f6 → 1 → "V1ABCDE345F6"
cisa endpoint detection rfi → 1 → "CISA" "endpoint detection" RFI
how to find recompetes in DoD → 2 → ways to find recompete contracts in DoD; identify expiring awards; follow-on opportunities; contract renewals
similar notices to FAA SWIM support → 2 → notices similar to FAA SWIM support; aviation data integration; System Wide Information Management; enterprise integration support
small business set-aside cloud modernization rfp examples → 2 → examples of small business set-aside cloud modernization solicitations; RFPs; RFIs; sources sought
How to resolve any field ending in _idsAll
*_idsfields require valid GovTribe IDs. Use search tools to resolve names/identifiers to IDs.Most fields ending in
_idsare self-explanatory (e.g.,pipeline_idsfilters by pipelines). These have extra context:Agency distinctions:
contracting_federal_agency_ids: Who signed the contractfunding_federal_agency_ids: Who owns the moneyfederal_agency_ids: Any agency involved (contracting or funding)
Location granularity:
place_of_performance_ids: Accepts countries, states, counties, cities, postal codesvendor_location_ids: Accepts countries, states, counties, cities, postal codes
Vendor relationships:
vendor_ids: Prime contractors/awardeessub_vendor_ids: Subcontractors
Other:
federal_meta_opportunity_ids: The originating solicitation/noticevendor_primary_registered_naics_category_ids: Vendor's primary NAICSvendor_registered_psc_category_ids: Any PSC the vendor is registered for
Last updated
Was this helpful?
