Welcome to
Skip to main content
Home/Documentation/AI Assistant (Toti)

AI Assistant (Toti)

Toti is a multilingual AI assistant for the INOV-Norte Digital HUB, designed to answer questions about the platform’s catalog while enforcing strict source and routing rules. Its core principle is: internal catalog first, with external search allowed only as a controlled supplement for some entity types, and never as a substitute for missing catalog content.


Scope

Toti is designed to answer questions about seven catalog entity types indexed in Qdrant: course, tool, resource, best_practice, case_study, pedagogical_method, and community. Qdrant supports payload metadata and filtering, which is what enables type-aware retrieval using fields like type and slug.

For specific business rules, course and community are special cases:

  • They must be answered only from internal catalog data.
  • No external augmentation or search is allowed.

The remaining entity types (tool, resource, best_practice, case_study, pedagogical_method) may use external search only to complement an entity already found in the catalog, never to introduce an entity that does not exist in Qdrant.


Architecture

The catalog source of truth is Directus, and Qdrant acts as the vector retrieval layer. At runtime, Toti follows a routed Retrieval-Augmented Generation (RAG) architecture orchestrated by Langflow:

Routed RAG Flow

  1. User message enters the flow.
  2. Langflow acts as the agent orchestration layer, routing inputs and coordinating pipeline execution.
  3. A lightweight intent router classifies the request.
  4. The route determines whether the request is:
    • Clarification (confidence is low or intent is weak)
    • Restricted retrieval (internal catalog only)
    • Hybrid retrieval (internal catalog with optional external enrichment)
  5. Retrieval is executed against Qdrant with a metadata filter on type.
  6. The final Large Language Model (LLM) — powered by Azure OpenAI or locally by Ollama and orchestrated via Langflow nodes — generates the answer under strict response rules.

Routing Logic

The router outputs structured JSON with at least:

  • entity_type
  • web_search_allowed
  • confidence
  • needs_clarification
  • clarification_question

The routing policy is as follows:

ConditionRouteDescription
entity_type == unknown, needs_clarification == true, or low confidenceClarificationReturns the clarification_question directly without executing retrieval.
entity_type in ["course", "community"]RestrictedQuery passes forward but only exposes internal catalog retrieval.
entity_type in ["tool", "resource", "best_practice", "case_study", "pedagogical_method"] and web allowedHybridQuery passes forward, allowing a second optional enrichment step after catalog retrieval.

For maximum stability, the Langflow implementation uses simple custom components to parse and emit routing values, while native flow control handles branching.


Retrieval Layer

Qdrant is the key enforcement point. Instead of relying only on semantic similarity, Toti always injects a metadata filter based on the router output so retrieval happens inside the correct entity family (e.g. matching type == "course").

Retrieval behaves in two distinct modes:

  • Semantic retrieval mode: Used for specific questions (e.g., "courses about AI in education").
  • Catalog listing mode: Used for vague or exploratory requests (e.g., "tell me a course" or "show me communities"). Pure vector similarity often performs poorly for generic queries like "course" because they do not resemble the semantic body of any single item. Toti intercepts these requests and uses a filter-based listing strategy to return a small set of items from the requested type.

The Qdrant component:

  1. Accepts entity_type as dynamic input.
  2. Builds a filter on type.
  3. Uses vector search for meaningful queries.
  4. Uses a filter-only listing fallback for generic requests.

User-facing links are generated using the indexed slug payload in the form: https://hub.inov-norte.pt/[type]/[slug]


Agent Behavior

Toti exposes two different answer-generation profiles based on the selected route.

Restricted Agent

Used for course and community.

  • Must use only the internal catalog source.
  • Must not use external search.
  • Must not invent entities, dates, schedules, or availability.
  • If nothing is found, it must say so clearly and stop.
  • Prefers concise output: title first, optional short explanation, then a direct link.

Hybrid Agent

Used for tool, resource, best_practice, case_study, and pedagogical_method.

  • Must query the internal catalog first.
  • May use external search only after catalog retrieval, and only to enrich an already retrieved entity.
  • Must not answer with web-only entities that are absent from the catalog.
  • Must clearly prioritize HUB Digital content in the final answer.

Output Policy

Toti’s final answer policy is optimized for teaching staff rather than for raw retrieval transparency:

  • Language: Answer in the same language as the user.
  • Actionability: Prefer short, actionable responses.
  • Vague Requests: Suggest 1–2 relevant items instead of dumping full record contents.
  • Links: Always include direct catalog links when referencing retrieved entities.
  • Fallback: Use the router’s clarification question when intent is weak instead of guessing.

Production Implementation Guidelines

A solid production implementation of Toti must include the following specifications:

  • Directus to Qdrant Sync: Automation of vector synchronization via database triggers or webhook/operation flows (specifically create, update, and delete hooks).
  • Normalized Payload Schema: The Qdrant points must share a uniform payload schema with keys: type, slug, title, summary, tags, and optional language localization fields.
  • Router Model: A lightweight, fast, and deterministic model tasked with parsing intents and returning structured JSON.
  • Qdrant Component: Dynamic routing filter support coupled with an automated fallback to catalog-listing for generic vector queries.
  • Agent Separation: Distinct separation of execution paths (Restricted vs. Hybrid), ensuring different tool sets are exposed and completely separate system prompts are applied.
  • Clarification-First Fallback: A strict rule protecting computation/costs—never invoke retrieval or vector searches if the router confidence score is low.
  • Observability: Logging of the router's JSON payload, selected route path, translated search query, applied metadata filters, and returned content slugs for traceability.
  • Prompt Safety: Concrete system-level guidelines instructing the hybrid agent that external web searching is permitted exclusively for enrichment, and must never introduce non-catalog entities.

Frontend Integration

The chat UI is available globally through a floating action button (FAB) and context-specific "Open in Chat" triggers.

State Management

The stores/chat.ts Pinia store manages all chat state, handling session switching, fetching message history, and sending messages to the Directus /ai-chat endpoint.

Composable

The useChat composable (modules/shared/composables/useChat.ts) provides component-level interaction:

const { 
  displayMessages, 
  title, 
  sendMessage, 
  startNewSession 
} = useChat()