User Guide

Overview

A2RCHI supports various data sources as easy ways to ingest your data into the vector store databased used for document retrieval. These include:

Links lists (even behind SSO): automatically scrape and ingest documents from a list of URLs
Git scraping: git mkdocs repositories
Ticketing systems: JIRA, Redmine, Piazza
Local documents

Additionally, A2RCHI supports various interfaces/services, which are applications that interact with the RAG system. These include:

Chat interface: a web-based chat application
Piazza integration: read posts from Piazza and post draft responses to a Slack channel
Cleo/Redmine integration: read emails and create tickets in Redmine
Mattermost integration: read posts from Mattermost and post draft responses to a Mattermost channel
Grafana monitoring dashboard: monitor system and LLM performance metrics
Document uploader: web interface for uploading and managing documents
Grader: automated grading service for assignments with web interface

Both data sources and interfaces/services are enabled via flags to the a2rchi create command,

a2rchi create [...] --services=chatbot,piazza,... --sources jira,redmine,...

The parameters of the services and sources are configured via the configuration file. See below for more details.

We support various pipelines which are pre-defined sequences of operations that process user inputs and generate responses. Each service may support a given pipeline. See the Services and Pipelines sections below for more details. For each pipeline, you can use different models, retrievers, and prompts for different steps of the pipeline. We support various models for both embeddings and LLMs, which can be run locally or accessed via APIs. See the Models section below for more details. Both pipelines and models are configured via the configuration file.

Finally, we support various retrievers and embedding techniques for document retrieval. These are configured via the configuration file. See the Vector Store section below for more details.

Optional command line options

In addition to the required --name, --config/--config-dir, --env-file, and --services arguments, the a2rchi create command accepts several useful flags:

--podman: Run the deployment with Podman instead of Docker.
--sources / -src: Enable additional ingestion sources (git, sso, jira, redmine, ...). Provide a comma-separated list.
--gpu-ids: Mount specific GPUs (--gpu-ids all or --gpu-ids 0,1). The legacy --gpu flag still works but maps to all.
--tag: Override the local image tag (defaults to 2000). Handy when building multiple configurations side-by-side.
--hostmode: Use host networking for all services.
--verbosity / -v: Control CLI logging level (0 = quiet, 4 = debug).
--force / --dry-run: Force recreation of an existing deployment and/or show what would happen without actually deploying.

You can inspect the available services and sources, together with descriptions, using a2rchi list-services.

GPU helpers

GPU access requires the NVIDIA drivers plus the NVIDIA Container Toolkit. After installing the toolkit, generate CDI entries (for Podman) with sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml and confirm with nvidia-ctk cdi list. Docker users should run sudo nvidia-ctk runtime configure --runtime=docker.

Data Sources

These are the different ways to ingest data into the vector store used for document retrieval.

Web Link Lists

A web link list is a simple text file containing a list of URLs, one per line. A2RCHI will fetch the content from each URL and add it to the vector store, using the Scraper class.

Configuration

You can define which lists of links A2RCHI will ingest in the configuration file as follows:

data_manager:
  sources:
    links:
      input_lists:  # REQUIRED
        - miscellanea.list  # list of websites with relevant info
        - [...other lists...]

Each list should be a simple text file containing one URL per line, e.g.,

https://example.com/page1
https://example.com/page2
[...]

In the case that some of the links are behind a Single Sign-On (SSO) system, enable the SSO source in your configuration and specify the collector class:

data_manager:
  sources:
    sso:
      enabled: true
      sso_class: CERNSSOScraper  # or whichever class is appropriate
      sso_class_map:
        CERNSSOScraper:
          kwargs:
            headless: true
            max_depth: 2

Then, run a2rchi create ... --sources sso to activate the SSO collector.

You can customise the HTTP scraper behaviour (for example, to avoid SSL verification warnings):

data_manager:
  sources:
    links:
      scraper:
        reset_data: true
        verify_urls: false
        enable_warnings: false

Secrets

If you are using SSO, depending on the class, you may need to provide your login credentials in a secrets file as follows:

SSO_USERNAME=username
SSO_PASSWORD=password

Then, make sure that the links you provide in the .list file(s) start with sso-, e.g.,

sso-https://example.com/protected/page

Running

Link scraping is automatically enabled in A2RCHI, you don't need to add any arguments to the create command unless the links are sso protected.

Git scraping

In some cases, the RAG input may be documentations based on MKDocs git repositories. Instead of scraping these sites as regular HTML sites you can obtain the relevant content using the GitScraper class.

Configuration

To configure it, enable the git source in the configuration file:

data_manager:
  sources:
    git:
      enabled: true

In the input lists, make sure to prepend git- to the URL of the repositories you are interested in scraping.

git-https://github.com/example/mkdocs/documentation.git

Secrets

You will need to provide a git username and token in the secrets file,

GIT_USERNAME=your_username
GIT_TOKEN=your_token

Running

Enable the git source during deployment with --sources git.

JIRA

The JIRA integration allows A2RCHI to fetch issues and comments from specified JIRA projects and add them to the vector store, using the JiraClient class.

Configuration

Select which projects to scrape in the configuration file:

data_manager:
  sources:
    jira:
      url: https://jira.example.com
      projects:
        - PROJECT_KEY
      anonymize_data: true

You can further customise anonymisation via the global anonymiser settings.

data_manager:
  utils:
    anonymizer:
      nlp_model: en_core_web_sm
      excluded_words:
        - Example
      greeting_patterns:
        - '^(hi|hello|hey|greetings|dear)\b'
      signoff_patterns:
        - '\b(regards|sincerely|best regards|cheers|thank you)\b'
      email_pattern: '[\w\.-]+@[\w\.-]+\.\w+'
      username_pattern: '\[~[^\]]+\]'

The anonymizer will remove names, emails, usernames, greetings, signoffs, and any other words you specify from the fetched data. This is useful if you want to avoid having personal information in the vector store.

Secrets

A personal access token (PAT) is required to authenticate and authorize with JIRA. Add JIRA_PAT=<token> to your .env file before deploying with --sources jira.

Running

Enable the source at deploy time with:

a2rchi create [...] --services=chatbot --sources jira

Adding Documents and the Uploader Interface

Adding Documents

There are two main ways to add documents to A2RCHI's vector database. They are:

Manually adding files while the service is running via the uploader GUI
Directly copying files into the container

These methods are outlined below.

Manual Uploader

In order to upload documents while A2RCHI is running via an easily accessible GUI, enable the uploader service when creating the deployment:

a2rchi create [...] --services=chatbot,uploader

The exact port may vary based on configuration (default external port is 5003). A quick podman ps or docker ps will show which port is exposed.

In order to access the manager, you must first create an admin account. Grab the container ID with podman ps/docker ps and then enter the container:

docker exec -it <CONTAINER-ID> bash

Run the bundled helper:

python -u src/bin/service_create_account.py

from the /root/A2RCHI directory inside the container. This script will guide you through creating an account; never reuse sensitive passwords here.

Once you have created an account, visit the outgoing port of the data manager docker service and then log in. The GUI will then allow you to upload documents while A2RCHI is still running. Note that it may take a few minutes for all the documents to upload.

Directly copying files to the container

The documents used for RAG live in the chat container at /root/data/<directory>/<files>. Thus, in a pinch, you can docker/podman cp a file at this directory level, e.g., podman/docker cp myfile.pdf <container name or ID>:/root/data/<new_dir>/. If you need to make a new directory in the container, you can do podman exec -it <container name or ID> mkdir /root/data/<new_dir>.

Redmine

Use the Redmine source to ingest solved tickets (question/answer pairs) into the vector store.

Configuration

data_manager:
  sources:
    redmine:
      url: https://redmine.example.com
      project: my-project
      anonymize_data: true

Secrets

Add the following to your .env file:

REDMINE_USER=...
REDMINE_PW=...

Running

Enable the source at deploy time with:

a2rchi create [...] --services=chatbot --sources redmine

To automate email replies, also enable the redmine-mailer service (see the Services section below).

Interfaces/Services

These are the different apps that A2RCHI supports, which allow you to interact with the AI pipelines.

Piazza Interface

Set up A2RCHI to read posts from your Piazza forum and post draft responses to a specified Slack channel. To do this, a Piazza login (email and password) is required, plus the network ID of your Piazza channel, and lastly, a Webhook for the slack channel A2RCHI will post to. See below for a step-by-step description of this.

Go to https://api.slack.com/apps and sign in to workspace where you will eventually want A2RCHI to post to (note doing this in a business workspace like the MIT one will require approval of the app/bot).
Click 'Create New App', and then 'From scratch'. Name your app and again select the correct workspace. Then hit 'Create App'
Now you have your app, and there are a few things to configure before you can launch A2RCHI:
Go to Incoming Webhooks under Features, and toggle it on.
Click 'Add New Webhook', and select the channel you want A2RCHI to post to.
Now, copy the 'Webhook URL' and paste it into the secrets file, and handle it like any other secret!

Configuration

Beyond standard required configuration fields, the network ID of the Piazza channel is required (see below for an example config). You can get the network ID by simply navigating to the class homepage, and grabbing the sequence that follows 'https://piazza.com/class/'. For example, the 8.01 Fall 2024 homepage is: 'https://piazza.com/class/m0g3v0ahsqm2lg'. The network ID is thus 'm0g3v0ahsqm2lg'.

Example minimal config for the Piazza interface:

name: bare_minimum_configuration #REQUIRED

data_manager:
  sources:
    links:
      input_lists:
        - class_info.list # class info links

a2rchi:
  [... a2rchi config ...]

services:
  piazza:
    network_id: <your Piazza network ID here> # REQUIRED
  chat_app:
    trained_on: "Your class materials" # REQUIRED

Secrets

The necessary secrets for deploying the Piazza service are the following:

PIAZZA_EMAIL=...
PIAZZA_PASSWORD=...
SLACK_WEBHOOK=...

The Slack webhook secret is described above. The Piazza email and password should be those of one of the class instructors. Remember to put this information in files named following what is written above.

Running

To run the Piazza service, simply add the piazza flag. For example:

a2rchi create [...] --services=chatbot,piazza

Redmine/Mailbox Interface

A2RCHI will read all new tickets in a Redmine project, and draft a response as a comment to the ticket. Once the ticket is updated to the "Resolved" status by an admin, A2RCHI will send the response as an email to the user who opened the ticket. The admin can modify A2RCHI's response before sending it out.

Configuration

services:
  redmine_mailbox:
    url: https://redmine.example.com
    project: my-project
    redmine_update_time: 10
    mailbox_update_time: 10
    answer_tag: "-- A2RCHI -- Resolving email was sent"

Secrets

Add the following secrets to your .env file:

IMAP_USER=...
IMAP_PW=...
REDMINE_USER=...
REDMINE_PW=...
SENDER_SERVER=...
SENDER_PORT=587
SENDER_REPLYTO=...
SENDER_USER=...
SENDER_PW=...

Running

a2rchi create [...] --services=chatbot,redmine-mailer

Mattermost Interface

Set up A2RCHI to read posts from your Mattermost forum and post draft responses to a specified Mattermost channel.

Configuration

services:
  mattermost:
    update_time: 60

Secrets

You need to specify a webhook, access token, and channel identifiers:

MATTERMOST_WEBHOOK=...
MATTERMOST_PAK=...
MATTERMOST_CHANNEL_ID_READ=...
MATTERMOST_CHANNEL_ID_WRITE=...

Running

To run the Mattermost service, include it when selecting services. For example:

a2rchi create [...] --services=chatbot,mattermost

Grafana Interface

Monitor the performance of your A2RCHI instance with the Grafana interface. This service provides a web-based dashboard to visualize various metrics related to system performance, LLM usage, and more.

Note, if you are deploying a version of A2RCHI you have already used (i.e., you haven't removed the images/volumes for a given --name), the postgres will have already been created without the Grafana user created, and it will not work, so make sure to deploy a fresh instance.

Configuration

services:
  grafana:
    external_port: 3000

Secrets

Grafana shares the Postgres database with other services, so you need both the database password and a Grafana-specific password:

PG_PASSWORD=<your_database_password>
GRAFANA_PG_PASSWORD=<grafana_db_password>

Running

Deploy Grafana alongside your other services:

a2rchi create [...] --services=chatbot,grafana

and you should see something like this

CONTAINER ID  IMAGE                                     COMMAND               CREATED        STATUS                  PORTS                             NAMES
d27482864238  localhost/chromadb-gtesting2:2000         uvicorn chromadb....  9 minutes ago  Up 9 minutes (healthy)  0.0.0.0:8000->8000/tcp, 8000/tcp  chromadb-gtesting2
87f1c7289d29  docker.io/library/postgres:16             postgres              9 minutes ago  Up 9 minutes (healthy)  5432/tcp                          postgres-gtesting2
40130e8e23de  docker.io/library/grafana-gtesting2:2000                        9 minutes ago  Up 9 minutes            0.0.0.0:3000->3000/tcp, 3000/tcp  grafana-gtesting2
d6ce8a149439  localhost/chat-gtesting2:2000             python -u a2rchi/...  9 minutes ago  Up 9 minutes            0.0.0.0:7861->7861/tcp            chat-gtesting2

where the grafana interface is accessible at your-hostname:3000. To change the external port from 3000, you can do this in the config at services.grafana.external_port. The default login and password are both "admin", which you will be prompted to change should you want to after first logging in. Navigate to the A2RCHI dashboard from the home page by going to the menu > Dashboards > A2RCHI > A2RCHI Usage. Note, your-hostname here is the just name of the machine. Grafana uses its default configuration which is localhost but unlike the chat interface, there are no APIs where we template with a selected hostname, so the container networking handles this nicely.

Pro tip: once at the web interface, for the "Recent Conversation Messages (Clean Text + Link)" panel, click the three little dots in the top right hand corner of the panel, click "Edit", and on the right, go to e.g., "Override 4" (should have Fields with name: clean text, also Override 7 for context column) and override property "Cell options > Cell value inspect". This will allow you to expand the text boxes with messages longer than can fit. Make sure you click apply to keep the changes.

Pro tip 2: If you want to download all of the information from any panel as a CSV, go to the same three dots and click "Inspect", and you should see the option.

Grader Interface

Interface to launch a website which for a provided solution and rubric (and a couple of other things detailed below), will grade scanned images of a handwritten solution for the specified problem(s).

Nota bene: this is not yet fully generalized and "service" ready, but instead for testing grading pipelines and a base off of which to build a potential grading app.

Requirements

To launch the service the following files are required:

users.csv. This file is .csv file that contains two columns: "MIT email" and "Unique code", e.g.:

MIT email,Unique code
username@mit.edu,222

For now, the system requires the emails to be in the MIT domain, namely, contain "@mit.edu". TODO: make this an argument that is passed (e.g., school/email domain)

solution_with_rubric_*.txt. These are .txt files that contain the problem solution followed by the rubric. The naming of the files should follow exactly, where the * is the problem number. There should be one of these files for every problem you want the app to be able to grade. The top of the file should be the problem name with a line of dashes ("-") below, e.g.:

Anti-Helmholtz Coils
---------------------------------------------------

These files should live in a directory which you will pass to the config, and A2RCHI will handle the rest.

admin_password.txt. This file will be passed as a secret and be the admin code to login in to the page where you can reset attempts for students.

Secrets

The only grading specific secret is the admin password, which like shown above, should be put in the following file

ADMIN_PASSWORD=your_password

Then it behaves like any other secret.

Configuration

The required fields in the configuration file are different from the rest of the A2RCHI services. Below is an example:

name: grading_test # REQUIRED

a2rchi:
  pipelines:
    - GradingPipeline
  pipeline_map:
    GradingPipeline:
      prompts:
        required:
          final_grade_prompt: final_grade.prompt
      models:
        required:
          final_grade_model: OllamaInterface
    ImageProcessingPipeline:
      prompts:
        required:
          image_processing_prompt: image_processing.prompt
      models:
        required:
          image_processing_model: OllamaInterface

services:
  chat_app:
    trained_on: "rubrics, class info, etc." # REQUIRED
  grader_app:
    num_problems: 1 # REQUIRED
    local_rubric_dir: ~/grading/my_rubrics # REQUIRED
    local_users_csv_dir: ~/grading/logins # REQUIRED

data_manager:
  [...]

name -- The name of your configuration (required).
a2rchi.pipelines -- List of pipelines to use (e.g., GradingPipeline, ImageProcessingPipeline).
a2rchi.pipeline_map -- Mapping of pipelines to their required prompts and models.
a2rchi.pipeline_map.GradingPipeline.prompts.required.final_grade_prompt -- Path to the grading prompt file for evaluating student solutions.
a2rchi.pipeline_map.GradingPipeline.models.required.final_grade_model -- Model class for grading (e.g., OllamaInterface, HuggingFaceOpenLLM).
a2rchi.pipeline_map.ImageProcessingPipeline.prompts.required.image_processing_prompt -- Path to the prompt file for image processing.
a2rchi.pipeline_map.ImageProcessingPipeline.models.required.image_processing_model -- Model class for image processing (e.g., OllamaInterface, HuggingFaceImageLLM).
services.chat_app.trained_on -- A brief description of the data or materials A2RCHI is trained on (required).
services.grader_app.num_problems -- Number of problems the grading service should expect (must match the number of rubric files).
services.grader_app.local_rubric_dir -- Directory containing the solution_with_rubric_*.txt files.
services.grader_app.local_users_csv_dir -- Directory containing the users.csv file.

Running

a2rchi create [...] --services=grader

Models

Models are either:

Hosted locally, either via VLLM or HuggingFace transformers.
Accessed via an API, e.g., OpenAI, Anthropic, etc.
Accessed via an Ollama server instance.

Local Models

To use a local model, specify one of the local model classes in models.py:

HuggingFaceOpenLLM
HuggingFaceImageLLM
VLLM

Models via APIs

We support the following model classes in models.py for models accessed via APIs:

OpenAILLM
AnthropicLLM

Ollama

In order to use an Ollama server instance for the chatbot, it is possible to specify OllamaInterface for the model name. To then correctly use models on the Ollama server, in the keyword args, specify both the url of the server and the name of a model hosted on the server.

a2rchi:
  model_class_map:
    OllamaInterface:
      kwargs:
        base_model: "gemma3" # example
        url: "url-for-server"

In this case, the gemma3 model is hosted on the Ollama server at url-for-server. You can check which models are hosted on your server by going to url-for-server/models.

Vector Store

The vector store is a database that stores document embeddings, enabling semantic and/or lexical search over your knowledge base. A2RCHI uses ChromaDB as the vector store backend to index and retrieve relevant documents based on similarity to user queries.

Configuration

Vector store settings are configured under the data_manager section:

data_manager:
  collection_name: default_collection
  embedding_name: OpenAIEmbeddings
  chunk_size: 1000
  chunk_overlap: 0
  reset_collection: true
  num_documents_to_retrieve: 5
  distance_metric: cosine

Core Settings

collection_name: Name of the ChromaDB collection. Default: default_collection
chunk_size: Maximum size of text chunks (in characters) when splitting documents. Default: 1000
chunk_overlap: Number of overlapping characters between consecutive chunks. Default: 0
reset_collection: If true, deletes and recreates the collection on startup. Default: true
num_documents_to_retrieve: Number of relevant document chunks to retrieve for each query. Default: 5

Distance Metrics

The distance_metric determines how similarity is calculated between embeddings:

cosine: Cosine similarity (default) - measures the angle between vectors
l2: Euclidean distance - measures straight-line distance
ip: Inner product - measures dot product similarity

data_manager:
  distance_metric: cosine  # Options: cosine, l2, ip

Embedding Models

Embeddings convert text into numerical vectors. A2RCHI supports multiple embedding providers:

OpenAI Embeddings

data_manager:
  embedding_name: OpenAIEmbeddings
  embedding_class_map:
    OpenAIEmbeddings:
      class: OpenAIEmbeddings
      kwargs:
        model: text-embedding-3-small
      similarity_score_reference: 10

HuggingFace Embeddings

data_manager:
  embedding_name: HuggingFaceEmbeddings
  embedding_class_map:
    HuggingFaceEmbeddings:
      class: HuggingFaceEmbeddings
      kwargs:
        model_name: sentence-transformers/all-MiniLM-L6-v2
        model_kwargs:
          device: cpu
        encode_kwargs:
          normalize_embeddings: true
      similarity_score_reference: 10
      query_embedding_instructions: null

Supported Document Formats

The vector store can process the following file types:

Text files: .txt, .C
Markdown: .md
Python: .py
HTML: .html
PDF: .pdf

Documents are automatically loaded with the appropriate parser based on file extension.

Document Synchronization

A2RCHI automatically synchronizes your data directory with the vector store:

Adding documents: New files in the data directory are automatically chunked, embedded, and added to the collection
Removing documents: Files deleted from the data directory are removed from the collection
Source tracking: Each ingested artifact is recorded in the unified index.yaml file as <resource-hash>: <relative file path> inside the data directory

Hybrid Search

Combine semantic search with keyword-based BM25 search for improved retrieval:

data_manager:
  use_hybrid_search: true
  bm25_weight: 0.6
  semantic_weight: 0.4
  bm25:
    k1: 0.5
    b: 0.75

use_hybrid_search: Enable hybrid search combining BM25 and semantic similarity. Default: false
bm25_weight: Weight for BM25 keyword scores. Default: 0.6
semantic_weight: Weight for semantic similarity scores. Default: 0.4
bm25.k1: BM25 term frequency saturation parameter. Default: 0.5
bm25.b: BM25 document length normalization parameter. Default: 0.75

Stemming

By specifying the stemming option within your configuration, stemming functionality for the documents in A2RCHI will be enabled. By doing so, documents inserted into the retrieval pipeline, as well as the query that is matched with them, will be stemmed and simplified for faster and more accurate lookup.

data_manager:
  stemming:
    enabled: true

When enabled, both documents and queries are processed using the Porter Stemmer algorithm to reduce words to their root forms (e.g., "running" → "run"), improving matching accuracy.

ChromaDB Backend

A2RCHI supports both local and remote ChromaDB instances:

Local (Persistent)

services:
  chromadb:
    local_vstore_path: /path/to/vectorstore

Remote (HTTP Client)

services:
  chromadb:
    use_HTTP_chromadb_client: true
    chromadb_host: localhost
    chromadb_port: 8000

Benchmarking

A2RCHI has benchmarking functionality provided by the evaluate CLI command. We currently support two modes:

SOURCES: given a user question and a list of correct sources, check if the retrieved documents contain any of the correct sources.
RAGAS: use the Ragas RAG evaluator module to return numerical values judging by 4 of their provided metrics the quality of the answer: answer_relevancy, faithfulness, context precision, and context relevancy.

Preparing the queries file

Provide your list of questions, answers, and relevant sources in JSON format as follows:

[
  {
    "question": "",
    "sources": [...],
    "answer": ""
    // (optional)
    "sources_match_field": [...]
  },
  ...
]

Explanation of fields: - question: The question to be answered by the A2RCHI instance. - sources: A list of sources (e.g., URLs, ticket IDs) that contain the answer. They are identified via the sources_match_field, which must be one of the metadata fields of the documents in your vector store. - answer: The expected answer to the question, used for evaluation. - sources_match_field (optional): A list of metadata fields to match the sources against (e.g., url, ticket_id). If not provided, defaults to what is in the configuration file under data_manager:services:benchmarking:mode_settings:sources:default_match_field.

Example: (see also examples/benchmarking/queries.json)

[
  {
    "question": "Does Jorian Benke work with the PPC and what topic will she work on?",
    "sources": ["https://ppc.mit.edu/blog/2025/07/14/welcome-our-first-ever-in-house-masters-student/", "CMSPROD-42"],
    "answer": "Yes, Jorian works with the PPC and her topic is the study of Lorentz invariance.",
    "source_match_field": ["url", "ticket_id"]
  },
  ...
]

N.B.: one could also provide the url for the JIRA ticket: it is just a choice that you must make, and detail in source_match_field. i.e., the following will evaluate equivalently as the above example:

[
  {
    "question": "Does Jorian Benke work with the PPC and what topic will she work on?",
    "sources": ["https://ppc.mit.edu/blog/2025/07/14/welcome-our-first-ever-in-house-masters-student/", "https://its.cern.ch/jira/browse/CMSPROD-42"],
    "answer": "Yes, Jorian works with the PPC and her topic is the study of Lorentz invariance.",
    "source_match_field": ["url", "url"]
  },
  ...
]

Configuration

You can evaluate one or more configurations by specifying the evaluate command with the -cd flag pointing to the directory containing your configuration file(s). You can also specify individual files with the -c flag. This can be useful if you're interested in comparing different hyperparameter settings.

We support two modes, which you can specify in the configuration file under services:benchmarking:modes. You can choose either or both of RAGAS and SOURCES.

The RAGAS mode will use the Ragas RAG evaluator module to return numerical values judging by 4 of their provided metrics: answer_relevancy, faithfulness, context precision, and context relevancy. More information about these metrics can be found on the Ragas website.

The SOURCES mode will check if the retrieved documents contain any of the correct sources. The matching is done by comparing a given metadata field for any source. The default is display_name, as per the configuration file (data_manager:services:benchmarking:mode_settings:sources:default_match_field). You can override this on a per-query basis by specifying the sources_match_field field in the queries file, as described above.

The configuration file should look like the following:

services:
  benchmarking:
    queries_path: examples/benchmarking/queries.json
    out_dir: bench_out
    modes:
      - "RAGAS"
      - "LINKS"
    mode_settings:
      sources:
        default_match_field: ["display_name"] # default field to match sources against, can be overridden in the queries file
      ragas_settings:
        provider: <provider name> # can be one of OpenAI, HuggingFace, Ollama, and Anthropic
        evaluation_model_settings:
          model_name: <model name> # ensure this lines up with the langchain API name for your chosen model and provider
          base_url: <url> # address to your running Ollama server should you have chosen the Ollama provider
        embedding_model: <embedding provider> # OpenAI or HuggingFace

Finally, before you run the command ensure out_dir, the output directory, both exists on your system and that the path is correctly specified so that results can show up inside of it.

Running

To run the benchmarking script simply run the following:

a2rchi evaluate -n <name> -e <env_file> -cd <configs_directory> <optionally use  -c <file1>,<file2>, ...> <OPTIONS>

Example:

a2rchi evaluate -n benchmark -c examples/benchmarking/benchmark_configs/example_conf.yaml --gpu-ids all

Additional options

You might also want to adjust the timeout setting, which is the upper limit on how long the Ragas evaluation takes on a single QA pair, or the batch_size, which determines how many QA pairs to evaluate at once, which you might want to adjust, e.g., based on hardware constraints, as Ragas doesn't pay great attention to that. The corresponding configuration options are similarly set for the benchmarking services, as follows:

services:
  benchmarking:
    timeout: <time in seconds> # default is 180
    batch_size: <desired batch size> # no default setting, set by Ragas...

Results

The output of the benchmarking will be saved in the out_dir specified in the configuration file. The results will be saved in a timestamped subdirectory, e.g., bench_out/2042-10-01_12-00-00/.

To later examine your data, check out scripts/benchmarking/, which contains some plotting functions and an ipynotebook with some basic usage examples. This is useful to play around with the results of the benchmarking, we will soon also have instead dedicated scripts to produce the plots of interest.

Other

Some useful additional features supported by the framework.

Add ChromaDB Document Management API Endpoints

Debugging ChromaDB endpoints

Debugging REST API endpoints to the A2RCHI chat application for programmatic access to the ChromaDB vector database can be exposed with the following configuration change. To enable the ChromaDB endpoints, add the following to your config file under services.chat_app:

services:
  chat_app:
    # ... other config options ...
    enable_debug_chroma_endpoints: true  # Default: false

ChromaDB Endpoints Info

# `/api/list_docs` (GET)

Lists all documents indexed in ChromaDB with pagination support.

Query Parameters: - page: Page number (1-based, default: 1) - per_page: Documents per page (default: 50, max: 500) - content_length: Content preview length (default: -1 for full content)

Response:

{
  "pagination": {
    "page": 1,
    "per_page": 50,
    "total_documents": 1250,
    "total_pages": 25,
    "has_next": true,
    "has_prev": false
  },
  "documents": [...]
}

# `/api/search_docs` (POST)

Performs semantic search on the document collection using vector similarity.

Request Body: - query: Search query string (required) - n_results: Number of results (default: 5, max: 100) - content_length: Max content length (default: -1, max: 5000) - include_full_content: Include complete document content (default: false)

Response:

{
  "query": "machine learning",
  "search_params": {...},
  "documents": [
    {
      "content": "Document content...",
      "content_length": 1200,
      "metadata": {...},
      "similarity_score": 0.85
    }
  ]
}

User Guide

Overview

Optional command line options

Data Sources

Web Link Lists

Configuration

Secrets

Running

Git scraping

Configuration

Secrets

Running

JIRA

Configuration

Secrets

Running

Adding Documents and the Uploader Interface

Adding Documents

Manual Uploader

Directly copying files to the container

Redmine

Configuration

Secrets

Running

Interfaces/Services

Piazza Interface

Configuration

Secrets

Running

Redmine/Mailbox Interface

Configuration

Secrets

Running

Mattermost Interface

Configuration

Secrets

Running

Grafana Interface

Configuration

Secrets

Running

Grader Interface

Requirements

Secrets

Configuration

Running

Models

Local Models

Models via APIs

Ollama

Vector Store

Configuration

Core Settings

Distance Metrics

Embedding Models

OpenAI Embeddings

HuggingFace Embeddings

Supported Document Formats

Document Synchronization

Hybrid Search

Stemming

ChromaDB Backend

Local (Persistent)

Remote (HTTP Client)

Benchmarking

Preparing the queries file

Configuration

Running

Additional options

Results

Other

Add ChromaDB Document Management API Endpoints

Debugging ChromaDB endpoints

ChromaDB Endpoints Info

# /api/list_docs (GET)

# /api/search_docs (POST)

# `/api/list_docs` (GET)

# `/api/search_docs` (POST)