API Reference
CLI
The A2rchi CLI provides commands to create, manage, and delete A2rchi deployments and services.
Commands
1. create
Create a new A2rchi deployment.
Usage:
a2rchi create --name <deployment_name> --config <config.yaml> --env-file <secrets.env> [OPTIONS]
Options:
--name, -n
(str, required): Name of the deployment.--config, -c
(str, required): Path to the YAML configuration file.--env-file, -e
(str, required): Path to the secrets.env
file.--services, -s
(comma-separated): List of services to enable (e.g.,chat_app,uploader_app
).--sources, -src
(comma-separated): Data sources to enable (e.g.,jira,redmine
).--podman, -p
: Use Podman instead of Docker.--gpu-ids
: GPU configuration (all
or comma-separated IDs).--tag, -t
(str): Image tag for built containers (default:2000
).--hostmode
: Use host network mode.--verbosity, -v
(int): Logging verbosity (0-4, default: 3).--force, -f
: Overwrite existing deployment if it exists.--dry, --dry-run
: Validate and show what would be created, but do not deploy.
2. delete
Delete an existing A2rchi deployment.
Usage:
a2rchi delete --name <deployment_name> [OPTIONS]
Options:
--name, -n
(str): Name of the deployment to delete.--rmi
: Remove container images.--rmv
: Remove volumes.--keep-files
: Keep deployment files (do not remove directory).--list
: List all available deployments.
3. list_services
List all available A2rchi services and data sources.
Usage:
a2rchi list_services
4. list_deployments
List all existing A2rchi deployments.
Usage:
a2rchi list_deployments
Examples
Create a deployment:
a2rchi create --name mybot --config configs/my.yaml --env-file secrets.env --services chat_app,uploader_app
Delete a deployment and remove images/volumes:
a2rchi delete --name mybot --rmi --rmv
List all deployments:
a2rchi list_deployments
List all services:
a2rchi list_services
Configuration YAML API Reference
The A2rchi configuration YAML file defines the deployment, services, data sources, pipelines, models, and interface settings for your A2rchi instance.
Top-Level Fields
name
- Type: string
- Description: Name of the deployment.
global
- TRAINED_ON: string
Description of the data or corpus the system was trained on. - DATA_PATH: string
Path to data storage. - ACCOUNTS_PATH: string
Path to user accounts. - ACCEPTED_FILES: list
Allowed file extensions for uploads. - ROLES: list
User roles available in the system. - LOGGING.input_output_filename: string
Log file for input/output. - verbosity: int
Logging verbosity (0-4).
interfaces
Settings for each web interface or service.
chat_app
, uploader_app
, grader_app
, grafana
- port: int
Internal port that the Flask application binds to inside the container. This is the port the Flask server listens on within the container's network namespace. Usually don't need to change this unless you have port conflicts within the container. Default is7861
. - external_port: int
External port that maps to the container's internal port, making the chat application accessible from outside the container. This is the port users will connect to in their browser (e.g.,your-hostname:7861
). When running multiple deployments on the same machine, each deployment must use a different external port to avoid conflicts. Default is7861
. - host: string
Network interface address that the Flask application binds to inside the container. Setting this to0.0.0.0
allows the application to accept connections from any network interface, which is necessary for the application to be accessible from outside the container. Shouldn't remain unchanged unless you have specific networking requirements. Default is0.0.0.0
. - hostname: string
The hostname or IP address that client browsers will use to make API requests to the Flask server. This gets embedded into the JavaScript code and determines where the frontend sends its API calls. Must be set to the actual hostname/IP of the machine running the container. Usinglocalhost
will only work if accessing the application from the same machine. Default islocalhost
. - template_folder: string
Path to HTML templates. - static_folder: string
Path to static files (if applicable). - num_responses_until_feedback: int
Number of responses before the user is encouraged to provide feedback. - include_copy_button: bool
Show copy-to-clipboard button. - enable_debug_chroma_endpoints: bool
Enable debug endpoints (chat_app). - flask_debug_mode: bool
Enable Flask debug mode. - num_problems: int
Number of problems (grader_app). - local_rubric_dir: string
Path to rubric files (grader_app). - local_users_csv_dir: string
Path to users CSV (grader_app). - verify_urls: bool
Verify URLs on upload (uploader_app).
data_manager
Controls vector store, chunking, and embedding settings.
- collection_name: string
Name of the vector collection. - input_lists: list
List of files with initial context URLs. - local_vstore_path: string
Path to local vector store. - embedding_name: string
Embedding backend (OpenAIEmbeddings
,HuggingFaceEmbeddings
). - embedding_class_map: dict
Embedding backend configuration (see below). - chunk_size: int
Number of characters per chunk, i.e., a string that will get embedded and stored in the vector database. Default is1000
. - chunk_overlap: int
When splitting documents into chunks, how much should they overlap. Default is0
. - use_HTTP_chromadb_client: bool
Use HTTP client for ChromaDB. - chromadb_host: string
Hostname for ChromaDB. - chromadb_port: int
Internal port for ChromaDB. - chromadb_external_port: int
Host port for ChromaDB. - reset_collection: bool
Reset vector collection on startup. - num_documents_to_retrieve: int
How many chunks to query in order of decreasing similarity (so 1 would return the most similar only, 2 the next most similar, etc.). - stemming.enabled: bool
Enable stemming for search. - distance_metric: string
Distance metric to use for similarity search in ChromaDB. Options arecosine
,l2
, andip
. Read more (here)[https://docs.trychroma.com/docs/collections/configure]. Default for A2rchi is cosine. - use_hybrid_search: bool
Enables hybrid search, that is performing lexical search as well as semantic search. Docs retrieved from both searches are combined. The default isFalse
- bm25_weight: float
Weight for BM25 in hybrid search. - semantic_weight: float
Weight for semantic search in hybrid search. - bm25.k1: float
BM25 term frequency saturation. Controls how much the score increases with additional occurrences of a term in a document. Range:[1.2,2.0]
- bm25.b: float
BM25 length normalization. Controls how much the document length influences the score. BM25 normalizes term frequency by document length compared to the average document length in the corpus. Range:[0,1]
embedding_class_map
-
OpenAIEmbeddings:
- class:
string
- kwargs.model:
string
- similarity_score_reference:
float
- class:
-
HuggingFaceEmbeddings:
- class:
string
-
kwargs.model_name:
string
The HuggingFace embedding model you want to use. Default issentence-transformers/all-MiniLM-L6-v2
. TODO: fix logic to require token if private model is requested. -
kwargs.model_kwargs.device:
string
(cpu
orcuda
)
Argument passed to embedding model initialization, to load ontocpu
(default) orcuda
(GPU), which you can select if you are deploying a2rchi onto GPU. -
kwargs.encode_kwargs.normalize_embeddings:
bool
Whether to normalize the embedded vectors or not. Default istrue
. Note, the default distance metric that chromadb uses is l2, which measures the absolute geometric distance between vectors, so whether they are normalized or not will affect the search. -
similarity_score_reference:
float
The threshold for whether to include the link to the most relevant context in the chat response. It is an approximate distance (chromadb uses an HNSW index, where default distance function is l2 -- see more here), so smaller values represent higher similarity. The link will be included if the score is below the chosen value. Default is10
(scores are usually order 1, so default is to always include link). -
query_embedding_instructions:
string
ornull
Instructions to accompany the embedding of the query and subsequent document search. Only certain embedding models support this -- seeINSTRUCTION_AWARE_MODELS
ina2rchi/chains/retrievers.py
to add models that support this. For example, theQwen/Qwen3-Embedding-XB
embedding models support this and are listed, see more here. Default isNone
. You should write the string directly into the config. An example instruction might look like:"Given a query, retrieve relevant information to answer the query"
. You might tune it to be more specific to your use case which might improve performance.
- class:
a2rchi
Pipeline and model configuration.
- pipelines:
list
List of enabled pipelines (e.g.,QAPipeline
,GradingPipeline
). - pipeline_map:
dict
Configuration for each pipeline:- max_tokens:
int
- prompts.required:
dict
Required prompt files for the pipeline. - prompts.optional:
dict
Optional prompt files. - models.required:
dict
Required models for the pipeline. - models.optional:
dict
Optional models.
- max_tokens:
- model_class_map:
dict
Model backend configuration (see below). - chain_update_time:
int
Time (seconds) between chain updates.
model_class_map
Each model (e.g., AnthropicLLM
, OpenAIGPT4
, LlamaLLM
, etc.) has:
- class:
string
- kwargs:
dict
Model-specific parameters (see template for details).
utils
- postgres:
- port:
int
- user:
string
- database:
string
- host:
string
- port:
- sso:
- enabled:
bool
- sso_class:
string
- sso_class_map:
dict
- class:
string
- kwargs:
dict
- class:
- enabled:
- git:
- enabled:
bool
- enabled:
- scraper:
- reset_data:
bool
- verify_urls:
bool
- enable_warnings:
bool
- reset_data:
- piazza:
- network_id:
string
- update_time:
int
- network_id:
- mattermost:
- update_time:
int
- update_time:
- redmine:
- redmine_update_time:
int
- answer_tag:
string
- redmine_update_time:
- mailbox:
- imap4_port:
int
- mailbox_update_time:
int
- imap4_port:
- jira:
- url:
string
The URL of the JIRA instance from which A2rchi will fetch data. - projects:
list
List of JIRA project names that A2rchi will fetch data from. - anonymize_data:
bool
Boolean flag indicating whether the fetched data from JIRA should be anonymized. Default isTrue
.
- url:
- anonymizer:
- nlp_model:
string
The NLP model that thespacy
library will use to perform Named Entity Recognition (NER). - excluded_words:
list
The list of words that the anonymizer should remove. - greeting_patterns:
list
The regex patterns to match and remove greeting patterns. - signoff_patterns:
list
The regex patterns to match and remove signoff patterns. - email_pattern:
string
The regex pattern to match and remove email addresses. - username_pattern:
string
The regex pattern to match and remove JIRA usernames.
- nlp_model:
Required Fields
Some fields are required depending on enabled services and pipelines.
For example:
name
global.TRAINED_ON
a2rchi.pipelines
- Service-specific fields (e.g.,
utils.piazza.network_id
,interfaces.grader_app.num_problems
)
See the User Guide for more configuration examples and explanations.
Example
name: my_deployment
global:
TRAINED_ON: "MIT course data"
DATA_PATH: "/root/data/"
ACCOUNTS_PATH: "/root/.accounts/"
ACCEPTED_FILES: [".txt", ".pdf"]
ROLES: ["User", "A2rchi", "Expert"]
LOGGING:
input_output_filename: "chain_input_output.log"
verbosity: 3
interfaces:
chat_app:
port: 7861
external_port: 7861
host: "0.0.0.0"
hostname: "localhost"
num_responses_until_feedback: 3
flask_debug_mode: true
data_manager:
collection_name: "default_collection"
input_lists: ["configs/miscellanea.list"]
embedding_name: "OpenAIEmbeddings"
chunk_size: 1000
chunk_overlap: 0
distance_metric: "cosine"
num_documents_to_retrieve: 5
a2rchi:
pipelines: ["QAPipeline"]
pipeline_map:
QAPipeline:
max_tokens: 10000
prompts:
required:
condense_prompt: "condense.prompt"
chat_prompt: "chat.prompt"
models:
required:
condense_model: "DumbLLM"
chat_model: "DumbLLM"
model_class_map:
DumbLLM:
class: DumbLLM
kwargs:
sleep_time_mean: 3
filler: null
utils:
postgres:
port: 5432
user: "a2rchi"
database: "a2rchi-db"
host: "postgres"
Tip:
For a full template, see a2rchi/cli/templates/base-config.yaml
in