Understanding RAGflow’s api_token Table
Understanding RAGflow’s api_token Table
RAGflow uses a Peewee ORM layer on top of MySQL/PostgreSQL/OceanBase. Every API key
issued to a tenant is stored in the api_token table inside the rag_flow database.
This post walks through the table schema, the composite primary key design, and how
to query it — from both SQL and Python.
Schema — What the ORM Model Says
The canonical definition lives in
api/db/db_models.py:980-989:
class APIToken(DataBaseModel):
tenant_id = CharField(max_length=32, null=False, index=True)
token = CharField(max_length=255, null=False, index=True)
dialog_id = CharField(max_length=32, null=True, index=True)
source = CharField(max_length=16, null=True,
help_text="none|agent|dialog", index=True)
beta = CharField(max_length=255, null=True, index=True)
class Meta:
db_table = "api_token"
primary_key = CompositeKey("tenant_id", "token")
Translated to DDL this becomes:
CREATE TABLE api_token (
tenant_id VARCHAR(32) NOT NULL, -- owning tenant
token VARCHAR(255) NOT NULL, -- the API key string
dialog_id VARCHAR(32) DEFAULT NULL, -- optional dialog scope
source VARCHAR(16) DEFAULT NULL, -- none | agent | dialog
beta VARCHAR(255) DEFAULT NULL, -- beta-channel token
PRIMARY KEY (tenant_id, token) -- composite PK
);
Composite Primary Key Design
The primary key spans (tenant_id, token) rather than using a surrogate ID. This
choice means:
- A single tenant can own many tokens, but the same token string cannot appear twice under the same tenant.
- Lookups by token alone are fast because
tokenhas its own index in addition to being the second column in the composite PK. - Deletion of all tokens for a tenant is a simple
WHERE tenant_id = ?scan.
The source Column — Schema Evolution via Migration
source was not part of the original schema. It was added later through RAGflow’s
migrate_db() routine:
alter_db_add_column(
migrator, "api_token", "source",
CharField(max_length=16, null=True,
help_text="none|agent|dialog", index=True)
)
Likewise beta and the type change on dialog_id were applied through the same
migration pipeline, which means the live table may have been created without these
columns and had them applied in-place.
Querying from Python
RAGflow’s BaseModel.query() helper translates keyword arguments into WHERE
clauses:
from api.db.db_models import APIToken
# Validate an incoming Authorization header
def get_token_info(authorization_header: str):
raw_token = authorization_header.split()[1] # "Bearer <token>"
results = APIToken.query(token=raw_token)
if not results:
raise PermissionError("Invalid API token")
return results[0]
Querying from SQL
-- Use the correct database
USE rag_flow;
-- Inspect the live schema (MySQL / OceanBase)
DESCRIBE api_token;
-- List all tokens for a tenant
SELECT token, dialog_id, source, beta
FROM api_token
WHERE tenant_id = '<your-tenant-id>';
-- Find which tenants have dialog-scoped tokens
SELECT DISTINCT tenant_id
FROM api_token
WHERE source = 'dialog';
-- Revoke a specific token
DELETE FROM api_token
WHERE tenant_id = '<tenant-id>' AND token = '<token-value>';