Providers API
LLM provider implementations for different services.
OpenAIProvider
OpenAIProvider
Bases: BaseLLMProvider
OpenAI LLM provider implementation.
Supports GPT-4, GPT-4 Turbo, and GPT-3.5 Turbo models. Uses the official OpenAI Python SDK (v1.0+).
Example
provider = OpenAIProvider( ... api_key="sk-...", ... model="gpt-4-turbo" ... ) response = await provider.complete("Translate: Hello")
__init__(api_key, model='gpt-4-turbo', timeout=30.0)
Initialize OpenAI provider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
str
|
OpenAI API key |
required |
model
|
str
|
Model name (e.g., "gpt-4-turbo", "gpt-4", "gpt-3.5-turbo") |
'gpt-4-turbo'
|
timeout
|
float
|
Request timeout in seconds |
30.0
|
complete(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a single completion from OpenAI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-2.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional OpenAI parameters (top_p, frequency_penalty, etc.) |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
The generated text response |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
stream(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a streaming completion from OpenAI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-2.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional OpenAI parameters |
{}
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[str, None]
|
Text chunks as they are generated |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
Example
from kttc.llm import OpenAIProvider
from kttc.agents import AgentOrchestrator
provider = OpenAIProvider(
api_key="sk-...",
model="gpt-4",
temperature=0.3,
timeout=60
)
# Use with orchestrator
orchestrator = AgentOrchestrator(provider)
AnthropicProvider
AnthropicProvider
Bases: BaseLLMProvider
Anthropic Claude LLM provider implementation.
Supports Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku. Uses the official Anthropic Python SDK.
Example
provider = AnthropicProvider( ... api_key="sk-ant-...", ... model="claude-3-5-sonnet-20241022" ... ) response = await provider.complete("Translate: Hello")
__init__(api_key, model='claude-3-5-sonnet-20241022', timeout=30.0)
Initialize Anthropic provider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
str
|
Anthropic API key |
required |
model
|
str
|
Model name (e.g., "claude-3-5-sonnet-20241022", "claude-3-opus-20240229") |
'claude-3-5-sonnet-20241022'
|
timeout
|
float
|
Request timeout in seconds |
30.0
|
complete(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a single completion from Claude.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional Anthropic parameters (top_p, top_k, etc.) |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
The generated text response |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
stream(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a streaming completion from Claude.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional Anthropic parameters |
{}
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[str, None]
|
Text chunks as they are generated |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
Example
from kttc.llm import AnthropicProvider
provider = AnthropicProvider(
api_key="sk-ant-...",
model="claude-3-5-sonnet-20241022",
temperature=0.3
)
GigaChatProvider
GigaChatProvider
Bases: BaseLLMProvider
Sber GigaChat LLM provider implementation.
Uses OAuth 2.0 authentication (access token valid for 30 minutes). Supports different API access levels (PERS, B2B, CORP).
Example
provider = GigaChatProvider( ... client_id="your-client-id", ... client_secret="your-client-secret", ... scope="GIGACHAT_API_PERS" # or B2B, CORP ... ) response = await provider.complete("Write a short greeting")
__init__(client_id, client_secret, scope='GIGACHAT_API_PERS', model='GigaChat', timeout=30.0)
Initialize GigaChat provider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client_id
|
str
|
Client ID from Sber Developer portal |
required |
client_secret
|
str
|
Client secret from Sber Developer portal |
required |
scope
|
str
|
API scope (GIGACHAT_API_PERS, GIGACHAT_API_B2B, GIGACHAT_API_CORP) |
'GIGACHAT_API_PERS'
|
model
|
str
|
Model name (GigaChat, GigaChat-Pro, etc.) |
'GigaChat'
|
timeout
|
float
|
Request timeout in seconds |
30.0
|
complete(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a single completion from GigaChat.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional GigaChat parameters |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
The generated text response |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If authentication fails |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
stream(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a streaming completion from GigaChat.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional GigaChat parameters |
{}
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[str, None]
|
Text chunks as they are generated |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If authentication fails |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
Example
from kttc.llm import GigaChatProvider
provider = GigaChatProvider(
client_id="your-client-id",
client_secret="your-client-secret",
model="GigaChat-Pro"
)
YandexGPTProvider
YandexGPTProvider
Bases: BaseLLMProvider
Yandex GPT LLM provider implementation.
Supports YandexGPT Pro (complex tasks, up to 32K tokens) and YandexGPT Lite (fast responses, up to 7.4K tokens).
Example
provider = YandexGPTProvider( ... api_key="your-api-key", ... folder_id="your-folder-id", ... model="yandexgpt/latest" # or "yandexgpt-lite/latest" ... ) response = await provider.complete("Translate: Hello")
__init__(api_key, folder_id, model='yandexgpt/latest', timeout=30.0)
Initialize Yandex GPT provider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
str
|
Yandex Cloud API key (set YC_API_KEY env var) |
required |
folder_id
|
str
|
Yandex Cloud folder ID |
required |
model
|
str
|
Model URI (yandexgpt/latest or yandexgpt-lite/latest) |
'yandexgpt/latest'
|
timeout
|
float
|
Request timeout in seconds |
30.0
|
complete(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a single completion from Yandex GPT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate (must be > 0 and <= 7400) |
2000
|
**kwargs
|
Any
|
Additional Yandex parameters |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
The generated text response |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
stream(prompt, temperature=0.1, max_tokens=2000, **kwargs)
async
Generate a streaming completion from Yandex GPT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send |
required |
temperature
|
float
|
Sampling temperature (0.0-1.0) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Additional Yandex parameters |
{}
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[str, None]
|
Text chunks as they are generated |
Raises:
| Type | Description |
|---|---|
LLMAuthenticationError
|
If API key is invalid |
LLMRateLimitError
|
If rate limit is exceeded |
LLMTimeoutError
|
If request times out |
LLMError
|
For other API errors |
Example
from kttc.llm import YandexGPTProvider
provider = YandexGPTProvider(
api_key="your-api-key",
folder_id="your-folder-id",
model="yandexgpt"
)
Base Provider
BaseLLMProvider
Bases: ABC
Abstract base class for LLM providers.
All LLM providers (OpenAI, Anthropic, etc.) must implement this interface. Supports both synchronous completion and streaming.
complete(prompt, temperature=0.1, max_tokens=2000, **kwargs)
abstractmethod
async
Generate a single completion from the LLM.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send to the LLM |
required |
temperature
|
float
|
Sampling temperature (0.0 = deterministic, 1.0 = creative) |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Provider-specific parameters |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
The generated text response |
Raises:
| Type | Description |
|---|---|
LLMError
|
If the API call fails |
TimeoutError
|
If the request times out |
Example
provider = OpenAIProvider(api_key="...") response = await provider.complete("Translate: Hello") print(response) 'Hola'
stream(prompt, temperature=0.1, max_tokens=2000, **kwargs)
abstractmethod
Generate a streaming completion from the LLM.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prompt
|
str
|
The prompt to send to the LLM |
required |
temperature
|
float
|
Sampling temperature |
0.1
|
max_tokens
|
int
|
Maximum tokens to generate |
2000
|
**kwargs
|
Any
|
Provider-specific parameters |
{}
|
Yields:
| Type | Description |
|---|---|
AsyncGenerator[str, None]
|
Text chunks as they are generated |
Example
provider = OpenAIProvider(api_key="...") async for chunk in provider.stream("Translate: Hello"): ... print(chunk, end="", flush=True) Hola
All providers inherit from this base class.
Complexity Routing
ComplexityRouter
ComplexityRouter
Route to optimal model based on text complexity.
Combines complexity estimation with ModelSelector for intelligent model routing.
Example
router = ComplexityRouter() model, score = router.route( ... text="The API endpoint returns JSON data.", ... source_lang="en", ... target_lang="es" ... ) print(model) # "gpt-3.5-turbo" print(score.overall) # 0.25
__init__()
Initialize complexity router.
route(text, source_lang, target_lang, domain=None, force_model=None, available_providers=None)
Route to optimal model based on complexity and available providers.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Source text to analyze |
required |
source_lang
|
str
|
Source language code |
required |
target_lang
|
str
|
Target language code |
required |
domain
|
str | None
|
Optional domain hint |
None
|
force_model
|
str | None
|
Force specific model (override routing) |
None
|
available_providers
|
list[str] | None
|
List of available provider names (with API keys) |
None
|
Returns:
| Type | Description |
|---|---|
tuple[str, ComplexityScore]
|
Tuple of (model_name, complexity_score) |
Example
model, score = router.route( ... "Hello world", ... "en", ... "es", ... available_providers=["anthropic"] ... )
Automatically routes requests to different providers based on text complexity.
Example:
from kttc.llm import ComplexityRouter, OpenAIProvider
# Configure router with different providers for different complexity levels
router = ComplexityRouter(
simple_provider=OpenAIProvider(model="gpt-3.5-turbo"),
medium_provider=OpenAIProvider(model="gpt-4-turbo"),
complex_provider=AnthropicProvider(model="claude-3-5-sonnet-20241022")
)
# Router automatically selects provider
provider = await router.select_provider(task)
ComplexityEstimator
ComplexityEstimator
Estimate text complexity for smart routing.
Uses multiple heuristics to estimate translation difficulty: 1. Average sentence length 2. Rare word frequency 3. Syntactic complexity (clause nesting) 4. Domain-specific terminology density
Example
estimator = ComplexityEstimator() score = estimator.estimate("The API endpoint returns JSON data.") print(score.recommendation) # "gpt-3.5-turbo"
estimate(text, domain=None, available_providers=None)
Estimate text complexity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Text to analyze |
required |
domain
|
str | None
|
Optional domain hint |
None
|
available_providers
|
list[str] | None
|
List of available provider names (with API keys) |
None
|
Returns:
| Type | Description |
|---|---|
ComplexityScore
|
ComplexityScore with breakdown and recommendation |
Estimates text complexity for smart routing.
Error Handling
All providers can raise these exceptions:
LLMError
LLMError
Bases: Exception
Base exception for LLM-related errors.
Base exception for all LLM errors.
LLMAuthenticationError
LLMAuthenticationError
Bases: LLMError
Raised when authentication fails.
Raised when authentication fails (invalid API key).
LLMRateLimitError
LLMRateLimitError
Bases: LLMError
Raised when hitting rate limits.
Raised when rate limits are exceeded.
LLMTimeoutError
LLMTimeoutError
Bases: LLMError
Raised when an LLM request times out.
Raised when request times out.
Example: Error Handling
from kttc.llm import (
OpenAIProvider,
LLMAuthenticationError,
LLMRateLimitError,
LLMTimeoutError
)
provider = OpenAIProvider(api_key="sk-...")
try:
response = await provider.complete(messages=[...])
except LLMAuthenticationError:
print("Invalid API key")
except LLMRateLimitError:
print("Rate limit exceeded, please retry later")
except LLMTimeoutError:
print("Request timed out")
Prompt Templates
PromptTemplate
Manages prompt templates for QA agents.
Templates are stored as .txt files and support variable substitution.
Example
template = PromptTemplate.load("accuracy") prompt = template.format( ... source_text="Hello", ... translation="Hola", ... source_lang="en", ... target_lang="es" ... )
__init__(template_text)
Initialize with template text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
template_text
|
str
|
The raw template text with placeholders |
required |
format(**kwargs)
Format the template with provided variables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs
|
Any
|
Template variables (source_text, translation, etc.) |
{}
|
Returns:
| Type | Description |
|---|---|
str
|
Formatted prompt string |
Raises:
| Type | Description |
|---|---|
PromptTemplateError
|
If required variables are missing |
Example
template = PromptTemplate.load("accuracy") prompt = template.format( ... source_text="Hello", ... translation="Hola", ... source_lang="en", ... target_lang="es" ... )
load(agent_name)
classmethod
Load a prompt template for a specific agent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
agent_name
|
str
|
Name of the agent (e.g., "accuracy", "fluency", "terminology") |
required |
Returns:
| Type | Description |
|---|---|
PromptTemplate
|
PromptTemplate instance |
Raises:
| Type | Description |
|---|---|
PromptTemplateError
|
If template file doesn't exist |
Example
template = PromptTemplate.load("accuracy")
Create reusable prompt templates.
Example: