Rate Limiting¶
When processing many requests with execute_many(), you may need to control the request rate to avoid overwhelming the StackSpot AI API or hitting server-side rate limits.
Terminology: Rate Limiting vs Throttling
The SDK uses "rate limiting" terminology, but the actual behavior is throttling — requests are delayed (queued) rather than immediately rejected. This proactive approach maximizes successful requests. See HTTP Client > Terminology for details.
Global Configuration (Recommended)¶
The easiest way to enable rate limiting is via STKAI.configure():
from stkai import STKAI, RemoteQuickCommand, RqcRequest
# Enable rate limiting globally
STKAI.configure(
rate_limit={
"enabled": True,
"strategy": "token_bucket",
"max_requests": 30,
"time_window": 60.0,
}
)
# Rate limiting is automatically applied
rqc = RemoteQuickCommand(slug_name="my-quick-command")
responses = rqc.execute_many(
request_list=[RqcRequest(payload=data) for data in large_dataset]
)
Or via environment variables:
export STKAI_RATE_LIMIT_ENABLED=true
export STKAI_RATE_LIMIT_STRATEGY=token_bucket
export STKAI_RATE_LIMIT_MAX_REQUESTS=30
export STKAI_RATE_LIMIT_TIME_WINDOW=60.0
Full Configuration Reference
See HTTP Client > Rate Limiting for all configuration options, strategies comparison, algorithms explanation, and environment variables.
Manual Configuration¶
For more control, you can manually create rate-limited HTTP clients:
from stkai import RemoteQuickCommand, RqcRequest
from stkai import TokenBucketRateLimitedHttpClient, StkCLIHttpClient
# Limit to 30 requests per minute
http_client = TokenBucketRateLimitedHttpClient(
delegate=StkCLIHttpClient(),
max_requests=30,
time_window=60.0,
)
rqc = RemoteQuickCommand(
slug_name="my-quick-command",
http_client=http_client,
)
responses = rqc.execute_many(
request_list=[RqcRequest(payload=data) for data in large_dataset]
)
For adaptive rate limiting (handles HTTP 429 automatically):
from stkai import AdaptiveRateLimitedHttpClient, StkCLIHttpClient
http_client = AdaptiveRateLimitedHttpClient(
delegate=StkCLIHttpClient(),
max_requests=100,
time_window=60.0,
min_rate_floor=0.1, # Never below 10%
max_retries_on_429=3, # Retry on 429
)
rqc = RemoteQuickCommand(
slug_name="my-quick-command",
http_client=http_client,
)
Batch Processing with Rate Limiting¶
Rate limiting is especially useful with execute_many() for batch processing:
from stkai import STKAI, RemoteQuickCommand, RqcRequest
# Configure rate limiting
STKAI.configure(
rate_limit={
"enabled": True,
"strategy": "adaptive",
"max_requests": 50,
}
)
rqc = RemoteQuickCommand(
slug_name="code-review",
max_workers=16, # 16 concurrent workers, still rate-limited
)
# Process large dataset
files = load_files_to_review()
responses = rqc.execute_many(
request_list=[RqcRequest(payload={"code": f}) for f in files]
)
# Check results
completed = [r for r in responses if r.is_completed()]
failed = [r for r in responses if r.is_failure()]
Next Steps¶
- HTTP Client > Rate Limiting - Detailed guide with algorithms, strategies, and configuration
- Configuration - Global SDK configuration
- API Reference - Complete API documentation