LLM Providers¶
GraphBit supports multiple Large Language Model providers through a unified client interface. This guide covers configuration, usage, and optimization for each supported provider.
Supported Providers¶
GraphBit supports these LLM providers: - OpenAI - GPT models including GPT-4o, GPT-4o-mini - Azure OpenAI - GPT models hosted on Microsoft Azure with enterprise features - Anthropic - Claude models including Claude-4-Sonnet - OpenRouter - Unified access to 400+ models from multiple providers (GPT, Claude, Mistral, etc.) - Perplexity - Real-time search-enabled models including Sonar models - DeepSeek - High-performance models including DeepSeek-Chat, DeepSeek-Coder, and DeepSeek-Reasoner - TogetherAI - Access to open-source models including GPT-OSS, Kimi, and Qwen with competitive pricing - Fireworks AI - Fast inference for open-source models including Llama, Mixtral, and Qwen - Replicate - Access to open-source models with function calling support including Glaive, Hermes, and Granite models - xAI - Grok models with real-time information and advanced reasoning capabilities - Ollama - Local model execution with various open-source models
Configuration¶
OpenAI Configuration¶
Configure OpenAI provider with API key and model selection:
import os
from graphbit import LlmConfig
# Basic OpenAI configuration
config = LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-mini" # Optional - defaults to gpt-4o-mini
)
# Access configuration details
print(f"Provider: {config.provider()}") # "OpenAI"
print(f"Model: {config.model()}") # "gpt-4o-mini"
Available OpenAI Models¶
Model | Best For | Context Length | Performance |
---|---|---|---|
gpt-4o | Complex reasoning, latest features | 128K | High quality, slower |
gpt-4o-mini | Balanced performance and cost | 128K | Good quality, faster |
# Model selection examples
creative_config = LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o" # For creative and complex tasks
)
production_config = LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-mini" # Balanced for production
)
Azure OpenAI Configuration¶
Azure OpenAI provides enterprise-grade access to OpenAI models through Microsoft Azure, offering enhanced security, compliance, and regional availability.
import os
from graphbit import LlmConfig
# Basic Azure OpenAI configuration
config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name="gpt-4o-mini", # Your Azure deployment name
endpoint="https://your-resource.openai.azure.com" # Your Azure OpenAI endpoint
)
print(f"Provider: {config.provider()}") # "azure_openai"
print(f"Model: {config.model()}") # "gpt-4o-mini"
Azure OpenAI with Custom API Version¶
# Configuration with specific API version
config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name="gpt-4o",
endpoint="https://your-resource.openai.azure.com",
api_version="2024-10-21" # Optional - defaults to "2024-10-21"
)
Azure OpenAI Setup Requirements¶
To use Azure OpenAI, you need:
- Azure OpenAI Resource: Create an Azure OpenAI resource in the Azure portal
- Model Deployment: Deploy a model (e.g., GPT-4o, GPT-4o-mini) in your resource
- API Key: Get your API key from the Azure portal
- Endpoint URL: Your resource endpoint (format:
https://{resource-name}.openai.azure.com
)
Environment Variables¶
Set these environment variables for Azure OpenAI:
export AZURE_OPENAI_API_KEY="your-azure-openai-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"
export AZURE_OPENAI_DEPLOYMENT="your-deployment-name"
export AZURE_OPENAI_API_VERSION="2024-10-21" # Optional
# Using environment variables
config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT"),
endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_version=os.getenv("AZURE_OPENAI_API_VERSION", "2024-10-21")
)
Available Azure OpenAI Models¶
Model | Best For | Context Length | Performance |
---|---|---|---|
gpt-4o | Complex reasoning, latest features | 128K | High quality, slower |
gpt-4o-mini | Balanced performance and cost | 128K | Good quality, faster |
gpt-4-turbo | Advanced tasks, function calling | 128K | High quality, moderate speed |
gpt-4 | Complex analysis, creative tasks | 8K | High quality, slower |
gpt-3.5-turbo | General tasks, cost-effective | 16K | Good quality, fast |
# Model selection examples for Azure OpenAI
premium_config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name="gpt-4o", # For complex reasoning and latest features
endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)
balanced_config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name="gpt-4o-mini", # Balanced performance and cost
endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)
cost_effective_config = LlmConfig.azure_openai(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
deployment_name="gpt-3.5-turbo", # Cost-effective for general tasks
endpoint=os.getenv("AZURE_OPENAI_ENDPOINT")
)
Anthropic Configuration¶
Configure Anthropic provider for Claude models:
# Basic Anthropic configuration
config = LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-sonnet-4-20250514" # Optional - defaults to claude-sonnet-4-20250514
)
print(f"Provider: {config.provider()}") # "Anthropic"
print(f"Model: {config.model()}") # "claude-sonnet-4-20250514"
Available Anthropic Models¶
Model | Best For | Context Length | Speed |
---|---|---|---|
claude-opus-4-1-20250805 | Most capable, complex analysis | 200K | Medium speed, highest quality |
claude-sonnet-4-20250514 | Balanced performance | 200K/1M | Slow speed and good quality |
claude-3-haiku-20240307 | Fast, cost-effective | 200K | Fastest, good quality |
# Model selection for different use cases
complex_config = LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-opus-4-1-20250805" # For complex analysis
)
balanced_config = LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-sonnet-4-20250514" # For balanced workloads
)
fast_config = LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-3-haiku-20240307" # For speed and efficiency
)
OpenRouter Configuration¶
OpenRouter provides unified access to 400+ AI models through a single API, including models from OpenAI, Anthropic, Google, Meta, Mistral, and many others. This allows you to easily switch between different models and providers without changing your code.
import os
from graphbit import LlmConfig
# Basic OpenRouter configuration
config = LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="openai/gpt-4o-mini" # Optional - defaults to openai/gpt-4o-mini
)
print(f"Provider: {config.provider()}") # "openrouter"
print(f"Model: {config.model()}") # "openai/gpt-4o-mini"
Popular OpenRouter Models¶
Model | Provider | Best For | Context Length |
---|---|---|---|
openai/gpt-4o | OpenAI | Complex reasoning, latest features | 128K |
openai/gpt-4o-mini | OpenAI | Balanced performance and cost | 128K |
anthropic/claude-3-5-sonnet | Anthropic | Advanced reasoning, coding | 200K |
anthropic/claude-3-5-haiku | Anthropic | Fast responses, simple tasks | 200K |
google/gemini-pro-1.5 | Large context, multimodal | 1M | |
meta-llama/llama-3.1-405b-instruct | Meta | Open source, high performance | 131K |
mistralai/mistral-large | Mistral | Multilingual, reasoning | 128K |
# Model selection examples
openai_config = LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="openai/gpt-4o" # Access OpenAI models through OpenRouter
)
claude_config = LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="anthropic/claude-3-5-sonnet" # Access Claude models
)
llama_config = LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="meta-llama/llama-3.1-405b-instruct" # Access open source models
)
OpenRouter with Site Information¶
For better rankings and analytics on OpenRouter, you can provide your site information:
# Configuration with site information for OpenRouter rankings
config = LlmConfig.openrouter_with_site(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="openai/gpt-4o-mini",
site_url="https://graphbit.ai", # Optional - your site URL
site_name="GraphBit AI Framework" # Optional - your site name
)
Perplexity Configuration¶
Configure Perplexity provider to access real-time search-enabled models:
# Basic Perplexity configuration
config = LlmConfig.perplexity(
api_key=os.getenv("PERPLEXITY_API_KEY"),
model="sonar" # Optional - defaults to sonar
)
print(f"Provider: {config.provider()}") # "perplexity"
print(f"Model: {config.model()}") # "sonar"
Available Perplexity Models¶
Model | Best For | Context Length | Special Features |
---|---|---|---|
sonar | General purpose with search | 128K | Real-time web search, citations |
sonar-reasoning | Complex reasoning with search | 128K | Multi-step reasoning, web research |
sonar-deep-research | Comprehensive research | 128K | Exhaustive research, detailed analysis |
# Model selection for different use cases
research_config = LlmConfig.perplexity(
api_key=os.getenv("PERPLEXITY_API_KEY"),
model="sonar-deep-research" # For comprehensive research
)
reasoning_config = LlmConfig.perplexity(
api_key=os.getenv("PERPLEXITY_API_KEY"),
model="sonar-reasoning" # For complex problem solving
)
DeepSeek Configuration¶
Configure DeepSeek provider for high-performance, cost-effective AI models:
# Basic DeepSeek configuration
config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-chat" # Optional - defaults to deepseek-chat
)
print(f"Provider: {config.provider()}") # "deepseek"
print(f"Model: {config.model()}") # "deepseek-chat"
Available DeepSeek Models¶
Model | Best For | Context Length | Performance | Cost |
---|---|---|---|---|
deepseek-chat | General conversation, instruction following | 128K | High quality, fast | |
deepseek-coder | Code generation, programming tasks | 128K | Specialized for code | |
deepseek-reasoner | Complex reasoning, mathematics | 128K | Advanced reasoning |
# Model selection for different use cases
general_config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-chat" # For general tasks and conversation
)
coding_config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-coder" # For code generation and programming
)
reasoning_config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-reasoner" # For complex reasoning tasks
)
TogetherAI Configuration¶
Configure TogetherAI provider for access to open-source models with competitive pricing:
# Basic TogetherAI configuration
config = LlmConfig.togetherai(
api_key=os.getenv("TOGETHER_API_KEY"),
model="openai/gpt-oss-20b" # Optional - defaults to openai/gpt-oss-20b
)
# Access configuration details
print(f"Provider: {config.provider()}") # "togetherai"
print(f"Model: {config.model()}") # "openai/gpt-oss-20b"
Available TogetherAI Models¶
Model | Best For | Context Length | Cost (per 1M tokens) |
---|---|---|---|
openai/gpt-oss-20b | General tasks, fast inference | 8K | $0.50 / $0.50 |
moonshotai/Kimi-K2-Instruct-0905 | Long documents, high context | 200K | $1.00 / $1.00 |
Qwen/Qwen3-Next-80B-A3B-Instruct | Complex reasoning, most capable | 32K | $2.00 / $2.00 |
# Model selection examples
fast_config = LlmConfig.togetherai(
api_key=os.getenv("TOGETHER_API_KEY"),
model="openai/gpt-oss-20b" # Fast and cost-effective
)
long_context_config = LlmConfig.togetherai(
api_key=os.getenv("TOGETHER_API_KEY"),
model="moonshotai/Kimi-K2-Instruct-0905" # For long documents
)
capable_config = LlmConfig.togetherai(
api_key=os.getenv("TOGETHER_API_KEY"),
model="Qwen/Qwen3-Next-80B-A3B-Instruct" # Most capable
)
TogetherAI Features¶
- ✅ Function Calling: All models support function/tool calling
- ✅ Streaming: Real-time response streaming
- ✅ Cost Estimation: Built-in cost tracking per token
- ✅ Context Detection: Automatic context length detection
- ✅ Async Support: Full async/await compatibility
Getting Started with TogetherAI¶
- Sign up: Create an account at TogetherAI
- Get API Key: Generate your API key from the dashboard
- Set Environment Variable:
export TOGETHER_API_KEY="your-key"
- Start Building: Use in your GraphBit workflows
# Complete example
import os
from graphbit import LlmConfig, Executor, Workflow, Node
# Configure TogetherAI
config = LlmConfig.togetherai(
api_key=os.getenv("TOGETHER_API_KEY"),
model="openai/gpt-oss-20b"
)
# Create client and generate text
client = LlmClient(config)
response = client.complete(
prompt="Explain quantum computing in simple terms",
max_tokens=200,
temperature=0.7
)
print(response)
Fireworks AI Configuration¶
Configure Fireworks AI for fast inference with open-source models:
import os
from graphbit import LlmConfig
# Basic Fireworks AI configuration
config = LlmConfig.fireworks(
api_key=os.getenv("FIREWORKS_API_KEY"),
model="accounts/fireworks/models/llama-v3p1-8b-instruct" # Optional - defaults to llama-v3p1-8b-instruct
)
print(f"Provider: {config.provider()}") # "fireworks"
print(f"Model: {config.model()}") # "accounts/fireworks/models/llama-v3p1-8b-instruct"
Popular Fireworks AI Models¶
Model | Best For | Context Length | Performance | Cost |
---|---|---|---|---|
accounts/fireworks/models/llama-v3p1-8b-instruct | General tasks, fast inference | 131K | Fast, efficient | Very low |
accounts/fireworks/models/llama-v3p1-70b-instruct | Complex reasoning, high quality | 131K | High quality | Low |
accounts/fireworks/models/deepseek-v3p1 | Most complex tasks | 131K | Highest quality | Medium |
accounts/fireworks/models/kimi-k2-instruct-0905 | Multilingual, code generation | 32K | Balanced | Low |
accounts/fireworks/models/qwen3-coder-480b-a35b-instruct | Reasoning, mathematics | 32K | High quality | Low |
# Model selection for different use cases
fast_config = LlmConfig.fireworks(
api_key=os.getenv("FIREWORKS_API_KEY"),
model="accounts/fireworks/models/llama-v3p1-8b-instruct" # For fast, efficient tasks
)
quality_config = LlmConfig.fireworks(
api_key=os.getenv("FIREWORKS_API_KEY"),
model="accounts/fireworks/models/llama-v3p1-70b-instruct" # For high-quality responses
)
coding_config = LlmConfig.fireworks(
api_key=os.getenv("FIREWORKS_API_KEY"),
model="accounts/fireworks/models/mixtral-8x7b-instruct" # For code generation
)
Getting Started with Fireworks AI¶
- Sign up at fireworks.ai
- Get your API key from the dashboard
- Set environment variable:
export FIREWORKS_API_KEY="your-api-key"
- Start using with GraphBit
import os
from graphbit import LlmClient, LlmConfig
# Create configuration
config = LlmConfig.fireworks(
api_key=os.getenv("FIREWORKS_API_KEY"),
model="accounts/fireworks/models/llama-v3p1-8b-instruct"
)
# Create client and generate text
client = LlmClient(config)
response = client.complete(
prompt="Explain quantum computing in simple terms",
max_tokens=200,
temperature=0.7
)
print(response)
Replicate Configuration¶
Configure Replicate for access to open-source models with function calling capabilities:
import os
from graphbit import LlmConfig
# Basic Replicate configuration
config = LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="anthropic/claude-4-sonnet" # openai/gpt-5
)
print(f"Provider: {config.provider()}") # "replicate"
print(f"Model: {config.model()}") # "anthropic/claude-4-sonnet"
Provide Model Version¶
You can also provide version separately:
# Configuration with specific model version
config = LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b",
version="ee173688d3b8d9e05a5b910f10fb9bab1e9348963ab224579bb90d9fce3fb00b"
)
Getting Started with Replicate¶
- Sign up at replicate.com
- Get your API token from your account settings
- Set environment variable:
export REPLICATE_API_KEY="your-api-token"
- Start using with GraphBit
import os
from graphbit import LlmClient, LlmConfig
# Create configuration
config = LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/glaive-function-calling-v1"
)
# Create client and generate text
client = LlmClient(config)
response = client.complete(
prompt="Explain the benefits of open-source AI models",
max_tokens=300,
temperature=0.7
)
print(response)
xAI Configuration¶
Configure xAI for Grok models with real-time information and advanced reasoning:
import os
from graphbit import LlmConfig
# Basic xAI configuration
config = LlmConfig.xai(
api_key=os.getenv("XAI_API_KEY"),
model="grok-4" # Optional - defaults to grok-4
)
print(f"Provider: {config.provider()}") # "xai"
print(f"Model: {config.model()}") # "grok-4"
Popular xAI Grok Models¶
Model | Best For | Context Length | Performance | Cost |
---|---|---|---|---|
grok-4 | Complex reasoning, latest features | 256K | Highest quality | Medium |
grok-4-0709 | Stable version of Grok-4 | 256K | High quality | Medium |
grok-code-fast-1 | Code generation, fast inference | 256K | Fast, efficient | Very low |
grok-3 | General tasks, balanced performance | 131K | Good quality | Medium |
grok-3-mini | Quick tasks, cost-effective | 131K | Fast, efficient | Very low |
# Model selection for different use cases
reasoning_config = LlmConfig.xai(
api_key=os.getenv("XAI_API_KEY"),
model="grok-4" # For complex reasoning and latest features
)
coding_config = LlmConfig.xai(
api_key=os.getenv("XAI_API_KEY"),
model="grok-code-fast-1" # For fast code generation
)
efficient_config = LlmConfig.xai(
api_key=os.getenv("XAI_API_KEY"),
model="grok-3-mini" # For cost-effective tasks
)
Getting Started with xAI¶
- Sign up at x.ai
- Get your API key from the developer console
- Set environment variable:
export XAI_API_KEY="your-api-key"
- Start using with GraphBit
import os
from graphbit import LlmClient, LlmConfig
# Create configuration
config = LlmConfig.xai(
api_key=os.getenv("XAI_API_KEY"),
model="grok-4"
)
# Create client and generate text
client = LlmClient(config)
response = client.complete(
prompt="Explain quantum computing with real-time examples",
max_tokens=200,
temperature=0.7
)
print(response)
AI21 Labs Configuration¶
Configure AI21 Labs for Jamba models with real-time information and advanced reasoning:
import os
from graphbit import LlmConfig
# Basic AI21 configuration
config = LlmConfig.ai21(
api_key=os.getenv("AI21_API_KEY"),
model="jamba-mini" # Optional - defaults to jamba-mini
)
print(f"Provider: {config.provider()}") # "ai21"
print(f"Model: {config.model()}") # "jamba-mini"
AI21 Jamba Models¶
Model | Best For | Context Length | Performance | Cost |
---|---|---|---|---|
jamba-mini | General tasks, cost-effective | 256K | Highest quality | Medium |
jamba-large | General tasks, balanced performance | 256K | High quality | Medium |
# Model selection for different use cases
config = LlmConfig.ai21(
api_key=os.getenv("AI21_API_KEY"),
model="jamba-mini"
)
config = LlmConfig.ai21(
api_key=os.getenv("AI21_API_KEY"),
model="jamba-large"
)
Getting Started with AI21 Labs¶
- Sign up at AI21 Labs
- Get your API key from the developer console
- Set environment variable:
export AI21_API_KEY="your-api-key"
- Start using with GraphBit
import os
from graphbit import LlmClient, LlmConfig
# Create configuration
config = LlmConfig.xai(
api_key=os.getenv("AI21_API_KEY"),
model="jamba-mini"
)
# Create client and generate text
client = LlmClient(config)
response = client.complete(
prompt="Explain quantum computing with real-time examples",
max_tokens=200,
temperature=0.7
)
print(response)
Ollama Configuration¶
Configure Ollama for local model execution:
# Basic Ollama configuration (no API key required)
config = LlmConfig.ollama(
model="llama3.2" # Optional - defaults to llama3.2
)
print(f"Provider: {config.provider()}") # "Ollama"
print(f"Model: {config.model()}") # "llama3.2"
# Other popular models
mistral_config = LlmConfig.ollama(model="mistral")
codellama_config = LlmConfig.ollama(model="codellama")
phi_config = LlmConfig.ollama(model="phi")
LLM Client Usage¶
Creating and Using Clients¶
from graphbit import LlmClient
# Create client with configuration
client = LlmClient(config, debug=False)
# Basic text completion
response = client.complete(
prompt="Explain the concept of machine learning",
max_tokens=500, # Optional - controls response length
temperature=0.7 # Optional - controls randomness (0.0-1.0)
)
print(f"Response: {response}")
Asynchronous Operations¶
GraphBit provides async methods for non-blocking operations:
import asyncio
async def async_completion():
# Async completion
response = await client.complete_async(
prompt="Write a short story about AI",
max_tokens=300,
temperature=0.8
)
return response
# Run async operation
response = asyncio.run(async_completion())
Batch Processing¶
Process multiple prompts efficiently:
async def batch_processing():
prompts = [
"Summarize quantum computing",
"Explain blockchain technology",
"Describe neural networks",
"What is machine learning?"
]
responses = await client.complete_batch(
prompts=prompts,
max_tokens=200,
temperature=0.5,
max_concurrency=3 # Process 3 at a time
)
for i, response in enumerate(responses):
print(f"Response {i+1}: {response}")
asyncio.run(batch_processing())
Chat-Style Interactions¶
Use chat-optimized methods for conversational interactions:
async def chat_example():
# Chat with message history
response = await client.chat_optimized(
messages=[
("user", "Hello, how are you?"),
("assistant", "I'm doing well, thank you!"),
("user", "Can you help me with Python programming?"),
("user", "Specifically, how do I handle exceptions?")
],
max_tokens=400,
temperature=0.3
)
print(f"Chat response: {response}")
asyncio.run(chat_example())
Streaming Responses¶
Get real-time streaming responses:
async def streaming_example():
print("Streaming response:")
async for chunk in client.complete_stream(
prompt="Tell me a detailed story about space exploration",
max_tokens=1000,
temperature=0.7
):
print(chunk, end="", flush=True)
print("\n--- Stream complete ---")
asyncio.run(streaming_example())
Client Management and Monitoring¶
Client Statistics¶
Monitor client performance and usage:
# Get comprehensive statistics
stats = client.get_stats()
print(f"Total requests: {stats['total_requests']}")
print(f"Successful requests: {stats['successful_requests']}")
print(f"Failed requests: {stats['failed_requests']}")
print(f"Average response time: {stats['average_response_time_ms']}ms")
print(f"Circuit breaker state: {stats['circuit_breaker_state']}")
print(f"Client uptime: {stats['uptime']}")
# Calculate success rate
if stats['total_requests'] > 0:
success_rate = stats['successful_requests'] / stats['total_requests']
print(f"Success rate: {success_rate:.2%}")
Client Warmup¶
Pre-initialize connections for better performance:
async def warmup_client():
# Warmup client to reduce cold start latency
await client.warmup()
print("Client warmed up and ready")
# Warmup before production use
asyncio.run(warmup_client())
Reset Statistics¶
Reset client statistics for monitoring periods:
Provider-Specific Examples¶
OpenAI Workflow Example¶
import os
from graphbit import LlmConfig, Workflow, Node, Executor
def create_openai_workflow():
"""Create workflow using OpenAI"""
# Configure OpenAI
config = LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-mini"
)
# Create workflow
workflow = Workflow("OpenAI Analysis Pipeline")
# Create analyzer node
analyzer = Node.agent(
name="GPT Content Analyzer",
prompt=f"Analyze the following content for sentiment, key themes, and quality:\n\n{input}",
agent_id="gpt_analyzer"
)
# Add to workflow
analyzer_id = workflow.add_node(analyzer)
workflow.validate()
# Create executor and run
executor = Executor(config, timeout_seconds=60)
return workflow, executor
# Usage
workflow, executor = create_openai_workflow()
result = executor.execute(workflow)
Anthropic Workflow Example¶
import os
from graphbit import LlmConfig, Workflow, Node, Executor
def create_anthropic_workflow():
"""Create workflow using Anthropic Claude"""
# Configure Anthropic
config = LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-sonnet-4-20250514"
)
# Create workflow
workflow = Workflow("Claude Analysis Pipeline")
# Create analyzer with detailed prompt
analyzer = Node.agent(
name="Claude Content Analyzer",
prompt=f"""
Analyze the following content with attention to:
- Factual accuracy and logical consistency
- Potential biases or assumptions
- Clarity and structure
- Key insights and recommendations
Content: {input}
Provide your analysis in a structured format.
""",
agent_id="claude_analyzer"
)
workflow.add_node(analyzer)
workflow.validate()
# Create executor with longer timeout for Claude
executor = Executor(config, timeout_seconds=120)
return workflow, executor
# Usage
workflow, executor = create_anthropic_workflow()
DeepSeek Workflow Example¶
import os
from graphbit import LlmConfig, Workflow, Node, Executor
def create_deepseek_workflow():
"""Create workflow using DeepSeek models"""
# Configure DeepSeek
config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-chat"
)
# Create workflow
workflow = Workflow("DeepSeek Analysis Pipeline")
# Create analyzer optimized for DeepSeek's capabilities
analyzer = Node.agent(
name="DeepSeek Content Analyzer",
prompt=f"""
Analyze the following content efficiently and accurately:
- Main topics and themes
- Key insights and takeaways
- Actionable recommendations
- Potential concerns or limitations
Content: {input}
Provide a clear, structured analysis.
""",
agent_id="deepseek_analyzer"
)
workflow.add_node(analyzer)
workflow.validate()
# Create executor optimized for DeepSeek's fast inference
executor = Executor(config, timeout_seconds=90)
return workflow, executor
# Usage for different DeepSeek models
def create_deepseek_coding_workflow():
"""Create workflow for code analysis using DeepSeek Coder"""
config = LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-coder"
)
workflow = Workflow("DeepSeek Code Analysis")
code_analyzer = Node.agent(
name="DeepSeek Code Reviewer",
prompt=f"""
Review this code for:
- Code quality and best practices
- Potential bugs or issues
- Performance improvements
- Security considerations
Code: {input}
""",
agent_id="deepseek_code_analyzer"
)
workflow.add_node(code_analyzer)
workflow.validate()
executor = Executor(config, timeout_seconds=90)
return workflow, executor
# Usage
workflow, executor = create_deepseek_workflow()
OpenRouter Workflow Example¶
from graphbit import LlmConfig, Workflow, Node, Executor
import os
def create_openrouter_workflow():
"""Create workflow using OpenRouter with multiple models"""
# Configure OpenRouter with a high-performance model
config = LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="anthropic/claude-3-5-sonnet" # Use Claude through OpenRouter
)
workflow = Workflow("OpenRouter Multi-Model Pipeline")
# Create analyzer using Claude for complex reasoning
analyzer = Node.agent(
name="Claude Content Analyzer",
prompt=f"""
Analyze this content comprehensively:
- Main themes and topics
- Sentiment and tone
- Key insights and takeaways
- Potential improvements
Content: {input}
""",
agent_id="claude_analyzer"
)
# Create summarizer using a different model for comparison
summarizer = Node.agent(
name="GPT Summarizer",
prompt=f"Create a concise summary of this analysis: {input}",
agent_id="gpt_summarizer",
llm_config=LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="openai/gpt-4o-mini" # Use GPT for summarization
)
)
workflow.add_node(analyzer)
workflow.add_node(summarizer)
workflow.add_edge(analyzer, summarizer)
workflow.validate()
executor = Executor(config, timeout_seconds=120)
return workflow, executor
# Usage
workflow, executor = create_openrouter_workflow()
Ollama Workflow Example¶
from graphbit import LlmConfig, Workflow, Node, Executor
def create_ollama_workflow():
"""Create workflow using local Ollama models"""
# Configure Ollama (no API key needed)
config = LlmConfig.ollama(model="llama3.2")
# Create workflow
workflow = Workflow("Local LLM Pipeline")
# Create analyzer optimized for local models
analyzer = Node.agent(
name="Local Model Analyzer",
prompt=f"Analyze this text briefly: {input}",
agent_id="local_analyzer"
)
workflow.add_node(analyzer)
workflow.validate()
# Create executor with longer timeout for local processing
executor = Executor(config, timeout_seconds=180)
return workflow, executor
# Usage
workflow, executor = create_ollama_workflow()
Replicate Workflow Example¶
from graphbit import LlmConfig, Workflow, Node, Executor
import os
def create_replicate_workflow():
"""Create workflow using Replicate models with function calling"""
# Configure Replicate with function calling model
config = LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b:version"
)
# Create workflow
workflow = Workflow("Replicate Function Calling Pipeline")
# Create analyzer with function calling capabilities
analyzer = Node.agent(
name="Replicate Function Analyzer",
prompt=f"""
Analyze the following content and use available tools when needed:
- Identify key topics and themes
- Extract actionable insights
- Suggest relevant tools or functions to call
- Provide structured recommendations
Content: {input}
Use function calling when appropriate to enhance your analysis.
""",
agent_id="replicate_analyzer"
)
# Create summarizer using a different Replicate model
summarizer = Node.agent(
name="Hermes Summarizer",
prompt=f"Create a concise summary of this analysis: {input}",
agent_id="hermes_summarizer",
llm_config=LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b:version"
)
)
workflow.add_node(analyzer)
workflow.add_node(summarizer)
workflow.add_edge(analyzer, summarizer)
workflow.validate()
# Create executor with appropriate timeout for Replicate
executor = Executor(config, timeout_seconds=300) # Longer timeout for prediction-based API
return workflow, executor
def create_replicate_enterprise_workflow():
"""Create workflow using Replicate's enterprise-focused models"""
# Configure with Granite model for enterprise use
config = LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b:version"
)
workflow = Workflow("Enterprise Replicate Pipeline")
# Enterprise-focused analyzer
analyzer = Node.agent(
name="Granite Enterprise Analyzer",
prompt=f"""
Perform enterprise-grade analysis of the following content:
- Business impact assessment
- Risk analysis and mitigation strategies
- Compliance considerations
- Strategic recommendations
Content: {input}
Provide analysis suitable for enterprise decision-making.
""",
agent_id="granite_analyzer"
)
workflow.add_node(analyzer)
workflow.validate()
executor = Executor(config, timeout_seconds=300)
return workflow, executor
# Usage
workflow, executor = create_replicate_workflow()
enterprise_workflow, enterprise_executor = create_replicate_enterprise_workflow()
Performance Optimization¶
Timeout Configuration¶
Configure appropriate timeouts for different providers:
# OpenAI - typically faster
openai_executor = Executor(
openai_config,
timeout_seconds=60
)
anthropic_executor = Executor(
anthropic_config,
timeout_seconds=120
)
deepseek_executor = Executor(
deepseek_config,
timeout_seconds=90
)
# Replicate - longer timeout for prediction-based API
replicate_executor = Executor(
replicate_config,
timeout_seconds=300
)
# Ollama - local processing
ollama_executor = Executor(
ollama_config,
timeout_seconds=180
)
Executor Types for Different Providers¶
Choose appropriate executor types based on provider characteristics:
# High-throughput for cloud providers
cloud_executor = Executor(
llm_config=openai_config,
timeout_seconds=60
)
# Low-latency for fast providers
realtime_executor = Executor(
llm_config=anthropic_config,
lightweight_mode=True,
timeout_seconds=30
)
Error Handling¶
Provider-Specific Error Handling¶
def robust_llm_usage():
try:
# Configure
config = LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY")
)
client = LlmClient(config)
# Execute with error handling
response = client.complete(
prompt="Test prompt",
max_tokens=100
)
return response
except Exception as e:
print(f"LLM operation failed: {e}")
return None
Workflow Error Handling¶
def execute_with_error_handling(workflow, executor):
try:
result = executor.execute(workflow)
if result.is_completed():
return result.output()
elif result.is_failed():
error_msg = result.error()
print(f"Workflow failed: {error_msg}")
return None
except Exception as e:
print(f"Execution error: {e}")
return None
Best Practices¶
1. Provider Selection¶
Choose providers based on your requirements:
def get_optimal_config(use_case):
"""Select optimal provider for use case"""
if use_case == "creative":
return LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o"
)
elif use_case == "analytical":
return LlmConfig.anthropic(
api_key=os.getenv("ANTHROPIC_API_KEY"),
model="claude-sonnet-4-20250514"
)
elif use_case == "cost_effective":
return LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-chat"
)
elif use_case == "coding":
return LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-coder"
)
elif use_case == "reasoning":
return LlmConfig.deepseek(
api_key=os.getenv("DEEPSEEK_API_KEY"),
model="deepseek-reasoner"
)
elif use_case == "multi_model":
return LlmConfig.openrouter(
api_key=os.getenv("OPENROUTER_API_KEY"),
model="anthropic/claude-3-5-sonnet"
)
elif use_case == "function_calling":
return LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b:version"
)
elif use_case == "open_source":
return LlmConfig.replicate(
api_key=os.getenv("REPLICATE_API_KEY"),
model="lucataco/dolphin-2.9-llama3-8b:version"
)
elif use_case == "local":
return LlmConfig.ollama(model="llama3.2")
else:
return LlmConfig.openai(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o-mini"
)
2. API Key Management¶
Securely manage API keys:
import os
from pathlib import Path
def get_api_key(provider):
"""Securely retrieve API keys"""
key_mapping = {
"openai": "OPENAI_API_KEY",
"azure_openai": "AZURE_OPENAI_API_KEY",
"anthropic": "ANTHROPIC_API_KEY",
"openrouter": "OPENROUTER_API_KEY",
"perplexity": "PERPLEXITY_API_KEY",
"deepseek": "DEEPSEEK_API_KEY",
"replicate": "REPLICATE_API_KEY"
}
env_var = key_mapping.get(provider)
if not env_var:
raise ValueError(f"Unknown provider: {provider}")
api_key = os.getenv(env_var)
if not api_key:
raise ValueError(f"Missing {env_var} environment variable")
return api_key
# Usage
try:
openai_config = LlmConfig.openai(
api_key=get_api_key("openai")
)
except ValueError as e:
print(f"Configuration error: {e}")
3. Client Reuse¶
Reuse clients for better performance:
class LLMManager:
def __init__(self):
self.clients = {}
def get_client(self, provider, model=None):
"""Get or create client for provider"""
key = f"{provider}_{model or 'default'}"
if key not in self.clients:
if provider == "openai":
config = LlmConfig.openai(
api_key=get_api_key("openai"),
model=model
)
elif provider == "anthropic":
config = LlmConfig.anthropic(
api_key=get_api_key("anthropic"),
model=model
)
elif provider == "deepseek":
config = LlmConfig.deepseek(
api_key=get_api_key("deepseek"),
model=model
)
elif provider == "replicate":
config = LlmConfig.replicate(
api_key=get_api_key("replicate"),
model=model
)
elif provider == "ollama":
config = LlmConfig.ollama(model=model)
else:
raise ValueError(f"Unknown provider: {provider}")
self.clients[key] = LlmClient(config)
return self.clients[key]
# Usage
llm_manager = LLMManager()
openai_client = llm_manager.get_client("openai", "gpt-4o-mini")
4. Monitoring and Logging¶
Monitor LLM usage and performance:
def monitor_llm_usage(client, operation_name):
"""Monitor LLM client usage"""
stats_before = client.get_stats()
# Perform operation here
stats_after = client.get_stats()
requests_made = stats_after['total_requests'] - stats_before['total_requests']
print(f"{operation_name}: {requests_made} requests made")
if stats_after['total_requests'] > 0:
success_rate = stats_after['successful_requests'] / stats_after['total_requests']
print(f"Overall success rate: {success_rate:.2%}")
What's Next¶
- Learn about Embeddings for vector operations
- Explore Workflow Builder for complex workflows
- Check Performance for optimization techniques
- See Monitoring for production monitoring