Gitlab Search API Integration with Graphbit¶
Overview¶
This guideline explains how to connect the Gitlab Search API to Graphbit, enabling Graphbit to orchestrate the retrieval, processing, and utilization of web search results in your AI workflows. This integration allows you to automate research, enrich LLM prompts, and build intelligent pipelines that leverage real-time web data.
Prerequisites¶
- Gitlab Token (Not Mandatory): Obtain from Gitlab Personal Access Tokens
- OpenAI API Key: For LLM summarization (or another supported LLM provider).
- Graphbit installed and configured (see installation guide).
- Python environment with
requests,python-dotenv, andgraphbitinstalled. - .env file in your project root with the following variables:
Step 1: Implement the Gitlab projects Search Connector¶
Define a function to query the Gitlab Search API, loading credentials from environment variables:
import requests
import os
from dotenv import load_dotenv
load_dotenv()
GITLAB_TOKEN = os.getenv("GITLAB_TOKEN")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
def gitlab_projects_search(query, sort="stars", order="desc", per_page=50):
"""
Search GitLab projects (repositories)
Args:
query: Search query (e.g., "machine learning")
sort: Sort by 'stars', 'name', 'created_at', 'updated_at', 'last_activity_at'
order: 'asc' or 'desc'
per_page: Number of results per page (max 100)
Returns:
dict: JSON response with search results
"""
url = "https://gitlab.com/api/v4/projects"
if GITLAB_TOKEN:
headers = {
"PRIVATE-TOKEN": GITLAB_TOKEN,
}
else:
headers = {}
sort_mapping = {
"stars": "star_count",
"updated": "updated_at",
"created": "created_at",
"name": "name",
"last_activity": "last_activity_at"
}
order_by = sort_mapping.get(sort, "star_count")
params = {
"search": query,
"order_by": order_by,
"sort": order,
"per_page": per_page
}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
return response.json()
Step 2: Process the Search Results¶
Extract relevant information (title, link, and snippet) from the search results for downstream use. By default, only the top 3 results are included, but you can override this by specifying the max_snippets parameter:
def process_search_results(results, max_snippets=10):
"""
Extracts up to max_snippets search results (default: 3) as formatted strings.
"""
items = results.get("items", [])[:max_snippets]
snippets = [
','.join(project['tag_list'])
for item in items
]
return "\n\n".join(snippets)
- If you call
process_search_results(results), it will use the default of 10 results. - To use a different number, call
process_search_results(results, max_snippets=20)(for example).
Step 3: Build the Graphbit Workflow¶
- Run the gitlab projects Search and process the results:
search_results = gitlab_projects_search("python")
snippets_text = process_search_results(search_results, max_snippets=10)
- Create a Graphbit agent node for summarization:
from graphbit import Node, Workflow
agent = Node.agent(
name="Summarizer",
prompt=f"Summarize projects tags: {snippets_text}"
)
workflow = Workflow("Gitlab Projects Tags Summarizer Workflow")
workflow.add_node(agent)
Step 4: Orchestrate and Execute with Graphbit¶
- Initialize Graphbit and configure your LLM:
from graphbit import LlmConfig, Executor
from dotenv import load_dotenv
import os
load_dotenv()
llm_config = LlmConfig.openai(os.getenv("OPENAI_API_KEY"))
executor = Executor(llm_config)
- Run the workflow and retrieve the summary:
result = executor.execute(workflow)
if result.is_success():
print("Summary:", result.get_node_output("Summarizer"))
else:
print("Workflow failed:", result.state())
Full Example¶
import requests
from graphbit import Node, Workflow, LlmConfig, Executor
import os
from dotenv import load_dotenv
from graphbit import Node, Workflow
# Load environment variables from .env file
load_dotenv()
GITLAB_TOKEN = os.getenv("GITLAB_TOKEN")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
def gitlab_projects_search(query, sort="stars", order="desc", per_page=50):
"""
Search GitLab projects (repositories)
Args:
query: Search query (e.g., "machine learning")
sort: Sort by 'stars', 'name', 'created_at', 'updated_at', 'last_activity_at'
order: 'asc' or 'desc'
per_page: Number of results per page (max 100)
Returns:
dict: JSON response with search results
"""
url = "https://gitlab.com/api/v4/projects"
if GITLAB_TOKEN:
headers = {
"PRIVATE-TOKEN": GITLAB_TOKEN,
}
else:
headers = {}
sort_mapping = {
"stars": "star_count",
"updated": "updated_at",
"created": "created_at",
"name": "name",
"last_activity": "last_activity_at"
}
order_by = sort_mapping.get(sort, "star_count")
params = {
"search": query,
"order_by": order_by,
"sort": order,
"per_page": per_page
}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
return response.json()
def process_search_results(results, max_snippets=10):
"""
Extracts up to max_snippets search results (default: 3) as formatted strings.
"""
items = results.get("items", [])[:max_snippets]
snippets = [
','.join(project['tag_list'])
for item in items
]
return "\n\n".join(snippets)
search_results = gitlab_projects_search("python")
snippets_text = process_search_results(search_results, max_snippets=10)
agent = Node.agent(
name="Summarizer",
prompt=f"Summarize projects tags: {snippets_text}"
)
workflow = Workflow("Gitlab Projects Tags Summarizer Workflow")
workflow.add_node(agent)
result = executor.execute(workflow)
if result.is_success():
print("Summary:", result.get_node_output("Summarizer"))
else:
print("Workflow failed:", result.state())
This connector pattern enables you to seamlessly blend external web data into your AI workflows, orchestrated by Graphbit.