Confluence
Connect QDrant Loader to Confluence to index team documentation, knowledge bases, and collaborative content. This guide covers setup for both Confluence Cloud and Confluence Data Center.
๐ฏ What Gets Processed
When you connect to Confluence, QDrant Loader can process:
- Page content - All text content from Confluence pages
- Page hierarchy - Parent/child relationships between pages
- Attachments - Files attached to pages (PDFs, Office docs, images)
- Comments - Page comments and discussions
- Page metadata - Authors, creation dates, labels, versions
- Space information - Space descriptions and metadata
๐ง Authentication Setup
Confluence Cloud
API Token (Recommended)
- Create an API Token:
- Go to Atlassian Account Settings
- Click "Create API token"
- Give it a descriptive name like "QDrant Loader"
-
Copy the token
-
Set environment variables:
bash
export CONFLUENCE_URL=https://your-domain.atlassian.net/wiki
export CONFLUENCE_EMAIL=your-email@company.com
export CONFLUENCE_TOKEN=your_api_token_here
Confluence Data Center
Personal Access Token
- Create a Personal Access Token:
- Go to Confluence โ Settings โ Personal Access Tokens
- Click "Create token"
- Set appropriate permissions:
READ
for spaces and pages -
Copy the token
-
Set environment variables:
bash
export CONFLUENCE_URL=https://confluence.your-company.com
export CONFLUENCE_TOKEN=your_personal_access_token
โ๏ธ Configuration
QDrant Loader uses a project-based configuration structure. Each project can have multiple Confluence sources.
Basic Configuration
projects:
my-project:
display_name: "My Documentation Project"
description: "Company documentation and knowledge base"
collection_name: "my-docs"
sources:
confluence:
company-wiki:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud" # or "datacenter"
space_key: "DOCS"
email: "${CONFLUENCE_EMAIL}" # Required for Cloud
token: "${CONFLUENCE_TOKEN}"
content_types:
- "page"
- "blogpost"
include_labels: []
exclude_labels: []
enable_file_conversion: true
download_attachments: true
Advanced Configuration
projects:
documentation:
display_name: "Documentation Hub"
description: "All company documentation sources"
collection_name: "docs-hub"
sources:
confluence:
# Main documentation space
main-docs:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "DOCS"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types:
- "page"
- "blogpost"
include_labels: []
exclude_labels:
- "draft"
- "obsolete"
enable_file_conversion: true
download_attachments: true
# Technical documentation space
tech-docs:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "TECH"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types:
- "page"
include_labels:
- "api"
- "architecture"
exclude_labels:
- "deprecated"
enable_file_conversion: true
download_attachments: true
Multiple Confluence Instances
projects:
multi-confluence:
display_name: "Multi-Instance Documentation"
description: "Documentation from multiple Confluence instances"
collection_name: "multi-docs"
sources:
confluence:
# Cloud instance
cloud-wiki:
base_url: "https://company.atlassian.net/wiki"
deployment_type: "cloud"
space_key: "DOCS"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types: ["page", "blogpost"]
include_labels: []
exclude_labels: []
enable_file_conversion: true
download_attachments: true
# Data Center instance
datacenter-wiki:
base_url: "https://internal-confluence.company.com"
deployment_type: "datacenter"
space_key: "INTERNAL"
token: "${CONFLUENCE_PAT}"
content_types: ["page"]
include_labels: []
exclude_labels: []
enable_file_conversion: true
download_attachments: true
๐ฏ Configuration Options
Required Settings
Option | Type | Description | Example |
---|---|---|---|
base_url |
string | Confluence base URL | https://company.atlassian.net/wiki |
deployment_type |
string | Deployment type: cloud , datacenter , or server |
cloud |
space_key |
string | Confluence space key to process | DOCS |
token |
string | API token or Personal Access Token | ${CONFLUENCE_TOKEN} |
Cloud-Specific Settings
Option | Type | Description | Required for Cloud |
---|---|---|---|
email |
string | Email associated with Confluence account | Yes |
Content Filtering
Option | Type | Description | Default |
---|---|---|---|
content_types |
list | Content types to process | ["page", "blogpost"] |
include_labels |
list | Only process content with these labels | [] (all) |
exclude_labels |
list | Skip content with these labels | [] |
File Processing
Option | Type | Description | Default |
---|---|---|---|
enable_file_conversion |
bool | Enable file conversion for attachments | true |
download_attachments |
bool | Download and process attachments | true |
๐ Usage Examples
Documentation Team
projects:
docs-team:
display_name: "Documentation Team"
description: "All documentation spaces"
collection_name: "documentation"
sources:
confluence:
user-guides:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "GUIDES"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types: ["page"]
include_labels: ["published"]
exclude_labels: ["draft", "archive"]
enable_file_conversion: true
download_attachments: true
api-docs:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "API"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types: ["page", "blogpost"]
include_labels: ["api", "reference"]
exclude_labels: ["deprecated"]
enable_file_conversion: true
download_attachments: true
Software Development Team
projects:
dev-team:
display_name: "Development Team"
description: "Technical documentation and architecture"
collection_name: "dev-docs"
sources:
confluence:
architecture:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "ARCH"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types: ["page"]
include_labels: ["architecture", "design"]
exclude_labels: ["obsolete"]
enable_file_conversion: true
download_attachments: true
development:
base_url: "${CONFLUENCE_URL}"
deployment_type: "cloud"
space_key: "DEV"
email: "${CONFLUENCE_EMAIL}"
token: "${CONFLUENCE_TOKEN}"
content_types: ["page", "blogpost"]
include_labels: ["development", "guidelines"]
exclude_labels: ["draft"]
enable_file_conversion: true
download_attachments: true
๐งช Testing and Validation
Initialize and Test Configuration
# Initialize the project (creates collection if needed)
qdrant-loader --workspace . init
# Test ingestion with your Confluence configuration
qdrant-loader --workspace . ingest --project my-project
# Check project status
qdrant-loader --workspace . project status --project-id my-project
# List all configured projects
qdrant-loader --workspace . project list
# Validate project configuration
qdrant-loader --workspace . project validate --project-id my-project
Debug Confluence Processing
# Enable debug logging
qdrant-loader --workspace . --log-level DEBUG ingest --project my-project
# Process specific project only
qdrant-loader --workspace . ingest --project my-project
# Process specific source within a project
qdrant-loader --workspace . ingest --project my-project --source-type confluence --source company-wiki
๐ง Troubleshooting
Common Issues
Authentication Failures
Problem: 401 Unauthorized
or 403 Forbidden
Solutions:
# Test API token manually for Cloud
curl -u "your-email@company.com:your-api-token" \
"https://your-domain.atlassian.net/wiki/rest/api/space"
# Test Personal Access Token for Data Center
curl -H "Authorization: Bearer your-personal-access-token" \
"https://confluence.company.com/rest/api/space"
Check your configuration:
- Ensure
deployment_type
matches your Confluence instance - For Cloud: verify both
email
andtoken
are set - For Data Center: verify
token
(Personal Access Token) is set - Ensure the token has appropriate permissions
Space Access Issues
Problem: Space not found
or No permission to access space
Solutions:
# List accessible spaces for Cloud
curl -u "your-email:your-token" \
"https://your-domain.atlassian.net/wiki/rest/api/space" | jq '.results[].key'
# List accessible spaces for Data Center
curl -H "Authorization: Bearer your-token" \
"https://confluence.company.com/rest/api/space" | jq '.results[].key'
Check your configuration:
- Verify the
space_key
exists and is accessible - Ensure your account has read permissions for the space
- Check that the space key is correct (case-sensitive)
Configuration Issues
Problem: Configuration validation errors
Solutions:
- Verify project structure:
yaml
projects:
your-project: # Project ID
sources:
confluence:
source-name: # Source name
base_url: "..."
# ... other settings
- Check required fields:
base_url
: Must include/wiki
for Cloud instancesdeployment_type
: Must becloud
,datacenter
, orserver
space_key
: Must be a valid space key-
token
: Must be set via environment variable or directly -
Validate environment variables:
bash
echo $CONFLUENCE_URL
echo $CONFLUENCE_EMAIL
echo $CONFLUENCE_TOKEN
Rate Limiting
Problem: 429 Too Many Requests
Solutions:
The Confluence connector automatically handles rate limiting, but you can:
- Check your API usage in Atlassian Admin Console
- Reduce concurrent processing by processing fewer projects simultaneously
- Contact your Confluence administrator if limits are too restrictive
Large Space Performance
Problem: Processing takes too long or times out
Solutions:
- Filter content with labels:
yaml
confluence:
large-space:
space_key: "LARGE"
include_labels: ["important", "current"]
exclude_labels: ["archive", "deprecated"]
- Process specific content types:
yaml
confluence:
pages-only:
space_key: "DOCS"
content_types: ["page"] # Skip blogposts
- Disable attachment processing temporarily:
yaml
confluence:
no-attachments:
space_key: "DOCS"
download_attachments: false
Debugging Commands
# Check Confluence API connectivity
curl -u "email:token" "https://domain.atlassian.net/wiki/rest/api/space" | jq '.size'
# List pages in a space
curl -u "email:token" \
"https://domain.atlassian.net/wiki/rest/api/space/DOCS/content/page" | \
jq '.results[].title'
# Check specific page content
curl -u "email:token" \
"https://domain.atlassian.net/wiki/rest/api/content/PAGE_ID?expand=body.storage"
๐ Monitoring and Processing
Check Processing Status
# View project status
qdrant-loader --workspace . project status
# Check specific project
qdrant-loader --workspace . project status --project-id my-project
# List all projects
qdrant-loader --workspace . project list
Configuration Management
# View current configuration
qdrant-loader --workspace . config
# Validate all projects
qdrant-loader --workspace . project validate
๐ Best Practices
Content Organization
- Use descriptive space keys - Make spaces easy to identify
- Apply consistent labeling - Use labels for categorization and filtering
- Organize with page hierarchy - Use parent/child relationships
- Archive old content - Move outdated content to archive spaces
Configuration Management
- Use environment variables - Keep sensitive data out of config files
- Organize by teams/purposes - Create separate projects for different use cases
- Filter content appropriately - Use labels to include/exclude content
- Test configurations - Validate before running full ingestion
Security Considerations
- Use API tokens - Prefer tokens over passwords
- Limit token scope - Grant minimal necessary permissions
- Rotate tokens regularly - Update tokens periodically
- Monitor access - Track which content is being accessed
- Use environment variables - Never commit tokens to version control
Performance Optimization
- Filter aggressively - Only process content you need
- Use appropriate labels - Filter by labels to reduce processing
- Process incrementally - Run regular updates rather than full reprocessing
- Monitor resource usage - Watch memory and network usage during processing
๐ Related Documentation
- Configuration Reference - Complete configuration options
- File Conversion - Processing Confluence attachments
- Troubleshooting - Common issues and solutions
- MCP Server - Using processed Confluence content with AI tools
- Project Management - Managing multiple projects
Ready to connect your Confluence instance? Start with the basic configuration above and customize based on your space structure and content needs.