L10n system
AI-Powered Localization (l10n) System¶
Overview¶
Kitsune's AI-powered localization system automates the translation of knowledge base articles from English to multiple target languages using Large Language Models (LLMs). The system supports multiple translation strategies and workflows to accommodate different locales and quality requirements.
Architecture¶
Core Components¶
1. Translation Strategies (kitsune/wiki/strategies.py
)¶
The l10n system implements a strategy pattern with three main translation approaches:
ManualTranslationStrategy - Traditional human-only translation workflow - Handles marking content as "ready for localization" - Sends notifications to translators via email - Used for locales without AI support
AITranslationStrategy
- Fully automated AI translation
- Translates content automatically using LLMs
- Auto-publishes translations without human review
- Used for locales in AI_ENABLED_LOCALES
setting
HybridTranslationStrategy
- AI translation with human review
- Creates AI translations but requires human approval
- Includes grace period for review before auto-approval
- Used for locales in HYBRID_ENABLED_LOCALES
setting
2. LLM Translation Engine (kitsune/llm/l10n/translator.py
)¶
The core translation function uses LangChain to orchestrate LLM calls:
def translate(doc: "Document", target_locale: str) -> dict[str, dict[str, Any]]:
"""
Translates summary, keywords, content, and conditionally title
"""
Key features:
- Uses Gemini 2.5 Pro model by default (L10N_LLM_MODEL
)
- Supports incremental translation by comparing with prior versions
- Preserves special wiki markup and protected terms
- Returns both translation and explanation for each field
3. Translation Prompts (kitsune/llm/l10n/prompt.py
)¶
Sophisticated prompt engineering ensures high-quality translations:
- Protected Terms: Preserves Mozilla product names and special markup (
L10N_PROTECTED_TERMS
) - Wiki Syntax Preservation: Maintains links, images, templates, and formatting
- Incremental Translation: Reuses previous translations for unchanged content
- Context Awareness: Understands technical documentation context
4. Content Managers (kitsune/wiki/content_managers.py
)¶
Different content managers handle the publishing workflow:
ManualContentManager
: Human-driven workflowAIContentManager
: Automated publishingHybridContentManager
: Mixed human/AI workflow
5. Services (kitsune/wiki/services.py
)¶
TranslationQueryBuilder
- Centralized repository for complex translation discovery queries
- Provides methods to find stale and missing translations
- Implements document-aware routing logic for AI vs HYBRID flows
- Key methods:
- get_stale_docs_hybrid()
: Finds non-archived stale translations in HYBRID locales
- get_stale_docs_ai()
: Finds all stale translations for AI flow (all AI docs + archived HYBRID docs)
- get_missing_docs_hybrid()
: Finds non-archived documents without translations in HYBRID locales
- get_missing_docs_ai()
: Finds documents without translations for AI flow
- get_pending_translations()
: Finds unreviewed translations exceeding grace period
- get_obsolete_translations()
: Finds translations that are no longer useful
BaseTranslationService - Shared processing logic for translation services - Handles metadata building and strategy execution - Uses TranslationQueryBuilder for document discovery
StaleTranslationService
- Identifies outdated translations that need updates
- Processes batch updates using appropriate strategies
- Configurable via STALE_TRANSLATION_THRESHOLD_DAYS
and STALE_TRANSLATION_BATCH_SIZE
- Method: process_stale(limit, strategy)
- optionally filters by AI or HYBRID strategy
MissingTranslationService
- Creates initial translations for documents without existing translations
- Uses document archive status for smart routing
- Method: process_missing(limit, strategy)
- creates translations for missing docs
HybridTranslationService
- Manages hybrid workflow automation
- Auto-approves translations after grace period (HYBRID_REVIEW_GRACE_PERIOD
)
- Rejects obsolete machine translations
Translation Triggers¶
The system responds to several trigger events (TranslationTrigger
):
REVIEW_REVISION
: When a revision is approved and marked as ready for translationMARK_READY_FOR_L10N
: Manual request to mark content for translationTRANSLATE
: Direct translation requestSTALE_TRANSLATION_UPDATE
: Periodic updates of outdated contentINITIAL_TRANSLATION
: Creating new translations for documents without existing translations
Document-Aware Routing¶
The system intelligently routes translations based on document status and target locale:
Archived Documents in HYBRID Locales
- Archived knowledge base articles in HYBRID locales automatically use the AI flow
- This reduces review burden for content that's no longer actively maintained
- Implemented via TranslationStrategyFactory.get_method_for_document()
Routing Logic:
if locale in HYBRID_ENABLED_LOCALES and document.is_archived:
return TranslationMethod.AI # Skip human review for archived docs
else:
return get_method_for_locale(locale) # Standard locale-based routing
Configuration¶
Settings¶
Key configuration settings control l10n behavior:
# Locales using different strategies
AI_ENABLED_LOCALES = ['de', 'fr', 'es'] # Example
HYBRID_ENABLED_LOCALES = ['it', 'pt-BR'] # Example
# Stale translation detection
STALE_TRANSLATION_THRESHOLD_DAYS = 30
STALE_TRANSLATION_BATCH_SIZE = 50
# Hybrid workflow settings
HYBRID_REVIEW_GRACE_PERIOD = 72 # hours
Protected Terms (kitsune/llm/l10n/config.py
)¶
The system preserves Mozilla-specific terms and technical markup:
L10N_PROTECTED_TERMS = [
"Firefox", "Mozilla", "Thunderbird", "ESR",
"Firefox Beta", "Firefox Nightly",
# ... many more
]
Workflows¶
1. AI Translation Workflow¶
graph TD
A[English Article Updated] --> B[Mark Ready for L10N]
B --> C[Auto-detect AI Locales]
C --> D[Generate AI Translation]
D --> E[Auto-publish Translation]
E --> F[Send Notifications]
2. Hybrid Translation Workflow¶
graph TD
A[English Article Updated] --> B[Mark Ready for L10N]
B --> C[Auto-detect Hybrid Locales]
C --> D[Generate AI Translation]
D --> E[Create Pending Revision]
E --> F[Wait for Human Review]
F --> G{Reviewed?}
G -->|Yes| H[Publish/Reject]
G -->|No - Grace Period Expired| I[Auto-approve]
3. Manual Translation Workflow¶
graph TD
A[English Article Updated] --> B[Mark Ready for L10N]
B --> C[Send Email Notifications]
C --> D[Human Translator Works]
D --> E[Submit Translation]
E --> F[Human Review]
F --> G[Publish Translation]
Management Commands¶
Process Stale Translations¶
python manage.py process_stale_translations --limit 100
Updates outdated translations across all AI/Hybrid enabled locales. The service automatically routes documents based on their archive status and locale configuration.
Create Missing Translations¶
python manage.py create_missing_translations --limit 50
Creates initial translations for English documents that don't have translations yet. Uses document-aware routing: - AI locales: translates all documents (archived and non-archived) - HYBRID locales (non-archived docs): creates translations for human review - HYBRID locales (archived docs): auto-publishes via AI flow
Publish Pending Translations¶
python manage.py publish_pending_translations
Auto-approves hybrid translations that have exceeded the review grace period.
Update L10N Metrics¶
python manage.py update_l10n_metric
python manage.py update_l10n_coverage_metrics
python manage.py update_l10n_contributor_metrics
Updates various localization metrics for dashboards and reporting.
Report L10N Metrics¶
python manage.py report_l10_metrics
Generates a comprehensive report showing localization coverage across core locales, most-requested locales, and individual locales. Shows both availability percentages and up-to-date status for all localizable knowledge base articles.
Key Data Structures¶
TranslationRequest¶
@dataclass
class TranslationRequest:
revision: Revision
trigger: TranslationTrigger
target_locale: str = ""
method: TranslationMethod = TranslationMethod.MANUAL
asynchronous: bool = False
Usage Example¶
from kitsune.wiki.strategies import TranslationStrategyFactory
factory = TranslationStrategyFactory()
request = TranslationRequest(revision=rev, target_locale='de', method=TranslationMethod.AI)
result = factory.execute(request)