ISO 5060: The New International Standard for Translation Quality Evaluation
For decades, the translation industry had no standard way to measure quality. LISA QA, J2450, MQM, custom scorecards — every agency and enterprise picked their own approach. Comparing quality across vendors was nearly impossible.
ISO 5060:2024 changes that. It's the first international standard specifically focused on evaluating translation quality, and it consolidates the best ideas from MQM into an official ISO framework that organizations worldwide can adopt.
What is ISO 5060?
Officially titled "Translation and interpreting — Evaluation of translation output — General guidance," ISO 5060 provides a framework for assessing translation quality. It's different from ISO 17100 (which covers translation service requirements) or ISO 18587 (post-editing requirements). Those standards tell you how to produce translations. ISO 5060 tells you how to measure them.
That distinction matters more than it sounds.
Key Features of ISO 5060
| Feature | Description |
|---|---|
| Error typology | Harmonized with MQM first-level categories |
| Severity levels | Critical, Major, Minor classifications |
| Evaluation phases | Pre-evaluation, evaluation, post-evaluation |
| Scoring models | Guidance on quality scoring calculations |
| Evaluator requirements | Qualifications and training standards |
Why It Matters
Before ISO 5060, quality fragmentation created real problems:
- Comparing quality across vendors? Apples and oranges.
- Establishing industry benchmarks? No common baseline.
- Training evaluators consistently? Every company had different criteria.
- Demonstrating compliance to clients? "We have a QA process" isn't convincing.
ISO 5060 gives the entire industry a shared reference point. That alone is worth the price of the standard.
ISO 5060 and MQM: The Connection
MQM (Multidimensional Quality Metrics) served as the primary foundation for ISO 5060. The standard's error typology is explicitly harmonized with MQM's first-level categories.
Error Categories in ISO 5060
| Category | Subcategories | Description |
|---|---|---|
| Accuracy | Mistranslation, Omission, Addition, Untranslated | Meaning transfer errors |
| Fluency | Grammar, Spelling, Typography, Punctuation | Target language errors |
| Terminology | Wrong term, Inconsistent, Unapproved | Technical vocabulary errors |
| Style | Register, Unidiomatic, Inconsistent style | Stylistic issues |
| Locale | Date/Time, Number, Currency, Measurement | Locale convention errors |
If you've used MQM, this will look familiar. That's the point.
Severity Levels
ISO 5060 uses the three-tier severity model:
Critical - Errors that could cause legal liability, safety risks, or severe misunderstanding. Think: incorrect medication dosage in a medical translation.
Major - Errors that significantly impact comprehension or user experience. Think: mistranslated key product feature.
Minor - Errors with minimal impact on understanding. Think: minor punctuation inconsistency.
The gap between these levels isn't just semantic — a Critical error carries a 25-point penalty vs. 5 for Major and 1 for Minor. One Critical error does more damage to your score than 25 Minor ones.
How ISO 5060 Differs from Other Standards
ISO 5060 vs. ISO 17100
| Aspect | ISO 17100 | ISO 5060 |
|---|---|---|
| Focus | Translation service requirements | Quality evaluation |
| Scope | Full translation process | Evaluation methodology |
| Certification | LSP certification available | Framework standard |
| Purpose | Service quality assurance | Output quality measurement |
Short version: ISO 17100 = how to produce. ISO 5060 = how to measure.
ISO 5060 vs. ISO 11669
ISO 11669:2024, released alongside ISO 5060, focuses on translation specifications — how to define requirements before translation begins. They're designed as a pair:
- ISO 11669 → Define quality requirements upfront
- ISO 5060 → Evaluate whether requirements were met
ISO 5060 vs. ISO 18587
ISO 18587 covers post-editing of machine translation output. ISO 5060 can be applied to evaluate the quality of any translation — human, machine, or post-edited. It doesn't care how the translation was produced.
The Three Phases of ISO 5060 Evaluation
Phase 1: Pre-Evaluation
Before evaluation begins, you need to establish three things:
Quality specifications
- Error categories to assess
- Severity weights for each category
- Passing threshold (e.g., MQM score of 95 or higher)
- Sample size and selection method
Evaluator selection
- Native target language speakers
- Subject matter expertise for specialized content
- Trained in the evaluation methodology
- Independent from original translators
Evaluation setup
- Tools and templates
- Reference materials (glossaries, style guides)
- Calibration process
Skipping this phase is the most common mistake. Organizations jump straight to evaluation without defining what "good" means for their content. Then they wonder why evaluators disagree.
Phase 2: Evaluation
Evaluators work through these steps:
- Compare source and target segments
- Identify potential errors
- Classify errors by type and severity
- Document errors with annotations
- Assign penalty points based on severity
Calibration matters here. ISO 5060 emphasizes it for good reason. Before production evaluation:
- Have multiple evaluators assess the same content
- Compare results and discuss discrepancies
- Update guidelines based on alignment
- Document calibration decisions
Two evaluators who disagree on everything aren't producing useful data. Calibration is what turns individual opinions into reliable measurement.
Phase 3: Post-Evaluation
Score calculation
Quality Score = 100 - (Total Penalty Points / Word Count × 100) Typical penalty weights:
- Critical: 25 points
- Major: 5 points
- Minor: 1 point
Reporting
- Overall quality score
- Error breakdown by category
- Error breakdown by severity
- Specific annotations with examples
- Trend analysis (if historical data exists)
Feedback loop
- Share findings with translators
- Identify systematic issues
- Update training materials
- Adjust workflows as needed
The feedback loop is where most organizations drop the ball. They generate scores, file reports, and nothing changes. If your evaluation data doesn't lead to concrete improvements, you're doing expensive busywork.
Implementing ISO 5060 in Your Organization
Step 1: Assess Current State
Evaluate your existing quality processes. What methodology do you use? How well does it align with ISO 5060? What gaps need addressing? Do you have trained evaluators?
Step 2: Define Quality Tiers
Not all content requires the same rigor.
| Tier | Content Type | Evaluation Approach | Pass Threshold |
|---|---|---|---|
| Premium | Legal, medical, marketing | 100% human evaluation | 98+ |
| Standard | Technical documentation | 20% sample evaluation | 95+ |
| Basic | Internal communications | AI-assisted evaluation | 90+ |
Step 3: Train Your Team
Evaluators need to understand ISO 5060 error categories and definitions, severity criteria with real examples, scoring methodology, calibration processes, and the tools they'll use.
According to ISO 5060, qualified evaluators should be able to "compare paired source and target language segments and judge translation quality based on MQM Error Typology criteria."
Step 4: Select Tools
Choose evaluation tools that support error annotation and categorization, severity assignment, score calculation, report generation, and trend tracking.
AI-powered tools like KTTC can automate much of the evaluation process while maintaining ISO 5060 compliance.
Step 5: Establish Calibration Processes
Regular calibration keeps things consistent:
- Weekly: Quick alignment checks
- Monthly: Full calibration sessions
- Quarterly: Methodology review and updates
Step 6: Document Everything
Create and maintain evaluation guidelines, an error examples database, calibration records, evaluator certifications, and a quality reports archive. ISO audits love documentation. More importantly, documentation is how institutional knowledge survives turnover.
ISO 5060 and AI-Powered LQA
AI in translation quality evaluation creates both opportunities and questions around ISO 5060 compliance.
Can AI Perform ISO 5060 Evaluation?
AI LQA tools can:
- Identify potential errors using NLP analysis
- Classify errors according to MQM categories
- Calculate quality scores automatically
- Generate reports at scale
But AI currently can't:
- Make the kind of cultural judgment calls humans make instinctively
- Evaluate creative or marketing content reliably
- Serve as sole evaluator for content where errors have legal consequences
- Replace human calibration oversight
Best Practice: Hybrid Approach
The practical answer is to combine AI speed with human judgment:
- AI first pass - Automated evaluation at scale
- Human verification - Review AI findings, especially critical errors
- Random sampling - Human spot-checks on AI-evaluated content
- Continuous calibration - Compare AI and human results, adjust AI models
This hybrid approach maintains ISO 5060 compliance while cutting evaluation time and cost dramatically. Most organizations we've seen report 60-80% time reduction without sacrificing accuracy.
Demonstrating ISO 5060 Compliance
For Language Service Providers
If you're an LSP, ISO 5060 compliance is a real competitive advantage — not just a checkbox. Document your evaluation methodology alignment, train and certify evaluators, maintain calibration records, and provide ISO 5060-compliant quality reports to clients. When clients compare two LSPs and one provides structured quality data, that bid usually wins.
For Enterprises
If you're buying translation services: include ISO 5060 requirements in RFPs, require quality reports in ISO 5060 format, audit vendor compliance periodically, and benchmark vendors using consistent metrics.
FAQ
What is ISO 5060 in translation?
ISO 5060:2024 is the first international standard specifically focused on evaluating translation quality. It provides a framework for assessing translated content, including error categorization (aligned with MQM), severity levels, evaluation phases, and scoring methodologies. Organizations use it to standardize how they measure and report translation quality.
How is ISO 5060 different from MQM?
ISO 5060 is based on MQM (Multidimensional Quality Metrics) but is an official ISO standard. MQM is a flexible, open framework that organizations can customize. ISO 5060 formalizes key MQM concepts into an international standard, providing official guidance on implementation, evaluator qualifications, and evaluation processes. They're complementary — you can use MQM tools while following ISO 5060 methodology.
Is ISO 5060 certification available?
As of 2025, ISO 5060 itself doesn't have a certification program like ISO 17100. Organizations can document their ISO 5060 compliance and include it in quality management systems. Some certification bodies may develop ISO 5060-aligned evaluator certification programs in the future.
What's the minimum score to pass ISO 5060 evaluation?
ISO 5060 doesn't mandate a specific passing score — that's up to your organization based on content type and risk. Common thresholds are 98+ for critical content (legal, medical), 95+ for standard business content, and 90+ for low-risk internal content. The standard provides guidance on setting appropriate thresholds.
Can AI tools be used for ISO 5060 evaluation?
Yes, AI tools can assist with ISO 5060 evaluation, particularly for initial screening and scale. The standard emphasizes qualified human evaluators for final quality decisions, especially for critical content. A hybrid approach — AI-assisted evaluation with human oversight — is the 2025 best practice that maintains compliance while improving efficiency.
ISO 5060 isn't a silver bullet, but it solves a problem that's plagued the translation industry for years: how do you talk about quality in a way that everyone agrees on? Now there's an answer, and it's backed by ISO.
The practical advice: start with your highest-priority content, train your team on the fundamentals, and gradually expand coverage. Don't try to boil the ocean on day one.
Ready to implement ISO 5060-compliant quality evaluation? Try KTTC for AI-powered LQA with MQM-based error categorization and automated quality scoring.
