___ _ _
/ __| __ _ _ __ __ _| | |_ _ __ ___ _ __
\__ \/ _` | ' \/ _` | | _| ' \/ -_) ' \
|___/\__,_|_|_|_\__,_|_|\__|_|_|_\___|_|_|_|
Extract alternative translations (alt-trans) from XLIFF files and convert them to TMX format. Build comprehensive translation memories from fuzzy matches, machine translation suggestions, and other alternative translations stored in your CAT tool exports.
Perfect for: Translators β’ Translation Agencies β’ LSPs β’ Localization Managers
Works with: memoQ β’ SDL Trados β’ Wordfast β’ All TMX 1.4 tools
- Extract all alt-trans elements from XLIFF files
- Preserve all attributes (origin, match-quality, tool, tool-id, etc.)
- Inline formatting support (bpt, ept, ph elements preserved)
- Automatic language detection from XLIFF headers
- Optional match-quality export (can be toggled on/off)
- Two operation modes:
- Individual mode (default): Each XLIFF β separate TMX file
- Merge mode (
--merge): All XLIFFs β one combined TMX
- Recursive directory processing with
-rflag - Multiple file extensions supported automatically:
.xlf,.xliff,.mxliff,.mqxliff,.sdlxliff
- Custom extension filtering with
--extoption - Smart filename generation in merge mode:
- Includes file count and TU count (e.g.,
combined_5files_234tus.tmx)
- Includes file count and TU count (e.g.,
- Detailed statistics:
- Per-file contribution breakdown
- Percentage calculations
- Success/failure reporting
- β UTF-8 (with or without BOM)
- β UTF-16 (LE/BE, with or without BOM)
- β UTF-32 (LE/BE)
- β Robust BOM detection
- β Automatic encoding conversion
- β memoQ (fixed namespace issues)
- β SDL Trados
- β Wordfast
- β memoSource
- β All TMX 1.4 compliant tools
# Clone the repository
git clone https://github.com/jiangweiatgithub/samaltmem.git
cd samaltmem
# Install dependencies
pip install -r requirements.txt# Individual mode (default) - each XLIFF β separate TMX
python xliff_to_tmx_hybrid.py /path/to/xliff/files/ -r -v
# Merge mode - all XLIFFs β one combined TMX
python xliff_to_tmx_hybrid.py /path/to/xliff/files/ -r --merge -v
# With match-quality
python xliff_to_tmx_hybrid.py /path/to/xliff/files/ -r -m --mergepositional arguments:
INPUT Input XLIFF file(s), directory(ies), or patterns
optional arguments:
-h, --help Show help message
-o FILE, --output FILE Output TMX file (merge mode only)
--merge Merge all into one TMX (default: individual)
-r, --recursive Search directories recursively
--ext EXT Filter by file extension(s)
-m, --export-match-quality
Include match-quality property
-v, --verbose Show detailed statistics
--no-countdown Skip countdown (for automation)
--version Show version number
python xliff_to_tmx_hybrid.py document.xlf -o output.tmx -m# Each XLIFF β separate TMX
python xliff_to_tmx_hybrid.py /project/translations/ -r -v# All XLIFFs β one combined TMX
python xliff_to_tmx_hybrid.py /project/translations/ -r --merge -v# Process only memoQ files
python xliff_to_tmx_hybrid.py /project/ -r --ext .mxliff --ext .mqxliff --merge# Combine multiple client projects
python xliff_to_tmx_hybrid.py /clients/*/translations/ -r -m --merge -o master_tm.tmx- Extract fuzzy matches from XLIFF files
- Build personal translation memories
- Organize TMs by project
- Build master TMs from multiple projects
- Track contribution per project/translator
- Automate TM maintenance workflows
- Daily batch processing of incoming files
- Create reference TMs for specific domains
- Maintain separate TMs per client
======================================================================
INDIVIDUAL MODE: Converting 3 file(s)
======================================================================
[1/3] project/file1.xlf
β
Output: file1.tmx (12 TUs)
[2/3] project/file2.xliff
β
Output: file2.tmx (23 TUs)
[3/3] project/subfolder/file3.xlf
β
Output: file3.tmx (18 TUs)
======================================================================
SUMMARY - INDIVIDUAL MODE
======================================================================
Files processed: 3/3
Total TUs: 53
Detailed Statistics:
Input File Output File TUs
---------------------------------------- ---------------------------------------- ------
β
file1.xlf file1.tmx 12
β
file2.xliff file2.tmx 23
β
file3.xlf file3.tmx 18
======================================================================
MERGE MODE: Combining 3 file(s) into one TMX
======================================================================
[1/3] project/file1.xlf
β
Added 12 TUs
[2/3] project/file2.xliff
β
Added 23 TUs
[3/3] project/subfolder/file3.xlf
β
Added 18 TUs
======================================================================
SUMMARY - MERGE MODE
======================================================================
Files combined: 3
Total TUs: 53
Output file: combined_3files_53tus.tmx
Contribution per file:
Status File Name TUs %
------ -------------------------------------------------- -------- ------
β
file1.xlf 12 22.6%
β
file2.xliff 23 43.4%
β
file3.xlf 18 34.0%
β
Combined TMX saved to: combined_3files_53tus.tmx
- Core engine: XSLT transformation with lxml
- Encoding detection: BOM-based with fallback
- Output format: TMX 1.4 compliant
- Namespace handling: Clean output (memoQ compatible)
- Python 3.6 or higher
- lxml library
- XSLT stylesheet (included)
samaltmem/
βββ samaltmem.py # Main script (branded name) β
βββ xliff_to_tmx_hybrid.py # Same script (technical name)
βββ xliff_alttrans_to_tmx_parameterized.xsl # XSLT transformer
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ LICENSE # MIT License
βββ CHANGELOG.md # Version history
βββ CONTRIBUTING.md # Contribution guidelines
βββ BRANDING.md # Brand identity guide
The script automatically detects and handles UTF-16 files. If you encounter encoding errors:
# Use verbose mode to see encoding detection
python xliff_to_tmx_hybrid.py file.xlf -v# Check extensions being searched
python xliff_to_tmx_hybrid.py /folder/ -v
# Use recursive search
python xliff_to_tmx_hybrid.py /folder/ -r
# Specify custom extension
python xliff_to_tmx_hybrid.py /folder/ --ext .yourext- β Namespace issues are fixed in current version
- β Output is TMX 1.4 compliant
- β Inline elements properly formatted
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
# Clone repository
git clone https://github.com/jiangweiatgithub/samaltmem.git
cd samaltmem
# Install dependencies
pip install -r requirements.txt
# Run tests
python -m pytest tests/This project is licensed under the MIT License - see the LICENSE file for details.
Your Name
- GitHub: @jiangweiatgithub
- Email: polytrans@gmail.com
- Thanks to all contributors and users
- Inspired by the needs of the translation industry
- Built with Python and lxml
- Individual and merge modes
- Smart filename with statistics
- Detailed contribution reporting
- Enhanced UTF-16 support
- PyInstaller compatibility
- Directory processing
- Recursive search
- Multiple extension support
- Custom extension filtering
- Initial public release
- Basic XLIFF to TMX conversion
- Match-quality control
If you find this tool useful, please consider giving it a star!
Made with β€οΈ for the translation community