|
|
hai 1 ano | |
|---|---|---|
| pdf2zh | hai 1 ano | |
| .gitignore | hai 1 ano | |
| LICENSE | hai 1 ano | |
| README.md | hai 1 ano | |
| setup.py | hai 1 ano |
PDF scientific paper translation and bilingual comparison.
📊 Retain formulas and charts.
📄 Preserve table of contents.
🌐 Support multiple translation services.
Require Python version >=3.8, <=3.11
pip install -U "pdf2zh>=1.5.3"
Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory.
pdf2zh example.pdf
pdf2zh example.pdf -p 1-3,5
See Languages Codes.
pdf2zh example.pdf -li en -lo ja
See Ollama.
pdf2zh example.pdf -s gemma2
See DeepLX.
Set ENVs to construct an endpoint like {DEEPLX_URL}/{DEEPLX_TOKEN}/translate:
DEEPLX_URL, e.g., export DEEPLX_URL=https://api.deeplx.orgDEEPLX_TOKEN, e.g., export DEEPLX_TOKEN=ABCDEFGRun:
pdf2zh example.pdf -s deeplx
pdf2zh BDA3.pdf -f "(CM[^RT].*|MS.*|XY.*|MT.*|BL.*|.*0700|.*0500|.*Italic)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
Document merging: PyMuPDF
Document parsing: Pdfminer.six
Document extraction: MinerU
Multi-threaded translation: MathTranslate
Layout parsing: DocLayout-YOLO