Tidak Ada Deskripsi

Byaidu 3fedd47cdb doc: fix zhipu 1 tahun lalu
.github 5ac8526f08 Update issue templates 1 tahun lalu
pdf2zh 94d5559b11 fix: use new xref 1 tahun lalu
.gitignore a0d87c73aa feat (translator, convertor): add support for DeepLX 1 tahun lalu
LICENSE 04e1dedd8e Update LICENSE 1 tahun lalu
README.md a8e6043811 doc: set envs 1 tahun lalu
README_zh-CN.md 3fedd47cdb doc: fix zhipu 1 tahun lalu
setup.py 162ec34355 fix: dep 1 tahun lalu

README.md

English | [简体中文](README_zh-CN.md) # PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

  • 📊 Retain formulas and charts.

  • 📄 Preserve table of contents.

  • 🌐 Support multiple translation services.

Feel free to provide feedback in issues or user group.

Installation

Require Python version >=3.8, <=3.12

pip install pdf2zh

Usage

Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory. Use Google as the default translation service.

Please refer to ChatGPT for how to set environment variables.

Translate the entire document

pdf2zh example.pdf

Translate part of the document

pdf2zh example.pdf -p 1-3,5

Translate with the specified language

See Google Languages Codes, DeepL Languages Codes

pdf2zh example.pdf -li en -lo ja

Translate with DeepL/DeepLX

See DeepLX

Set ENVs to construct an endpoint like: {DEEPL_SERVER_URL}/{DEEPL_AUTH_KEY}/translate

  • DEEPL_SERVER_URL (Optional), e.g., export DEEPL_SERVER_URL=https://api.deepl.com
  • DEEPL_AUTH_KEY, e.g., export DEEPL_AUTH_KEY=xxx

    pdf2zh example.pdf -s deepl
    

Translate with Ollama

See Ollama

Set ENVs to construct an endpoint like: {OLLAMA_HOST}/api/chat

  • OLLAMA_HOST (Optional), e.g., export OLLAMA_HOST=https://localhost:11434

    pdf2zh example.pdf -s ollama:gemma2
    

Translate with OpenAI/SiliconCloud/Zhipu

See SiliconCloud, Zhipu

Set ENVs to construct an endpoint like: {OPENAI_BASE_URL}/chat/completions

  • OPENAI_BASE_URL (Optional), e.g., export OPENAI_BASE_URL=https://api.openai.com/v1
  • OPENAI_API_KEY, e.g., export OPENAI_API_KEY=xxx

    pdf2zh example.pdf -s openai:gpt-4o
    

Use regex to specify formula fonts and characters that need to be preserved

pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"

Preview

image

image

image

Acknowledgement

Document merging: PyMuPDF

Document parsing: Pdfminer.six

Document extraction: MinerU

Multi-threaded translation: MathTranslate

Layout parsing: DocLayout-YOLO

Document standard: PDF Explained, PDF Cheat Sheets

Contributors

Star History

Star History Chart