Нема описа

Byaidu f025e87c5a Revert "Merge pull request #108 from Byaidu/dev-guide" пре 1 година
.github 54d6c9160c fix: ci пре 1 година
docs f025e87c5a Revert "Merge pull request #108 from Byaidu/dev-guide" пре 1 година
pdf2zh f025e87c5a Revert "Merge pull request #108 from Byaidu/dev-guide" пре 1 година
.gitignore 3fad3b31d1 feat (gui): add gui support using gradio пре 1 година
Dockerfile 54d6c9160c fix: ci пре 1 година
Dockerfile.Demo 47ee968264 feat (demo): add dockerfile for demo пре 1 година
LICENSE 04e1dedd8e Update LICENSE пре 1 година
README.md fe52982b93 doc (readme): contribution status пре 1 година
README_zh-CN.md fe52982b93 doc (readme): contribution status пре 1 година
app.json 73e1d5291f feat (docker): app.json пре 1 година
requirements.txt 54d6c9160c fix: ci пре 1 година
setup.py 54d6c9160c fix: ci пре 1 година

README.md

English | [简体中文](README_zh-CN.md) PDF2ZH

PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

Feel free to provide feedback in GitHub Issues or Telegram Group.

Updates

Preview

Demo 🌟

You can try our demo on HuggingFace without installation.
Note that the computing resources of the demo are limited, so please avoid abusing them.

Installation and Usage

We provide three methods for using this project: Commandline, GUI, and Docker.

Method I. Commandline

  1. Python installed (3.8 <= version <= 3.12)
  2. Install our package

      pip install pdf2zh
    
  3. Use:

      pdf2zh document.pdf
    

Method II. GUI

  1. Python installed (3.8 <= version <= 3.12)
  2. Install our package

      pip install pdf2zh
    
  3. Start using in browser:

      pdf2zh -i
    
  4. If your browswer has not been started automatically, goto

    http://localhost:7860/
    

See documentation for GUI for more details.

Method III. Docker

  1. Pull and run:

    docker pull byaidu/pdf2zh
    docker run -p 7860:7860 byaidu/pdf2zh
    
  2. Open in browser:

    http://localhost:7860/
    

For docker deployment on cloud service:

Deploy

Deploy to Koyeb

Deploy on Zeabur

Deploy to Koyeb

Advanced Options

Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory. Use Google as the default translation service.

cmd

In the following table, we list all advanced options for reference:

Option Function Example
-i Enter GUI pdf2zh -i
-p Partial document translation pdf2zh example.pdf -p 1
-li Source language pdf2zh example.pdf -li en
-lo Target language pdf2zh example.pdf -lo zh
-s Translation service pdf2zh example.pdf -s deepl
-t Multi-threads pdf2zh example.pdf -t 1
-f, -c Exceptions pdf2zh example.pdf -f "(MS.*)"

Some services require setting environmental variables. Please refer to ChatGPT for how to set environment variables.

Full / partial document translation

  • Entire document

    pdf2zh example.pdf
    
  • Part of the document

    pdf2zh example.pdf -p 1-3,5
    

Specify source and target languages

See Google Languages Codes, DeepL Languages Codes

pdf2zh example.pdf -li en -lo ja

Translate with Different Services

  • DeepL

See DeepL

Set ENVs to construct an endpoint like: {DEEPL_SERVER_URL}/translate

  • DEEPL_SERVER_URL (Optional), e.g., export DEEPL_SERVER_URL=https://api.deepl.com
  • DEEPL_AUTH_KEY, e.g., export DEEPL_AUTH_KEY=xxx

    pdf2zh example.pdf -s deepl
    
  • DeepLX

See DeepLX

Set ENVs to construct an endpoint like: {DEEPL_SERVER_URL}/translate

  • DEEPLX_SERVER_URL (Optional), e.g., export DEEPLX_SERVER_URL=https://api.deeplx.org
  • DEEPLX_AUTH_KEY, e.g., export DEEPLX_AUTH_KEY=xxx

    pdf2zh example.pdf -s deeplx
    
  • Ollama

See Ollama

Set ENVs to construct an endpoint like: {OLLAMA_HOST}/api/chat

  • OLLAMA_HOST (Optional), e.g., export OLLAMA_HOST=https://localhost:11434

    pdf2zh example.pdf -s ollama:gemma2
    
  • LLM with OpenAI compatible schemas (OpenAI / SiliconCloud / Zhipu)

See SiliconCloud, Zhipu

Set ENVs to construct an endpoint like: {OPENAI_BASE_URL}/chat/completions

  • OPENAI_BASE_URL (Optional), e.g., export OPENAI_BASE_URL=https://api.openai.com/v1
  • OPENAI_API_KEY, e.g., export OPENAI_API_KEY=xxx

    pdf2zh example.pdf -s openai:gpt-4o
    
  • Azure

See Azure Text Translation

Following ENVs are required:

  • AZURE_APIKEY, e.g., export AZURE_APIKEY=xxx
  • AZURE_ENDPOINT, e.g, export AZURE_ENDPOINT=https://api.translator.azure.cn/
  • AZURE_REGION, e.g., export AZURE_REGION=chinaeast2

    pdf2zh example.pdf -s azure
    

Translate wih exceptions

Use regex to specify formula fonts and characters that need to be preserved.

pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"

Specify threads

Use -t to specify how many threads to use in translation:

pdf2zh example.pdf -t 1

Acknowledgements

Contributors

Alt

Star History

Star History Chart