Нет описания

Reynard 63fa4b2daf doc: simplified readme, step 2 1 год назад
.github 550025134a ci (test): cli and gui 1 год назад
docs a31cdc46bb doc: simplified readme, step 1 1 год назад
pdf2zh 549e796240 chore: format 1 год назад
script 1fa1724110 chore: remove unnecessary files into sub-folders 1 год назад
test 550025134a ci (test): cli and gui 1 год назад
.gitignore f7c96d0f22 git (ignore): ignore file path 1 год назад
.pre-commit-config.yaml 1b7f405970 fix: format 1 год назад
Dockerfile 94f4bda439 fix: docker expose 1 год назад
LICENSE 04e1dedd8e Update LICENSE 1 год назад
README.md 63fa4b2daf doc: simplified readme, step 2 1 год назад
app.json 73e1d5291f feat (docker): app.json 1 год назад
pyproject.toml beb85174ca release: 1.8.8 1 год назад
setup.cfg 583d6b3bc9 chore: format 1 год назад

README.md

English | [简体中文](docs/README_zh-CN.md) | [日本語](docs/README_ja-JP.md) PDF2ZH

PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

Feel free to provide feedback in GitHub Issues, Telegram Group or QQ Group.

Updates

Preview

Online Service 🌟

You can try our application out using either of the following demos:

Note that the computing resources of the demo are limited, so please avoid abusing them.

Installation and Usage

We provide four methods for using this project: Commandline, Portable, GUI, and Docker.

pdf2zh needs an extra model(wybxc/DocLayout-YOLO-DocStructBench-onnx), which can be found in modelscope. if you have a problem with downloading this model, try this environment variable:

USE_MODELSCOPE=1 pdf2zh

Method I. Commandline

  1. Python installed (3.8 <= version <= 3.12)
  2. Install our package:

    pip install pdf2zh
    
  3. Execute translation, files generated in current working directory:

    pdf2zh document.pdf
    

Method II. Portable

No need to pre-install Python environment

Download setup.bat and double-click to run

Method III. GUI

  1. Python installed (3.8 <= version <= 3.12)
  2. Install our package:

    pip install pdf2zh
    
  3. Start using in browser:

    pdf2zh -i
    
  4. If your browswer has not been started automatically, goto

    http://localhost:7860/
    

See documentation for GUI for more details.

Method IV. Docker

  1. Pull and run:

    docker pull byaidu/pdf2zh
    docker run -d -p 7860:7860 byaidu/pdf2zh
    
  2. Open in browser:

    http://localhost:7860/
    

For docker deployment on cloud service:

Advanced Options

Execute the translation command in the command line to generate the translated document example-mono.pdf and the bilingual document example-dual.pdf in the current working directory. Use Google as the default translation service.

cmd

In the following table, we list all advanced options for reference:

Option Function Example
files Local files pdf2zh ~/local.pdf
links Online files pdf2zh http://arxiv.org/paper.pdf
-i Enter GUI pdf2zh -i
-p Partial document translation pdf2zh example.pdf -p 1
-li Source language pdf2zh example.pdf -li en
-lo Target language pdf2zh example.pdf -lo zh
-s Translation service pdf2zh example.pdf -s deepl
-t Multi-threads pdf2zh example.pdf -t 1
-o Output dir pdf2zh example.pdf -o output
-f, -c Exceptions pdf2zh example.pdf -f "(MS.*)"
--share Public link pdf2zh -i --share
--authorized Authorization pdf2zh -i --authorized users.txt [auth.html]
--prompt Custom Prompt pdf2zh --prompt [prompt.txt]

For detailed explanations, please refer to our document about Advanced Usage for a full list of each option.

Secondary Development (APIs)

For downstream applications, please refer to our document about API Details for futher information about:

  • Python API, how to use the program in other Python programs
  • HTTP API, how to communicate with a server with the program installed

TODOs

  • [ ] Parse layout with DocLayNet based models, PaddleX, PaperMage, SAM2

  • [ ] Fix page rotation, table of contents, format of lists

  • [ ] Fix pixel formula in old papers

  • [ ] Async retry except KeyboardInterrupt

  • [ ] Knuth–Plass algorithm for western languages

  • [ ] Support non-PDF/A files

  • [ ] Plugins of Zotero and Obsidian

Acknowledgements

Contributors

Alt

Star History

Star History Chart