説明なし

Rongxin 2fde61868b doc (gui): update previews 1 年間 前
.github 5ac8526f08 Update issue templates 1 年間 前
docs 2fde61868b doc (gui): update previews 1 年間 前
pdf2zh 64424e936e fix: docker 1 年間 前
.gitignore 3fad3b31d1 feat (gui): add gui support using gradio 1 年間 前
Dockerfile 993f9d8ae9 fix: docker 1 年間 前
LICENSE 04e1dedd8e Update LICENSE 1 年間 前
README.md 01520926f5 doc (readme): add anchors 1 年間 前
README_zh-CN.md 01520926f5 doc (readme): add anchors 1 年間 前
setup.py 60a3264acc chore: use huggingface_hub 1 年間 前

README.md

English | [简体中文](README_zh-CN.md) PDF2ZH

PDFMathTranslate

PDF scientific paper translation and bilingual comparison.

Updates

  • [Nov. 20 2024] Support Docker
  • [Nov. 20 2024] Support multiple-threads
  • [Nov. 19 2024] Provides an graphical user interface
  • [Nov. 18 2024] Supports DeepL, DeepLX, and Azure

Installation

Require Python version >=3.8, <=3.12

pip install pdf2zh

Usage

Execute the translation command in the command line to generate the translated document example-zh.pdf and the bilingual document example-dual.pdf in the current directory. Use Google as the default translation service.

Please refer to ChatGPT for how to set environment variables.

Full / partial document translation

- Entire document ```bash pdf2zh example.pdf ``` - Part of the document ```bash pdf2zh example.pdf -p 1-3,5 ```

Specify source and target languages

See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages) ```bash pdf2zh example.pdf -li en -lo ja ```

Translate with Different Services

- **DeepL** See [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API) Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/translate` - `DEEPL_SERVER_URL` (Optional), e.g., `export DEEPL_SERVER_URL=https://api.deepl.com` - `DEEPL_AUTH_KEY`, e.g., `export DEEPL_AUTH_KEY=xxx` ```bash pdf2zh example.pdf -s deepl ``` - **DeepLX** See [DeepLX](https://github.com/OwO-Network/DeepLX) Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/translate` - `DEEPLX_SERVER_URL` (Optional), e.g., `export DEEPLX_SERVER_URL=https://api.deeplx.org` - `DEEPLX_AUTH_KEY`, e.g., `export DEEPLX_AUTH_KEY=xxx` ```bash pdf2zh example.pdf -s deeplx ``` - **Ollama** See [Ollama](https://github.com/ollama/ollama) Set ENVs to construct an endpoint like: `{OLLAMA_HOST}/api/chat` - `OLLAMA_HOST` (Optional), e.g., `export OLLAMA_HOST=https://localhost:11434` ```bash pdf2zh example.pdf -s ollama:gemma2 ``` - **LLM with OpenAI compatible schemas (OpenAI / SiliconCloud / Zhipu)** See [SiliconCloud](https://docs.siliconflow.cn/quickstart), [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk) Set ENVs to construct an endpoint like: `{OPENAI_BASE_URL}/chat/completions` - `OPENAI_BASE_URL` (Optional), e.g., `export OPENAI_BASE_URL=https://api.openai.com/v1` - `OPENAI_API_KEY`, e.g., `export OPENAI_API_KEY=xxx` ```bash pdf2zh example.pdf -s openai:gpt-4o ``` - **Azure** See [Azure Text Translation](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview) Following ENVs are required: - `AZURE_APIKEY`, e.g., `export AZURE_APIKEY=xxx` - `AZURE_ENDPOINT`, e.g, `export AZURE_ENDPOINT=https://api.translator.azure.cn/` - `AZURE_REGION`, e.g., `export AZURE_REGION=chinaeast2` ```bash pdf2zh example.pdf -s azure ```

Translate wih exceptions

Use regex to specify formula fonts and characters that need to be preserved. ```bash pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])" ```

Interact with GUI

```bash pdf2zh -i ``` See [documentation for GUI](./docs/README_GUI.md) for more details.

Docker

1. Pull and run: ```bash docker pull byaidu/pdf2zh docker run -p 7860:7860 byaidu/pdf2zh ``` 2. Open in browser: ``` http://localhost:7860/ ```

Preview

image

image

image

Requests and Reports

Feel free to provide feedback in GitHub Issues or Telegram Group.

Acknowledgements

Contributors

Star History

Star History Chart