|
|
@@ -4,7 +4,7 @@ English | [简体中文](README_zh-CN.md)
|
|
|
|
|
|
<img src="./docs/images/banner.png" width="320px" alt="PDF2ZH"/>
|
|
|
|
|
|
-## PDFMathTranslate
|
|
|
+<h2 id="title">PDFMathTranslate</h2>
|
|
|
|
|
|
<p>
|
|
|
<!-- PyPI -->
|
|
|
@@ -15,6 +15,8 @@ English | [简体中文](README_zh-CN.md)
|
|
|
<!-- License -->
|
|
|
<a href="./LICENSE">
|
|
|
<img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
|
|
|
+ <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
|
|
|
+ <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-yellow"/></a>
|
|
|
<a href="https://t.me/+Z9_SgnxmsmA5NzBl">
|
|
|
<img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
|
|
|
</p>
|
|
|
@@ -23,29 +25,105 @@ English | [简体中文](README_zh-CN.md)
|
|
|
|
|
|
PDF scientific paper translation and bilingual comparison.
|
|
|
|
|
|
-- 📊 Retain formulas and charts.
|
|
|
+- 📊 Preserve formulas, charts, table of contents, and annotations *([preview](#preview))*.
|
|
|
+- 🌐 Support [multiple languages](#language), and diverse [translation services](#services).
|
|
|
+- 🤖 Provides [commandline tool](#usage), [interactive user interface](#gui), and [Docker](#docker)
|
|
|
|
|
|
-- 📄 Preserve table of contents.
|
|
|
+Feel free to provide feedback in [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [Telegram Group](https://t.me/+Z9_SgnxmsmA5NzBl).
|
|
|
|
|
|
-- 🌐 Support multiple translation services.
|
|
|
+<h2 id="updates">Updates</h2>
|
|
|
|
|
|
-Feel free to provide feedback in [issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [user group](https://t.me/+Z9_SgnxmsmA5NzBl).
|
|
|
+- [Nov. 20 2024] 🌟 [Demo](#demo) online!
|
|
|
+- [Nov. 20 2024] Supports [Docker](#docker)
|
|
|
+- [Nov. 20 2024] Supports [multiple-threads translation](#threads)
|
|
|
+- [Nov. 19 2024] Provides an [interactive graphical user interface](#gui)
|
|
|
+- [Nov. 18 2024] Supports [more services: DeepL, DeepLX, and Azure](#services)
|
|
|
|
|
|
-## Installation
|
|
|
+<h2 id="demo">Demo 🌟</h2>
|
|
|
|
|
|
-Require Python version >=3.8, <=3.12
|
|
|
+You can try [our demo on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) without installation.
|
|
|
+Note that the computing resources of the demo are limited, so please avoid abusing them.
|
|
|
|
|
|
-```bash
|
|
|
-pip install pdf2zh
|
|
|
-```
|
|
|
+<h2 id="install">Installation and Usage</h2>
|
|
|
+
|
|
|
+We provide three methods for using this project: [commanline](#cmd), [GUI](#gui), and [Docker](#docker).
|
|
|
+
|
|
|
+<h3 id="cmd">Method I. Commandline</h3>
|
|
|
+
|
|
|
+ 1. Python installed (3.8 <= version <= 3.12)
|
|
|
+ 2. Install our package
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ pip install pdf2zh
|
|
|
+ ```
|
|
|
+
|
|
|
+ 3. Use:
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ pdf2zh document.pdf
|
|
|
+ ```
|
|
|
+
|
|
|
+<h3 id="gui">Method II. GUI</h3>
|
|
|
+
|
|
|
+1. Python installed (3.8 <= version <= 3.12)
|
|
|
+2. Install our package
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ pip install pdf2zh
|
|
|
+ ```
|
|
|
+
|
|
|
+3. Start using in browser:
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ pdf2zh -i
|
|
|
+ ```
|
|
|
+
|
|
|
+4. If your browswer has not been started automatically, goto
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ http://localhost:7860/
|
|
|
+ ```
|
|
|
+
|
|
|
+<img src="./docs/images/before.png" width="500"/>
|
|
|
+
|
|
|
+See [documentation for GUI](./docs/README_GUI.md) for more details.
|
|
|
|
|
|
-## Usage
|
|
|
+<h3 id="docker">Method III. Docker</h3>
|
|
|
+
|
|
|
+1. Pull and run:
|
|
|
+
|
|
|
+ ```bash
|
|
|
+ docker pull byaidu/pdf2zh
|
|
|
+ docker run -p 7860:7860 byaidu/pdf2zh
|
|
|
+ ```
|
|
|
+
|
|
|
+2. Open in browser:
|
|
|
+
|
|
|
+ ```
|
|
|
+ http://localhost:7860/
|
|
|
+ ```
|
|
|
+
|
|
|
+<h2 id="usage">Advanced Options</h2>
|
|
|
|
|
|
Execute the translation command in the command line to generate the translated document `example-zh.pdf` and the bilingual document `example-dual.pdf` in the current directory. Use Google as the default translation service.
|
|
|
|
|
|
-Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for how to set environment variables.
|
|
|
+<img src="./docs/images/cmd.explained.png" width="580px" alt="cmd"/>
|
|
|
|
|
|
-### Full / partial document translation
|
|
|
+In the following table, we list all advanced options for reference:
|
|
|
+
|
|
|
+| Option | Function | Example |
|
|
|
+| -------- | ------- |------- |
|
|
|
+| `-i` | [Enter GUI](#gui) | `pdf2zh -i` |
|
|
|
+| `-p` | [Partial document translation](#partial) | `pdf2zh example.pdf -p 1` |
|
|
|
+| `-li` | [Source language](#languages) | `pdf2zh example.pdf -li en` |
|
|
|
+| `-lo` | [Target language](#languages) | `pdf2zh example.pdf -lo zh` |
|
|
|
+| `-s` | [Translation service](#services) | `pdf2zh example.pdf -s deepl` |
|
|
|
+| `-t` | [Multi-threads](#threads) | `pdf2zh example.pdf -t 1` |
|
|
|
+| `-f`, `-c` | [Exceptions](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
|
|
|
+
|
|
|
+Some services require setting environmental variables. Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for how to set environment variables.
|
|
|
+
|
|
|
+<h3 id="partial">Full / partial document translation</h3>
|
|
|
|
|
|
- Entire document
|
|
|
|
|
|
@@ -59,7 +137,7 @@ Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57c
|
|
|
pdf2zh example.pdf -p 1-3,5
|
|
|
```
|
|
|
|
|
|
-### Specify source and target languages
|
|
|
+<h3 id="language">Specify source and target languages</h3>
|
|
|
|
|
|
See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages)
|
|
|
|
|
|
@@ -67,7 +145,7 @@ See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v
|
|
|
pdf2zh example.pdf -li en -lo ja
|
|
|
```
|
|
|
|
|
|
-### Translate with Different Services
|
|
|
+<h3 id="services">Translate with Different Services</h3>
|
|
|
|
|
|
- **DeepL**
|
|
|
|
|
|
@@ -129,7 +207,7 @@ pdf2zh example.pdf -li en -lo ja
|
|
|
pdf2zh example.pdf -s azure
|
|
|
```
|
|
|
|
|
|
-### Translate wih exceptions
|
|
|
+<h3 id="exceptions">Translate wih exceptions</h3>
|
|
|
|
|
|
Use regex to specify formula fonts and characters that need to be preserved.
|
|
|
|
|
|
@@ -137,30 +215,15 @@ Use regex to specify formula fonts and characters that need to be preserved.
|
|
|
pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
|
|
|
```
|
|
|
|
|
|
-### Interact with GUI
|
|
|
+<h3 id="threads">Specify threads</h3>
|
|
|
|
|
|
-<img src="./docs/images/before.png" width="500"/>
|
|
|
+Use `-t` to specify how many threads to use in translation:
|
|
|
|
|
|
```bash
|
|
|
-pdf2zh -i
|
|
|
-```
|
|
|
-
|
|
|
-See [documentation for GUI](./docs/README_GUI.md) for more details.
|
|
|
-
|
|
|
-### Docker
|
|
|
-
|
|
|
-```bash
|
|
|
-docker pull byaidu/pdf2zh
|
|
|
-docker run -p 7860:7860 byaidu/pdf2zh
|
|
|
-```
|
|
|
-
|
|
|
-Open in browser:
|
|
|
-
|
|
|
-```
|
|
|
-http://localhost:7860/
|
|
|
+pdf2zh example.pdf -t 1
|
|
|
```
|
|
|
|
|
|
-## Preview
|
|
|
+<h2 id="preview">Preview</h2>
|
|
|
|
|
|

|
|
|
|
|
|
@@ -168,27 +231,27 @@ http://localhost:7860/
|
|
|
|
|
|

|
|
|
|
|
|
-## Acknowledgement
|
|
|
+<h2 id="acknowledgement">Acknowledgements</h2>
|
|
|
|
|
|
-Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
|
|
|
+- Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
|
|
|
|
|
|
-Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
|
|
|
+- Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
|
|
|
|
|
|
-Document extraction: [MinerU](https://github.com/opendatalab/MinerU)
|
|
|
+- Document extraction: [MinerU](https://github.com/opendatalab/MinerU)
|
|
|
|
|
|
-Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
|
|
|
+- Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
|
|
|
|
|
|
-Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
|
|
|
+- Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
|
|
|
|
|
|
-Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
|
|
|
+- Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
|
|
|
|
|
|
-## Contributors
|
|
|
+<h2 id="contrib">Contributors</h2>
|
|
|
|
|
|
<a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
|
|
|
<img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
|
|
|
</a>
|
|
|
|
|
|
-## Star History
|
|
|
+<h2 id="star_hist">Star History</h2>
|
|
|
|
|
|
<a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
|
|
|
<picture>
|