Эх сурвалжийг харах

Merge branch 'Byaidu:main' into main

IuvenisSapiens 1 жил өмнө
parent
commit
047aa12b2f

+ 2 - 1
.github/ISSUE_TEMPLATE/问题反馈.md

@@ -9,7 +9,8 @@ assignees: ''
 
 ## 问题描述
 请对问题进行描述,并提供日志或截图
-**本项目不处理网络环境引发的问题**(例如 Empty translation result/Connection reset)
+请确认 issues 中没有相同问题且完整阅读 wiki
+**本项目不处理网络环境引发的问题**(例如 empty translation result/connection reset/check_hostname requires server_hostname/certificate verify failed)
 
 ## 测试文档
 > [!IMPORTANT]

+ 20 - 0
Dockerfile.Demo

@@ -0,0 +1,20 @@
+FROM python:3.12
+
+WORKDIR /app
+
+ENV PYTHONUNBUFFERED=1
+
+RUN apt-get update && apt-get install -y libgl1 \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN pip install pdf2zh
+RUN mkdir -p /data
+RUN chmod 777 /data
+RUN mkdir -p /app
+RUN chmod 777 /app
+RUN mkdir -p /.cache
+RUN chmod 777 /.cache
+RUN mkdir -p ./gradio_files
+RUN chmod 777 ./gradio_files
+
+CMD ["pdf2zh", "-i"]

+ 107 - 44
README.md

@@ -4,7 +4,7 @@ English | [简体中文](README_zh-CN.md)
 
 <img src="./docs/images/banner.png" width="320px"  alt="PDF2ZH"/>  
 
-## PDFMathTranslate
+<h2 id="title">PDFMathTranslate</h2>
 
 <p>
   <!-- PyPI -->
@@ -15,6 +15,8 @@ English | [简体中文](README_zh-CN.md)
   <!-- License -->
   <a href="./LICENSE">
     <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
+  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
+    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-yellow"/></a>
   <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
     <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
 </p>
@@ -23,29 +25,105 @@ English | [简体中文](README_zh-CN.md)
 
 PDF scientific paper translation and bilingual comparison.
 
-- 📊 Retain formulas and charts.
+- 📊 Preserve formulas, charts, table of contents, and annotations *([preview](#preview))*.
+- 🌐 Support [multiple languages](#language), and diverse [translation services](#services).
+- 🤖 Provides [commandline tool](#usage), [interactive user interface](#gui), and [Docker](#docker)
 
-- 📄 Preserve table of contents.
+Feel free to provide feedback in [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [Telegram Group](https://t.me/+Z9_SgnxmsmA5NzBl).
 
-- 🌐 Support multiple translation services.
+<h2 id="updates">Updates</h2>
 
-Feel free to provide feedback in [issues](https://github.com/Byaidu/PDFMathTranslate/issues) or [user group](https://t.me/+Z9_SgnxmsmA5NzBl).
+- [Nov. 20 2024] 🌟 [Demo](#demo)  online!
+- [Nov. 20 2024] Supports [Docker](#docker)  
+- [Nov. 20 2024] Supports [multiple-threads translation](#threads)
+- [Nov. 19 2024] Provides an [interactive graphical user interface](#gui)
+- [Nov. 18 2024] Supports [more services: DeepL, DeepLX, and Azure](#services)
 
-## Installation
+<h2 id="demo">Demo 🌟</h2>
 
-Require Python version >=3.8, <=3.12
+You can try [our demo on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) without installation.  
+Note that the computing resources of the demo are limited, so please avoid abusing them.
 
-```bash
-pip install pdf2zh
-```
+<h2 id="install">Installation and Usage</h2>
+
+We provide three methods for using this project: [commanline](#cmd), [GUI](#gui), and [Docker](#docker).
+
+<h3 id="cmd">Method I. Commandline</h3>
+
+  1. Python installed (3.8 <= version <= 3.12)
+  2. Install our package
+
+      ```bash
+      pip install pdf2zh
+      ```
+
+  3. Use:
+
+      ```bash
+      pdf2zh document.pdf
+      ```
+
+<h3 id="gui">Method II. GUI</h3>
+
+1. Python installed (3.8 <= version <= 3.12)
+2. Install our package
+
+      ```bash
+      pip install pdf2zh
+      ```
+
+3. Start using in browser:
+
+      ```bash
+      pdf2zh -i
+      ```
+
+4. If your browswer has not been started automatically, goto
+
+    ```bash
+    http://localhost:7860/
+    ```
+
+<img src="./docs/images/before.png" width="500"/>
+
+See [documentation for GUI](./docs/README_GUI.md) for more details.
 
-## Usage
+<h3 id="docker">Method III. Docker</h3>
+
+1. Pull and run:
+
+    ```bash
+    docker pull byaidu/pdf2zh
+    docker run -p 7860:7860 byaidu/pdf2zh
+    ```
+
+2. Open in browser:
+
+    ```
+    http://localhost:7860/
+    ```
+
+<h2 id="usage">Advanced Options</h2>
 
 Execute the translation command in the command line to generate the translated document `example-zh.pdf` and the bilingual document `example-dual.pdf` in the current directory. Use Google as the default translation service.
 
-Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for how to set environment variables.
+<img src="./docs/images/cmd.explained.png" width="580px"  alt="cmd"/>  
 
-### Full / partial document translation
+In the following table, we list all advanced options for reference:
+
+| Option    | Function | Example |
+| -------- | ------- |------- |
+| `-i`  | [Enter GUI](#gui) |  `pdf2zh -i` |
+| `-p`  | [Partial document translation](#partial) |  `pdf2zh example.pdf -p 1` |
+| `-li` | [Source language](#languages) |  `pdf2zh example.pdf -li en` |
+| `-lo` | [Target language](#languages) |  `pdf2zh example.pdf -lo zh` |
+| `-s`  | [Translation service](#services) |  `pdf2zh example.pdf -s deepl` |
+| `-t`  | [Multi-threads](#threads) | `pdf2zh example.pdf -t 1` |
+| `-f`, `-c` | [Exceptions](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
+
+Some services require setting environmental variables. Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for how to set environment variables.
+
+<h3 id="partial">Full / partial document translation</h3>
 
 - Entire document
 
@@ -59,7 +137,7 @@ Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57c
   pdf2zh example.pdf -p 1-3,5
   ```
 
-### Specify source and target languages
+<h3 id="language">Specify source and target languages</h3>
 
 See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages)
 
@@ -67,7 +145,7 @@ See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v
 pdf2zh example.pdf -li en -lo ja
 ```
 
-### Translate with Different Services
+<h3 id="services">Translate with Different Services</h3>
 
 - **DeepL**
 
@@ -129,7 +207,7 @@ pdf2zh example.pdf -li en -lo ja
   pdf2zh example.pdf -s azure
   ```
 
-### Translate wih exceptions
+<h3 id="exceptions">Translate wih exceptions</h3>
 
 Use regex to specify formula fonts and characters that need to be preserved.
 
@@ -137,30 +215,15 @@ Use regex to specify formula fonts and characters that need to be preserved.
 pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
 ```
 
-### Interact with GUI
+<h3 id="threads">Specify threads</h3>
 
-<img src="./docs/images/before.png" width="500"/>
+Use `-t` to specify how many threads to use in translation:
 
 ```bash
-pdf2zh -i
-```
-
-See [documentation for GUI](./docs/README_GUI.md) for more details.
-
-### Docker
-
-```bash
-docker pull byaidu/pdf2zh
-docker run -p 7860:7860 byaidu/pdf2zh
-```
-
-Open in browser:
-
-```
-http://localhost:7860/
+pdf2zh example.pdf -t 1
 ```
 
-## Preview
+<h2 id="preview">Preview</h2>
 
 ![image](https://github.com/user-attachments/assets/57e1cde6-c647-4af8-8f8f-587a40050dde)
 
@@ -168,27 +231,27 @@ http://localhost:7860/
 
 ![image](https://github.com/user-attachments/assets/5fe6af83-2f5b-47b1-9dd1-4aee6bc409de)
 
-## Acknowledgement
+<h2 id="acknowledgement">Acknowledgements</h2>
 
-Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
+- Document merging: [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
 
-Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
+- Document parsing: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
 
-Document extraction: [MinerU](https://github.com/opendatalab/MinerU)
+- Document extraction: [MinerU](https://github.com/opendatalab/MinerU)
 
-Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
+- Multi-threaded translation: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
 
-Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
+- Layout parsing: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
 
-Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
+- Document standard: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
 
-## Contributors
+<h2 id="contrib">Contributors</h2>
 
 <a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
   <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
 </a>
 
-## Star History
+<h2 id="star_hist">Star History</h2>
 
 <a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
  <picture>

+ 108 - 45
README_zh-CN.md

@@ -4,7 +4,7 @@
 
 <img src="./docs/images/banner.png" width="320px"  alt="PDF2ZH"/>  
 
-## PDFMathTranslate
+<h2 id="title">PDFMathTranslate</h2>
 
 <p>
   <!-- PyPI -->
@@ -15,37 +15,115 @@
   <!-- License -->
   <a href="./LICENSE">
     <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
+  <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
+    <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-yellow"/></a>
   <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
     <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a>
 </p>
 
 </div>
 
-PDF 文档翻译及双语对照工具
+科学 PDF 文档翻译及双语对照工具
 
-- 📊 保留公式和图表
+- 📊 保留公式、图表、目录和注释 *([预览效果](#preview))*
+- 🌐 支持 [多种语言](#language) 和 [诸多翻译服务](#services)
+- 🤖 提供 [命令行工具](#usage),[图形交互界面](#gui),以及 [容器化部署](#docker)
 
-- 📄 保留可索引目录
+欢迎在 [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) 或 [Telegram 用户群](https://t.me/+Z9_SgnxmsmA5NzBl) 中提供反馈。
 
-- 🌐 支持多种翻译服务
+<h2 id="updates">近期更新</h2>
 
-欢迎在 [issues](https://github.com/Byaidu/PDFMathTranslate/issues) 或 [用户群](https://t.me/+Z9_SgnxmsmA5NzBl) 中提供反馈
+- [Nov. 20 2024] 🌟 提供了 [在线演示](#demo)!
+- [Nov. 20 2024] 支持 [容器化部署](#docker)
+- [Nov. 20 2024] 支持速度更快的 [多线程翻译](#threads)
+- [Nov. 19 2024] 提供了[图形用户界面](#gui)
+- [Nov. 18 2024] 支持更多翻译服务,包含 [DeepL, DeepLX, 和 Azure](#services)
 
-## 安装
+<h2 id="demo">在线演示 🌟</h2>
 
-要求 Python 版本 >=3.8, <=3.12
+你可以立即尝试 [在 HuggingFace 上的在线演示](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) 而无需安装.  
+请注意,演示的计算资源有限,因此请避免滥用。
 
-```bash
-pip install pdf2zh
-```
+<h2 id="install">安装和使用</h2>
+
+我们提供了三种使用该项目的方法:[命令行工具](#cmd)、[图形交互界面](#gui) 和 [容器化部署](#docker).
+
+<h3 id="cmd">方法一、命令行工具</h3>
+
+  1. 确保安装了版本大于 3.8 且小于 3.12 的 Python
+  2. 安装此程序:
+
+      ```bash
+      pip install pdf2zh
+      ```
+
+  3. 开始使用:
+
+      ```bash
+      pdf2zh document.pdf
+      ```
+
+<h3 id="gui">方法二、图形交互界面</h3>
+
+1. 确保安装了版本大于 3.8 且小于 3.12 的 Python
+2. 安装此程序:
+
+      ```bash
+      pip install pdf2zh
+      ```
+
+3. 开始在浏览器中使用:
+
+      ```bash
+      pdf2zh -i
+      ```
+
+4. 如果您的浏览器没有自动启动并跳转,请用浏览器打开:
+
+    ```bash
+    http://localhost:7860/
+    ```
+
+<img src="./docs/images/before.png" width="500"/>
+
+查看 [documentation for GUI](./docs/README_GUI.md) 获取细节说明.
 
-## 使用
+<h3 id="docker">方法三、容器化部署</h3>
+
+1. 拉取 Docker 镜像并运行:
+
+    ```bash
+    docker pull byaidu/pdf2zh
+    docker run -p 7860:7860 byaidu/pdf2zh
+    ```
+
+2. 通过浏览器打开:
+
+    ```
+    http://localhost:7860/
+    ```
+
+<h2 id="usage">高级选项</h2>
 
 在命令行中执行翻译命令,生成译文文档 `example-zh.pdf` 和双语对照文档 `example-dual.pdf`,默认使用 Google 翻译服务
 
-关于设置环境变量的详细说明,请参考 [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4)
+<img src="./docs/images/cmd.explained.png" width="580px"  alt="cmd"/>  
 
-### 全文或部分文档翻译
+我们在下表中列出了所有高级选项,以供参考:
+
+| Option    | Function | Example |
+| -------- | ------- |------- |
+| `-i`  | [进入图形界面](#gui) |  `pdf2zh -i` |
+| `-p`  | [仅翻译部分文档](#partial) |  `pdf2zh example.pdf -p 1` |
+| `-li` | [源语言](#languages) |  `pdf2zh example.pdf -li en` |
+| `-lo` | [目标语言](#languages) |  `pdf2zh example.pdf -lo zh` |
+| `-s`  | [指定翻译服务](#services) |  `pdf2zh example.pdf -s deepl` |
+| `-t`  | [多线程](#threads) | `pdf2zh example.pdf -t 1` |
+| `-f`, `-c` | [例外规则](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
+
+某些服务需要设置环境变量。关于设置环境变量的详细说明,请参考 [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4)
+
+<h3 id="partial">全文或部分文档翻译</h3>
 
 - **全文翻译**
 
@@ -59,7 +137,7 @@ pdf2zh example.pdf
 pdf2zh example.pdf -p 1-3,5
 ```
 
-### 指定源语言和目标语言
+<h3 id="language">指定源语言和目标语言</h3>
 
 参考 [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages)
 
@@ -67,7 +145,7 @@ pdf2zh example.pdf -p 1-3,5
 pdf2zh example.pdf -li en -lo ja
 ```
 
-### 使用不同的翻译服务
+<h3 id="services">使用不同的翻译服务</h3>
 
 - **DeepL**
 
@@ -134,7 +212,7 @@ pdf2zh example.pdf -s openai:gpt-4o
 pdf2zh example.pdf -s azure
 ```
 
-### 指定例外规则
+<h3 id="exceptions">指定例外规则</h3>
 
 使用正则表达式指定需保留的公式字体与字符
 
@@ -142,30 +220,15 @@ pdf2zh example.pdf -s azure
 pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
 ```
 
-### 图形化交互界面
-
-<img src="./docs/images/before.png" width="500"/>
-
-```bash
-pdf2zh -i
-```
-
-详见 [GUI 文档](./docs/README_GUI.md)
+<h3 id="threads">指定线程数量</h3>
 
-### Docker
+使用 `-t` 指定翻译时使用的线程数量:
 
 ```bash
-docker pull byaidu/pdf2zh
-docker run -p 7860:7860 byaidu/pdf2zh
-```
-
-在浏览器中打开:
-
-```
-http://localhost:7860/
+pdf2zh example.pdf -t 1
 ```
 
-## 预览
+<h2 id="preview">预览</h2>
 
 ![image](https://github.com/user-attachments/assets/57e1cde6-c647-4af8-8f8f-587a40050dde)
 
@@ -173,27 +236,27 @@ http://localhost:7860/
 
 ![image](https://github.com/user-attachments/assets/5fe6af83-2f5b-47b1-9dd1-4aee6bc409de)
 
-## 致谢
+<h2 id="acknowledgement">致谢</h2>
 
-文档合并:[PyMuPDF](https://github.com/pymupdf/PyMuPDF)
+- 文档合并:[PyMuPDF](https://github.com/pymupdf/PyMuPDF)
 
-文档解析:[Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
+- 文档解析:[Pdfminer.six](https://github.com/pdfminer/pdfminer.six)
 
-文档提取:[MinerU](https://github.com/opendatalab/MinerU)
+- 文档提取:[MinerU](https://github.com/opendatalab/MinerU)
 
-多线程翻译:[MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
+- 多线程翻译:[MathTranslate](https://github.com/SUSYUSTC/MathTranslate)
 
-布局解析:[DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
+- 布局解析:[DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
 
-文档标准:[PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
+- 文档标准:[PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/)
 
-## 贡献者
+<h2 id="contrib">贡献者</h2>
 
 <a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors">
   <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" />
 </a>
 
-## Star History
+<h2 id="star_hist">星标历史</h2>
 
 <a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date">
  <picture>

BIN
docs/images/after.png


BIN
docs/images/before.png


BIN
docs/images/cmd.explained.png


BIN
docs/images/cmd.explained.zh.png