Explorar el Código

Merge branch 'main' into main

imClumsyPanda hace 11 meses
padre
commit
c3b5a099a8
Se han modificado 12 ficheros con 200 adiciones y 17 borrados
  1. 3 1
      README.md
  2. 42 1
      docs/ADVANCED.md
  3. 1 1
      docs/APIS.md
  4. 7 1
      docs/README_ja-JP.md
  5. 6 1
      docs/README_zh-CN.md
  6. 2 0
      pdf2zh/backend.py
  7. 2 1
      pdf2zh/converter.py
  8. 23 4
      pdf2zh/gui.py
  9. 5 5
      pdf2zh/high_level.py
  10. 66 2
      pdf2zh/pdf2zh.py
  11. 42 0
      pdf2zh/translator.py
  12. 1 0
      pyproject.toml

+ 3 - 1
README.md

@@ -187,8 +187,10 @@ In the following table, we list all advanced options for reference:
 | `-f`, `-c`     | [Exceptions](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#exceptions)                | `pdf2zh example.pdf -f "(MS.*)"`               |
 | `-cp`          | Compatibility Mode                                                                                            | `pdf2zh example.pdf --compatible`              |
 | `--share`      | Public link                                                                                                   | `pdf2zh -i --share`                            |
-| `--authorized` | Authorization                                                                                                 | `pdf2zh -i --authorized users.txt [auth.html]` |
+| `--authorized` | [Authorization](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#auth)                                                                                                 | `pdf2zh -i --authorized users.txt [auth.html]` |
 | `--prompt`     | [Custom Prompt](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#prompt)                 | `pdf2zh --prompt [prompt.txt]`                 |
+| `--onnx` | [Use Custom DocLayout-YOLO ONNX model] | `pdf2zh --onnx [onnx/model/path]` |
+| `--serverport` | [Use Custom WebUI port] | `pdf2zh --serverport 7860` |
 
 For detailed explanations, please refer to our document about [Advanced Usage](./docs/ADVANCED.md) for a full list of each option.
 

+ 42 - 1
docs/ADVANCED.md

@@ -65,6 +65,9 @@ We've provided a detailed table on the required [environment variables](https://
 | **Tencent**          | `tencent`      | `TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`                   | `[Your ID]`, `[Your Key]`                                | See [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)                                                                                                                            |
 | **Dify**             | `dify`         | `DIFY_API_URL`, `DIFY_API_KEY`                                        | `[Your DIFY URL]`, `[Your Key]`                          | See [Dify](https://github.com/langgenius/dify),Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.                                                                 |
 | **AnythingLLM**      | `anythingllm`  | `AnythingLLM_URL`, `AnythingLLM_APIKEY`                               | `[Your AnythingLLM URL]`, `[Your Key]`                   | See [anything-llm](https://github.com/Mintplex-Labs/anything-llm)                                                                                                                                         |
+|**Argos Translate**|`argos`| | |See [argos-translate](https://github.com/argosopentech/argos-translate)|
+
+For large language models that are compatible with the OpenAI API but not listed in the table above, you can set environment variables using the same method outlined for OpenAI in the table.
 
 Use `-s service` or `-s service:model` to specify service:
 
@@ -118,7 +121,7 @@ pdf2zh example.pdf -t 1
 Use `--prompt` to specify which prompt to use in llm:
 
 ```bash
-pdf2zh example.pdf -pr prompt.txt
+pdf2zh example.pdf --prompt prompt.txt
 ```
 
 example prompt.txt
@@ -146,3 +149,41 @@ In custom prompt file, there are three variables can be used.
 [⬆️ Back to top](#toc)
 
 ---
+
+<h3 id="auth">Authorization</h3>
+
+Use `--authorized` to specify which user to use Web UI and custom the login page:
+
+```bash
+pdf2zh example.pdf --authorized users.txt auth.html
+```
+
+example users.txt
+Each line contains two elements, username, and password, separated by a comma.
+
+```
+admin,123456
+user1,password1
+user2,abc123
+guest,guest123
+test,test123
+```
+
+example auth.html
+
+```html
+<!DOCTYPE html>
+<html>
+<head>
+    <title>Simple HTML</title>
+</head>
+<body>
+    <h1>Hello, World!</h1>
+    <p>Welcome to my simple HTML page.</p>
+</body>
+</html>
+```
+
+[⬆️ Back to top](#toc)
+
+---

+ 1 - 1
docs/APIS.md

@@ -1,7 +1,7 @@
 [**Documentation**](https://github.com/Byaidu/PDFMathTranslate) > **API Details** _(current)_
 
 <h2 id="toc">Table of Content</h2>
-The present project supports two types of APIs;
+The present project supports two types of APIs, All methods need the Redis;
 
 - [Functional calls in Python](#api-python)
 - [HTTP protocols](#api-http)

+ 7 - 1
docs/README_ja-JP.md

@@ -175,8 +175,10 @@ Python環境を事前にインストールする必要はありません
 | `-o`  | 出力ディレクトリ | `pdf2zh example.pdf -o output` |
 | `-f`, `-c` | [例外](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
 | `--share` | [gradio公開リンクを取得] | `pdf2zh -i --share` |
-| `--authorized` | [ウェブ認証とカスタム認証ページの追加] | `pdf2zh -i --authorized users.txt [auth.html]` |
+| `--authorized` | [[ウェブ認証とカスタム認証ページの追加](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` |
 | `--prompt` | [カスタムビッグモデルのプロンプトを使用する] | `pdf2zh --prompt [prompt.txt]` |
+| `--onnx` | [カスタムDocLayout-YOLO ONNXモデルの使用] | `pdf2zh --onnx [onnx/model/path]` |
+| `--serverport` | [カスタムWebUIポートを使用する] | `pdf2zh --serverport 7860` |
 
 <h3 id="partial">全文または部分的なドキュメント翻訳</h3>
 
@@ -221,6 +223,10 @@ pdf2zh example.pdf -li en -lo ja
 |**Tencent**|`tencent`|`TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`|`[Your ID]`, `[Your Key]`|See [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)|
 |**Dify**|`dify`|`DIFY_API_URL`, `DIFY_API_KEY`|`[Your DIFY URL]`, `[Your Key]`|See [Dify](https://github.com/langgenius/dify),Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.|
 |**AnythingLLM**|`anythingllm`|`AnythingLLM_URL`, `AnythingLLM_APIKEY`|`[Your AnythingLLM URL]`, `[Your Key]`|See [anything-llm](https://github.com/Mintplex-Labs/anything-llm)|
+|**Argos Translate**|`argos`| | |See [argos-translate](https://github.com/argosopentech/argos-translate)|
+
+(need Japenese translation)
+For large language models that are compatible with the OpenAI API but not listed in the table above, you can set environment variables using the same method outlined for OpenAI in the table.
 
 `-s service` または `-s service:model` を使用してサービスを指定します:
 

+ 6 - 1
docs/README_zh-CN.md

@@ -175,8 +175,10 @@ set HF_ENDPOINT=https://hf-mirror.com
 | `-o`  | 输出目录 | `pdf2zh example.pdf -o output` |
 | `-f`, `-c` | [例外规则](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` |
 | `--share` | [获取 gradio 公开链接] | `pdf2zh -i --share` |
-| `--authorized` | [添加网页认证和自定义认证页] | `pdf2zh -i --authorized users.txt [auth.html]` |
+| `--authorized` | [[添加网页认证和自定义认证页](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` |
 | `--prompt` | [使用自定义的大模型prompt] | `pdf2zh --prompt [prompt.txt]` |
+| `--onnx` | [使用自定义的 DocLayout-YOLO ONNX 模型] | `pdf2zh --onnx [onnx/model/path]` |
+| `--serverport` | [使用自定义的 WebUI 端口] | `pdf2zh --serverport 7860` |
 
 <h3 id="partial">全文或部分文档翻译</h3>
 
@@ -221,6 +223,9 @@ pdf2zh example.pdf -li en -lo ja
 |**Tencent**|`tencent`|`TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`|`[Your ID]`, `[Your Key]`|See [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104)|
 |**Dify**|`dify`|`DIFY_API_URL`, `DIFY_API_KEY`|`[Your DIFY URL]`, `[Your Key]`|See [Dify](https://github.com/langgenius/dify),Three variables, lang_out, lang_in, and text, need to be defined in Dify's workflow input.|
 |**AnythingLLM**|`anythingllm`|`AnythingLLM_URL`, `AnythingLLM_APIKEY`|`[Your AnythingLLM URL]`, `[Your Key]`|See [anything-llm](https://github.com/Mintplex-Labs/anything-llm)|
+|**Argos Translate**|`argos`| | |See [argos-translate](https://github.com/argosopentech/argos-translate)|
+
+对于未在上述表格中的,并且兼容 OpenAI api 的大语言模型,可使用表格中的 OpenAI 的方式进行环境变量的设置。
 
 使用 `-s service` 或 `-s service:model` 指定翻译服务:
 

+ 2 - 0
pdf2zh/backend.py

@@ -6,6 +6,7 @@ from pdf2zh import translate_stream
 import tqdm
 import json
 import io
+from pdf2zh.pdf2zh import model
 
 flask_app = Flask("pdf2zh")
 flask_app.config.from_mapping(
@@ -47,6 +48,7 @@ def translate_task(
     doc_mono, doc_dual = translate_stream(
         stream,
         callback=progress_bar,
+        model=model,
         **args,
     )
     return doc_mono, doc_dual

+ 2 - 1
pdf2zh/converter.py

@@ -35,6 +35,7 @@ from pdf2zh.translator import (
     DifyTranslator,
     AnythingLLMTranslator,
     XinferenceTranslator,
+    ArgosTranslator,
 )
 from pymupdf import Font
 
@@ -150,7 +151,7 @@ class TranslateConverter(PDFConverterEx):
         service_name = param[0]
         service_model = param[1] if len(param) > 1 else None
         for translator in [GoogleTranslator, BingTranslator, DeepLTranslator, DeepLXTranslator, OllamaTranslator, XinferenceTranslator, AzureOpenAITranslator,
-                           OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator]:
+                           OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator, ArgosTranslator]:
             if service_name == translator.name:
                 self.translator = translator(lang_in, lang_out, service_model, envs=envs, prompt=prompt)
         if not self.translator:

+ 23 - 4
pdf2zh/gui.py

@@ -13,6 +13,7 @@ from gradio_pdf import PDF
 
 from pdf2zh import __version__
 from pdf2zh.high_level import translate
+from pdf2zh.pdf2zh import model
 from pdf2zh.translator import (
     AnythingLLMTranslator,
     AzureOpenAITranslator,
@@ -22,6 +23,7 @@ from pdf2zh.translator import (
     DeepLTranslator,
     DeepLXTranslator,
     DifyTranslator,
+    ArgosTranslator,
     GeminiTranslator,
     GoogleTranslator,
     ModelScopeTranslator,
@@ -51,6 +53,7 @@ service_map: dict[str, BaseTranslator] = {
     "Tencent": TencentTranslator,
     "Dify": DifyTranslator,
     "AnythingLLM": AnythingLLMTranslator,
+    "Argos Translate": ArgosTranslator,
 }
 
 # The following variables associate strings with specific languages
@@ -265,6 +268,7 @@ def translate_file(
         "cancellation_event": cancellation_event_map[session_id],
         "envs": _envs,
         "prompt": prompt,
+        "model": model,
     }
     try:
         translate(**param)
@@ -587,7 +591,9 @@ def parse_user_passwd(file_path: str) -> tuple:
     return tuple_list, content
 
 
-def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
+def setup_gui(
+    share: bool = False, auth_file: list = ["", ""], server_port=7860
+) -> None:
     """
     Setup the GUI with the given parameters.
 
@@ -605,7 +611,11 @@ def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
         if len(user_list) == 0:
             try:
                 demo.launch(
-                    server_name="0.0.0.0", debug=True, inbrowser=True, share=share
+                    server_name="0.0.0.0",
+                    debug=True,
+                    inbrowser=True,
+                    share=share,
+                    server_port=server_port,
                 )
             except Exception:
                 print(
@@ -613,13 +623,19 @@ def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
                 )
                 try:
                     demo.launch(
-                        server_name="127.0.0.1", debug=True, inbrowser=True, share=share
+                        server_name="127.0.0.1",
+                        debug=True,
+                        inbrowser=True,
+                        share=share,
+                        server_port=server_port,
                     )
                 except Exception:
                     print(
                         "Error launching GUI using 127.0.0.1.\nThis may be caused by global mode of proxy software."
                     )
-                    demo.launch(debug=True, inbrowser=True, share=True)
+                    demo.launch(
+                        debug=True, inbrowser=True, share=True, server_port=server_port
+                    )
         else:
             try:
                 demo.launch(
@@ -629,6 +645,7 @@ def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
                     share=share,
                     auth=user_list,
                     auth_message=html,
+                    server_port=server_port,
                 )
             except Exception:
                 print(
@@ -642,6 +659,7 @@ def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
                         share=share,
                         auth=user_list,
                         auth_message=html,
+                        server_port=server_port,
                     )
                 except Exception:
                     print(
@@ -653,6 +671,7 @@ def setup_gui(share: bool = False, auth_file: list = ["", ""]) -> None:
                         share=True,
                         auth=user_list,
                         auth_message=html,
+                        server_port=server_port,
                     )
 
 

+ 5 - 5
pdf2zh/high_level.py

@@ -21,11 +21,9 @@ from pdfminer.pdfparser import PDFParser
 from pymupdf import Document, Font
 
 from pdf2zh.converter import TranslateConverter
-from pdf2zh.doclayout import DocLayoutModel
+from pdf2zh.doclayout import OnnxModel
 from pdf2zh.pdfinterp import PDFPageInterpreterEx
 
-model = DocLayoutModel.load_available()
-
 resfont_map = {
     "zh-cn": "china-ss",
     "zh-tw": "china-ts",
@@ -88,6 +86,7 @@ def translate_patch(
     noto: Font = None,
     callback: object = None,
     cancellation_event: asyncio.Event = None,
+    model: OnnxModel = None,
     **kwarg: Any,
 ) -> None:
     rsrcmgr = PDFResourceManager()
@@ -179,6 +178,7 @@ def translate_stream(
     vchar: str = "",
     callback: object = None,
     cancellation_event: asyncio.Event = None,
+    model: OnnxModel = None,
     **kwarg: Any,
 ):
     font_list = [("tiro", None)]
@@ -234,7 +234,7 @@ def translate_stream(
 
     fp = io.BytesIO()
     doc_zh.save(fp)
-    obj_patch: dict = translate_patch(fp, prompt=kwarg["prompt"], **locals())
+    obj_patch: dict = translate_patch(fp, **locals())
 
     for obj_id, ops_new in obj_patch.items():
         # ops_old=doc_en.xref_stream(obj_id)
@@ -312,6 +312,7 @@ def translate(
     callback: object = None,
     compatible: bool = False,
     cancellation_event: asyncio.Event = None,
+    model: OnnxModel = None,
     **kwarg: Any,
 ):
     if not files:
@@ -364,7 +365,6 @@ def translate(
 
         if file.startswith(tempfile.gettempdir()):
             os.unlink(file)
-
         s_mono, s_dual = translate_stream(
             s_raw,
             envs=kwarg.get("envs", {}),

+ 66 - 2
pdf2zh/pdf2zh.py

@@ -13,6 +13,8 @@ from typing import List, Optional
 
 from pdf2zh import __version__, log
 from pdf2zh.high_level import translate
+from pdf2zh.doclayout import OnnxModel
+import os
 
 
 def create_parser() -> argparse.ArgumentParser:
@@ -136,6 +138,24 @@ def create_parser() -> argparse.ArgumentParser:
         help="Convert the PDF file into PDF/A format to improve compatibility.",
     )
 
+    parse_params.add_argument(
+        "--onnx",
+        type=str,
+        help="custom onnx model path.",
+    )
+
+    parse_params.add_argument(
+        "--serverport",
+        type=int,
+        help="custom WebUI port.",
+    )
+
+    parse_params.add_argument(
+        "--dir",
+        action="store_true",
+        help="translate directory.",
+    )
+
     return parser
 
 
@@ -155,6 +175,33 @@ def parse_args(args: Optional[List[str]]) -> argparse.Namespace:
     return parsed_args
 
 
+def find_all_files_in_directory(directory_path):
+    """
+    Recursively search all PDF files in the given directory and return their paths as a list.
+
+    :param directory_path: str, the path to the directory to search
+    :return: list of PDF file paths
+    """
+    # Check if the provided path is a directory
+    if not os.path.isdir(directory_path):
+        raise ValueError(f"The provided path '{directory_path}' is not a directory.")
+
+    file_paths = []
+
+    # Walk through the directory recursively
+    for root, _, files in os.walk(directory_path):
+        for file in files:
+            # Check if the file is a PDF
+            if file.lower().endswith(".pdf"):
+                # Append the full file path to the list
+                file_paths.append(os.path.join(root, file))
+
+    return file_paths
+
+
+model = None
+
+
 def main(args: Optional[List[str]] = None) -> int:
     logging.basicConfig()
 
@@ -162,11 +209,21 @@ def main(args: Optional[List[str]] = None) -> int:
 
     if parsed_args.debug:
         log.setLevel(logging.DEBUG)
+    global model
+    if parsed_args.onnx:
+        model = OnnxModel(parsed_args.onnx)
+    else:
+        model = OnnxModel.load_available()
 
     if parsed_args.interactive:
         from pdf2zh.gui import setup_gui
 
-        setup_gui(parsed_args.share, parsed_args.authorized)
+        if parsed_args.serverport:
+            setup_gui(
+                parsed_args.share, parsed_args.authorized, int(parsed_args.serverport)
+            )
+        else:
+            setup_gui(parsed_args.share, parsed_args.authorized)
         return 0
 
     if parsed_args.flask:
@@ -189,7 +246,14 @@ def main(args: Optional[List[str]] = None) -> int:
         except Exception:
             raise ValueError("prompt error.")
 
-    translate(**vars(parsed_args))
+    if parsed_args.dir:
+        untranlate_file = find_all_files_in_directory(parsed_args.files[0])
+        parsed_args.files = untranlate_file
+        print(parsed_args)
+        translate(model=model, **vars(parsed_args))
+        return 0
+    # print(parsed_args)
+    translate(model=model, **vars(parsed_args))
     return 0
 
 

+ 42 - 0
pdf2zh/translator.py

@@ -16,6 +16,8 @@ from tencentcloud.common import credential
 from tencentcloud.tmt.v20180321.tmt_client import TmtClient
 from tencentcloud.tmt.v20180321.models import TextTranslateRequest
 from tencentcloud.tmt.v20180321.models import TextTranslateResponse
+import argostranslate.package
+import argostranslate.translate
 
 import json
 
@@ -647,3 +649,43 @@ class DifyTranslator(BaseTranslator):
 
         # 解析响应
         return response_data.get("data", {}).get("outputs", {}).get("text", [])
+
+
+class ArgosTranslator(BaseTranslator):
+    name = "argos"
+
+    def __init__(self, lang_in, lang_out, model, **kwargs):
+        super().__init__(lang_in, lang_out, model)
+        lang_in = self.lang_map.get(lang_in.lower(), lang_in)
+        lang_out = self.lang_map.get(lang_out.lower(), lang_out)
+        self.lang_in = lang_in
+        self.lang_out = lang_out
+        argostranslate.package.update_package_index()
+        available_packages = argostranslate.package.get_available_packages()
+        try:
+            available_package = list(
+                filter(
+                    lambda x: x.from_code == self.lang_in
+                    and x.to_code == self.lang_out,
+                    available_packages,
+                )
+            )[0]
+        except Exception:
+            raise ValueError(
+                "lang_in and lang_out pair not supported by Argos Translate."
+            )
+        download_path = available_package.download()
+        argostranslate.package.install_from_path(download_path)
+
+    def translate(self, text):
+        # Translate
+        installed_languages = argostranslate.translate.get_installed_languages()
+        from_lang = list(filter(lambda x: x.code == self.lang_in, installed_languages))[
+            0
+        ]
+        to_lang = list(filter(lambda x: x.code == self.lang_out, installed_languages))[
+            0
+        ]
+        translation = from_lang.get_translation(to_lang)
+        translatedText = translation.translate(text)
+        return translatedText

+ 1 - 0
pyproject.toml

@@ -30,6 +30,7 @@ dependencies = [
     "gradio_pdf>=0.0.21",
     "pikepdf",
     "peewee>=3.17.8",
+    "argostranslate",
 ]
 
 [project.optional-dependencies]