Explorar o código

Merge pull request #258 from tastelikefeet/feat/fix-token-leak

Fix token leakage
Byaidu hai 1 ano
pai
achega
d21009a643
Modificáronse 8 ficheiros con 126 adicións e 49 borrados
  1. 9 2
      README.md
  2. 9 1
      README_ja-JP.md
  3. 8 1
      README_zh-CN.md
  4. 4 1
      pdf2zh/converter.py
  5. 13 1
      pdf2zh/doclayout.py
  6. 3 2
      pdf2zh/gui.py
  7. 13 3
      pdf2zh/high_level.py
  8. 67 38
      pdf2zh/translator.py

+ 9 - 2
README.md

@@ -19,6 +19,8 @@ English | [简体中文](README_zh-CN.md) | [日本語](README_ja-JP.md)
     <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"></a>
   <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
     <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"></a>
+  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
+    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
   <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
     <img src="https://img.shields.io/badge/contributions-welcome-green"></a>
   <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
@@ -61,15 +63,20 @@ Feel free to provide feedback in [GitHub Issues](https://github.com/Byaidu/PDFMa
 
 You can try our [public service](https://pdf2zh.com/) online without installation.  
 
-### Hugging Face Demo
+### Demos
 
-You can try [our demo on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) without installation.
+You can try [our demo on HuggingFace](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) or [our demo on ModelScope](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) without installation.
 Note that the computing resources of the demo are limited, so please avoid abusing them.
 
 <h2 id="install">Installation and Usage</h2>
 
 We provide four methods for using this project: [Commandline](#cmd), [Portable](#portable), [GUI](#gui), and [Docker](#docker).
 
+pdf2zh needs an extra model(`wybxc/DocLayout-YOLO-DocStructBench-onnx`), which can be found in modelscope. if you have a problem with downloading this model, try this environment variable:
+```shell
+USE_MODELSCOPE=1 pdf2zh
+```
+
 <h3 id="cmd">Method I. Commandline</h3>
 
   1. Python installed (3.8 <= version <= 3.12)

+ 9 - 1
README_ja-JP.md

@@ -19,6 +19,8 @@
     <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
   <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
     <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
+  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
+    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
   <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
     <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
   <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
@@ -63,13 +65,19 @@
 
 ### Hugging Face デモ
 
-インストールなしで [HuggingFace上のデモ](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) を試すことができます。
+インストールなしで [HuggingFace上のデモ](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker), [ModelScope上のデモ](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate) を試すことができます。
 デモの計算リソースは限られているため、乱用しないようにしてください。
 
 <h2 id="install">インストールと使用方法</h2>
 
 このプロジェクトを使用するための4つの方法を提供しています:[コマンドライン](#cmd)、[ポータブル](#portable)、[GUI](#gui)、および [Docker](#docker)。
 
+pdf2zhの実行には追加モデル(`wybxc/DocLayout-YOLO-DocStructBench-onnx`)が必要です。このモデルはModelScopeでも見つけることができます。起動時にこのモデルのダウンロードに問題がある場合は、以下の環境変数を使用してください:
+
+```shell
+USE_MODELSCOPE=1 pdf2zh
+```
+
 <h3 id="cmd">方法1. コマンドライン</h3>
 
   1. Pythonがインストールされていること (バージョン3.8 <= バージョン <= 3.12)

+ 8 - 1
README_zh-CN.md

@@ -19,6 +19,8 @@
     <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a>
   <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker">
     <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a>
+  <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate">
+    <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a>
   <a href="https://github.com/Byaidu/PDFMathTranslate/pulls">
     <img src="https://img.shields.io/badge/contributions-welcome-green"/></a>
   <a href="https://t.me/+Z9_SgnxmsmA5NzBl">
@@ -63,13 +65,18 @@
 
 ### Hugging Face 在线演示
 
-你可以立即尝试 [在 HuggingFace 上的在线演示](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker) 而无需安装
+你可以立即尝试 [在 HuggingFace 上的在线演示](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)和[魔搭的在线演示](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate)而无需安装
 请注意,演示的计算资源有限,因此请避免滥用
 
 <h2 id="install">安装和使用</h2>
 
 我们提供了四种使用该项目的方法:[命令行工具](#cmd)、[便携式安装](#portable)、[图形交互界面](#gui) 和 [容器化部署](#docker).
 
+pdf2zh的运行依赖于额外模型(`wybxc/DocLayout-YOLO-DocStructBench-onnx`),该模型在魔搭上也可以找到。如果你在启动时下载该模型遇到问题,请使用如下环境变量:
+```shell
+USE_MODELSCOPE=1 pdf2zh
+```
+
 <h3 id="cmd">方法一、命令行工具</h3>
 
   1. 确保安装了版本大于 3.8 且小于 3.12 的 Python

+ 4 - 1
pdf2zh/converter.py

@@ -1,3 +1,5 @@
+from typing import Dict
+
 from pdfminer.pdfinterp import PDFGraphicState, PDFResourceManager
 from pdfminer.pdffont import PDFCIDFont
 from pdfminer.converter import PDFConverter
@@ -133,6 +135,7 @@ class TranslateConverter(PDFConverterEx):
         service: str = "",
         resfont: str = "",
         noto: Font = None,
+        envs: Dict = None,
     ) -> None:
         super().__init__(rsrcmgr)
         self.vfont = vfont
@@ -148,7 +151,7 @@ class TranslateConverter(PDFConverterEx):
         for translator in [GoogleTranslator, BingTranslator, DeepLTranslator, DeepLXTranslator, OllamaTranslator, AzureOpenAITranslator,
                            OpenAITranslator, ZhipuTranslator, ModelScopeTranslator, SiliconTranslator, GeminiTranslator, AzureTranslator, TencentTranslator, DifyTranslator, AnythingLLMTranslator]:
             if service_name == translator.name:
-                self.translator = translator(lang_in, lang_out, service_model)
+                self.translator = translator(lang_in, lang_out, service_model, envs=envs)
         if not self.translator:
             raise ValueError("Unsupported translation service")
 

+ 13 - 1
pdf2zh/doclayout.py

@@ -1,4 +1,6 @@
 import abc
+import os.path
+
 import cv2
 import numpy as np
 import ast
@@ -70,7 +72,17 @@ class OnnxModel(DocLayoutModel):
 
     @staticmethod
     def from_pretrained(repo_id: str, filename: str):
-        pth = hf_hub_download(repo_id=repo_id, filename=filename, etag_timeout=1)
+        if os.environ.get("USE_MODELSCOPE", "0") == "1":
+            repo_mapping = {
+                # Edit here to add more models
+                "wybxc/DocLayout-YOLO-DocStructBench-onnx": "AI-ModelScope/DocLayout-YOLO-DocStructBench-onnx"
+            }
+            from modelscope import snapshot_download
+
+            model_dir = snapshot_download(repo_mapping[repo_id])
+            pth = os.path.join(model_dir, filename)
+        else:
+            pth = hf_hub_download(repo_id=repo_id, filename=filename, etag_timeout=1)
         return OnnxModel(pth)
 
     @property

+ 3 - 2
pdf2zh/gui.py

@@ -165,8 +165,9 @@ def translate_file(
     lang_from = lang_map[lang_from]
     lang_to = lang_map[lang_to]
 
+    _envs = {}
     for i, env in enumerate(translator.envs.items()):
-        os.environ[env[0]] = envs[i]
+        _envs[env[0]] = envs[i]
 
     print(f"Files before translation: {os.listdir(output)}")
 
@@ -183,8 +184,8 @@ def translate_file(
         "thread": 4,
         "callback": progress_bar,
         "cancellation_event": cancellation_event_map[session_id],
+        "envs": _envs,
     }
-    print(param)
     try:
         translate(**param)
     except CancelledError:

+ 13 - 3
pdf2zh/high_level.py

@@ -92,7 +92,17 @@ def translate_patch(
     rsrcmgr = PDFResourceManager()
     layout = {}
     device = TranslateConverter(
-        rsrcmgr, vfont, vchar, thread, layout, lang_in, lang_out, service, resfont, noto
+        rsrcmgr,
+        vfont,
+        vchar,
+        thread,
+        layout,
+        lang_in,
+        lang_out,
+        service,
+        resfont,
+        noto,
+        kwarg.get("envs", {}),
     )
 
     assert device is not None
@@ -216,7 +226,7 @@ def translate_stream(
 
     fp = io.BytesIO()
     doc_zh.save(fp)
-    obj_patch: dict = translate_patch(fp, **locals())
+    obj_patch: dict = translate_patch(fp, envs=kwarg["envs"], **locals())
 
     for obj_id, ops_new in obj_patch.items():
         # ops_old=doc_en.xref_stream(obj_id)
@@ -282,7 +292,7 @@ def translate(
 
         doc_raw = open(file, "rb")
         s_raw = doc_raw.read()
-        s_mono, s_dual = translate_stream(s_raw, **locals())
+        s_mono, s_dual = translate_stream(s_raw, envs=kwarg["envs"], **locals())
         file_mono = Path(output) / f"{filename}-mono.pdf"
         file_dual = Path(output) / f"{filename}-dual.pdf"
         doc_mono = open(file_mono, "wb")

+ 67 - 38
pdf2zh/translator.py

@@ -3,7 +3,7 @@ import logging
 import os
 import re
 import unicodedata
-
+from copy import copy
 import deepl
 import ollama
 import openai
@@ -34,6 +34,18 @@ class BaseTranslator:
         self.lang_out = lang_out
         self.model = model
 
+    def set_envs(self, envs):
+        # Detach from self.__class__.envs
+        # Cannot use self.envs = copy(self.__class__.envs)
+        # because if set_envs called twice, the second call will override the first call
+        self.envs = copy(self.envs)
+        for key in self.envs:
+            if key in os.environ:
+                self.envs[key] = os.environ[key]
+        if envs is not None:
+            for key in envs:
+                self.envs[key] = envs[key]
+
     def translate(self, text):
         pass
 
@@ -57,7 +69,7 @@ class GoogleTranslator(BaseTranslator):
     name = "google"
     lang_map = {"zh": "zh-CN"}
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, **kwargs):
         super().__init__(lang_in, lang_out, model)
         self.session = requests.Session()
         self.endpoint = "http://translate.google.com/m"
@@ -88,7 +100,7 @@ class BingTranslator(BaseTranslator):
     name = "bing"
     lang_map = {"zh": "zh-Hans"}
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, **kwargs):
         super().__init__(lang_in, lang_out, model)
         self.session = requests.Session()
         self.endpoint = "https://www.bing.com/translator"
@@ -133,9 +145,10 @@ class DeepLTranslator(BaseTranslator):
     }
     lang_map = {"zh": "zh-Hans"}
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_in, lang_out, model)
-        auth_key = os.getenv("DEEPL_AUTH_KEY")
+        auth_key = self.envs["DEEPL_AUTH_KEY"]
         self.client = deepl.Translator(auth_key)
 
     def translate(self, text):
@@ -153,9 +166,10 @@ class DeepLXTranslator(BaseTranslator):
     }
     lang_map = {"zh": "zh-Hans"}
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_in, lang_out, model)
-        self.endpoint = os.getenv("DEEPLX_ENDPOINT", self.envs["DEEPLX_ENDPOINT"])
+        self.endpoint = self.envs["DEEPLX_ENDPOINT"]
         self.session = requests.Session()
 
     def translate(self, text):
@@ -179,9 +193,10 @@ class OllamaTranslator(BaseTranslator):
         "OLLAMA_MODEL": "gemma2",
     }
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         if not model:
-            model = os.getenv("OLLAMA_MODEL", self.envs["OLLAMA_MODEL"])
+            model = self.envs["OLLAMA_MODEL"]
         super().__init__(lang_in, lang_out, model)
         self.options = {"temperature": 0}  # 随机采样可能会打断公式标记
         self.client = ollama.Client()
@@ -204,9 +219,12 @@ class OpenAITranslator(BaseTranslator):
         "OPENAI_MODEL": "gpt-4o-mini",
     }
 
-    def __init__(self, lang_in, lang_out, model, base_url=None, api_key=None):
+    def __init__(
+        self, lang_in, lang_out, model, base_url=None, api_key=None, envs=None
+    ):
+        self.set_envs(envs)
         if not model:
-            model = os.getenv("OPENAI_MODEL", self.envs["OPENAI_MODEL"])
+            model = self.envs["OPENAI_MODEL"]
         super().__init__(lang_in, lang_out, model)
         self.options = {"temperature": 0}  # 随机采样可能会打断公式标记
         self.client = openai.OpenAI(base_url=base_url, api_key=api_key)
@@ -228,12 +246,13 @@ class AzureOpenAITranslator(BaseTranslator):
         "AZURE_OPENAI_MODEL": "gpt-4o-mini",
     }
 
-    def __init__(self, lang_in, lang_out, model, base_url=None, api_key=None):
-        base_url = os.getenv(
-            "AZURE_OPENAI_BASE_URL", self.envs["AZURE_OPENAI_BASE_URL"]
-        )
+    def __init__(
+        self, lang_in, lang_out, model, base_url=None, api_key=None, envs=None
+    ):
+        self.set_envs(envs)
+        base_url = self.envs["AZURE_OPENAI_BASE_URL"]
         if not model:
-            model = os.getenv("AZURE_OPENAI_MODEL", self.envs["AZURE_OPENAI_MODEL"])
+            model = self.envs["AZURE_OPENAI_MODEL"]
         super().__init__(lang_in, lang_out, model)
         self.options = {"temperature": 0}
         self.client = openai.AzureOpenAI(
@@ -257,14 +276,17 @@ class ModelScopeTranslator(OpenAITranslator):
     envs = {
         "MODELSCOPE_BASE_URL": "https://api-inference.modelscope.cn/v1",
         "MODELSCOPE_API_KEY": None,
-        "MODELSCOPE_MODEL": "Qwen/Qwen2.5-Coder-32B-Instruct",
+        "MODELSCOPE_MODEL": "Qwen/Qwen2.5-32B-Instruct",
     }
 
-    def __init__(self, lang_in, lang_out, model, base_url=None, api_key=None):
+    def __init__(
+        self, lang_in, lang_out, model, base_url=None, api_key=None, envs=None
+    ):
+        self.set_envs(envs)
         base_url = "https://api-inference.modelscope.cn/v1"
-        api_key = os.getenv("MODELSCOPE_API_KEY")
+        api_key = self.envs["MODELSCOPE_API_KEY"]
         if not model:
-            model = os.getenv("MODELSCOPE_MODEL", self.envs["MODELSCOPE_MODEL"])
+            model = self.envs["MODELSCOPE_MODEL"]
         super().__init__(lang_in, lang_out, model, base_url=base_url, api_key=api_key)
 
 
@@ -276,11 +298,12 @@ class ZhipuTranslator(OpenAITranslator):
         "ZHIPU_MODEL": "glm-4-flash",
     }
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         base_url = "https://open.bigmodel.cn/api/paas/v4"
-        api_key = os.getenv("ZHIPU_API_KEY")
+        api_key = self.envs["ZHIPU_API_KEY"]
         if not model:
-            model = os.getenv("ZHIPU_MODEL", self.envs["ZHIPU_MODEL"])
+            model = self.envs["ZHIPU_MODEL"]
         super().__init__(lang_in, lang_out, model, base_url=base_url, api_key=api_key)
 
     def translate(self, text) -> str:
@@ -308,11 +331,12 @@ class SiliconTranslator(OpenAITranslator):
         "SILICON_MODEL": "Qwen/Qwen2.5-7B-Instruct",
     }
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         base_url = "https://api.siliconflow.cn/v1"
-        api_key = os.getenv("SILICON_API_KEY")
+        api_key = self.envs["SILICON_API_KEY"]
         if not model:
-            model = os.getenv("SILICON_MODEL", self.envs["SILICON_MODEL"])
+            model = self.envs["SILICON_MODEL"]
         super().__init__(lang_in, lang_out, model, base_url=base_url, api_key=api_key)
 
 
@@ -324,11 +348,12 @@ class GeminiTranslator(OpenAITranslator):
         "GEMINI_MODEL": "gemini-1.5-flash",
     }
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         base_url = "https://generativelanguage.googleapis.com/v1beta/openai/"
-        api_key = os.getenv("GEMINI_API_KEY")
+        api_key = self.envs["GEMINI_API_KEY"]
         if not model:
-            model = os.getenv("GEMINI_MODEL", self.envs["GEMINI_MODEL"])
+            model = self.envs["GEMINI_MODEL"]
         super().__init__(lang_in, lang_out, model, base_url=base_url, api_key=api_key)
 
 
@@ -341,9 +366,10 @@ class AzureTranslator(BaseTranslator):
     }
     lang_map = {"zh": "zh-Hans"}
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_in, lang_out, model)
-        endpoint = os.getenv("AZURE_ENDPOINT", self.envs["AZURE_ENDPOINT"])
+        endpoint = self.envs["AZURE_ENDPOINT"]
         api_key = os.getenv("AZURE_API_KEY")
         credential = AzureKeyCredential(api_key)
         self.client = TextTranslationClient(
@@ -371,7 +397,8 @@ class TencentTranslator(BaseTranslator):
         "TENCENTCLOUD_SECRET_KEY": None,
     }
 
-    def __init__(self, lang_in, lang_out, model):
+    def __init__(self, lang_in, lang_out, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_in, lang_out, model)
         cred = credential.DefaultCredentialProvider().get_credential()
         self.client = TmtClient(cred, "ap-beijing")
@@ -393,10 +420,11 @@ class AnythingLLMTranslator(BaseTranslator):
         "AnythingLLM_APIKEY": "api_key",
     }
 
-    def __init__(self, lang_out, lang_in, model):
+    def __init__(self, lang_out, lang_in, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_out, lang_in, model)
-        self.api_url = os.getenv("AnythingLLM_URL", self.envs["AnythingLLM_URL"])
-        self.api_key = os.getenv("AnythingLLM_APIKEY", self.envs["AnythingLLM_APIKEY"])
+        self.api_url = self.envs["AnythingLLM_URL"]
+        self.api_key = self.envs["AnythingLLM_APIKEY"]
         self.headers = {
             "accept": "application/json",
             "Authorization": f"Bearer {self.api_key}",
@@ -428,10 +456,11 @@ class DifyTranslator(BaseTranslator):
         "DIFY_API_KEY": "api_key",  # 替换为实际 API 密钥
     }
 
-    def __init__(self, lang_out, lang_in, model):
+    def __init__(self, lang_out, lang_in, model, envs=None):
+        self.set_envs(envs)
         super().__init__(lang_out, lang_in, model)
-        self.api_url = os.getenv("DIFY_API_URL", self.envs["DIFY_API_URL"])
-        self.api_key = os.getenv("DIFY_API_KEY", self.envs["DIFY_API_KEY"])
+        self.api_url = self.envs["DIFY_API_URL"]
+        self.api_key = self.envs["DIFY_API_KEY"]
 
     def translate(self, text):
         headers = {