Prechádzať zdrojové kódy

Merge pull request #434 from hellofinch/main

feat (main) : add config file
Byaidu 1 rok pred
rodič
commit
3d9924fd37

+ 1 - 0
.gitignore

@@ -135,6 +135,7 @@ venv/
 ENV/
 env.bak/
 venv.bak/
+pdf2zh-dev/
 
 # Spyder project settings
 .spyderproject

+ 8 - 1
README.md

@@ -165,11 +165,16 @@ The present program needs an AI model(`wybxc/DocLayout-YOLO-DocStructBench-onnx`
 set HF_ENDPOINT=https://hf-mirror.com
 ```
 
+For PowerShell user:
+```shell
+$env:HF_ENDPOINT = https://hf-mirror.com
+```
+
 If the solution does not work to you / you encountered other issues, please refer to [frequently asked questions](https://github.com/Byaidu/PDFMathTranslate/wiki#-faq--%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98).
 
 <h2 id="usage">Advanced Options</h2>
 
-Execute the translation command in the command line to generate the translated document `example-mono.pdf` and the bilingual document `example-dual.pdf` in the current working directory. Use Google as the default translation service.
+Execute the translation command in the command line to generate the translated document `example-mono.pdf` and the bilingual document `example-dual.pdf` in the current working directory. Use Google as the default translation service. More support translation services can find [HERE](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services).
 
 <img src="./docs/images/cmd.explained.png" width="580px"  alt="cmd"/>
 
@@ -194,6 +199,8 @@ In the following table, we list all advanced options for reference:
 | `--onnx` | [Use Custom DocLayout-YOLO ONNX model] | `pdf2zh --onnx [onnx/model/path]` |
 | `--serverport` | [Use Custom WebUI port] | `pdf2zh --serverport 7860` |
 | `--dir` | [batch translate] | `pdf2zh --dir /path/to/translate/` |
+| `--config` | [configuration file](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig) | `pdf2zh --config /path/to/config/config.json` |
+| `--serverport` | [custom gradio server port] | `pdf2zh --serverport 7860` |
 
 For detailed explanations, please refer to our document about [Advanced Usage](./docs/ADVANCED.md) for a full list of each option.
 

+ 49 - 0
docs/ADVANCED.md

@@ -86,6 +86,12 @@ set OPENAI_MODEL=gpt-4o-mini
 pdf2zh example.pdf -s openai
 ```
 
+For PowerShell user:
+```shell
+$env:OPENAI_MODEL = gpt-4o-mini
+pdf2zh example.pdf -s openai
+```
+
 [⬆️ Back to top](#toc)
 
 ---
@@ -191,3 +197,46 @@ example auth.html
 [⬆️ Back to top](#toc)
 
 ---
+
+<h3 id="cofig">Custom configuration file</h3>
+
+Use `--config` to specify which file to configure the PDFMathTranslate:
+
+```bash
+pdf2zh example.pdf --config config.json
+```
+
+```bash
+pdf2zh -i --config config.json
+```
+
+example config.json
+```json
+{
+    "USE_MODELSCOPE": "0",
+    "PDF2ZH_LANG_FROM": "English",
+    "PDF2ZH_LANG_TO": "Simplified Chinese",
+    "NOTO_FONT_PATH": "/app/SourceHanSerifCN-Regular.ttf",
+    "translators": [
+        {
+            "name": "deeplx",
+            "envs": {
+                "DEEPLX_ENDPOINT": "http://localhost:1188/translate/",
+                "DEEPLX_ACCESS_TOKEN": null
+            }
+        },
+        {
+            "name": "ollama",
+            "envs": {
+                "OLLAMA_HOST": "http://127.0.0.1:11434",
+                "OLLAMA_MODEL": "gemma2"
+            }
+        }
+    ]
+}
+```
+By default, the config file is saved in the `~/.config/PDFMathTranslate/config.json`. The program will start by reading the contents of config.json, and after that it will read the contents of the environment variables. When an environment variable is available, the contents of the environment variable are used first and the file is updated.
+
+[⬆️ Back to top](#toc)
+
+---

+ 15 - 1
docs/README_ja-JP.md

@@ -80,6 +80,11 @@ pdf2zhの実行には追加モデル(`wybxc/DocLayout-YOLO-DocStructBench-onnx
 set HF_ENDPOINT=https://hf-mirror.com
 ```
 
+For PowerShell user:
+```shell
+$env:HF_ENDPOINT = https://hf-mirror.com
+```
+
 <h3 id="cmd">方法1. コマンドライン</h3>
 
   1. Pythonがインストールされていること (バージョン3.8 <= バージョン <= 3.12)
@@ -156,7 +161,8 @@ Python環境を事前にインストールする必要はありません
 
 <h2 id="usage">高度なオプション</h2>
 
-コマンドラインで翻訳コマンドを実行し、現在の作業ディレクトリに翻訳されたドキュメント `example-mono.pdf` とバイリンガルドキュメント `example-dual.pdf` を生成します。デフォルトではGoogle翻訳サービスを使用します。
+コマンドラインで翻訳コマンドを実行し、現在の作業ディレクトリに翻訳されたドキュメント `example-mono.pdf` とバイリンガルドキュメント `example-dual.pdf` を生成します。デフォルトではGoogle翻訳サービスを使用します。More support translation services can find [HERE](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services).
+
 
 <img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>  
 
@@ -180,6 +186,8 @@ Python環境を事前にインストールする必要はありません
 | `--onnx` | [カスタムDocLayout-YOLO ONNXモデルの使用] | `pdf2zh --onnx [onnx/model/path]` |
 | `--serverport` | [カスタムWebUIポートを使用する] | `pdf2zh --serverport 7860` |
 | `--dir` | [batch translate] | `pdf2zh --dir /path/to/translate/` |
+| `--config` | [configuration file](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig) | `pdf2zh --config /path/to/config/config.json` |
+| `--serverport` | [custom gradio server port] | `pdf2zh --serverport 7860` |
 
 <h3 id="partial">全文または部分的なドキュメント翻訳</h3>
 
@@ -245,6 +253,12 @@ set OPENAI_MODEL=gpt-4o-mini
 pdf2zh example.pdf -s openai
 ```
 
+For PowerShell user:
+```shell
+$env:OPENAI_MODEL = gpt-4o-mini
+pdf2zh example.pdf -s openai
+```
+
 <h3 id="exceptions">例外を指定して翻訳</h3>
 
 正規表現を使用して保持する必要がある数式フォントと文字を指定します:

+ 15 - 1
docs/README_zh-CN.md

@@ -82,6 +82,11 @@ pdf2zh的运行依赖于额外模型(`wybxc/DocLayout-YOLO-DocStructBench-onnx`)
 set HF_ENDPOINT=https://hf-mirror.com
 ```
 
+如使用 PowerShell,请使用如下方法设置环境变量:
+```shell
+$env:HF_ENDPOINT = https://hf-mirror.com
+```
+
 <h3 id="cmd">方法一、命令行工具</h3>
 
   1. 确保安装了版本大于 3.8 且小于 3.12 的 Python
@@ -158,7 +163,7 @@ set HF_ENDPOINT=https://hf-mirror.com
 
 <h2 id="usage">高级选项</h2>
 
-在命令行中执行翻译命令,在当前工作目录下生成译文文档 `example-mono.pdf` 和双语对照文档 `example-dual.pdf`,默认使用 Google 翻译服务
+在命令行中执行翻译命令,在当前工作目录下生成译文文档 `example-mono.pdf` 和双语对照文档 `example-dual.pdf`,默认使用 Google 翻译服务,更多支持的服务在[这里](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services))。
 
 <img src="./images/cmd.explained.png" width="580px"  alt="cmd"/>  
 
@@ -182,6 +187,9 @@ set HF_ENDPOINT=https://hf-mirror.com
 | `--onnx` | [使用自定义的 DocLayout-YOLO ONNX 模型] | `pdf2zh --onnx [onnx/model/path]` |
 | `--serverport` | [使用自定义的 WebUI 端口] | `pdf2zh --serverport 7860` |
 | `--dir` | [文件夹翻译] | `pdf2zh --dir /path/to/translate/` |
+| `--serverport` | [自定义端口号] | `pdf2zh --serverport 7860` |
+| `--config` | [持久化定义配置文件](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig) | `pdf2zh --config /path/to/config/config.json` |
+
 
 <h3 id="partial">全文或部分文档翻译</h3>
 
@@ -246,6 +254,12 @@ set OPENAI_MODEL=gpt-4o-mini
 pdf2zh example.pdf -s openai
 ```
 
+对于 PowerShell 用户,请使用如下方式设置环境变量指定模型:
+```shell
+$env:OPENAI_MODEL = gpt-4o-mini
+pdf2zh example.pdf -s openai
+```
+
 <h3 id="exceptions">指定例外规则</h3>
 
 使用正则表达式指定需保留的公式字体与字符:

+ 3 - 3
pdf2zh/backend.py

@@ -1,4 +1,3 @@
-import os
 from flask import Flask, request, send_file
 from celery import Celery, Task
 from celery.result import AsyncResult
@@ -7,12 +6,13 @@ import tqdm
 import json
 import io
 from pdf2zh.doclayout import ModelInstance
+from pdf2zh.config import ConfigManager
 
 flask_app = Flask("pdf2zh")
 flask_app.config.from_mapping(
     CELERY=dict(
-        broker_url=os.environ.get("CELERY_BROKER", "redis://127.0.0.1:6379/0"),
-        result_backend=os.environ.get("CELERY_RESULT", "redis://127.0.0.1:6379/0"),
+        broker_url=ConfigManager.get("CELERY_BROKER", "redis://127.0.0.1:6379/0"),
+        result_backend=ConfigManager.get("CELERY_RESULT", "redis://127.0.0.1:6379/0"),
     )
 )
 

+ 214 - 0
pdf2zh/config.py

@@ -0,0 +1,214 @@
+import json
+from pathlib import Path
+from threading import RLock  # 改成 RLock
+import os
+import copy
+
+
+class ConfigManager:
+    _instance = None
+    _lock = RLock()  # 用 RLock 替换 Lock,允许在同一个线程中重复获取锁
+
+    @classmethod
+    def get_instance(cls):
+        """获取单例实例"""
+        # 先判断是否存在实例,如果不存在再加锁进行初始化
+        if cls._instance is None:
+            with cls._lock:
+                if cls._instance is None:
+                    cls._instance = cls()
+        return cls._instance
+
+    def __init__(self):
+        # 防止重复初始化
+        if hasattr(self, "_initialized") and self._initialized:
+            return
+        self._initialized = True
+
+        self._config_path = Path.home() / ".config" / "PDFMathTranslate" / "config.json"
+        self._config_data = {}
+
+        # 这里不要再加锁,因为外层可能已经加了锁 (get_instance), RLock也无妨
+        self._ensure_config_exists()
+
+    def _ensure_config_exists(self, isInit=True):
+        """确保配置文件存在,如果不存在则创建默认配置"""
+        # 这里也不需要显式再次加锁,原因同上,方法体中再调用 _load_config(),
+        # 而 _load_config() 内部会加锁。因为 RLock 是可重入的,不会阻塞。
+        if not self._config_path.exists():
+            if isInit:
+                self._config_path.parent.mkdir(parents=True, exist_ok=True)
+                self._config_data = {}  # 默认配置内容
+                self._save_config()
+            else:
+                raise ValueError(f"config file {self._config_path} not found!")
+        else:
+            self._load_config()
+
+    def _load_config(self):
+        """从 config.json 中加载配置"""
+        with self._lock:  # 加锁确保线程安全
+            with self._config_path.open("r", encoding="utf-8") as f:
+                self._config_data = json.load(f)
+
+    def _save_config(self):
+        """保存配置到 config.json"""
+        with self._lock:  # 加锁确保线程安全
+            # 移除循环引用并写入
+            cleaned_data = self._remove_circular_references(self._config_data)
+            with self._config_path.open("w", encoding="utf-8") as f:
+                json.dump(cleaned_data, f, indent=4, ensure_ascii=False)
+
+    def _remove_circular_references(self, obj, seen=None):
+        """递归移除循环引用"""
+        if seen is None:
+            seen = set()
+        obj_id = id(obj)
+        if obj_id in seen:
+            return None  # 遇到已处理过的对象,视为循环引用
+        seen.add(obj_id)
+
+        if isinstance(obj, dict):
+            return {
+                k: self._remove_circular_references(v, seen) for k, v in obj.items()
+            }
+        elif isinstance(obj, list):
+            return [self._remove_circular_references(i, seen) for i in obj]
+        return obj
+
+    @classmethod
+    def custome_config(cls, file_path):
+        """使用自定义路径加载配置文件"""
+        custom_path = Path(file_path)
+        if not custom_path.exists():
+            raise ValueError(f"Config file {custom_path} not found!")
+        # 加锁
+        with cls._lock:
+            instance = cls()
+            instance._config_path = custom_path
+            # 此处传 isInit=False,若不存在则报错;若存在则正常 _load_config()
+            instance._ensure_config_exists(isInit=False)
+            cls._instance = instance
+
+    @classmethod
+    def get(cls, key, default=None):
+        """获取配置值"""
+        instance = cls.get_instance()
+        # 读取时,加锁或不加锁都行。但为了统一,我们在修改配置前后都要加锁。
+        # get 只要最终需要保存,则会加锁 -> _save_config()
+        if key in instance._config_data:
+            return instance._config_data[key]
+
+        # 若环境变量中存在该 key,则使用环境变量并写回 config
+        if key in os.environ:
+            value = os.environ[key]
+            instance._config_data[key] = value
+            instance._save_config()
+            return value
+
+        # 若 default 不为 None,则设置并保存
+        if default is not None:
+            instance._config_data[key] = default
+            instance._save_config()
+            return default
+
+        # 找不到则抛出异常
+        # raise KeyError(f"{key} is not found in config file or environment variables.")
+        return default
+
+    @classmethod
+    def set(cls, key, value):
+        """设置配置值并保存"""
+        instance = cls.get_instance()
+        with instance._lock:
+            instance._config_data[key] = value
+            instance._save_config()
+
+    @classmethod
+    def get_translator_by_name(cls, name):
+        """根据 name 获取对应的 translator 配置"""
+        instance = cls.get_instance()
+        translators = instance._config_data.get("translators", [])
+        for translator in translators:
+            if translator.get("name") == name:
+                return translator["envs"]
+        return None
+
+    @classmethod
+    def set_translator_by_name(cls, name, new_translator_envs):
+        """根据 name 设置或更新 translator 配置"""
+        instance = cls.get_instance()
+        with instance._lock:
+            translators = instance._config_data.get("translators", [])
+            for translator in translators:
+                if translator.get("name") == name:
+                    translator["envs"] = copy.deepcopy(new_translator_envs)
+                    instance._save_config()
+                    return
+            translators.append(
+                {"name": name, "envs": copy.deepcopy(new_translator_envs)}
+            )
+            instance._config_data["translators"] = translators
+            instance._save_config()
+
+    @classmethod
+    def get_env_by_translatername(cls, translater_name, name, default=None):
+        """根据 name 获取对应的 translator 配置"""
+        instance = cls.get_instance()
+        translators = instance._config_data.get("translators", [])
+        for translator in translators:
+            if translator.get("name") == translater_name.name:
+                if translator["envs"][name]:
+                    return translator["envs"][name]
+                else:
+                    with instance._lock:
+                        translator["envs"][name] = default
+                        instance._save_config()
+                        return default
+
+        with instance._lock:
+            translators = instance._config_data.get("translators", [])
+            for translator in translators:
+                if translator.get("name") == translater_name.name:
+                    translator["envs"][name] = default
+                    instance._save_config()
+                    return default
+            translators.append(
+                {
+                    "name": translater_name.name,
+                    "envs": copy.deepcopy(translater_name.envs),
+                }
+            )
+            instance._config_data["translators"] = translators
+            instance._save_config()
+            return default
+
+    @classmethod
+    def delete(cls, key):
+        """删除配置值并保存"""
+        instance = cls.get_instance()
+        with instance._lock:
+            if key in instance._config_data:
+                del instance._config_data[key]
+                instance._save_config()
+
+    @classmethod
+    def clear(cls):
+        """删除配置值并保存"""
+        instance = cls.get_instance()
+        with instance._lock:
+            instance._config_data = {}
+            instance._save_config()
+
+    @classmethod
+    def all(cls):
+        """返回所有配置项"""
+        instance = cls.get_instance()
+        # 这里只做读取操作,一般可不加锁。不过为了保险也可以加锁。
+        return instance._config_data
+
+    @classmethod
+    def remove(cls):
+        instance = cls.get_instance()
+        with instance._lock:
+            os.remove(instance._config_path)

+ 3 - 2
pdf2zh/doclayout.py

@@ -8,6 +8,8 @@ import onnx
 import onnxruntime
 from huggingface_hub import hf_hub_download
 
+from pdf2zh.config import ConfigManager
+
 
 class DocLayoutModel(abc.ABC):
     @staticmethod
@@ -60,7 +62,6 @@ class YoloBox:
 
 
 class OnnxModel(DocLayoutModel):
-
     def __init__(self, model_path: str):
         self.model_path = model_path
 
@@ -73,7 +74,7 @@ class OnnxModel(DocLayoutModel):
 
     @staticmethod
     def from_pretrained(repo_id: str, filename: str):
-        if os.environ.get("USE_MODELSCOPE", "0") == "1":
+        if ConfigManager.get("USE_MODELSCOPE", "0") == "1":
             repo_mapping = {
                 # Edit here to add more models
                 "wybxc/DocLayout-YOLO-DocStructBench-onnx": "AI-ModelScope/DocLayout-YOLO-DocStructBench-onnx"

+ 13 - 7
pdf2zh/gui.py

@@ -10,10 +10,12 @@ import gradio as gr
 import requests
 import tqdm
 from gradio_pdf import PDF
+from string import Template
 
 from pdf2zh import __version__
 from pdf2zh.high_level import translate
 from pdf2zh.doclayout import ModelInstance
+from pdf2zh.config import ConfigManager
 from pdf2zh.translator import (
     AnythingLLMTranslator,
     AzureOpenAITranslator,
@@ -90,7 +92,7 @@ page_map = {
 flag_demo = False
 
 # Limit resources
-if os.getenv("PDF2ZH_DEMO"):
+if ConfigManager.get("PDF2ZH_DEMO"):
     flag_demo = True
     service_map = {
         "Google": GoogleTranslator,
@@ -99,8 +101,8 @@ if os.getenv("PDF2ZH_DEMO"):
         "First": [0],
         "First 20 pages": list(range(0, 20)),
     }
-    client_key = os.getenv("PDF2ZH_CLIENT_KEY")
-    server_key = os.getenv("PDF2ZH_SERVER_KEY")
+    client_key = ConfigManager.get("PDF2ZH_CLIENT_KEY")
+    server_key = ConfigManager.get("PDF2ZH_SERVER_KEY")
 
 
 # Public demo control
@@ -275,7 +277,7 @@ def translate_file(
         "callback": progress_bar,
         "cancellation_event": cancellation_event_map[session_id],
         "envs": _envs,
-        "prompt": prompt,
+        "prompt": Template(prompt),
         "model": ModelInstance.value,
     }
     try:
@@ -411,12 +413,12 @@ with gr.Blocks(
                 lang_from = gr.Dropdown(
                     label="Translate from",
                     choices=lang_map.keys(),
-                    value=os.getenv("PDF2ZH_LANG_FROM", "English"),
+                    value=ConfigManager.get("PDF2ZH_LANG_FROM", "English"),
                 )
                 lang_to = gr.Dropdown(
                     label="Translate to",
                     choices=lang_map.keys(),
-                    value=os.getenv("PDF2ZH_LANG_TO", "Simplified Chinese"),
+                    value=ConfigManager.get("PDF2ZH_LANG_TO", "Simplified Chinese"),
                 )
             page_range = gr.Radio(
                 choices=page_map.keys(),
@@ -447,7 +449,11 @@ with gr.Blocks(
                     _envs.append(gr.update(visible=False, value=""))
                 for i, env in enumerate(translator.envs.items()):
                     _envs[i] = gr.update(
-                        visible=True, label=env[0], value=os.getenv(env[0], env[1])
+                        visible=True,
+                        label=env[0],
+                        value=ConfigManager.get_env_by_translatername(
+                            translator, env[0], env[1]
+                        ),
                     )
                 _envs[-1] = gr.update(visible=translator.CustomPrompt)
                 return _envs

+ 3 - 1
pdf2zh/high_level.py

@@ -24,6 +24,8 @@ from pdf2zh.converter import TranslateConverter
 from pdf2zh.doclayout import OnnxModel
 from pdf2zh.pdfinterp import PDFPageInterpreterEx
 
+from pdf2zh.config import ConfigManager
+
 NOTO_NAME = "noto"
 
 noto_list = [
@@ -383,7 +385,7 @@ def download_remote_fonts(lang: str):
     font_name = LANG_NAME_MAP.get(lang, "GoNotoKurrent-Regular.ttf")
 
     # docker
-    font_path = os.environ.get("NOTO_FONT_PATH", Path("/app", font_name).as_posix())
+    font_path = ConfigManager.get("NOTO_FONT_PATH", Path("/app", font_name).as_posix())
     if not Path(font_path).exists():
         font_path = Path(tempfile.gettempdir(), font_name).as_posix()
     if not Path(font_path).exists():

+ 13 - 2
pdf2zh/pdf2zh.py

@@ -16,6 +16,8 @@ from pdf2zh.high_level import translate
 from pdf2zh.doclayout import OnnxModel, ModelInstance
 import os
 
+from pdf2zh.config import ConfigManager
+
 
 def create_parser() -> argparse.ArgumentParser:
     parser = argparse.ArgumentParser(description=__doc__, add_help=True)
@@ -156,6 +158,12 @@ def create_parser() -> argparse.ArgumentParser:
         help="translate directory.",
     )
 
+    parse_params.add_argument(
+        "--config",
+        type=str,
+        help="config file.",
+    )
+
     return parser
 
 
@@ -204,6 +212,9 @@ def main(args: Optional[List[str]] = None) -> int:
 
     parsed_args = parse_args(args)
 
+    if parsed_args.config:
+        ConfigManager.custome_config(parsed_args.config)
+
     if parsed_args.debug:
         log.setLevel(logging.DEBUG)
 
@@ -243,13 +254,13 @@ def main(args: Optional[List[str]] = None) -> int:
         except Exception:
             raise ValueError("prompt error.")
 
+    print(parsed_args)
     if parsed_args.dir:
         untranlate_file = find_all_files_in_directory(parsed_args.files[0])
         parsed_args.files = untranlate_file
-        print(parsed_args)
         translate(model=ModelInstance.value, **vars(parsed_args))
         return 0
-    # print(parsed_args)
+
     translate(model=ModelInstance.value, **vars(parsed_args))
     return 0
 

+ 1 - 1
pdf2zh/pdfinterp.py

@@ -361,4 +361,4 @@ class PDFPageInterpreterEx(PDFPageInterpreter):
             else:
                 self.push(obj)
         # print('REV DATA',ops)
-        return ops
+        return ops

+ 15 - 1
pdf2zh/translator.py

@@ -20,6 +20,7 @@ import argostranslate.package
 import argostranslate.translate
 
 import json
+from pdf2zh.config import ConfigManager
 
 
 def remove_control_characters(s):
@@ -54,12 +55,19 @@ class BaseTranslator:
         # Cannot use self.envs = copy(self.__class__.envs)
         # because if set_envs called twice, the second call will override the first call
         self.envs = copy(self.envs)
+        if ConfigManager.get_translator_by_name(self.name):
+            self.envs = ConfigManager.get_translator_by_name(self.name)
+        needUpdate = False
         for key in self.envs:
             if key in os.environ:
                 self.envs[key] = os.environ[key]
+                needUpdate = True
+        if needUpdate:
+            ConfigManager.set_translator_by_name(self.name, self.envs)
         if envs is not None:
             for key in envs:
                 self.envs[key] = envs[key]
+            ConfigManager.set_translator_by_name(self.name, self.envs)
 
     def add_cache_impact_parameters(self, k: str, v):
         """
@@ -214,6 +222,7 @@ class DeepLXTranslator(BaseTranslator):
     name = "deeplx"
     envs = {
         "DEEPLX_ENDPOINT": "https://api.deepl.com/translate",
+        "DEEPLX_ACCESS_TOKEN": None,
     }
     lang_map = {"zh": "zh-Hans"}
 
@@ -222,6 +231,9 @@ class DeepLXTranslator(BaseTranslator):
         super().__init__(lang_in, lang_out, model)
         self.endpoint = self.envs["DEEPLX_ENDPOINT"]
         self.session = requests.Session()
+        auth_key = self.envs["DEEPLX_ACCESS_TOKEN"]
+        if auth_key:
+            self.endpoint = f"{self.endpoint}?token={auth_key}"
 
     def do_translate(self, text):
         response = self.session.post(
@@ -540,7 +552,7 @@ class AzureTranslator(BaseTranslator):
         self.set_envs(envs)
         super().__init__(lang_in, lang_out, model)
         endpoint = self.envs["AZURE_ENDPOINT"]
-        api_key = os.getenv("AZURE_API_KEY")
+        api_key = self.envs["AZURE_API_KEY"]
         credential = AzureKeyCredential(api_key)
         self.client = TextTranslationClient(
             endpoint=endpoint, credential=credential, region="chinaeast2"
@@ -723,6 +735,7 @@ class GorkTranslator(OpenAITranslator):
         if prompt:
             self.add_cache_impact_parameters("prompt", prompt.template)
 
+
 class GroqTranslator(OpenAITranslator):
     name = "groq"
     envs = {
@@ -742,6 +755,7 @@ class GroqTranslator(OpenAITranslator):
         if prompt:
             self.add_cache_impact_parameters("prompt", prompt.template)
 
+
 class DeepseekTranslator(OpenAITranslator):
     name = "deepseek"
     envs = {

+ 5 - 0
test/test_translator.py

@@ -2,6 +2,7 @@ import unittest
 from pdf2zh.translator import BaseTranslator
 from pdf2zh.translator import OpenAIlikedTranslator
 from pdf2zh import cache
+from pdf2zh.config import ConfigManager
 
 
 class AutoIncreaseTranslator(BaseTranslator):
@@ -84,6 +85,7 @@ class TestOpenAIlikedTranslator(unittest.TestCase):
 
     def test_missing_base_url_raises_error(self):
         """测试缺失 OPENAILIKED_BASE_URL 时抛出异常"""
+        ConfigManager.clear()
         with self.assertRaises(ValueError) as context:
             OpenAIlikedTranslator(
                 lang_in="en", lang_out="zh", model="test_model", envs={}
@@ -96,6 +98,7 @@ class TestOpenAIlikedTranslator(unittest.TestCase):
             "OPENAILIKED_BASE_URL": "https://api.openailiked.com",
             "OPENAILIKED_API_KEY": "test_api_key",
         }
+        ConfigManager.clear()
         with self.assertRaises(ValueError) as context:
             OpenAIlikedTranslator(
                 lang_in="en", lang_out="zh", model=None, envs=envs_without_model
@@ -104,6 +107,7 @@ class TestOpenAIlikedTranslator(unittest.TestCase):
 
     def test_initialization_with_valid_envs(self):
         """测试使用有效的环境变量初始化"""
+        ConfigManager.clear()
         translator = OpenAIlikedTranslator(
             lang_in="en",
             lang_out="zh",
@@ -126,6 +130,7 @@ class TestOpenAIlikedTranslator(unittest.TestCase):
             "OPENAILIKED_BASE_URL": "https://api.openailiked.com",
             "OPENAILIKED_MODEL": "test_model",
         }
+        ConfigManager.clear()
         translator = OpenAIlikedTranslator(
             lang_in="en",
             lang_out="zh",