Bläddra i källkod

Merge pull request #60 from reycn/main

Byaidu 1 år sedan
förälder
incheckning
40919a9dee
8 ändrade filer med 330 tillägg och 49 borttagningar
  1. 6 1
      .gitignore
  2. 64 42
      README.md
  3. 26 0
      gui/README.MD
  4. BIN
      gui/img/after.png
  5. BIN
      gui/img/before.png
  6. 167 0
      gui/main.py
  7. 5 0
      pdf2zh/converter.py
  8. 62 6
      pdf2zh/translator.py

+ 6 - 1
.gitignore

@@ -1,3 +1,7 @@
+gradio_files
+tmp
+gui/gradio_files
+gui/tmp
 # Byte-compiled / optimized / DLL files
 __pycache__/
 *.py[cod]
@@ -160,4 +164,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
-.vscode
+.vscode
+.DS_Store

+ 64 - 42
README.md

@@ -43,19 +43,20 @@ Execute the translation command in the command line to generate the translated d
 
 Please refer to [ChatGPT](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4) for how to set environment variables.
 
-### Translate the entire document
+### Full / partial document translation
+ - Entire document
 
-```bash
-pdf2zh example.pdf
-```
+  ```bash
+  pdf2zh example.pdf
+  ```
 
-### Translate part of the document
+ - Part of the document
 
-```bash
-pdf2zh example.pdf -p 1-3,5
-```
+  ```bash
+  pdf2zh example.pdf -p 1-3,5
+  ```
 
-### Translate with the specified language
+### Specify source and target languages
 
 See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages)
 
@@ -63,61 +64,82 @@ See [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v
 pdf2zh example.pdf -li en -lo ja
 ```
 
-### Translate with DeepL/DeepLX
+### Translate with Different Services
 
-See [DeepLX](https://github.com/OwO-Network/DeepLX)
+- **DeepL**
 
-Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/translate`
-- `DEEPL_SERVER_URL` (Optional), e.g., `export DEEPL_SERVER_URL=https://api.deepl.com`
-- `DEEPL_AUTH_KEY`, e.g., `export DEEPL_AUTH_KEY=xxx`
+  See [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API)
 
-```bash
-pdf2zh example.pdf -s deepl
-```
+  Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/translate`
+  - `DEEPL_SERVER_URL` (Optional), e.g., `export DEEPL_SERVER_URL=https://api.deepl.com`
+  - `DEEPL_AUTH_KEY`, e.g., `export DEEPL_AUTH_KEY=xxx`
 
-### Translate with Ollama
+  ```bash
+  pdf2zh example.pdf -s deepl
+  ```
 
-See [Ollama](https://github.com/ollama/ollama)
 
-Set ENVs to construct an endpoint like: `{OLLAMA_HOST}/api/chat`
-- `OLLAMA_HOST` (Optional), e.g., `export OLLAMA_HOST=https://localhost:11434`
+- **DeepLX**
 
-```bash
-pdf2zh example.pdf -s ollama:gemma2
-```
+  See [DeepLX](https://github.com/OwO-Network/DeepLX)
 
-### Translate with OpenAI/SiliconCloud/Zhipu
+  Set ENVs to construct an endpoint like: `{DEEPL_SERVER_URL}/translate`
+  - `DEEPLX_SERVER_URL` (Optional), e.g., `export DEEPLX_SERVER_URL=https://api.deeplx.org`
+  - `DEEPLX_AUTH_KEY`, e.g., `export DEEPLX_AUTH_KEY=xxx`
 
-See [SiliconCloud](https://docs.siliconflow.cn/quickstart), [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk)
+  ```bash
+  pdf2zh example.pdf -s deeplx
+  ```
 
-Set ENVs to construct an endpoint like: `{OPENAI_BASE_URL}/chat/completions`
-- `OPENAI_BASE_URL` (Optional), e.g., `export OPENAI_BASE_URL=https://api.openai.com/v1`
-- `OPENAI_API_KEY`, e.g., `export OPENAI_API_KEY=xxx`
+- **Ollama**
 
-```bash
-pdf2zh example.pdf -s openai:gpt-4o
-```
+  See [Ollama](https://github.com/ollama/ollama)
 
-### Translate with Azure Text Translation
+  Set ENVs to construct an endpoint like: `{OLLAMA_HOST}/api/chat`
+  - `OLLAMA_HOST` (Optional), e.g., `export OLLAMA_HOST=https://localhost:11434`
 
-See [What is Azure Text Translation?](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview)
+  ```bash
+  pdf2zh example.pdf -s ollama:gemma2
+  ```
 
-Following ENVs are required.
-- `AZURE_APIKEY`, e.g., `export AZURE_APIKEY=xxx`
-- `AZURE_ENDPOINT`, e.g, `export AZURE_ENDPOINT=https://api.translator.azure.cn/`
-- `AZURE_REGION`, e.g., `export AZURE_REGION=chinaeast2`
+- **LLM with OpenAI compatible schemas (OpenAI / SiliconCloud / Zhipu)**
 
+  See [SiliconCloud](https://docs.siliconflow.cn/quickstart), [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk)
 
-```bash
-pdf2zh example.pdf -s azure
-```
+  Set ENVs to construct an endpoint like: `{OPENAI_BASE_URL}/chat/completions`
+  - `OPENAI_BASE_URL` (Optional), e.g., `export OPENAI_BASE_URL=https://api.openai.com/v1`
+  - `OPENAI_API_KEY`, e.g., `export OPENAI_API_KEY=xxx`
+
+  ```bash
+  pdf2zh example.pdf -s openai:gpt-4o
+  ```
+
+- **Azure**
+
+  See [What is Azure Text Translation?](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview)
 
-### Use regex to specify formula fonts and characters that need to be preserved
+  Following ENVs are required.
+  - `AZURE_APIKEY`, e.g., `export AZURE_APIKEY=xxx`
+  - `AZURE_ENDPOINT`, e.g, `export AZURE_ENDPOINT=https://api.translator.azure.cn/`
+  - `AZURE_REGION`, e.g., `export AZURE_REGION=chinaeast2`
+
+
+  ```bash
+  pdf2zh example.pdf -s azure
+  ```
+
+### Translation wih exceptions
+Use regex to specify formula fonts and characters that need to be preserved
 
 ```bash
 pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])"
 ```
 
+### Using GUI
+
+<img src="./gui/img/before.png" width="650" alt="Original">
+See [the documentation for the GUI for more details](./gui/README.md)
+
 ## Preview
 
 ![image](https://github.com/user-attachments/assets/57e1cde6-c647-4af8-8f8f-587a40050dde)

+ 26 - 0
gui/README.MD

@@ -0,0 +1,26 @@
+# GUI (early test version)
+
+This subfolder provides the GUI mode of `pdf2zh`.
+
+## Usage:
+1. Make sure that the main tool and additional dependencies are installed. Extra modules required by the GUI are:
+
+- GUI framework: `pip install gradio`
+- PDF preview: `pip install pdf2image`
+- PDF converter:
+- - Mac `brew install poppler`
+- - Other platforms: see [the link](https://pypi.org/project/pdf2image/) for details
+
+2. Run
+- `cd gui`
+- `python main.py`
+
+3. Drop the PDF file into the window and click `Translate`.
+
+## Sample PDFs:
+
+  <img src="./img/before.png" width="650" alt="Original">
+  <img src="./img/after.png" width="650" alt="Translated">
+
+## Maintainance
+GUI maintained by [Rongxin](https://github.com/reycn)

BIN
gui/img/after.png


BIN
gui/img/before.png


+ 167 - 0
gui/main.py

@@ -0,0 +1,167 @@
+import os
+import re
+import subprocess
+import tempfile
+from pathlib import Path
+
+import gradio as gr
+from pdf2image import convert_from_path
+
+
+def upload_file(file, service, progress=gr.Progress()):
+    """Handle file upload, validation, and initial preview."""
+    if not file or not os.path.exists(file):
+        return None, None, gr.update(visible=False)
+
+    progress(0.3, desc="Converting PDF for preview...")
+    try:
+        # Convert first page for preview
+        images = convert_from_path(file, first_page=1, last_page=1)
+        preview_image = images[0] if images else None
+
+        return file, preview_image, gr.update(visible=True)
+    except Exception as e:
+        print(f"Error converting PDF: {e}")
+        return None, None, gr.update(visible=False)
+
+
+def translate(file_path, service, progress=gr.Progress()):
+    """Translate PDF content using selected service."""
+    if not file_path:
+        return None, None, gr.update(visible=False)
+
+    progress(0, desc="Starting translation...")
+
+    # Create a temporary working directory using Gradio's file utilities
+    with tempfile.TemporaryDirectory() as temp_dir:
+        # Create safe paths using pathlib
+        temp_path = Path(temp_dir)
+        input_pdf = temp_path / "input.pdf"
+
+        # Copy input file to temp directory
+        progress(0.2, desc="Preparing files...")
+        with open(file_path, "rb") as src, open(input_pdf, "wb") as dst:
+            dst.write(src.read())
+
+        # Map service names to pdf2zh service options
+        service_map = {
+            "Google": "google",
+            "DeepL": "deepl",
+            "DeepLX": "deeplx",
+            "Ollama": "ollama:gemma2",
+        }
+        selected_service = service_map.get(service, "google")
+        lang_to = "zh"
+
+        # Execute translation in temp directory with real-time progress
+        progress(0.3, desc=f"Starting translation with {selected_service}...")
+
+        # Create output directory for translated files
+        output_dir = Path("gradio_files") / "outputs"
+        output_dir.mkdir(parents=True, exist_ok=True)
+        final_output = output_dir / f"translated_{os.path.basename(file_path)}"
+
+        # Execute translation command
+        command = f"cd '{temp_path}' && pdf2zh '{input_pdf}' -s {selected_service}"
+        print(f"Executing command: {command}")
+        print(f"Files in temp directory: {os.listdir(temp_path)}")
+
+        process = subprocess.Popen(
+            command,
+            shell=True,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.STDOUT,
+            universal_newlines=True,
+        )
+
+        # Monitor progress from command output
+        while True:
+            output = process.stdout.readline()
+            if output == "" and process.poll() is not None:
+                break
+            if output:
+                print(f"Command output: {output.strip()}")
+                # Look for percentage in output
+                match = re.search(r"(\d+)%", output.strip())
+                if match:
+                    percent = int(match.group(1))
+                    # Map command progress (0-100%) to our progress range (30-80%)
+                    progress_val = 0.3 + (percent * 0.5 / 100)
+                    progress(progress_val, desc=f"Translating content: {percent}%")
+
+        # Get the return code
+        return_code = process.poll()
+        print(f"Command completed with return code: {return_code}")
+
+        # Check if translation was successful
+        translated_file = temp_path / f"input-{lang_to}.pdf"
+        print(f"Files after translation: {os.listdir(temp_path)}")
+
+        if not translated_file.exists():
+            print(f"Translation failed: Output file not found at {translated_file}")
+            return None, None, gr.update(visible=False)
+
+        # Copy the translated file to a permanent location
+        progress(0.8, desc="Saving translated file...")
+        with open(translated_file, "rb") as src, open(final_output, "wb") as dst:
+            dst.write(src.read())
+
+        # Generate preview of translated PDF
+        progress(0.9, desc="Generating preview...")
+        try:
+            translated_preview = convert_from_path(
+                str(final_output), first_page=1, last_page=1
+            )[0]
+        except Exception as e:
+            print(f"Error generating preview: {e}")
+            translated_preview = None
+
+    progress(1.0, desc="Translation complete!")
+    return str(final_output), translated_preview, gr.update(visible=True)
+
+
+with gr.Blocks(title="PDF Translation") as app:
+    gr.Markdown("# PDF Translation")
+
+    with gr.Row():
+        with gr.Column(scale=1):
+            service = gr.Dropdown(
+                label="Service",
+                choices=["Google", "DeepL", "DeepLX", "Ollama"],
+                value="Google",
+            )
+
+            file_input = gr.File(
+                label="Upload",
+                file_count="single",
+                file_types=[".pdf"],
+                type="filepath",
+            )
+
+            output_file = gr.File(label="Download Translation", visible=False)
+            translate_btn = gr.Button("Translate", variant="primary", visible=False)
+            # add a text description
+            gr.Markdown(
+                """*Note: Please make sure that [pdf2zh](https://github.com/Byaidu/PDFMathTranslate) is correctly configured.*
+                GUI implemented by: [Rongxin](https://github.com/reycn)
+                [Early Version]
+                """
+            )
+
+        with gr.Column(scale=2):
+            preview = gr.Image(label="Preview", visible=True)
+
+    # Event handlers
+    file_input.upload(
+        upload_file,
+        inputs=[file_input, service],
+        outputs=[file_input, preview, translate_btn],
+    )
+
+    translate_btn.click(
+        translate,
+        inputs=[file_input, service],
+        outputs=[output_file, preview, output_file],
+    )
+
+app.launch(debug=True, inbrowser=True, share=False)

+ 5 - 0
pdf2zh/converter.py

@@ -23,6 +23,7 @@ from pdf2zh.translator import (
     BaseTranslator,
     GoogleTranslator,
     DeepLTranslator,
+    DeepLXTranslator,
     OllamaTranslator,
     OpenAITranslator,
     AzureTranslator,
@@ -375,6 +376,10 @@ class TextConverter(PDFConverter[AnyIO]):
             self.translator: BaseTranslator = DeepLTranslator(
                 service, lang_out, lang_in, None
             )
+        elif param[0] == "deeplx":
+            self.translator: BaseTranslator = DeepLXTranslator(
+                service, lang_out, lang_in, None
+            )
         elif param[0] == "ollama":
             self.translator: BaseTranslator = OllamaTranslator(
                 service, lang_out, lang_in, param[1]

+ 62 - 6
pdf2zh/translator.py

@@ -1,11 +1,13 @@
 import html
-import re
+import logging
 import os
+import re
+from json import dumps, loads
+
+import deepl
 import ollama
-import logging
-import requests
 import openai
-import deepl
+import requests
 from azure.ai.translation.text import TextTranslationClient
 from azure.core.credentials import AzureKeyCredential
 
@@ -29,8 +31,8 @@ class BaseTranslator:
 
 class GoogleTranslator(BaseTranslator):
     def __init__(self, service, lang_out, lang_in, model):
-        lang_out='zh-CN' if lang_out=='auto' else lang_out
-        lang_in='en' if lang_in=='auto' else lang_in
+        lang_out = "zh-CN" if lang_out == "auto" else lang_out
+        lang_in = "en" if lang_in == "auto" else lang_in
         super().__init__(service, lang_out, lang_in, model)
         self.session = requests.Session()
         self.base_link = "http://translate.google.com/m"
@@ -55,6 +57,60 @@ class GoogleTranslator(BaseTranslator):
         return result
 
 
+class DeepLXTranslator(BaseTranslator):
+    def __init__(self, service, lang_out, lang_in, model):
+        lang_out = "zh" if lang_out == "auto" else lang_out
+        lang_in = "en" if lang_in == "auto" else lang_in
+        super().__init__(service, lang_out, lang_in, model)
+        try:
+            auth_key = os.getenv("DEEPLX_AUTH_KEY")
+            server_url = (
+                "https://api.deeplx.org"
+                if not os.getenv("DEEPLX_SERVER_URL")
+                else os.getenv("DEEPLX_SERVER_URL")
+            )
+        except KeyError as e:
+            missing_var = e.args[0]
+            raise ValueError(
+                f"The environment variable '{missing_var}' is required but not set."
+            ) from e
+
+        self.session = requests.Session()
+        self.base_link = f"{server_url}/{auth_key}/translate"
+        self.headers = {
+            "User-Agent": "Mozilla/4.0 (compatible;MSIE 6.0;Windows NT 5.1;SV1;.NET CLR 1.1.4322;.NET CLR 2.0.50727;.NET CLR 3.0.04506.30)"
+        }
+
+    def translate(self, text):
+        text = text[:5000]  # google translate max length
+        response = self.session.post(
+            self.base_link,
+            dumps(
+                {
+                    "target_lang": self.lang_out,
+                    "text": text,
+                }
+            ),
+            headers=self.headers,
+        )
+        # 1. Status code test
+        if response.status_code == 200:
+            result = loads(response.text)
+        else:
+            raise ValueError("HTTP error: " + str(response.status_code))
+        # 2. Result test
+        try:
+            result = result["data"]
+            return result
+        except KeyError:
+            result = ""
+            raise ValueError("No valid key in DeepLX's response")
+        # 3. Result length check
+        if len(result) == 0:
+            raise ValueError("Empty translation result")
+        return result
+
+
 class DeepLTranslator(BaseTranslator):
     def __init__(self, service, lang_out, lang_in, model):
         lang_out='ZH' if lang_out=='auto' else lang_out