|
|
@@ -18,10 +18,12 @@ itn_inference_pipline = pipeline(
|
|
|
|
|
|
itn_result = itn_inference_pipline(text_in='百二十三')
|
|
|
print(itn_result)
|
|
|
+# 123
|
|
|
```
|
|
|
- read text data directly.
|
|
|
```python
|
|
|
rec_result = inference_pipeline(text_in='一九九九年に誕生した同商品にちなみ、約三十年前、二十四歳の頃の幸四郎の写真を公開。')
|
|
|
+# 1999年に誕生した同商品にちなみ、約30年前、24歳の頃の幸四郎の写真を公開。
|
|
|
```
|
|
|
- text stored via url,example:https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt
|
|
|
```python
|
|
|
@@ -30,22 +32,6 @@ rec_result = inference_pipeline(text_in='https://isv-data.oss-cn-hangzhou.aliyun
|
|
|
|
|
|
Full code of demo, please ref to [demo](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing/inverse_text_normalization)
|
|
|
|
|
|
-### Modify Your Own ITN Model
|
|
|
-The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules. After modify the rules, the users can export their own ITN models in local directory.
|
|
|
-
|
|
|
-### Export ITN Model
|
|
|
-Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below.
|
|
|
-```shell
|
|
|
-cd fun_text_processing/inverse_text_normalization/
|
|
|
-python export_models.py --language ja --export_dir ./itn_models/
|
|
|
-```
|
|
|
-
|
|
|
-### Evaluate ITN Model
|
|
|
-Users can evaluate their own ITN model in local directory. Here is an example:
|
|
|
-```shell
|
|
|
-python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja
|
|
|
-```
|
|
|
-
|
|
|
### API-reference
|
|
|
#### Define pipeline
|
|
|
- `task`: `Tasks.inverse_text_processing`
|
|
|
@@ -58,4 +44,19 @@ python fun_text_processing/inverse_text_normalization/inverse_normalize.py --inp
|
|
|
- text bytes, `e.g.`: "一九九九年に誕生した同商品にちなみ、約三十年前、二十四歳の頃の幸四郎の写真を公開。"
|
|
|
- text file, `e.g.`: https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_text/ja_itn_example.txt
|
|
|
In this case of `text file` input, `output_dir` must be set to save the output results
|
|
|
-
|
|
|
+
|
|
|
+## Modify Your Own ITN Model
|
|
|
+The rule-based ITN code is open-sourced in [FunTextProcessing](https://github.com/alibaba-damo-academy/FunASR/tree/main/fun_text_processing), users can modify by their own grammar rules for different languages. Let's take Japanese as an example, users can add their own whitelist in fun_text_processing/inverse_text_normalization/ja/data/whitelist.tsv. After modify the rules, the users can export their own ITN models in local directory.
|
|
|
+
|
|
|
+### Export ITN Model
|
|
|
+Use the code in FunASR to export ITN model. An example to export ITN model to local folder is shown as below.
|
|
|
+```shell
|
|
|
+cd fun_text_processing/inverse_text_normalization/
|
|
|
+python export_models.py --language ja --export_dir ./itn_models/
|
|
|
+```
|
|
|
+
|
|
|
+### Evaluate ITN Model
|
|
|
+Users can evaluate their own ITN model in local directory. Here is an example:
|
|
|
+```shell
|
|
|
+python fun_text_processing/inverse_text_normalization/inverse_normalize.py --input_file ja_itn_example.txt --cache_dir ./itn_models/ --output_file output.txt --language=ja
|
|
|
+```
|