před 2 roky · a1bf1f4d30
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -31,12 +31,7 @@ Overview
 
				    ./academic_recipe/sd_recipe.md
			
 
				 
			
 
				 
			
 
				-.. toctree::
			
 
				-   :maxdepth: 1
			
 
				-   :caption: Model Zoo
			
 
				 
			
 
				-   ./modelscope_models.md
			
 
				-   ./huggingface_models.md
			
 
				 
			
 
				 .. toctree::
			
 
				    :maxdepth: 1
			
@@ -56,11 +51,13 @@ Overview
 
				 
			
 
				    Undo
			
 
				 
			
 
				+
			
 
				 .. toctree::
			
 
				    :maxdepth: 1
			
 
				-   :caption: Funasr Library
			
 
				+   :caption: Model Zoo
			
 
				 
			
 
				-   ./build_task.md
			
 
				+   ./modelscope_models.md
			
 
				+   ./huggingface_models.md
			
 
				 
			
 
				 .. toctree::
			
 
				    :maxdepth: 1
			
@@ -82,6 +79,13 @@ Overview
 
				    ./benchmark/benchmark_onnx_cpp.md
			
 
				    ./benchmark/benchmark_libtorch.md
			
 
				 
			
 
				+
			
 
				+.. toctree::
			
 
				+   :maxdepth: 1
			
 
				+   :caption: Funasr Library
			
 
				+
			
 
				+   ./build_task.md
			
 
				+
			
 
				 .. toctree::
			
 
				    :maxdepth: 1
			
 
				    :caption: Papers
			
--- a/docs/modelscope_models.md
+++ b/docs/modelscope_models.md
@@ -13,7 +13,7 @@ Here we provided several pretrained models on different datasets. The details of
 
				 |:--------------------------------------------------------------------------------------------------------------------------------------------------:|:--------:|:--------------------------------:|:----------:|:---------:|:--------------:|:--------------------------------------------------------------------------------------------------------------------------------|
			
 
				 |        [Paraformer-large](https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary)        | CN & EN  | Alibaba Speech Data (60000hours) |    8404    |   220M    |    Offline     | Duration of input wav <= 20s                                                                                                    |
			
 
				 | [Paraformer-large-long](https://www.modelscope.cn/models/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary) | CN & EN  | Alibaba Speech Data (60000hours) |    8404    |   220M    |    Offline     | Which ould deal with arbitrary length input wav                                                                                 |
			
 
				-| [paraformer-large-contextual](https://www.modelscope.cn/models/damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/summary) | CN & EN  | Alibaba Speech Data (60000hours) |    8404    |   220M    |    Offline     | Which supports the hotword customization based on the incentive enhancement, and improves the recall and precision of hotwords. |
			
 
				+| [Paraformer-large-contextual](https://www.modelscope.cn/models/damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404/summary) | CN & EN  | Alibaba Speech Data (60000hours) |    8404    |   220M    |    Offline     | Which supports the hotword customization based on the incentive enhancement, and improves the recall and precision of hotwords. |
			
 
				 |              [Paraformer](https://modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/summary)              | CN & EN  | Alibaba Speech Data (50000hours) |    8358    |    68M    |    Offline     | Duration of input wav <= 20s                                                                                                    |
			
 
				 |          [Paraformer-online](https://modelscope.cn/models/damo/speech_paraformer_asr_nat-zh-cn-16k-common-vocab8358-tensorflow1/summary)           | CN & EN  | Alibaba Speech Data (50000hours) |    8404    |    68M    |     Online     | Which could deal with streaming input                                                                                           |
			
 
				 |       [Paraformer-tiny](https://www.modelscope.cn/models/damo/speech_paraformer-tiny-commandword_asr_nat-zh-cn-16k-vocab544-pytorch/summary)       |    CN    |  Alibaba Speech Data (200hours)  |    544     |   5.2M    |    Offline     | Lightweight Paraformer model which supports Mandarin command words recognition                                                  |
			
--- a/egs_modelscope/asr/TEMPLATE/README.md
+++ b/egs_modelscope/asr/TEMPLATE/README.md
@@ -1,7 +1,7 @@
 
				 # Speech Recognition
			
 
				 
			
 
				 > **Note**: 
			
 
				-> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take typic model as example to demonstrate the usage.
			
 
				+> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take the typic models as examples to demonstrate the usage.
			
 
				 
			
 
				 ## Inference
			
 
				 
			
--- a/egs_modelscope/vad/TEMPLATE/README.md
+++ b/egs_modelscope/vad/TEMPLATE/README.md
@@ -1,7 +1,7 @@
 
				 # Voice Activity Detection
			
 
				 
			
 
				 > **Note**: 
			
 
				-> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take model of FSMN-VAD as example to demonstrate the usage.
			
 
				+> The modelscope pipeline supports all the models in [model zoo](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_models.html#pretrained-models-on-modelscope) to inference and finetine. Here we take the model of FSMN-VAD as example to demonstrate the usage.
			
 
				 
			
 
				 ## Inference