|
|
2 gadi atpakaļ | |
|---|---|---|
| .github | 2 gadi atpakaļ | |
| docs | 2 gadi atpakaļ | |
| egs | 2 gadi atpakaļ | |
| egs_modelscope | 2 gadi atpakaļ | |
| fun_text_processing | 2 gadi atpakaļ | |
| funasr | 2 gadi atpakaļ | |
| tests | 2 gadi atpakaļ | |
| .gitignore | 2 gadi atpakaļ | |
| Acknowledge.md | 2 gadi atpakaļ | |
| MODEL_LICENSE | 2 gadi atpakaļ | |
| README.md | 2 gadi atpakaļ | |
| README_zh.md | 2 gadi atpakaļ | |
| setup.py | 2 gadi atpakaļ |
(简体中文|English)
FunASR hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on ModelScope, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!
Highlights | News | Installation | Quick Start | Runtime | Model Zoo | Contact
Please ref to installation docs
FunASR supports pre-trained or further fine-tuned models for deployment as a service. The CPU version of the Chinese offline file conversion service has been released, details can be found in docs. More detailed information about service deployment can be found in the deployment roadmap.
Quick start for new users(tutorial)
FunASR supports inference and fine-tuning of models trained on industrial datasets of tens of thousands of hours. For more details, please refer to (modelscope_egs). It also supports training and fine-tuning of models on academic standard datasets. For more details, please refer to(egs). The models include speech recognition (ASR), speech activity detection (VAD), punctuation recovery, language model, speaker verification, speaker separation, and multi-party conversation speech recognition. For a detailed list of models, please refer to the Model Zoo:
If you encounter problems in use, you can directly raise Issues on the github page.
You can also scan the following DingTalk group or WeChat group QR code to join the community group for communication and discussion.
| DingTalk group | WeChat group |
|---|---|
![]() | ![]() |
![]() | ![]() | |
|
| |
|---|
The contributors can be found in contributors list
This project is licensed under the The MIT License. FunASR also contains various third-party components and some code modified from other repos under other open source licenses. The use of pretraining model is subject to model license
@inproceedings{gao2023funasr,
author={Zhifu Gao and Zerui Li and Jiaming Wang and Haoneng Luo and Xian Shi and Mengzhe Chen and Yabin Li and Lingyun Zuo and Zhihao Du and Zhangyu Xiao and Shiliang Zhang},
title={FunASR: A Fundamental End-to-End Speech Recognition Toolkit},
year={2023},
booktitle={INTERSPEECH},
}
@inproceedings{An2023bat,
author={Keyu An and Xian Shi and Shiliang Zhang},
title={BAT: Boundary aware transducer for memory-efficient and low-latency ASR},
year={2023},
booktitle={INTERSPEECH},
}
@inproceedings{wang2023told,
author={Jiaming Wang and Zhihao Du and Shiliang Zhang},
title={{TOLD:} {A} Novel Two-Stage Overlap-Aware Framework for Speaker Diarization},
year={2023},
booktitle={ICASSP},
}
@inproceedings{gao22b_interspeech,
author={Zhifu Gao and ShiLiang Zhang and Ian McLoughlin and Zhijie Yan},
title={{Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition}},
year=2022,
booktitle={Proc. Interspeech 2022},
pages={2063--2067},
doi={10.21437/Interspeech.2022-9996}
}