Просмотр исходного кода

Update wav_utils.py

Because there are no uppercase letters in the dictionary, when there are uppercase letters in the annotated text, the finetune result will be "unk", so uniformly converted to lowercase when read the annotated text.
zhuzizyf 3 лет назад
Родитель
Сommit
1a39b6f981
1 измененных файлов с 1 добавлено и 1 удалено
  1. 1 1
      funasr/utils/wav_utils.py

+ 1 - 1
funasr/utils/wav_utils.py

@@ -309,7 +309,7 @@ def filter_wav_text(data_dir, dataset):
         if len(parts) < 2:
         if len(parts) < 2:
             continue
             continue
         sample_name = parts[0]
         sample_name = parts[0]
-        text_dict[sample_name] = " ".join(parts[1:])
+        text_dict[sample_name] = " ".join(parts[1:]).lower()
     filter_count = 0
     filter_count = 0
     with open(wav_file, "w") as f_wav, open(text_file, "w") as f_text:
     with open(wav_file, "w") as f_wav, open(text_file, "w") as f_text:
         for sample_name, wav_path in wav_dict.items():
         for sample_name, wav_path in wav_dict.items():