The First Multimodal Information Based Speech Processing (Misp) Challenge

Data, Tasks, Baselines And Results

Conference paper (2022)

Authors

Hang Chen University of Science and Technology of China

Hengshun Zhou University of Science and Technology of China

Jun Du University of Science and Technology of China

Chin-Hui Lee Georgia Institute of Technology

Jingdong Chen Northwestern Polytechnical University

Shinji Watanabe Carnegie Mellon University

Sabato Marco Siniscalchi University of Enna Kore, Georgia Institute of Technology

O.E. Scharenborg

Di-Yuan Liu iFlytek

More Authors... External organisation

DOI: https://doi.org/10.1109/ICASSP43922.2022.9746683

Automatic speech recognition MISP challenge Microphone array Audio-visual Wake word spotting

To reference this document use:

http://resolver.tudelft.nl/uuid:488cad9d-badf-4818-8cb2-1b28d5d44c01

More Info

expand_more

Published Date

2022

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

In this paper we discuss the rational of the Multi-model Information based Speech Processing (MISP) Challenge, and provide a detailed description of the data recorded, the two evaluation tasks and the corresponding baselines, followed by a summary of submitted systems and evaluation results. The MISP Challenge aims at tack-ling speech processing tasks in different scenarios by introducing information about an additional modality (e.g., video, or text), which will hopefully lead to better environmental and speaker robustness in realistic applications. In the first MISP challenge, two bench-mark datasets recorded in a real-home TV room with two reproducible open-source baseline systems have been released to promote research in audio-visual wake word spotting (AVWWS) and audio-visual speech recognition (AVSR). To our knowledge, MISP is the first open evaluation challenge to tackle real-world issues of AVWWS and AVSR in the home TV scenario.

Files

Paper_misp2021_icassp2022.pdf

(pdf | 0.267 Mb)

Unknown license

Download not available

The_First_Multimodal_Informati... (pdf)

(pdf | 0.958 Mb)

- Embargo expired in 01-07-2023

Unknown license