Please login to be able to save your searches and receive alerts for new content matching your search criteria.
A classification approach combining machine learning and representational similarity analysis was performed on neurophysiological data (i.e., electrophysiological responses) to distinguish speakers of different stances. The trial-based classification ...
Under the background that metaphoric gestures may play a nonnegligible role in multimodal communication, this paper attempts to probe into how metaphoric gestures co-contribute to the construction of rhetorical behavior in the discourse type of public ...
In recent years, the research of multimodal interaction has made rapid progress owing to the development of artificial intelligence and big data technology, as well as the new findings of human psychology and behavior's study. Nevertheless, the human- ...
We proposes a method with neural network models to detect language anomalies using electroencephalogram (EEG) signals. To the best of our knowledge, there have been few studies on classifying single-trial EEG signals related to language processing such ...
Audio-visual understanding is usually challenged by the complementary gap between audio and visual informative bridging. Motivated by the recent audio-visual studies, a closed-set word-level speech recognition scheme is proposed for the Mandarin Audio-...
Spoken language understanding (SLU) converts user utterances into structured semantic forms. There are still two main issues for SLU: robustness to ASR-errors and the data sparsity of new and extended domains. In this paper, we propose a robust SLU ...
The spoken language understanding (SLU) is an important part of spoken dialogue system (SDS). In the paper, we focus on how to extract a set of act-slot-value tuples from users’ utterances in the 1st Chinese Audio-Textual Spoken Language Understanding ...
Spoken language understanding (SLU) is a key component of conversational dialogue systems, which converts user utterances into semantic representations. The previous works almost focus on parsing semantic from textual inputs (top hypothesis of speech ...
In this paper, we present a series of methods to improve the performance of spoken language understanding in the 1st Chinese Audio-Textual Spoken Language Understanding Challenge (CATSLU 2019) which is aimed to improve the robustness for automatic ...
Mental health disorders are among the leading causes of disability. Despite the prevalence of mental health disorders, there is a large gap between the needs and resources available for their assessment and treatment. Automatic behaviour analysis for ...
The ever-growing research in computer vision has created new avenues for user interaction. Speech commands and gesture recognition are already being applied in various touch-based inputs. It is, therefore, foreseeable, that the use of multimodal input ...
High-accuracy physiological emotion recognition typically requires participants to wear or attach obtrusive sensors (e.g., Electroencephalograph). To achieve precise emotion recognition using only wearable body-worn physiological sensors, my doctoral ...
The emotion recognition in the wild has been a hot research topic in the field of affective computing. Though some progresses have been achieved, the emotion recognition in the wild is still an unsolved problem due to the challenge of head movement, ...
Group cohesiveness is a compelling and often studied composition in group dynamics and group performance. The enormous number of web images of groups of people can be used to develop an effective method to detect group cohesiveness. This paper ...
In this paper, we propose a hybrid deep learning network for predicting group cohesion in images. It is a kind of regression problem and its objective is to predict the Group Cohesion Score (GCS), which is in the range of [0,3]. In order to solve this ...
With the rapid progress in computing and sensory technologies, we will enter the era of human-robot coexistence in the not-too-distant future, and it is time to address the challenges of multimodal interaction. Should a robot take the form of humanoid? ...
Recent years have initiated a paradigm shift from pure task-based human-machine interfaces towards socially-aware interaction. Advances in deep learning have led to anthropomorphic interfaces with robust sensing capabilities that come close to or even ...
Nowadays, there are more and more papers submitted to various periodicals and conferences. Typically, reviewers need to read through the paper and give a review comment and score to it based on somehow certain criterion. This review process is labor ...
Internet of Things technologies yield large amounts of real-life speech data related to human emotions. Yet, labelled data of human emotion from spontaneous speech are extremely limited due to the difficulties in the annotation of such large volumes of ...
Using neural networks to classify infant vocalisations into important subclasses (such as crying versus speech) is an emergent task in speech technology. One of the biggest roadblocks standing in the way of progress lies in the datasets: The performance ...