Search by Subject

Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Applied Filters

People

Publications

Reproducibility Badges

Publication Date

Searched The ACM Full-Text Collection (691,749 records)|Expand your search to The ACM Guide to Computing Literature (3,482,418 records)

Showing 1 - 20of2,231 Results

Filters

Select All

Export Citations Save to Binder

per page:

Latest

invited-talk
November 2021
Published By ACM
Modern Learning Methodologies for Co-Saliency Detection
- Junwei Han
HUMA'21: Proceedings of the 2nd International Workshop on Human-centric Multimedia AnalysisNovember 2021, pp 1https://doi.org/10.1145/3475723.3487886

Visual saliency computing aims to imitate the human visual attention mechanism to identify the most prominent or unique areas or objects from a visual scene. It is one of the basic low-level image processing techniques and can be applied to many ...
0
38
Metrics
Total Citations0
Total Downloads38
Last 12 Months8
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2021
Published By ACM
Using Feature Interaction among GPS Data for Road Intersection Detection
HUMA'21: Proceedings of the 2nd International Workshop on Human-centric Multimedia AnalysisNovember 2021, pp 31–37https://doi.org/10.1145/3475723.3484249

Road intersection plays a vital role in road network construction, automatic drive, and intelligent transportation systems. Most methods detect road intersections only using geometrical features without spatio-temporal features, leading to insufficient ...
0
85
Metrics
Total Citations0
Total Downloads85
Last 12 Months50
Last 6 weeks5
1
Supplementary Material
huma3497.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
November 2021
Published By ACM
A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric
- Yitian Yuan,
- Xiaohan Lan,
- Xin Wang,
- Long Chen,
- Zhi Wang,
- Wenwu Zhu
HUMA'21: Proceedings of the 2nd International Workshop on Human-centric Multimedia AnalysisNovember 2021, pp 13–21https://doi.org/10.1145/3475723.3484247

Temporal Sentence Grounding in Videos (TSGV), \ie, grounding a natural language sentence which indicates complex human activities in a long and untrimmed video sequence, has received unprecedented attentions over the last few years. Although each newly ...
13
193
Metrics
Total Citations13
Total Downloads193
Last 12 Months119
Last 6 weeks13
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
short-paper
October 2021
Published By ACM
Multimodal Product Identification: Submission to Watch and Buy 2021 Challenge
WAB'21: Proceedings of the 1st Workshop on Multimodal Product Identification in Livestreaming and WAB ChallengeOctober 2021, pp 9–13https://doi.org/10.1145/3475956.3484486

This technical report describes the overview of our approach to the "Watch and Buy: Multimodal Product Identification Challenge". Specifically, we tackle this problem with a three-stage framework, i.e., product detection, retrieval and classification. ...
0
93
Metrics
Total Citations0
Total Downloads93
Last 12 Months30
Last 6 weeks2
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
An Empirical Study of Uncertainty Gap for Disentangling Factors
Trustworthy AI'21: Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia ComputingOctober 2021, pp 1–8https://doi.org/10.1145/3475731.3484954

Disentangling factors has proven to be crucial for building interpretable AI systems. Disentangled generative models would have explanatory input variables to increase the trustworthiness and robustness. Previous works apply a progressive ...
0
69
Metrics
Total Citations0
Total Downloads69
Last 12 Months30
Last 6 weeks3
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
Frequency Centric Defense Mechanisms against Adversarial Examples
ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for MultimediaOctober 2021, pp 62–67https://doi.org/10.1145/3475724.3483610

Adversarial example(AE) aims at fooling a Convolution Neural Network by introducing small perturbations in the input image. The proposed work uses the magnitude and phase of the Fourier Spectrum and the entropy of the image to defend against AE. We ...
3
82
Metrics
Total Citations3
Total Downloads82
Last 12 Months35
Last 6 weeks3
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
An Investigation on Sparsity of CapsNets for Adversarial Robustness
- Lei Zhao,
- Lei Huang
ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for MultimediaOctober 2021, pp 55–61https://doi.org/10.1145/3475724.3483609

The routing-by-agreement mechanism in capsule networks (CapsNets) is used to build visual hierarchical relationships with a characteristic of assigning parts to wholes. The connections between capsules of different layers become sparser with more ...
0
31
Metrics
Total Citations0
Total Downloads31
Last 12 Months8
Last 6 weeks0
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
October 2021
Published By ACM
Real World Robustness from Systematic Noise
ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for MultimediaOctober 2021, pp 42–48https://doi.org/10.1145/3475724.3483607

Systematic error, which is not determined by chance, often refers to the inaccuracy (involving either the observation or measurement process) inherent to a system. In this paper, we exhibit some long-neglected but frequent-happening adversarial examples ...
0
165
Metrics
Total Citations0
Total Downloads165
Last 12 Months60
Last 6 weeks8
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
Open Access
October 2021
Published By ACM
Urban Footpath Image Dataset to Assess Pedestrian Mobility
UrbanMM'21: Proceedings of the 1st International Workshop on Multimedia Computing for Urban DataOctober 2021, pp 23–30https://doi.org/10.1145/3475721.3484313

This paper presents an urban footpath image dataset captured through crowdsourcing using the mapillary service (mobile application) and demonstrating its use for data analytics applications by employing object detection and image segmentation. The study ...
0
435
Metrics
Total Citations0
Total Downloads435
Last 12 Months338
Last 6 weeks22
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
Open Access
October 2021
Published By ACM
UrbanAccess: Query Driven Urban Analytics Platform for Detecting Complex Accessibility Event Patterns using Tactile Surfaces
UrbanMM'21: Proceedings of the 1st International Workshop on Multimedia Computing for Urban DataOctober 2021, pp 19–21https://doi.org/10.1145/3475721.3484312

The smart city concept has now become one of the key enablers in urban city management. The adoption and permeation of ICT and AI-driven techniques have enabled the authorities to resolve poor urban planning issues with improved delivery of citizen ...
0
153
Metrics
Total Citations0
Total Downloads153
Last 12 Months113
Last 6 weeks21
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
proceeding
October 2021
Published By ACM
ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia
Deep learning has achieved significant success in multimedia fields involving computer vision, natural language processing, and acoustics. However, research in adversarial learning also shows that they are highly vulnerable to adversarial examples. ...
13
1,276
Metrics
Total Citations13
Total Downloads1,276
Last 12 Months591
Last 6 weeks57
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
research-article
October 2021
Published By ACM
Spatio-temporal Convolutional Attention Network for Spotting Macro- and Micro-expression Intervals
FME'21: Proceedings of the 1st Workshop on Facial Micro-Expression: Advanced Techniques for Facial Expressions Generation and SpottingOctober 2021, pp 25–30https://doi.org/10.1145/3476100.3484463

Emotional detection based on facial expressions is an important procedure in high-risk tasks such as criminal investigation or lie detection. To reduce the impact of the inconsistency in the duration of macro- and micro-expression, we propose an ...
2
141
Metrics
Total Citations2
Total Downloads141
Last 12 Months73
Last 6 weeks7
1
Supplementary Material
FME21-fme3488.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
Facial Action Unit Detection with Local Key Facial Sub-region based Multi-label Classification for Micro-expression Analysis
FME'21: Proceedings of the 1st Workshop on Facial Micro-Expression: Advanced Techniques for Facial Expressions Generation and SpottingOctober 2021, pp 11–18https://doi.org/10.1145/3476100.3484462

Micro-expressions describe unconscious facial movements which reflect a person's psychological state even when there is an attempt to conceal it. Often used in psychological and forensic applications, their manual recognition requires professional ...
2
218
Metrics
Total Citations2
Total Downloads218
Last 12 Months124
Last 6 weeks12
1
Supplementary Material
FME21-fp3480.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
Invertable Frowns: Video-to-Video Facial Emotion Translation
ADGD '21: Proceedings of the 1st Workshop on Synthetic Multimedia - Audiovisual Deepfake Generation and DetectionOctober 2021, pp 25–33https://doi.org/10.1145/3476099.3484317

We present Wav2Lip-Emotion, a video-to-video translation architecture that modifies facial expressions of emotion in videos of speakers. Previous work modifies emotion in images, uses a single image to produce a video with animated emotion, or puppets ...
0
111
Metrics
Total Citations0
Total Downloads111
Last 12 Months54
Last 6 weeks1
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
October 2021
Published By ACM
DmyT: Dummy Triplet Loss for Deepfake Detection
ADGD '21: Proceedings of the 1st Workshop on Synthetic Multimedia - Audiovisual Deepfake Generation and DetectionOctober 2021, pp 17–24https://doi.org/10.1145/3476099.3484316

Recent progress in deep learning-based image generation has madeit easier to create convincing fake videos called deepfakes. Whilethe benefits of such technology are undeniable, it can also be usedas realistic fake news support for mass disinformation. ...
2
204
Metrics
Total Citations2
Total Downloads204
Last 12 Months85
Last 6 weeks7
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
keynote
October 2021
Published By ACM
"Deepfake" Portrait Image Generation
- Jianfei Cai
ADGD '21: Proceedings of the 1st Workshop on Synthetic Multimedia - Audiovisual Deepfake Generation and DetectionOctober 2021, pp 5https://doi.org/10.1145/3476099.3480396

With the prevailing of deep learning technology, especially generative adversarial networks (GAN), generating photo-realistic facial images has made huge progress. Image generation techniques have many good applications such as data augmentation, ...
0
202
Metrics
Total Citations0
Total Downloads202
Last 12 Months104
Last 6 weeks16
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
October 2021
Published By ACM
Contextual Image Parsing via Panoptic Segment Sorting
MULL'21: Multimedia Understanding with Less Labeling on Multimedia Understanding with Less LabelingOctober 2021, pp 27–36https://doi.org/10.1145/3476098.3485056

Real-world visual recognition is far more complex than object recognition; there is stuff without distinctive shape or appearance, and the same object appearing in different contexts calls for different actions. While we need context-aware visual ...
1
151
Metrics
Total Citations1
Total Downloads151
Last 12 Months89
Last 6 weeks7
1
Supplementary Material
mull07aux.zip
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
October 2021
Published By ACM
Hybrid Mutimodal Fusion for Dimensional Emotion Recognition
- Ziyu Ma,
- Fuyan Ma,
- Bin Sun,
- Shutao Li
MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis ChallengeOctober 2021, pp 29–36https://doi.org/10.1145/3475957.3484457

In this paper, we extensively present our solutions for the MuSe-Stress sub-challenge and the MuSe-Physio sub-challenge of Multimodal Sentiment Challenge (MuSe) 2021. The goal of MuSe-Stress sub-challenge is to predict the level of emotional arousal and ...
8
323
Metrics
Total Citations8
Total Downloads323
Last 12 Months162
Last 6 weeks13
1
Supplementary Material
muse3470.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access
research-article
Open Access
October 2021
Published By ACM
Multimodal Emotion Recognition and Sentiment Analysis via Attention Enhanced Recurrent Model
- Licai Sun,
- Mingyu Xu,
- Zheng Lian,
- Bin Liu,
- Jianhua Tao,
- Meng Wang,
- Yuan Cheng
MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis ChallengeOctober 2021, pp 15–20https://doi.org/10.1145/3475957.3484456

With the proliferation of user-generated videos in online websites, it becomes particularly important to achieve automatic perception and understanding of human emotion/sentiment from these videos. In this paper, we present our solutions to the MuSe-...
8
822
Metrics
Total Citations8
Total Downloads822
Last 12 Months556
Last 6 weeks45
1
Supplementary Material
muse21-3469.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
View online with eReader
PDF
research-article
October 2021
Published By ACM
Multimodal Sentiment Analysis based on Recurrent Neural Network and Multimodal Attention
- Cong Cai,
- Yu He,
- Licai Sun,
- Zheng Lian,
- Bin Liu,
- Jianhua Tao,
- Mingyu Xu,
- Kexin Wang
MuSe '21: Proceedings of the 2nd on Multimodal Sentiment Analysis ChallengeOctober 2021, pp 61–67https://doi.org/10.1145/3475957.3484454

Automatic estimation of emotional state has a wide application in human-computer interaction. In this paper, we present our solutions for the MuSe-Stress and MuSe-Physio sub-challenge of Multimodal Sentiment Analysis (MuSe 2021). The goal of these two ...
8
528
Metrics
Total Citations8
Total Downloads528
Last 12 Months342
Last 6 weeks19
1
Supplementary Material
MuSe21-fp3467.mp4
Export Citations
Save to Binder
Save to Binder
Create a New Binder
Name
Get Access

Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing

Applied Filters

People

Names

Affiliations

Authors

Reviewers

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Paper Award

Publisher

Conferences

Sponsors

Conference Event

Reproducibility Badges

Publication Date

Modern Learning Methodologies for Co-Saliency Detection

Using Feature Interaction among GPS Data for Road Intersection Detection

A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric

Multimodal Product Identification: Submission to Watch and Buy 2021 Challenge

An Empirical Study of Uncertainty Gap for Disentangling Factors

Frequency Centric Defense Mechanisms against Adversarial Examples

An Investigation on Sparsity of CapsNets for Adversarial Robustness

Real World Robustness from Systematic Noise

Urban Footpath Image Dataset to Assess Pedestrian Mobility

UrbanAccess: Query Driven Urban Analytics Platform for Detecting Complex Accessibility Event Patterns using Tactile Surfaces

ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia

Spatio-temporal Convolutional Attention Network for Spotting Macro- and Micro-expression Intervals

Facial Action Unit Detection with Local Key Facial Sub-region based Multi-label Classification for Micro-expression Analysis

Invertable Frowns: Video-to-Video Facial Emotion Translation

DmyT: Dummy Triplet Loss for Deepfake Detection

"Deepfake" Portrait Image Generation

Contextual Image Parsing via Panoptic Segment Sorting

Hybrid Mutimodal Fusion for Dimensional Emotion Recognition

Multimodal Emotion Recognition and Sentiment Analysis via Attention Enhanced Recurrent Model

Multimodal Sentiment Analysis based on Recurrent Neural Network and Multimodal Attention