An Active Speaker Detection Method in Videos using Standard Deviations of Color Histogram
No Thumbnail Available
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Description
Active Speaker Detection (ASD) is a process that predicts who the speaker is amongst
those whose faces appear in a video (if any) at any given point in time within the recorded
video. This work presents a novel algorithm capable of detecting the active speakers in
each video using the standard deviations of Color Histograms (CHs) computed at the mouth
region from one frame to another. This paper relies on the assumption that the lips of an
active speaker are in motion. They open and close and thus reveal the inner parts of the
mouth, like the tongue, teeth, and the vocal cavity which are of diverse colors in the process
of talking. It is possible to use already existing algorithms to detect the mouth region. This
region can be analyzed during the speaking process for the changes in color activity, and
this can be used to predict whether a user is speaking or not. If a person is not speaking,
the lips are at rest the CH of such mouth regions such candidates would be stable. As a
result, the standard deviations of such regions would be negligible. A threshold can be
experimentally determined which is thus capable of predicting if a person is speaking or
otherwise. This paper explores 53 online videos from Channels TV station, these videos
were employed in the creation of 250 video clips. Each clip is between 15 to 60 seconds
with a total of 3.6 hours. Each video contained the faces of at most two speakers in no
particular order. Sometimes, only one of the speakers' faces appears, at other times both
appear in the duration of the video. The status of the speakers whether active or not was
manually labeled to be used for the performance evaluation of the proposed algorithm. This
method was able to predict the active speakers with an accuracy of 99.19%.
Keywords
T Technology (General), TK Electrical engineering. Electronics Nuclear engineering