Publications:Building video databases to boost performance quantification -- the DXM2VTS database


Do not edit this section

Keep all hand-made modifications below

Title Building video databases to boost performance quantification – the DXM2VTS database
Author Dereje Teferi and Josef Bigun
Year 2006
PublicationType Conference Paper
HostPublication Proceedings of The Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006)
Conference The Ninth Scandinavian Conference on Artificial Intelligence (SCAI 2006), Helsinki University of Technology Espoo, Finland, October 25-27, 2006
Diva url
Abstract Building a biometric database is an expensive task which requires high level of cooperation from a large number of participants. Currently, despite increased demand for large multimodal databases, there are only a few available. The XM2VTS database is one of the most utilized audio-video databases in the research community although it has been increasingly revealed that it cannot quantify performance of a recognition system in the presence of complex background, illumination, and scale variability. However, producing such databases could mean repeatedly recording a multitude of audio-video data outdoors, which makes it a very difficult task if not an impossible one. This is mainly due to the additional demands put on participants. This work presents a novel approach to audio-visual database collection and maintenance to boost the performance quantification of recognition methods and to increase the efficiency of multimodal database construction. To this end we present our segmentation procedure to separate the background of a high-quality video recorded under controlled studio conditions with the purpose to replace it with an arbitrary complex background. Furthermore, we present how an affine transformation and synthetic noise can be incorporated into the production of the new database to simulate real noise, e.g. motion blur due to translation, zooming and rotation. The entire system is applied to the XM2VTS database, which already consists of several terabytes of data, to produce the DXM2VTS – Damascened XM2VTS database essentially without an increase in resource consumption, i.e. storage space, video operator time, and time of clients populating the database. As a result, the DXM2VTS database is a damascened (sewn together) composition of two independently recorded real image sequences that consist of a choice of complex background scenes and the the original XM2VTS database.