Skip to main content

Research Repository

Advanced Search

All Outputs (31)

Visual Attention Assisted Games (2023)
Conference Proceeding
Mandal, B., Puhan, N. B., & Homi Anil, V. (2023). Visual Attention Assisted Games. In 2023 IEEE Conference on Games (CoG). https://doi.org/10.1109/cog57401.2023.10333186

In this work, we propose a committee of attention models developed for improving the deep reinforcement learning frequently used for games. The game environment is manifested with spatial and temporal attention mechanisms so as to focus on important... Read More about Visual Attention Assisted Games.

Optimization and Performance Evaluation of Hybrid Deep Learning Models for Traffic Flow Prediction (2023)
Conference Proceeding
Goparaju, S. U., Biju, R., M, P., MC, B., Gangadharan, D., Mandal, B., & C, P. (2023). Optimization and Performance Evaluation of Hybrid Deep Learning Models for Traffic Flow Prediction. . https://doi.org/10.1109/vtc2023-spring57618.2023.10200600

Traffic flow prediction has been regarded as a critical problem in intelligent transportation systems. An accurate prediction can help mitigate congestion and other societal problems while facilitating safer, cost and time-efficient travel. However,... Read More about Optimization and Performance Evaluation of Hybrid Deep Learning Models for Traffic Flow Prediction.

Deep Neural Network Based Attention Model for Structural Component Recognition (2023)
Conference Proceeding
Sarangi, S., & Mandal, B. (2023). Deep Neural Network Based Attention Model for Structural Component Recognition. . https://doi.org/10.5220/0011688400003417

The recognition of structural components from images/videos is a highly complex task because of the appearance of huge components and their extended existence alongside, which are relatively small components. The latter is frequently overestimated or... Read More about Deep Neural Network Based Attention Model for Structural Component Recognition.

StructureNet: Deep Context Attention Learning for Structural Component Recognition (2022)
Conference Proceeding
Kaothalkar, A., Mandal, B., & Puhan, N. (2022). StructureNet: Deep Context Attention Learning for Structural Component Recognition. . https://doi.org/10.5220/0010872800003124

Structural component recognition using images is a very challenging task due to the appearance of large components and their long continuation, existing jointly with very small components, the latter are often outcasted/missed by the existing methodo... Read More about StructureNet: Deep Context Attention Learning for Structural Component Recognition.

Cross-spectral Periocular Recognition: a Survey (2019)
Conference Proceeding
Behera, S., Mandal, B., & Puhan, N. (2019). Cross-spectral Periocular Recognition: a Survey. In Emerging Research in Electronics, Computer Science and Technology (731–741). https://doi.org/10.1007/978-981-13-5802-9_64

Among many biometrics such as face, iris, fingerprint and others, periocular region has the advantages over other biometrics because it is non-intrusive and serves as a balance between iris or eye region (very stringent, small area) and the whole fac... Read More about Cross-spectral Periocular Recognition: a Survey.

DeepPCA Based Objective Function for Melanoma Detection (2018)
Conference Proceeding
Sultana, N. N., Puhan, N. B., & Mandal, B. (2018). DeepPCA Based Objective Function for Melanoma Detection. In 2018 International Conference on Information Technology (ICIT). https://doi.org/10.1109/icit.2018.00025

In this paper, we propose an objective function for the convolutional neural network to acquire the variation separability as opposed to the categorical cross entropy which maximizes according to the target labels. This approach is an unsupervised le... Read More about DeepPCA Based Objective Function for Melanoma Detection.

Deep Adaptive Temporal Pooling for Activity Recognition (2018)
Conference Proceeding
Song, S., Cheung, N., Chandrasekhar, V., & Mandal, B. (2018). Deep Adaptive Temporal Pooling for Activity Recognition. . https://doi.org/10.1145/3240508.3240713

Deep neural networks have recently achieved competitive accuracy for human activity recognition. However, there is room for improvement, especially in modeling of long-term temporal importance and determining the activity relevance of different tempo... Read More about Deep Adaptive Temporal Pooling for Activity Recognition.

Deep Residual Network With Subclass Discriminant Analysis For Crowd Behavior Recognition (2018)
Conference Proceeding
Mandal, B., Fajtl, J., Argyriou, V., Monekosso, D., & Remagnino, P. (2018). Deep Residual Network With Subclass Discriminant Analysis For Crowd Behavior Recognition. . https://doi.org/10.1109/ICIP.2018.8451190

In this work, we extract rich representations of crowd behavior from video using a fine-tuned deep convolutional neural residual network. Using spatial partitioning trees we create subclasses within the feature maps from each of the crowd behavior a... Read More about Deep Residual Network With Subclass Discriminant Analysis For Crowd Behavior Recognition.

I2R VC @ ImageClef2017: Ensemble of Deep Learnt Features for Lifelog Video Summarization (2017)
Conference Proceeding
Molino, A., Mandal, B., Jie, L., Lim, J., Subbaraju, V., & Chandrasekhar, V. (2017). I2R VC @ ImageClef2017: Ensemble of Deep Learnt Features for Lifelog Video Summarization.

In this paper we describe our approach for the ImageCLEF-lifelog summarization task. A total of ten runs were submitted, which used only visual features, only metadata information, or both. In the first step, a set of relevant frames are drawn from t... Read More about I2R VC @ ImageClef2017: Ensemble of Deep Learnt Features for Lifelog Video Summarization.

An empirical approach for automatic face clustering on personal lifelogging images (2017)
Conference Proceeding
Subbaraju, V., Xu, Q., Mandal, B., Li, L., & Lim, J. (2017). An empirical approach for automatic face clustering on personal lifelogging images. . https://doi.org/10.1109/siprocess.2017.8124519

Life-logging applications generate a vast amount of personalized data that provides vital insights into the user's daily life. One such key insight is the people whom the user has come across/interacted with during regular life. This can be obtained... Read More about An empirical approach for automatic face clustering on personal lifelogging images.

Analysis of Human Attentions for Face Recognition on Natural Videos and Comparison with CV Algorithm on Performance (2017)
Conference Proceeding
Ragab Sayed, M., Yuting Lim, R., Mandal, B., Li, L., Hwee Lim, J., & Sim, T. (2017). Analysis of Human Attentions for Face Recognition on Natural Videos and Comparison with CV Algorithm on Performance. In No. 7: Science of Intelligence: Computational Principles of Natural and Artificial Intelligence

Researchers have conducted many studies on human attentions and their eye gaze patterns for face recognition (FR), hoping to inspire new ideas to develop computer vision (CV) algorithms which perform like or even better than human. Yet, while these s... Read More about Analysis of Human Attentions for Face Recognition on Natural Videos and Comparison with CV Algorithm on Performance.

Distinguishing Posed and Spontaneous Smiles by Facial Dynamics (2017)
Conference Proceeding
Mandal, B., Lee, D., & Ouarti, N. (2017). Distinguishing Posed and Spontaneous Smiles by Facial Dynamics. In Computer Vision – ACCV 2016 Workshops (552-566). https://doi.org/10.1007/978-3-319-54407-6_37

Smile is one of the key elements in identifying emotions and present state of mind of an individual. In this work, we propose a cluster of approaches to classify posed and spontaneous smiles using deep convolutional neural network (CNN) face features... Read More about Distinguishing Posed and Spontaneous Smiles by Facial Dynamics.

Spontaneous Versus Posed Smiles—Can We Tell the Difference? (2016)
Conference Proceeding
Mandal, B., & Ouarti, N. (2017). Spontaneous Versus Posed Smiles—Can We Tell the Difference?. . https://doi.org/10.1007/978-981-10-2107-7_24

Smile is an irrefutable expression that shows the physical state of the mind in both true and deceptive ways. Generally, it shows happy state of the mind, however, ‘smiles’ can be deceptive, for example people can give a smile when they feel happy an... Read More about Spontaneous Versus Posed Smiles—Can We Tell the Difference?.

Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition (2016)
Conference Proceeding
Song, S., Chandrasekhar, V., Mandal, B., Li, L., Lim, J., Babu, G. S., …Cheung, N. (2016). Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition. . https://doi.org/10.1109/cvprw.2016.54

In this paper, we propose a multimodal multi-stream deep learning framework to tackle the egocentric activity recognition problem, using both the video and sensor data. First, we experiment and extend a multi-stream Convolutional Neural Network to le... Read More about Multimodal Multi-Stream Deep Learning for Egocentric Activity Recognition.

Egocentric activity recognition with multimodal fisher vector (2016)
Conference Proceeding
Song, S., Cheung, N., Chandrasekhar, V., Mandal, B., & Liri, J. (2016). Egocentric activity recognition with multimodal fisher vector. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp.2016.7472171

With the increasing availability of wearable devices, research on egocentric activity recognition has received much attention recently. In this paper, we build a Multimodal Egocentric Activity dataset which includes egocentric videos and sensor data... Read More about Egocentric activity recognition with multimodal fisher vector.