Abstract

Visual Speech Recognition is the technique to understand what is being said by interpreting the lip movements of a person. Grasping the technique of lip reading is challenging for the hearing impaired as it is mostly based on trial and error methods. This paper proposes an approach to not only ease this process of learning but also help them in day to day life. The approach follows the steps: Lip detection using OpenCV, feature extraction using AutoEncoders and Long Short Term Memory neural network and classification using Softmax as a multi-class classifier. The output will be in the form of captions displayed below the video file.