OpenCV was used for image processing.
Motivation
- Done in fulfillment of ECE 499 at the University of Victoria
- Chosen because it poses multiple image processing challenges that are not easily solved with current open source software
- This type of technology has potential to enhance our control over technology and improve our quality of life in a technological society
Challenges
- Skin detection from colour image
- Gesture determination from skin mask
- Tracking of gestures in an image stream
Algorithm Overview
The code can be broken up into three sections (excluding camera control):
- Skin Detection
- Skin detection takes a colour image as input and returns a binary skin mask and information about this face (dimensions and location)
- Skin detection is performed in the YCrCb colour space
- The face location and dimensions are determined using a HAAR cascade classifier
- The HAAR cascade classifier is cuda accelerated to enhance performance
- Gesture Recognition
- Gesture recognition takes in a binary skin mask and face information and returns a finger count and a flag about the state of the thumb for each hand in the skin mask
- Binary skin mask of face is analysed using contour analysis
- By knowing the dimensions and location of the face, the size of different hand features are assumed
- A model of what the hand should look like for different gestures is generated
- Gesture Tracking
- Takes in gestures with timestamps, and face information
- Sorts gestures into different queues based on the face it belongs to and the side of the body is it on
- Old gestures are discarded
- If a gesture is held for more than 500 ms before transitioning to a different gesture for 500 ms (within a 3 second timeframe) a callback is executed
- If not callback is registered, this step is skipped
Limitations and Future Work
- Spatial tracking is not implemented, the data is there so it could be added with minimal effort
- Spectral lighting or coloured/mixed lighting can cause skin detection to fail, this likely cannot be fixed without the use of multiple cascade classifiers to improve the skin detection technique
- People with similar skin tones may conflict and mesh with each other if they are both in the image, more work into discrimation of similar skin tones would aid this
- More reliability in gesture recognition would be good. Current technique requires the user is perpendicular to the camera