OpenCV was used for image processing.

Motivation

  • Done in fulfillment of ECE 499 at the University of Victoria
  • Chosen because it poses multiple image processing challenges that are not easily solved with current open source software
  • This type of technology has potential to enhance our control over technology and improve our quality of life in a technological society

Challenges

  • Skin detection from colour image
  • Gesture determination from skin mask
  • Tracking of gestures in an image stream

Algorithm Overview

The code can be broken up into three sections (excluding camera control):

  1. Skin Detection
    • Skin detection takes a colour image as input and returns a binary skin mask and information about this face (dimensions and location)
    • Skin detection is performed in the YCrCb colour space
    • The face location and dimensions are determined using a HAAR cascade classifier
      • The HAAR cascade classifier is cuda accelerated to enhance performance
  2. Gesture Recognition
    • Gesture recognition takes in a binary skin mask and face information and returns a finger count and a flag about the state of the thumb for each hand in the skin mask
    • Binary skin mask of face is analysed using contour analysis
    • By knowing the dimensions and location of the face, the size of different hand features are assumed
    • A model of what the hand should look like for different gestures is generated
  3. Gesture Tracking
    • Takes in gestures with timestamps, and face information
    • Sorts gestures into different queues based on the face it belongs to and the side of the body is it on
    • Old gestures are discarded
    • If a gesture is held for more than 500 ms before transitioning to a different gesture for 500 ms (within a 3 second timeframe) a callback is executed
      • If not callback is registered, this step is skipped

Limitations and Future Work

  1. Spatial tracking is not implemented, the data is there so it could be added with minimal effort
  2. Spectral lighting or coloured/mixed lighting can cause skin detection to fail, this likely cannot be fixed without the use of multiple cascade classifiers to improve the skin detection technique
  3. People with similar skin tones may conflict and mesh with each other if they are both in the image, more work into discrimation of similar skin tones would aid this
  4. More reliability in gesture recognition would be good. Current technique requires the user is perpendicular to the camera