Technologies Used

Gesture Recognition using Simple Computer Vision

A project to explore hand gesture recognition, a novel form of interaction with a computer.

Date: Fall 2007
Course: ComS 401: Projects in Computing
Teammates: Ben Baldus, Juan Lizzaraga
Awards: Best Student Project award


Project Goals

For our project-based class on computer vision and related topics, my team of three chose to study hand gesture recognition – something I liked because it presents a novel form of interaction beyond just the keyboard and mouse.

We began our project by reading other related research on the topic, and found some interesting work being done by Andrew Wilson of Microsoft Research. Wilson had conducted a project on pinch recognition, or the gesture made by touching the thumb and forefinger. He used a webcam to recognize pinch gestures and use these gestures for cursor control. We decided for our project to replicate Wilson’s research, and then build upon his design by creating new techniques and applications.

picture-10.png

Process

We began our work by using OpenCV, an open-source computer vision library in C++, to process the input from a webcam. With Wilson’s setup, the camera is set on top of the computer monitor looking down. When the application is first started, the camera records a picture of the keyboard; then, with each new frame, this initial image is compared against the current image to find if the hands have appeared over the keyboard.

picture-12.png

When the user pinches their hand, this creates a segmented ‘hole’ over the keyboard, which is distinguished in several steps, including binary segmentation to segment the background of the keyboard from the hand, and connected components analysis to find objects of a certain size, which are the holes.

picture-13.png

The centroid of this hole is the point used for cursor control. Cursor movement is enabled when the user first pinches their fingers, and ends when the pinching stops. To create mouse clicks, the interface uses the rapid closing and opening of the thumb and forefinger as the mouse-down and mouse-up events.

After several weeks of intensive coding in the lab, we were able to replicate Wilson’s process for pinch recognition. I took the lead on programming, including reworking our code as we wrote it to make it more efficient. My other two team members did some coding and also most of the testing. I put in many hours on my own because I was enjoying the project so much.

Key Challenges

While we had succeeded in replicating Wilson’s research, our goal was to go beyond this. We quickly learned that one of the key technical challenges with the project was the binary segmentation step – if the segmentation did not work with a high degree of accuracy, many false pinches would be recognized. Wilson himself noted this as one of the major drawbacks, and an area for future research. Thus, we spent a large part of the semester working to improve this step, with me leading the efforts. I investigated several alternative algorithms for binary segmentation, but many algorithms were either too slow to be performed in real time, or did not improve the segmentation. After much trial and error, we ended up going with the original segmentation in combination with a skin detection algorithm. The original segmentation would find pixels that had changed from the previous frame, and the skin detection found pixels within certain color value ranges. Together, they caused the accuracy to improve.

A second key challenge was creating a new application for the pinch gesture technique. I had the idea of using it for window management (dragging and resizing), since this is a common and somewhat tedious task when a lot of windows are open. I figured out how to use Windows hooks to be able to drag and resize from anywhere in a window using the pinch gestures. This was a good application for using hand gestures, since it can be done easily and quickly – and as we found, has a certain “fun” factor that users really enjoy.

Results

At the end of the semester, our group tested our application with ten users, wrote a paper, and presented our findings to the class along with a live demo. We won the “Best Project Award” for the class, which was a great reward after all of our hard work.

Download our final paper in PDF format

Demos

Resizing a window:

 

Dragging a window:

 

Dragging windows in testing mode: