Sunday, June 21, 2009

Computer Vision: Working with OpenCV and beyond

For my software Engineering project, I had chosen to work on something different other than database and information management system that we typically take in this course. I've had enough of it in my last term. So this time we three group members choose to work on Computer Vision System. At first we where quite confused because we knew nothing about this field and given our inexperience, it was really difficult. Even our project requirement was a bit hazy, so to speak :)

We tried to learn about the basic concepts of computer vision, image analysis and had to go through an extensive set of frameworks to evaluate which fits our needs. Initially our project requirement was to detect/recognize human facial structure but this was later retrofitted with object detection and feature extraction. Through the course of time we tried to study a lot of research papers (most of them seemed to me like Egyptian Hieroglyphics, :(( ...).

The first framework we came across was Torch3Vision. This framework is built on top of Torch3, which is gives the user a lot of image processing algorithms. Torch3Vision is extensively built for Facial detection. While it was really a good framework, it proved a bit sturdy to customize for our general need as the project requirement started to change.

We later started to study about Content Based Image Retrieval (CBIR) techniques, two of the framework we took into account was GIFT (also known as GNUIFT, to avoid name confusion with another open source tool) and Lire. Lire is written in java and uses Lucene api in the backend, while GIFT is written in C/C++. These implementations enable the user to search images by 'Query by Example', not what we exactly wanted to do. So yeah, another dead end what it seemed ...

The last one, and which of course we choose to work on was OpenCV. OpenCV is an open source library initially built and later sponsored by Intel. It is used extensively in various fields of computer vision, so its in active development and well documented. It also comes with a comprehensive e-book that can help you a lot in crunching the heavy duty libs for the first time.

We also took help from another project that uses OpenCV library to detect objects using 'Boosted Histograms' method, its called objectdet. Here I'm going to go through the basic steps of running OpenCV library in Linux as well as compiling objectdet in Linux. Though OpenCV comes with all major linux distros, I recommend direcetly downloading and compiling from the svn. They have a very extensive wiki, this
page has the detailed instructions for installing in all OS. One thing I'd like to mention here, is the CMake based build is more straightforward than Autotools as it sometimes (on rare occasions) breaks.

After you have OpenCV set up and running, you can try to install 'objectdet' on your system (I'm running it in a Ubuntu 9.04). It hasn't been on active development for a while and running the old code with new libs can surely have its moments(!!!).


sudo apt-get install build-essential libxml++-dev libboost-filesystem-dev
svn checkout http://objectdet.googlecode.com/svn/trunk/ objectdet-read-only
cd objectdet-read-only/objectdet
chmod a+X autogen.sh
./autogen.sh
mv Makefile.am makefile.am
sudo ln -s /usr/lib/libboost_filesystem.a /usr/lib//libboost_filesystem-gcc.a
sudo ln -s /usr/lib/libboost_filesystem.so /usr/lib//libboost_filesystem-gcc.so
./configure --with-opencv-headers=/usr/local/include/opencv/
cd src
make
sudo make install
cd ..

Run the demo

src/objectdet -i images/000537.png -c classifiers/cabmodel_interm_nst40_VOC06car01train5_trainval_orienthistmod.xml

A simple window with a car being detected should appear. Here the xml file contains the trained data for the 'car' object. How to create this trained xml? Well, that's the harder part, I'll try to write that in my next post when I get some time from my GSoC project.

If you are interested in working in Computer Vision and searching for a good set of library, apart from the above mentioned ones, also take a look at these.

1. Gandalf (yeah Gandalf, I really liked the name)
2. This site hosts a lot of computed vision projects
3. LTI-LIB


Back to Blogging

After a failed attempt last year to get back to blogging, I'm trying it again this year. I really wanted to get back, but got busy will...