martes, 23 de febrero de 2016

Classification of object in Android - Fourth Week

In my forth week I worked on a code for classifying images in Android. That is possible using the SVM calculated in the program in Python. The Support Vector Machine is saved in a file from Python and then is loaded on Android. It is also necessary to use OpenCV to get the local descriptors from the testing image and use the codebook to get the global descriptor. The codebook is also saved in a file from Python and loaded on Android using CSV format.

How to use OpenCV on an Android Activity

I created a new Activity on the Android project called TestActivity to make the debugging easier. It uses a picture that was already taken and stored in memory. To make an Activity work with OpenCV you can follow this tutorial
http://docs.opencv.org/2.4/doc/tutorials/introduction/android_binary_package/dev_with_OCV_on_Android.html.
In my case I added a LoaderCallback attribute to the TestActivity class:

private BaseLoaderCallback mLoaderCallback =
new BaseLoaderCallback(this) {
  @Override
  public void onManagerConnected(int status) {
      switch (status) {
          case LoaderCallbackInterface.SUCCESS: {
              Log.d(TAG, "OpenCV loaded");
          }
          break;
          default: {
              super.onManagerConnected(status);
          }
          break;
      }
  }
};

And this lines inside the onCreate() method:

// --------------------------------------------------------------------------
// OpenCVLoader
if (!OpenCVLoader.initDebug()) {
  mLoaderCallback.
onManagerConnected(LoaderCallbackInterface.INIT_FAILED);
} else {
mLoaderCallback.
onManagerConnected(LoaderCallbackInterface.SUCCESS);
}
// ---------------------------------------------------------------------------

Loading an OpenCV SVM on Android 

To use the SVM previously computed with Python it was necessary to do something triky. The SVM class has a function load(filename). The problem is that in Android filenames are not as easy as in a PC. You may have the file stored in external or internal memory, on the local folder of the app or in the res/raw/ directory of the source code. So I followed a sample from OpenCV Android, a project called face-detection. They store the model file in a /res/raw/ directory inside the source code and what they do is to open it and make a local copy of it. So when the load method is called it uses the local copy.

InputStream is = getResources().openRawResource(R.raw.lbpcascade_frontalface);
File cascadeDir = getDir("cascade", Context.MODE_PRIVATE);
mCascadeFile = new File(cascadeDir, "lbpcascade_frontalface.xml");
FileOutputStream os = new FileOutputStream(mCascadeFile);

byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = is.read(buffer)) != -1) {
  os.write(buffer, 0, bytesRead);
}
is.close();
os.close();

mJavaDetector = new CascadeClassifier(mCascadeFile.getAbsolutePath());

The method getAbsolutePath( ) is what you need as filename. After using that I was able to load the SVM and use it to predict the class of the image.

Result

Finally I also had to create a Vlad class to calculate the global descriptor for the image using the codebook and the local descriptors. After I did that I had all the elements to do a classification for the input image, which was an image of potatoes, and was correctly classified as that.


domingo, 21 de febrero de 2016

Results of classification - Third Week

In my third week I began testing the object classification code looking on accuracy and time of processing. These are the results:

Using ORB Features

For k (number of clusters, centers or codewords) = 64

Time for getting all the local ORB descriptors of the 356 training images was 00:00:09.533.
Time for k-means with ORB and k=64 was 00:00:16.574.
Time for getting VLAD global descriptors for k=64 and the 356 training images was 00:00:58.26.
Time for calculating the SVM for k=64 was 00:00:13.407.
Time for getting VLAD descriptors for k=64 and the 179 testing images was 00:00:28.784.
Time for classifying was 00:00:0.001.
Accuracy was 76.705%.

For k=128

Time for k-means with ORB and k=128 was 00:00:20.
Time for getting VLAD global descriptors for k=128 and the 356 training images was 00:02:24.651.
Time for calculating the SVM for k=128 was 00:00:34.648.
Time for getting VLAD descriptors for k=128 and the 179 testing images was 00:01:09.031.
Time for classifying was 00:00:0.001.
Accuracy was 80.114%.

For k=256

Time for k-means with ORB and k=256 was 00:00:59.893.
Time for getting VLAD global descriptors for k=256 and the 356 training images was 00:02:58.63.
Time for calculating the SVM for k=256 was 00:01:12.594.
Time for getting VLAD descriptors for k=128 and the 179 testing images was 00:01:27.856.
Time for classifying was 00:00:0.003.
Accuracy was 77.841%.

Using SIFT features

For k=64

Time for getting all the local SIFT descriptors of the 356 training images was 00:01:45.578.
Time for k-means with SIFT and k=64 was 00:01:33.133.
Time for getting VLAD global descriptors for k=64 and the 356 training images was 00:05:16.315.
Time for calculating the SVM for k=64 was 00:00:48.826.
Time for getting VLAD descriptors for k=64 and the 179 testing images was 00:02:05.115.
Time for classifying was 00:00:0.002.
Accuracy was 89.773%.

For k=128

Time for k-means with SIFT and k=128 was 00:02:51.927.
Time for getting VLAD global descriptors for k=128 and the 356 training images was 00:10:41.265.
Time for calculating the SVM for k=128 was 00:01:55.241.
Time for getting VLAD descriptors for k=128 and the 179 testing images was 00:04:17.92.
Time for classifying was 00:00:0.004.
Accuracy was 89.773%.

For k=256

Time for k-means with SIFT and k=256 was 00:05:23.914.
Time for getting VLAD global descriptors for k=256 and the 356 training images was 00:16:36.843.
Time for calculating the SVM for k=256 was 00:04:01.465.
Time for getting VLAD descriptors for k=256 and the 179 testing images was 00:05:41.41.
Time for classifying was 00:00:0.016.
Accuracy was 87.5%.

Results may be different using a SVM with other kernel like RBF. I only used a Linear kernel. Accuracy was calculated as number of correctly classified images / total number of images (in the testing set).

miércoles, 17 de febrero de 2016

Object Classification - Second Week

About object classification

In my second week I started to developed a software for training an object classifier using images. The idea is that using an image as input, for example an image of potatoes the classifier would give the class of the image as output. That is done using Computer Vision and Machine Learning techniques. The Computer Vision part is that the program gets descriptors for the image and the Machine Learning part is that the descriptors are used to train a model that would classify another set of descriptors.

For example an article would be classified as sport, political news or other class depending on the words that are used. For example if an article has words like football, goals, points, team, etc. It would probably be an sports article. In that sense, the words are the descriptors of the article and they are different depending of the class they belong. A model is a rule for the set of descriptors that would take them and predict the class that is described by them.

About the program


I created a program on Python and OpenCV that uses ORB local descriptors, VLAD global descriptors and a SVM as classifier. It is free and can be downloaded on Github https://github.com/HenrYxZ/object-classification. To run it just use the command line

python main.py 

The program will look for a "dataset" directory inside the project folder. Then it will generate a Dataset object with the images found there. The images must be in a folder with the name of it class, for example all the images of potatoes must be in a "potatoes" folder. You don't have to divide the images between training and testing sets as the program will do that automatically with a random selection of 1/3 of the images for testing and the rest for training. Then the Dataset object will be stored in a file so that it can be used later and will have the information of which image is in which set and in which class.

After that the local descriptors for the training set are calculated. That is done using OpenCV functions. This tutorial shows how ORB descriptors are obtained http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_feature2d/py_orb/py_orb.html. I resize every image that has more than 640 pixels in one of its sides to have 640 pixels in its biggest size and conserving the aspect ratio to have a similar amount of local descriptors on every image. If an image has too high resolution it may give a lot of local descriptors and it would be difficult to have enough memory to store them all. If the dataset has too many images the program may crash because the computer may not be able to store all the descriptors in RAM memory.

An example of ORB descriptors found in a Cassava root image


When the local descriptors are ready a codebook is generated using K-Means (The technique is known as Bag of Words). A codebook is a set of vectors that theoretically have more descriptive power for recognizing classes. Those vectors are called codewords. For example a codeword for a car may be a wheel because cars have wheels no matter how different they are. The K-Means algorithm find centers that are representatives of clusters by minimizing the distance between the elements of a cluster to it center. The descriptors are grouped into k number of clusters (k is predefined) randomly at first but then in the groups that minimize distances. So for example for k=128 there are going to be 128 codewords which are vectors obtained by K-Means. A center is an average of descriptors and may have descriptors of different classes in its cluster space.

An example of a codebook with multiple words


Then for each image there is a global descriptor VLAD (Vector of Locally Aggregated Descriptors). This descriptor uses the local descriptors and the codebook to create one vector for the whole image. For each local descriptor in the image finds the nearest codeword in the codebook and adds the difference between each component of the two vectors to the global vector. The global vector is a vector of length equal to the dimension of the local descriptors multiplied by the number of codewords. In the program after the VLAD vector is calculated there is a square root normalization where every component of the global vector is equal to the square root of the absolute value of the previous component and then a l2 normalization where the vector is divided by its norm. That has had better results.

VLAD global descriptors


This global descriptors are given to a Support Vector Machine as input with the labels and it creates a model for the training set. I have used a Linear kernel but it's possible to use other kernels like RBF and maybe get better results. What the SVM does is to find decision lines or equations that are going to divide groups of vectors into one class or another. And it does that by maximizing the separation between decision lines and the minimizing the distance to the vectors of the same class. OpenCV comes with a SVM class that has a train_auto function that automatically selects the best parameters for the machine, and is what I use in the program.

After the SVM is trained the VLAD vectors of the testing images are calculated in the same way as in training and the global vectors are used with the SVM to predict their classes. Then the accuracy in the prediction is obtained by dividing the number of images that where correctly classified by the total number of images.

Implementation

I downloaded images from ImageNet of cassava roots, pinto beans and potatoes. And put them in a dataset folder inside the project. The best results where 80% of accuracy with a codebook of k=128.The training set where a total of 356 images and a testing set with 179 images total. The total time of processing was less than 3 minutes in a Intel I3 Ultrabook.

Color Segmentation - First Week

In my first week I downloaded the 1KK application and looked at the code. It is an Open Source Android project developed by Trevor Rife and is hosted on Github. The program works by taking a picture of seeds using a green background with blue circles. The circles are used to estimate the length per pixel of a picture by getting an average diameter of circle in pixel and knowing the real diameter of the circles. For getting the circles it's necessary to do a segmentation of the blue colors and for extracting the seeds from the image there is also a segmentation to remove the green and blue colors.

An example of an image taken with 1KK


The segmentation is done by using range of colors. The problem with this type of segmentation is that the ranges have to be found by experimentation. In the case of RGB colors they may be affected by lighting changes. For example a picture taken with a lot of light can have high values for RGB channels components but  with low illumination  the values would be low. So for the same range and the same objects the picture would give a different segmentation depending on the light. They use HSV color space for segmentation. To test the ranges predefined I used a software I created that lets you try with different color ranges and see the segmentation of the picture in real time (it also works with videos).

The software is free and is called color-range-selector. To run it you need to have Python and OpenCV installed. I used it on Windows 10 and this are the steps I followed to make it run:

How to install OpenCV for Python in Windows

  1. First you need to install Python 2.7 by downloading and executing the msi installer. 
  2. Then install the Python Development Tools for Windows that are available here http://aka.ms/vcpython27 .
  3. After that you will be able to install the Numpy library for Python by running:
    pip install numpy
    in command line.
  4.  Follow the instructions in this tutorial to install OpenCV for Python http://docs.opencv.org/master/d5/de5/tutorial_py_setup_in_windows.html .

How to install and use the color range selector

  1. Download the project from Github by using the command line:
    git clone https://github.com/HenrYxZ/color-range-selector.git
  2. Change the directory to the project:
    cd color-range-selector
  3. Finally run the main program:
    python color_range_selector.py  

Selecting a range of color

Segmentation using HSV range of colors with color-range-selector
The image shows the segmentation for using a range between (40, 0, 0) and (255, 255, 255) in HSV space. The result is capable to extract the green background and the blue circles. But for other image the ranges don't work.

Testing a segmentation using HSV range and not working
But using a BGR color range seems to work better in the second case. The idea is to use a high minimum value for the red channel. That is because both green and blue are far from the red in the BGR color space. This are the results using a range between (0, 0, 60) and (255, 255, 255):

Testing segmentation on BGR and working well

Segmentation on BGR with inferior results than HSV
In conclusion, HSV space may be used for segmentation with low change of hue. For example using seeds or potatoes with brownish hue works well, but the testing using papers of gray hue not. On the other hand, BGR lets use more different colors of seeds but the segmentation is affected by the illumination.

martes, 16 de febrero de 2016

Introduction

Hi, I'm Hernaldo! The purpose of this blog is to document my work as a Research Intern at Texas A&M University. On 2015 I graduated from a B.S. in Computer Science at the Pontifical Catholic University of Chile (PUC). Then I was one of five students selected from the School of Engineering of my university to do a research internship from January to March 2016 at Texas A&M University.

My project has been about using Computer Vision in Agriculture and I'm advised by the professor Bruce Gooch. There is a group researching about agriculture improvement that has developed some mobile applications. One of them is 1KK app which allows the users to get morphological measures of seeds by using the device camera. That is done by using Computer Vision and algorithms implemented in SmartGrain. The idea is to improve the applications by testing the currents and adding new ones.