The recommended build process involves using docker, so that you have the exact environment that I know this code works under.
The docker image can be built using the Dockerfile in this repository. It is also available pre-built at sanchom/phd-environment.
I recommend checking out this repository into a location on your host
machine. Then launch the docker container as needed, attaching the
local directory to the container using docker's -v
flag.
git clone https://github.com/sanchom/sjm.git
cd sjm
docker run --rm -v `pwd`:/work -w /work sanchom/phd-environment scons
In case you're not familiar with docker, this command makes the local
directory visible inside the container at the path /work
. It then
runs the build command, scons
inside that work directory. The build
products are put onto your host machine in the sjm
directory. As
soon as the build is finished, the docker container stops and is
removed (--rm
).
- Download the Caltech 101 Dataset.
- Extract the files:
tar -xvzf 101_ObjectCategories.tar.gz
. This should give you a directory called101_ObjectCategories
with 102 sub-directories, one for each of the 101 object categories and one background category. - Resize the images to have a maximum width or height of 300 pixels, with preserved aspect ratio. To do this,
I use ImageMagick's
mogrify
command:mogrify -verbose -resize 300x300 101_ObjectCategories/*/*.jpg
.
On your local host machine, create a directory that the extracted
features will end up in. In the following, I call that
path_for_extracted_features
.
Use the docker container to run this command:
docker run --rm -v `pwd`:/work -v [path_to_your_101_Categories_directory]:/images \
-v [path_for_extracted_features]:/features -w /work/naive_bayes_nearest_neighbor/experiment_1 \
-e PYTHONPATH=/work sanchom/phd-environment \
python extract_caltech.py --dataset_path=/images --process_limit=4 --sift_normalization=2.0 \
--sift_discard_unnormalized --sift_grid_type FIXED_3X3 --sift_first_level_smoothing 0.66 --sift_fast \
--sift_multiscale --features_directory=/features
This will extract multi-scale SIFT at 4 scales, with a small amount of additional smoothing applied at the first level.
Features are extracted at each level on a 3x3 pixel grid. The first level features are 16x16, and increase by a factor
of 1.5 at each level. This also discards features that from low contrast regions
(--sift_normalization_threshold 2.0 --sift_discard_unnormalized
).
Now, you should have a directory structure on your local machine at
path_for_extracted_features
that mirrors that at
path_to_your_101_Categories
, but with .sift
files instead of
.jpeg
files.
Create a 101_categories.txt
file that lists all the 101 object
categories (not BACKGROUND_Google). We ignore the background class as
suggested by the dataset
creators.
Run this:
docker run --rm -v `pwd`:/work -v [path_to_extracted_features]:/features \
-w /work sanchom/phd-environment ./naive_bayes_nearest_neighbor/experiment_1/experiment_1 \
--category_list 101_categories.txt --features_directory /features \
--alpha [alpha] --trees [trees] --checks [checks] \
--results_file [results_file] --logtostderr
In our experiments, we fixed alpha=1.6, trees=4, and varied the checks variable depending on the particular experiment we were performing, but for optimal performance, checks should be greater than 128 (see Figure 4 from our paper).
The NBNN algorithm is implemented in
NbnnClassifier::Classify
Create a 101_categories.txt
file that lists all the 101 object
categories (not BACKGROUND_Google). We ignore the background class as
suggested by the dataset
creators.
Run this:
docker run --rm -v `pwd`:`pwd` -v [path_to_extracted_features]:/features \
-w `pwd` sanchom/phd-environment ./naive_bayes_nearest_neighbor/experiment_3/experiment_3 \
--category_list 101_categories.txt --features_directory /features \
--k [k] --alpha [alpha] --trees [trees] --checks [checks] \
--results_file [results_file] --logtostderr
In our experiments, we fixed alpha=1.6, trees=4, and varied k and checks depending on the experiment. For optimal results, checks should be above 1024 (see Figure 4 from our paper), and k should be around 10-20 (see Figure 3 from our paper).
The Local NBNN algorithm is implemented in
MergedClassifier::Classify
This section is being rewritten, but if you're curious, look in the raw text of this README file for a section that's been commented out.