A simple solver for "Where's Wally" images.
This repository contains a deep convolutional neural network trained to find Wally (aka Waldo).
The training code is based upon the dlib example for training a car detector. The only thing I changed was the network architecture to be able to detect objects as small as 24x24 pixels.
The definition of the network architecture can be found in src/detector.h
.
The trained model has 7 convolutional layers and weighs only 350 kB, so I decided to embed it into the code directly.
To that end, I used the powerful serialize
function family from dlib.
The model has been serialized into a bytestring, then compressed and finally converted to base 64.
As a result, instantiating the WallyFinder
class from python code is enough to get a fully working model that, when run on an image, will return a list of dictonaries with xmin
, ymin
, xmax
, ymax
that describe each bounding box.
These dependencies are only needed at build time, not at run-time:
CMake
>=3.14
Ninja
gcc
>=7
g
>=7
These dependencies are needed at both build-time and run-time:
python
>=3.6
CUDA
>=7.5
(optional: it will use CPU instead if not found)CUDNN
>=5
(optional: it will use CPU instead if not found)
An installable wheel file can be created by running
python setup.py bdist_wheel
The compiled .whl file will be placed in dist
directory, which can be now distributed or installed using pip install
The steps described above can be combined by running
python setup.py install --user
This will install the module to ~/.local/lib/
.
You can also opt for a system-wide installation by removing the --user
flag.