Labeling pixels on a road in images using a Fully Convolutional Network (FCN).
Function load_vgg loads
loads pre-trained vgg model.
The project has layers functions implemented.
The optimize
function for the network is cross-entropy, and an Adam optimizer is used.
cross_entropy_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.minimize(cross_entropy_loss)
The train_nn
function is implemented and prints time and loss per epoch/epochs of training.
The project trains model correctly, about 48s per epoch, 48sx40 epochs in total.
Final hyperparamters used for training.
L2_REG = 1e-5
STDEV = 1e-2
KEEP_PROB = 0.8
LEARNING_RATE = 1e-4
EPOCHS = 40
BATCH_SIZE = 8
IMAGE_SHAPE_KITI = (160,576)
NUM_CLASSES = 2
Results from the test images. From the GIF below, A pre-trained VGG-16 network combined with a fully convolutional network will successfully label the road. Performance was also improved through the use of skip connections and adding element-wise to upsampled lower-level layers.
main.py
will check to make sure you are using GPU - if you don't have a GPU on your system, you can use AWS or another cloud computing platform.
Make sure you have the following is installed:
You may also need Python Image Library (PIL) for SciPy's imresize
function.
Download the Kitti Road dataset from here. Extract the dataset in the data
folder. This will create the folder data_road
with all the training a test images.
Run the following command to run the project:
python main.py
Note: If running this in Jupyter Notebook system messages, such as those regarding test status, may appear in the terminal rather than the notebook.