CUDA C neural network implementation with RPROP
currently network runs only one hidden layer (needs improvement using templates)
./gpu <neuron in hidden layer> <threads> <number of epochs>- blocks executed = #threads/1024 1
- threads executed = #threads/#blocks