Removed all C++ references from .cu files as required by 64-bit CUDA.
Changed the order of processing within the detector to reduce the memory requirement on the GPU.
Added the ability to select the GPU in the gpusurf_engine command-line utility.
Added a demo option to the gpusurf_engine command-line utility. The demo uses OpenCV to open up a camera, then runs the detector on the video stream, and displays the results.
gpusurf_engine --run-demo
Added an API call that allows the detector to be run with a user-specified threshold