GPU accelerated object detection can be enabled as long as you have a compatible GPU (CUDA enabled GPUs).
If you have a compatible GPU you will also need to make sure the following required libraries are installed and configured for use.
To enable hardware accelerated GPU detection on a device.
At this point StreamShuttle will try to utilize the GPU for object detection. It will fallback to using the CPU if the required GPU libraries are not found. You need to verify it's actually utilizing the GPU working by monitoring the logs.
When trying to enable for the first time it's usually best to only have one device running which helps make it easier to browse through and pinpoint info in the logs.
On Linux: Open the terminal and run the following command.
tail -f ~/.config/StreamShuttle/streamshuttle/pm2/logs/hub-1-out.log
On Windows: Open power shell and run the following command.
Get-Content $env:USERPROFILE\AppData\Roaming\StreamShuttle\streamshuttle\pm2\logs\hub-1-out.log -Wait -Tail 30
Every time a device stream is restarted information from the object detector will be logged. Specifically seeing the following line means GPU is correctly being utilized
If the GPU detector is not working you will instead see the following
If the USING_GPU option is not currently working generally there will be more information about why in the lines that directly precede it. Basically any line that beings with the following
After GPU detection is up and running you can increase the rate at which object detection happens.
Depending on how many devices you have it may make more sense to utilize a "global" detector rather than running a detector for each device. This will decrease the total amount of RAM the graphics card will utilize.
In order to utilize the global detector it needs to be enabled in the "StreamShuttle Control Panel".
Once it's enabled in the control panel you can enable the device specific option.
Again depending on you specific usage ex. the # of cams / type of GPU you have. It may make sense to add additional hubs in order to create more "Global Detectors". This will allow you to spread the load as needed.
ex. Each detector requires roughly 600MB of GPU memory. If you have a smaller GPU then having 8 separate detectors running might exceed the GPU memory. Instead you could create a global detector and have all of the devices use that. This means we would only occupy 600MB vs 4800MB if we ran a detector for each device.
Continuing with the same example as above. If we are utilizing a "Global Detector" for more than 8 devices then the FPS it can handle may become the bottle neck. In which case we can simply add another "Global Detector". Now we would only be utilizing 1200MB of memory but will be able to handle roughly double the amount of FPS.
The last scenario would be to utilize Multiple GPUs. We can assign which GPUs are available to each hub utilizing via environment variables.