mirror of
https://github.com/opencv/opencv_contrib.git
synced 2025-10-18 08:44:11 +08:00
KCF speedup (#1374)
* kcf use float data type rather than double. In our practice, float is good enough and could get better performance. With this patch, one of my benchmark could get about 20% performance gain. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> * Offload transpose matrix multiplication to ocl. The matrix multiplication in updateProjectMatrix is one of the hotspot. And because of the matrix shape is special, say the m is very short but the n is very large. The GEMM implementation in neither the clBLAS nor the in trunk implementation are very inefficient, I implement an standalone transpose matrix mulplication kernel here. It can get about 10% performance gain on Intel desktop platform or 20% performance gain on a braswell platform. And in the mean time, the CPU utilization will be lower. Signed-off-by: Zhigang Gong <zhigang.gong@intel.com> * Add verification code for kcf ocl transpose mm kernel. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> * tracking: show FPS in traker sample * tracking: fix MSVC warnings in KCF * tracking: move OCL kernel initialization to constructor in KCF
This commit is contained in:

committed by
Vadim Pisarevsky

parent
0058eca130
commit
41995b76e8
@@ -1236,12 +1236,12 @@ public:
|
||||
*/
|
||||
void write(FileStorage& /*fs*/) const;
|
||||
|
||||
double detect_thresh; //!< detection confidence threshold
|
||||
double sigma; //!< gaussian kernel bandwidth
|
||||
double lambda; //!< regularization
|
||||
double interp_factor; //!< linear interpolation factor for adaptation
|
||||
double output_sigma_factor; //!< spatial bandwidth (proportional to target)
|
||||
double pca_learning_rate; //!< compression learning rate
|
||||
float detect_thresh; //!< detection confidence threshold
|
||||
float sigma; //!< gaussian kernel bandwidth
|
||||
float lambda; //!< regularization
|
||||
float interp_factor; //!< linear interpolation factor for adaptation
|
||||
float output_sigma_factor; //!< spatial bandwidth (proportional to target)
|
||||
float pca_learning_rate; //!< compression learning rate
|
||||
bool resize; //!< activate the resize feature to improve the processing speed
|
||||
bool split_coeff; //!< split the training coefficients into two matrices
|
||||
bool wrap_kernel; //!< wrap around the kernel values
|
||||
|
Reference in New Issue
Block a user