Optimization
来源:互联网 发布:做摩卡拼图的软件 编辑:程序博客网 时间:2024/05/21 21:38
http://answers.opencv.org/question/755/object-detection-slow/#760
Haar features are inherently slow - they make extensive use of floating point operations, which are a bit slow on mobile devices.
A quick solution would be to turn to LBP cascades - all you need is a few lines changed in your code. The performance gain is significant, and the loss in accuracy is minimal. Look for lbpcascades/lbpcascade_frontalface.xml
.
If you want to dig deeper into optimzations, here is a generic optimization tip list (cross-posted from SO) Please note that face detection, being one of the most requested features of OpenCV, is already quite optimized, so advancing it further may mean deep knowledge.
Advice for optimization
A. Profile your app. Do it first on your computer, since it is much easier. Use visual studio profiler, and see what functions take the most. Optimize them. Never ever optimize because you think is slow, but because you measure it. Start with the slowest function, optimize it as much as possible, then take the second slower.
B. First, focus on algorithms. A faster algorithm can improve performance with orders of magnitude (100x). A C++ trick will give you maybe 2x performance boost.
Classical techniques:
Resize you video frames to be smaller. many times, you can extract the information from a 200x300px image, instead of a 1024x768. The area of the first one is 10 times smaller.
Use simpler operations instead of complicated ones. Use integers instead of floats. And never use
double
in a matrix or afor
loop that executes thousands of times.Do as little calculation as possible. Can you track an object only in a specific area of the image, instead of processing it all for all the frames? Can you make a rough/approximate detection on a very small image and then refine it on a ROI in the full frame?
C. In for loops, it may make sense to use C style instead of C++. A pointer to data matrix or a float array is much faster than mat.at<i, j=""> or std::vector<>. But change only if it's needed. Usually, a lot of processing (90%) is done in some double for loop. Focus on it. It doesn't make sense to replace vector<> all over the place, ad make your code look like spaghetti.
D. Some OpenCV functions convert data to double, process it, then convert back to the input format. Beware of them, they kill performance on mobile devices. Examples: warping, scaling, type conversions. Also, color space conversions are known to be lazy. Prefer grayscale obtained directly from native YUV.
E. ARM processors have NEON. Learn and use it. It is powerfull!
A small example:
float* a, *b, *c;// init a and b to 1000001 elementsfor(int i=0;i<1000001;i++) c[i] = a[i]*b[i];
can be rewritten as follows. It's more verbose, but trust me it's faster.
float* a, *b, *c;// init a and b to 1000001 elementsfloat32x4_t _a, _b, _c;int i;for(i=0;i<1000001;i+=4){ a_ = vld1q_f32( &a[i] ); // load 4 floats from a in a NEON register b_ = vld1q_f32( &b[i] ); c_ = vmulq_f32(a_, b_); // perform 4 float multiplies in parrallel vst1q_f32( &c[i], c_); // store the four results in c}// the vector size is not always multiple of 4 or 8 or 16. // Process the remaining elementsfor(;i<1000001;i++) c[i] = a[i]*b[i];
Purists say you must write in assembler, but for the regular programmer guy that's a bit daunting. I found good results writing with intrinsics, like in the above example.
Also check this blog post and the following posts about NEON.
And, last but not least, I should mention that I had very good success converting the SSE optimizations (this is the NEON counterpart in x86-64 processors) in OpenCV to NEON, like here. This is the image filtering code for uchar matrices (the regular image format). You should't blindly convert instructions one by one, because there are better ways to do it, but take it as an example to start with.
- optimization
- optimization
- Optimization
- optimization
- optimization
- JavaScript Optimization
- SSE2 Optimization
- JavaScript Optimization
- Programming Optimization
- Query optimization
- Database Optimization
- numerical optimization
- Texture Optimization
- SQL Optimization
- performance optimization
- Optimization Algorithms
- performance optimization
- Optimization Techniques
- HDOJ 1083 二分匹配
- JSON初入门
- 程序员的自我修养——操作系统篇
- win7 Eclipse连接Hadoop
- C语言结构填充
- Optimization
- 批量删除表的sql语句
- OC不可变数组排序方法
- 多媒体介绍 color format
- 闪回监控 闪回日志大小
- ICCV2013、CVPR2013、ECCV2013目标检测相关论文
- Matlab 机器学习算法函数总结
- [剑指offer]求字符数组的所有组合
- 编程正解