1

Topic: Time delay OpenCL&Opencv on Geforce videocards

I check usage possibility OpenCL&Opencv on videocards. Found one singularity for cards Geforce. Rendering on them goes fast, but here back coupling when it is necessary to take away the data from storage video goes very slowly, even more slowly, than it to do similarly on . On Intelovsky boards all transits bright. Here the tested code - cv:: Mat srcMat (sizeSrcCV, m_bRGB32? CV_8UC4: CV_8UC3, srcBuf); cv:: Mat dstMat (sizeDstCV, m_bRGB32? CV_8UC4: CV_8UC3, dstBuf); cv:: UMat dstUMat; cv:: resize (srcMat, dstUMat, sizeDstCV, 0, 0, m_nResizeMethod); RtlCopyMemory (dstMat.data, dstUMat.getMat (cv:: ACCESS_FAST).data, nLenActual); Brakes on dstUMat.getMat (). Is as it to overcome, can use for them cuda?

2

Re: Time delay OpenCL&Opencv on Geforce videocards

Hello, Vicul, you wrote: V> Brakes on dstUMat.getMat (). Is as it to overcome, can use for them cuda? Hardly helps. Storage is arranged so that copying in video storage happens much faster, than reversely. With the built in Intelovsky cards it is better, because there storage anywhere and is not copied, and initially is in system memory. Therefore it is necessary to set up algorithms: 1. To fulfill a handling full stroke on GPU and not to copy anything back. 2. If we process video it is possible to combine copying with calculations, receiving a log on one frame. Roughly speaking, while results of handling of the previous frame are copied reversely, we already launched the following frame on handling.

3

Re: Time delay OpenCL&Opencv on Geforce videocards

V> Brakes on dstUMat.getMat (). Is as it to overcome, can use for them cuda? I suspect that resize simply creates the task and sends it on GPU, therefore this method is quickly returned. And here when you requested result (getMat) waiting of its readiness and copying begins. Possible decisions described above Nuzhny

4

Re: Time delay OpenCL&Opencv on Geforce videocards

N> 1. To fulfill a handling full stroke on GPU and not to copy anything back. No so it does not turn out. All should be processed in the separate filter and be transferred to another. I started it is everything to discharge the main percents on CV functions. And here it is impossible how to be played by storage contexts, well type, to specify to the videocard to use system memory? N> 2. If we process video it is possible to combine copying with calculations, receiving a log on one frame. Roughly speaking, while results of handling of the previous frame are copied reversely, we already launched the following frame on handling. I will try. But I think there one frame not to manage. There logs big jump.

5

Re: Time delay OpenCL&Opencv on Geforce videocards

Hello, Vicul, you wrote: V> Is not present, so it does not turn out. All should be processed in the separate filter and be transferred to another. V> I started it is everything to discharge the main percents on CV functions. And here it is impossible how to be played by storage contexts, V> well type, to specify to the videocard to use system memory? Well, it contradicts iron a little: there is a system memory, the bus, then video (global memory and constant memory). There are deeper layers Further, but is not mandatory. Directly to address to system memory videocard multiprocessors are not able. There is a mechanism which is called pinned memory, but it especially does not help. Better  it is architectural, that on a maximum to eliminate copying.

6

Re: Time delay OpenCL&Opencv on Geforce videocards

N> Well,   it is architectural, that on a maximum to eliminate copying of Thanks for the information, I will try.

7

Re: Time delay OpenCL&Opencv on Geforce videocards

N> it is better  it architecturally that on maximum to eliminate copying. If to transfer all mathematics on video storage, the question arises and how to be with rendering on the end? Opencv works only with the windows that is not absolutely convenient. I normally for this purpose use DirectShow filters, type VMR9 or EVR which are anchored with the created window on hwnd. And, schedule video there is connected by the automatic machine. But then I again should do copying all too to tire out the processed frames on this . Hence, to me it is necessary new  on the basis of the same EVR in which it is necessary to thrust all my mathematics. To write EVR from zero it would not be desirable. Can be eat the basic code such  on which basis it is possible to create specific ? Or there is other decision?

8

Re: Time delay OpenCL&Opencv on Geforce videocards

Hello, Vicul, you wrote: V> If to transfer all mathematics on video storage, the question arises and how to be with rendering on the end? I in DirectShow do not understand, but from OpenCL always it is possible to deduce result on OpenGL a texture without copyings in system memory. Probably, also it is possible and on DX a texture. Or you another mean something? What besides?