#### Topic: Calculation on GPU

I welcome, I have zero experience of programming of videocards, and knowledge on this point in question at me too is close to zero. If the question seems axiomatic, or silly, I ask not to kick strongly. It is necessary it makes sense to understand how much to use GPU instead of CPU for the following task. There is rather simple iterative algorithm. The algorithm fulfills the order of hundred iterations and operates with type numbers double. On an input the algorithm receives some tens numbers, on an output has the order of 10 numbers. On good CPU the algorithm is completely fulfilled approximately for 400 microseconds for one dial-up of input parameters. This algorithm is necessary for running for different dial-ups of input parameters. Number of dial-ups - from 100 to 5000. All data sets are independent and accessible simultaneously (in operative storage). The task consists in finishing recalculation of all data sets as soon as possible. Questions: whether It is possible  this calculation on GPU? How many dial-ups it will be possible to consider simultaneously? What acceleration it is possible to expect in comparison with CPU which enumerates all data sets sequentially one after another? Where it is possible to expect bottlenecks and problems? Where it is possible to find a code sample which does something similar? The task purely mathematical also does not concern display something on the screen. Thanks

#### Re: Calculation on GPU

Hello, DKM_MSFT, you wrote: DKM> This algorithm is necessary for running for different dial-ups of input parameters. Number of dial-ups - from 100 to 5000. All data sets are independent and accessible simultaneously (in operative storage). It is taught, on all about all 2  a maximum? It seems to me that to begin with it is necessary to try  calculations on all kernels and  SSE4 any on CPU, well and do not forget about caching.

#### Re: Calculation on GPU

Hello, DKM_MSFT, you wrote: DKM> I Welcome, DKM> I have zero experience of programming of videocards, and knowledge on this point in question at me too is close to zero. If the question seems axiomatic, or silly, I ask not to kick strongly. DKM> it is necessary it makes sense to understand how much to use GPU instead of CPU for the following task. https://www.youtube.com/watch?v=G-W0mVL … mqec_u7kcV DKM> There is rather simple iterative algorithm. The algorithm fulfills the order of hundred iterations and operates with type numbers double. On an input the algorithm receives some tens numbers, on an output has the order of 10 numbers. On good CPU the algorithm is completely fulfilled approximately for 400 microseconds for one dial-up of input parameters. DKM> this algorithm is necessary for running for different dial-ups of input parameters. Number of dial-ups - from 100 to 5000. All data sets are independent and accessible simultaneously (in operative storage). DKM> the Task consists in finishing recalculation of all data sets as soon as possible. DKM> questions: Whether DKM> It is possible  this calculation on GPU? DKM> How many dial-ups it will be possible to consider simultaneously? DKM> what acceleration it is possible to expect in comparison with CPU which enumerates all data sets sequentially one after another? DKM> where it is possible to expect bottlenecks and problems? DKM> where it is possible to find a code sample which does something similar? https://www.physics.drexel.edu/~vallier … l_2010.pdf http://ecee.colorado.edu/~siewerts/extr … mples.html DKM> the task purely mathematical also does not concern display something on the screen. DKM> thanks

#### Re: Calculation on GPU

Hello, Kernan, you wrote: K> It is taught, on all about all 2  a maximum? About that K> it seems To me that to begin with it is necessary to try  calculations on all kernels and  SSE4 any on CPU, well and do not forget about caching. These variants too are considered, but with them at me much more clearness. It would be desirable to understand perspectives to do it on GPU.

#### Re: Calculation on GPU

Hello, kov_serg, you wrote: DKM>> It is necessary it makes sense to understand how much to use GPU instead of CPU for the following task. _> https://www.youtube.com/watch?v=G-W0mVL … mqec_u7kcV thanks, I will look. Easier before to start to dig deeply a problem most, someone from the local can tells, whether costs  basically.

#### Re: Calculation on GPU

Hello, DKM_MSFT, you wrote: DKM> the Algorithm fulfills the order of hundred iterations and operates with type numbers double. double it is extremely strongly cut, only on old cards more or less. Peak on Titan black 1881 GFlops, and new GTX 1080 only 277! i7 - 6950X for comparing 240!!! But on CPU they are much more effective in case of difficult branching algorithms. http://www.geeks3d.com/20140305/amd-rad … computing/ if in algorithm it is a lot of branchings it will not be accelerated on gpu. The maximum productivity from  is squeezed out only in cases when over the data many operations are fulfilled - for example in a cycle to multiply something. If it is a lot of conditions on them brakes also can be extremely slowly.

#### Re: Calculation on GPU

Hello, DKM_MSFT, you wrote: DKM> Thanks, I will look. Easier before to start to dig deeply a problem most, someone from the local can tells, whether costs  basically. It seems to me, these two seconds by concerning moderate efforts will be pressed several times on normal CPU (actually, task multisequencing on 8 flows is simple transforms 2 seconds in 1/4 seconds provided that in a case  flows too will not compete for FPU). A question in on what efforts you are ready to go to receive faster result and so it is necessary for you, such acceleration?

#### Re: Calculation on GPU

Hello, Pzz, you wrote: Pzz> the Question in on what efforts you are ready to go to receive faster result and so it is necessary for you, such acceleration? We assume that I am ready to spend any efforts to receive as much as possible fast result. What can I expect at best?

#### Re: Calculation on GPU

Hello, DKM_MSFT, you wrote: Pzz>> the Question in on what efforts you are ready to go to receive faster result and so it is necessary for you, such acceleration? DKM> we assume that I am ready to spend any efforts to receive as much as possible fast result. What can I expect at best? I do not know. But I think, taking into account that the videocard it is necessary to initialize, load in it the code, etc., tens-hundreds milliseconds. I.e., it is comparable that you already have.

#### Re: Calculation on GPU

Hello, Pzz, you wrote: Pzz> I do not know. But I think, taking into account that the videocard it is necessary to initialize, load in it the code, etc., tens-hundreds milliseconds. I.e., it is comparable that you already have. I not so well raised the question. To us this calculation needs to be led not , and it is constant during the day (with other data). That is after  one dial-up of the independent data it will be necessary to count the following dial-up . If the very first dial-up is slowly considered owing to initialization it is normal. What I can expect acceleration at the subsequent starts of the same algorithm for other data sets?