Yep, it's known that running BRP4 and BRP5 tasks concurrently causes a significant performance degradation.
Does anyone have experience whether the same happens with BRP5+BRP6? I think it should not, since both use the same binaries (only the filename differs).
Well, all three BRP searches use the same application binaries. The difference is purely in the data and parameter sets, which lead to different GPU memory usage, different transfers sizes between CPU and GPU memory etc.
Thanks again for the feedback. The reports about some BRP6 workunits that need ca. 3 times the run time of BRP5G tasks got me worried enough to make some experiments and YES, we do have a problem here: The OpenCL versions of our BRP apps have a performance bug that will, only for some "unfortunate" values of the max search frequency in combination with other parameters, lead to a very inefficient way to run certain threads on the GPU. The value of the max search frequency for BRP6, 300Hz, happened to be such a value.
I think I'll roll out a new app version early next week. This won't affect performance for BRP5, it will *just* prevent the performance degradation in BRP6.
Again, thanks for the useful feedback that pointed me to this problem.
The new BRP6 application version 1.47 has been published for Beta test. It features the optimizations from HBE as announced e.g. here, and the "latest and greatest" BOINC API code, including the fix mentioned by Richard e.g. here.
The new BRP6 application version 1.47 has been published for Beta test.
I just tried to get beta test work. The machine is in a venue which has the beta test apps pref setting enabled. It did get a new BRP6 task but it was for the 1.39 version app (NVIDIA GTX 650 GPU). Should beta test tasks be distributed now or is that going to happen later?
The new BRP6 application version 1.47 has been published for Beta test.
I just tried to get beta test work. The machine is in a venue which has the beta test apps pref setting enabled. It did get a new BRP6 task but it was for the 1.39 version app (NVIDIA GTX 650 GPU). Should beta test tasks be distributed now or is that going to happen later?
Could you post or PM the hostid?
BM
Edit: or try to find yourself in the scheduler logs why lan classe BRP6-Beta-cuda32* were rejected.
RE: Yep, it's known that
)
Well, all three BRP searches use the same application binaries. The difference is purely in the data and parameter sets, which lead to different GPU memory usage, different transfers sizes between CPU and GPU memory etc.
BM
BM
No tasks available?
)
No tasks available?
RE: No tasks
)
At this moment still in testing phase (see higher in thread)
Hi! Thanks again for the
)
Hi!
Thanks again for the feedback. The reports about some BRP6 workunits that need ca. 3 times the run time of BRP5G tasks got me worried enough to make some experiments and YES, we do have a problem here: The OpenCL versions of our BRP apps have a performance bug that will, only for some "unfortunate" values of the max search frequency in combination with other parameters, lead to a very inefficient way to run certain threads on the GPU. The value of the max search frequency for BRP6, 300Hz, happened to be such a value.
I think I'll roll out a new app version early next week. This won't affect performance for BRP5, it will *just* prevent the performance degradation in BRP6.
Again, thanks for the useful feedback that pointed me to this problem.
Cheers
HB
The new BRP6 application
)
The new BRP6 application version 1.47 has been published for Beta test. It features the optimizations from HBE as announced e.g. here, and the "latest and greatest" BOINC API code, including the fix mentioned by Richard e.g. here.
BM
BM
RE: The new BRP6
)
I just tried to get beta test work. The machine is in a venue which has the beta test apps pref setting enabled. It did get a new BRP6 task but it was for the 1.39 version app (NVIDIA GTX 650 GPU). Should beta test tasks be distributed now or is that going to happen later?
Cheers,
Gary.
RE: RE: The new BRP6
)
Could you post or PM the hostid?
BM
Edit: or try to find yourself in the scheduler logs why lan classe BRP6-Beta-cuda32* were rejected.
BM
Just tried Gary's experiment,
)
Just tried Gary's experiment, with the same result (got v1.39). Relevant section of the server log is:
Looks as if 'BRP6-Beta-cuda32-nv301' is OK, but 'BRP5-cuda32-nv301' is better.
RE: Looks as if
)
Thanks for spotting this!
More precisely: 'BRP6-Beta-cuda32-nv301' is not better than 'BRP5-cuda32-nv301'.
Fixed.
BM
BM
That looks
)
That looks better:
I'll let you know how they got on when I've cleared the v1.39s out of the way.