BRP4 1.31/1.32 GPU app release: feedback thread

Thorvin

Joined: 19 Feb 05

Posts: 8

Credit: 2640916

RAC: 0

Yes, after changing the Boinc

19 Nov 2012 9:09:56 UTC

Message 112975

(moderation:

)

Yes, after changing the Boinc version I get some tasks now, but the runtime is quiet long compared to other tasks I can find (OpenCL too).

Is there another problem with my setup or are the newer tasks just longer ?

Markus

Thorvin

Joined: 19 Feb 05

Posts: 8

Credit: 2640916

RAC: 0

Interesting.... I did two

19 Nov 2012 10:39:01 UTC

Message 112976

(moderation:

)

Interesting....

I did two WU with a utilisation factor of 1, so one single WU on the GPU, which both results way long in runtime...

Now I changed to factor 0.5 (two at the same time) and now the WUs run about 2000 secs like the others I saw on other machines ...

see here :
http://einsteinathome.org/host/6069022/tasks

Maybe in the first case the GPU tasks were fighting about the half CPU with other tasks and now as the two have a complete CPU core for themself it is better... unfortunatly I'm not at my PC at home right now, so no cahnce to verify...

Markus

Jord

Joined: 26 Jan 05

Posts: 2952

Credit: 5893653

RAC: 0

It's advised to free one CPU

19 Nov 2012 11:53:32 UTC

Message 112977 in response to message 112976

(moderation:

)

It's advised to free one CPU core, aka tell BOINC to use all but one CPU core, so that the GPU can use this free core for itself. So in your case, "On multiprocessors, use at most 87.5% of the processors".

Einstein OpenCL relies heavily on the CPU. Freeing one core will speed up calculations immensely.

Oliver Behnke

Moderator

Administrator

Joined: 4 Sep 07

Posts: 987

Credit: 25171438

RAC: 0

RE: Einstein OpenCL relies

19 Nov 2012 13:05:14 UTC

Message 112978 in response to message 112977

(moderation:

)

Quote:

Einstein OpenCL relies heavily on the CPU. Freeing one core will speed up calculations immensely.

The more powerful a GPU is, the more CPU we need to "feed" it.

Oliver

Einstein@Home Project

Thorvin

Joined: 19 Feb 05

Posts: 8

Credit: 2640916

RAC: 0

Just to provide some numbers

19 Nov 2012 20:50:08 UTC

Message 112979

(moderation:

)

Just to provide some numbers ....

single GPU task without freeing a CPU core : 17000 secs
two GPU tasks without freeing a CPU core : 1700-1800 secs
two GPU tasks with a free CPU core : 1230secs

greetings from France
Markus

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5883

Credit: 119061938031

RAC: 24599706

RE: two GPU tasks without

20 Nov 2012 10:17:54 UTC

Message 112980 in response to message 112979

(moderation:

)

Quote:

two GPU tasks without freeing a CPU core : 1700-1800 secs
two GPU tasks with a free CPU core : 1230secs

Take a look at these times reported for a HD7970 GPU. According to those, you should be able to do even better than you are currently - probably by freeing even more CPU cores :-). It would certainly be worthwhile trying to run x3 x4 x5 etc, to see if you get further improvement.

If you tried x5 (ie pref set to 0.2) BOINC would force 2 cores to be not used and you could further free up more cores, until you had no running CPU tasks at all. If you did this systematically, it would provide valuable information about the optimal running conditions. I suspect that your x2 results are not as good as the ones linked to because (perhaps) the linked results were produced with no CPU tasks being run concurrently.

HD7970 results are quite interesting at the moment. The top pages of the top computers list have been dominated by nvidia endowed hosts but now there are quite a few new ones with 7970s suddenly right up there. The commentary from the Devs has (for quite a while) been that it's easier to optimise performance of apps with CUDA tools - everything is more mature, more efficient, etc, so it's very interesting to see a 'less mature, lower efficiency' OpenCL app suddenly doing extremely well on that particular GPU. It will be interesting to see if a flood of ATI/AMD GPUs start to arrive to even up the imbalance you can see on the server status page. See the 'GPU Productivity' section.

One final comment. When I looked at your results list, I noticed you only had 2 tasks 'in progress' and that there was about a 3 minute delay between one task finishing and a replacement starting. This was obviously the delay whilst a replacement was being downloaded. You really need to keep some extra tasks in your cache of work. Probably rather a *lot* more would be appropriate :-).

Cheers,
Gary.

Oliver Behnke

Moderator

Administrator

Joined: 4 Sep 07

Posts: 987

Credit: 25171438

RAC: 0

RE: The commentary from the

21 Nov 2012 10:50:11 UTC

Message 112981 in response to message 112980

(moderation:

)

Quote:

The commentary from the Devs has (for quite a while) been that it's easier to optimise performance of apps with CUDA tools - everything is more mature, more efficient, etc, so it's very interesting to see a 'less mature, lower efficiency' OpenCL app suddenly doing extremely well on that particular GPU.

The point being that, IMHO, ATI GPUs have always been better than NVIDIA GPUs, hardware-wise that is. The problem with AMD has been a) the lack of a sophisticated GPGPU programming ecosystem, b) the sad quality of their drivers and c) their responsiveness to community input (like bug reports). This is where NVIDIA still excels. However, the raw hardware power of ATI GPUs makes up for more and more of the lacking software side...

Oliver

Einstein@Home Project

robertmiles

Joined: 8 Oct 09

Posts: 127

Credit: 29950866

RAC: 10888

Alex, RE: Well,

23 Nov 2012 6:04:16 UTC

Message 112982 in response to message 112946

(moderation:

)

Alex,

Quote:

Well, looks like there is a problem.
My nVidia system is no longer getting work for GTX550.
Message says:

2012-11-13 15:06:33.6218 [PID=26262] [version] Checking plan class 'BRP4SSE'
2012-11-13 15:06:33.6218 [PID=26262] [version] project prefs setting 'also_run_cpu' (1.000000) prevents using plan class.
2012-11-13 15:06:33.6218 [PID=26262] [version] Checking plan class 'BRP4cuda32'
2012-11-13 15:06:33.6219 [PID=26262] [version] parsed project prefs setting 'gpu_util_brp': 0.500000
2012-11-13 15:06:33.6219 [PID=26262] [version] driver version required max: -29053, supplied: 30697

I don't know which part is of interest, but I hope this can help.

Regards
Alexander

Just a guess: The lines above look like the only GPU workunits available at the time might have required using a full CPU core as well as a GPU, but your setting prevented it from getting the full CPU for such workunits. Could you check if this is correct?

Richard Haselgrove

Joined: 10 Dec 05

Posts: 2143

Credit: 3002141867

RAC: 701388

RE: Alex, Just a guess:

23 Nov 2012 9:46:58 UTC

Message 112983 in response to message 112982

(moderation:

)

Quote:

Alex,

Just a guess: The lines above look like the only GPU workunits available at the time might have required using a full CPU core as well as a GPU, but your setting prevented it from getting the full CPU for such workunits. Could you check if this is correct?

Robert, you've gone back and pulled up a very old post (13 November).

There was a configuration problem on the server that day - Oliver fixed it on 14 November.

Alex

Joined: 1 Mar 05

Posts: 451

Credit: 520516820

RAC: 241971

RE: RE: Alex, Just a

23 Nov 2012 16:39:42 UTC

Message 112984 in response to message 112983

(moderation:

)

Quote:

Quote:
Alex,

Just a guess: The lines above look like the only GPU workunits available at the time might have required using a full CPU core as well as a GPU, but your setting prevented it from getting the full CPU for such workunits. Could you check if this is correct?

Robert, you've gone back and pulled up a very old post (13 November).

There was a configuration problem on the server that day - Oliver fixed it on 14 November.

Anyway, THX for the effort.
Alexander

BRP4 1.31/1.32 GPU app release: feedback thread

Forums › Problems and Bug Reports

Comment viewing options

Forums › Problems and Bug Reports