i've searched a while for some information like this
now i can add some
7870 running arecibo 1.32 opencl
1x 18minutes (about 1100s)
2x 29minutes (about 1800s) (14.5m per WU)
3x 46minutes (about 2800s) (15.33m per WU)
gpu usage measured via MSI Afterburner, on 3 tasks still at 60% with 90% peaks, there seems to be not much difference to 2 tasks
cpu is an 3570k slightly overclocked at 4,2ghz and one core free for gpu feeding, without this the performance is really bad, about 2.5 hours for one task (compared to 18 minutes that is really a bad out of the box performance i think)
i've searched a while for some information like this
now i can add some
7870 running arecibo 1.32 opencl
1x 18minutes (about 1100s)
2x 29minutes (about 1800s) (14.5m per WU)
3x 46minutes (about 2800s) (15.33m per WU)
You should have asked sooner :-).
For quite a while, E@H has had a special project preference called 'GPU utilisation factor' which allows you to run multiple GPU tasks simultaneously. You don't need to have a 'bleeding edge' version of BOINC to use it either. With the app config feature BOINC is catching up with what Bernd had put in place for multiple simultaneous GPU tasks. It does add the ability to tweak and as well but I don't think that will be of much benefit until tha opencl-ati app improves further - at which point the Devs will tweak the default values anyway.
Quote:
gpu usage measured via MSI Afterburner, on 3 tasks still at 60% with 90% peaks, there seems to be not much difference to 2 tasks
With 2 tasks, BOINC would have automatically kept one CPU core free. When you changed to 3x, the total requirement would have changed to 1.5 so that there would have been still only 1 CPU core kept free. Did you try changing prefs to free up a second core? If you do that I would guess that you might see a further improvement in GPU utilisation and a reduction in crunch time. Probably not enough improvement to compensate for the loss of a CPU core to crunch CPU tasks but you don't know until you try :-).
Quote:
cpu is an 3570k slightly overclocked at 4,2ghz and one core free for gpu feeding, without this the performance is really bad, about 2.5 hours for one task (compared to 18 minutes that is really a bad out of the box performance i think)
Slightly overclocked :-). I know that overclock is very easy to achieve but I would hardly call it "slight" :-). It's a very decent overclock indeed and if they can get a bit cheaper, it would make them quite attractive for future budget builds.
EDIT: I've just noticed that you are using BOINC 7.0.28 which means you can't be using the app configuration feature described by Jord since you need the latest 7.0.42 for that. So are you using the special GPU utilisation pref setting after all? Because your message immediately followed Jord's report on the app config feature, I assumed you were using what he had just described. I should have checked your host before making that assumption :-).
sorry if my post caused misunderstanding, i wasn't referring to the app_config variant, but the start post/topic in general
i only use the settings in the project preferences itself, since i had much problems to get an app_info.xml working in other projects, i don't use them anymore, it just produces too much errors if i had to check them permanently for correctness
mhm, i hope you understand what i want to say, i'm afraid my english is not the best
for giving 2 cores to gpu feeding, i just forget about that, i will try it again with 2 cores, maybe even 4 tasks are possible without filling the vram
4.2 ghz i would call slightly overclocked, because the single cores are guaranteed to work at 3.8, so it is only about 10% more clock speed at effectively the same voltage my cpu use with turbo boost activated. as long as one does not have to give more voltage to the cores i would call the overclocking 'slightly', or am i wrong?
sorry if my post caused misunderstanding, i wasn't referring to the app_config variant, but the start post/topic in general
That's quite OK. My fault - I made a bad assumption which I later corrected when I looked closely at your BOINC version.
Quote:
i only use the settings in the project preferences itself, since i had much problems to get an app_info.xml working in other projects, i don't use them anymore, it just produces too much errors if i had to check them permanently for correctness
Yes, I quite agree. Anonymous platform (AP) can be tricky. Especially when you want to back out of it. It also requires a lot of manual intervention every time the app version changes. That's why I'm so interested in app config. No changes to be made when the app version changes.
Quote:
mhm, i hope you understand what i want to say, i'm afraid my english is not the best
Your English is fine - far better than my non-existent German. I'm sure I understand you perfectly.
Quote:
for giving 2 cores to gpu feeding, i just forget about that, i will try it again with 2 cores, maybe even 4 tasks are possible without filling the vram
When you try with 2 free cores, please post your findings. It would be very interesting to know, for a 7870, if you get a significant improvement when running 3x. Your GPU has 2GB RAM which is enough to run 4x. You would automatically have 2 free cores for this. It would be useful to see if adding a third free core made any further improvement. Most people are using NVIDIA GPUs so good data about AMD is not very common in this thread.
Quote:
4.2 ghz i would call slightly overclocked, because the single cores are guaranteed to work at 3.8, so it is only about 10% more clock speed at effectively the same voltage my cpu use with turbo boost activated. as long as one does not have to give more voltage to the cores i would call the overclocking 'slightly', or am i wrong?
You are not really wrong but it would probably be more correct to describe the overclock as 'moderate' or 'medium' rather than 'slight'. 'Slight' means 'quite small' or 'hardly noticeable'. If you increased the stock speed from 3.4GHz to say 3.6 - 3.7GHz, you could call that slight. 4.2GHz is very noticeable :-). After all, even some professional overclockers using voltage and fancy cooling solutions sometimes have trouble getting past 4.6-4.8GHz.
In any case, I was really just making a joke - there was absolutely no criticism intended :-).
a little bit strange the behavior of BRPS
on 7870, running at 1150MHz chip clock, 1300 MHz RAM clock
3570k running at 4,2GHz
3x need about 45 minutes per WU
4x needs about an hour, but varying from 3237s (http://einsteinathome.org/workunit/142484182) up to 3700s, but beside really good runs and disturbed ones i see relatively stable 59minutes wich makes 3540s
if i ignore the 50MHz higher clock since the 1x and 2x tries i get
1x 1080s per WU
2x 880s per WU
3x 900s per WU
4x 885s per WU
as i don't calculate the accurate average and disturb the 'benchmarks' with silly things like using the computer by myself i would say that 2 parallel tasks are enough to make good, maybe optimal use of the gpu while it is the easiest setting on my system, occupies one core and not two
interesting is that two cores for gpu give no advantage compared to one core, i use the tool 'process lasso' to bind gpu tasks fixed to one core and exclude most of the tasks on my pc from that core because i have found that the 'core hopping' often eats decent performance for absolutely nothing
as far as i can see every process that don't uses multiple cores at the same time profits from fixed core binding under win7 (prof)
seems to be a really awful scheduler that ignores the fact that cpu cache hits are better than cache misses
unfortunately i have no idea how to bind each application to a specific core, i can only get an exclusive core for gpu applications
it would be interesting if there is ANY single core application that profits from the ability to hop from one core to another
edit:
hey, seems like there is something positive about ati
the 7870 seems to be faster than 660OC ;)
1 unit 1,863.28s to 1,927.96s. 2 units 3,423.72s to 3,356.81s. System specs in signature. 7 cpu's working 1 free core. GPU not overclocked. So 1 unit is around 34 mins and 2 units gets done around 58 mins. 1 and 2 units run the same GPU Load and Memory Load. "roughly very little difference"
PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home
sorry can't seem to edit my last post now... :( I am testing the 306.97 driver now and so far it seems about a min. faster at times. Will post more after some more testing and times.
PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home
i've searched a while for
)
i've searched a while for some information like this
now i can add some
7870 running arecibo 1.32 opencl
1x 18minutes (about 1100s)
2x 29minutes (about 1800s) (14.5m per WU)
3x 46minutes (about 2800s) (15.33m per WU)
gpu usage measured via MSI Afterburner, on 3 tasks still at 60% with 90% peaks, there seems to be not much difference to 2 tasks
cpu is an 3570k slightly overclocked at 4,2ghz and one core free for gpu feeding, without this the performance is really bad, about 2.5 hours for one task (compared to 18 minutes that is really a bad out of the box performance i think)
RE: i've searched a while
)
You should have asked sooner :-).
For quite a while, E@H has had a special project preference called 'GPU utilisation factor' which allows you to run multiple GPU tasks simultaneously. You don't need to have a 'bleeding edge' version of BOINC to use it either. With the app config feature BOINC is catching up with what Bernd had put in place for multiple simultaneous GPU tasks. It does add the ability to tweak and as well but I don't think that will be of much benefit until tha opencl-ati app improves further - at which point the Devs will tweak the default values anyway.
With 2 tasks, BOINC would have automatically kept one CPU core free. When you changed to 3x, the total requirement would have changed to 1.5 so that there would have been still only 1 CPU core kept free. Did you try changing prefs to free up a second core? If you do that I would guess that you might see a further improvement in GPU utilisation and a reduction in crunch time. Probably not enough improvement to compensate for the loss of a CPU core to crunch CPU tasks but you don't know until you try :-).
Slightly overclocked :-). I know that overclock is very easy to achieve but I would hardly call it "slight" :-). It's a very decent overclock indeed and if they can get a bit cheaper, it would make them quite attractive for future budget builds.
EDIT: I've just noticed that you are using BOINC 7.0.28 which means you can't be using the app configuration feature described by Jord since you need the latest 7.0.42 for that. So are you using the special GPU utilisation pref setting after all? Because your message immediately followed Jord's report on the app config feature, I assumed you were using what he had just described. I should have checked your host before making that assumption :-).
Cheers,
Gary.
sorry if my post caused
)
sorry if my post caused misunderstanding, i wasn't referring to the app_config variant, but the start post/topic in general
i only use the settings in the project preferences itself, since i had much problems to get an app_info.xml working in other projects, i don't use them anymore, it just produces too much errors if i had to check them permanently for correctness
mhm, i hope you understand what i want to say, i'm afraid my english is not the best
for giving 2 cores to gpu feeding, i just forget about that, i will try it again with 2 cores, maybe even 4 tasks are possible without filling the vram
4.2 ghz i would call slightly overclocked, because the single cores are guaranteed to work at 3.8, so it is only about 10% more clock speed at effectively the same voltage my cpu use with turbo boost activated. as long as one does not have to give more voltage to the cores i would call the overclocking 'slightly', or am i wrong?
RE: sorry if my post caused
)
That's quite OK. My fault - I made a bad assumption which I later corrected when I looked closely at your BOINC version.
Yes, I quite agree. Anonymous platform (AP) can be tricky. Especially when you want to back out of it. It also requires a lot of manual intervention every time the app version changes. That's why I'm so interested in app config. No changes to be made when the app version changes.
Your English is fine - far better than my non-existent German. I'm sure I understand you perfectly.
When you try with 2 free cores, please post your findings. It would be very interesting to know, for a 7870, if you get a significant improvement when running 3x. Your GPU has 2GB RAM which is enough to run 4x. You would automatically have 2 free cores for this. It would be useful to see if adding a third free core made any further improvement. Most people are using NVIDIA GPUs so good data about AMD is not very common in this thread.
You are not really wrong but it would probably be more correct to describe the overclock as 'moderate' or 'medium' rather than 'slight'. 'Slight' means 'quite small' or 'hardly noticeable'. If you increased the stock speed from 3.4GHz to say 3.6 - 3.7GHz, you could call that slight. 4.2GHz is very noticeable :-). After all, even some professional overclockers using voltage and fancy cooling solutions sometimes have trouble getting past 4.6-4.8GHz.
In any case, I was really just making a joke - there was absolutely no criticism intended :-).
Cheers,
Gary.
Updated list after 2 Weeks
)
Updated list after 2 Weeks without change ;)
http://www.dskag.at/images/Research/EinsteinGPUperformancelist.pdf
Happy new Year :)
DSKAG Austria Research Team: [LINK]http://www.research.dskag.at[/LINK]
a little bit strange the
)
a little bit strange the behavior of BRPS
on 7870, running at 1150MHz chip clock, 1300 MHz RAM clock
3570k running at 4,2GHz
3x need about 45 minutes per WU
4x needs about an hour, but varying from 3237s (http://einsteinathome.org/workunit/142484182) up to 3700s, but beside really good runs and disturbed ones i see relatively stable 59minutes wich makes 3540s
if i ignore the 50MHz higher clock since the 1x and 2x tries i get
1x 1080s per WU
2x 880s per WU
3x 900s per WU
4x 885s per WU
as i don't calculate the accurate average and disturb the 'benchmarks' with silly things like using the computer by myself i would say that 2 parallel tasks are enough to make good, maybe optimal use of the gpu while it is the easiest setting on my system, occupies one core and not two
interesting is that two cores for gpu give no advantage compared to one core, i use the tool 'process lasso' to bind gpu tasks fixed to one core and exclude most of the tasks on my pc from that core because i have found that the 'core hopping' often eats decent performance for absolutely nothing
as far as i can see every process that don't uses multiple cores at the same time profits from fixed core binding under win7 (prof)
seems to be a really awful scheduler that ignores the fact that cpu cache hits are better than cache misses
unfortunately i have no idea how to bind each application to a specific core, i can only get an exclusive core for gpu applications
it would be interesting if there is ANY single core application that profits from the ability to hop from one core to another
edit:
hey, seems like there is something positive about ati
the 7870 seems to be faster than 660OC ;)
HD6790 1Gb DDR5 (with AMD
)
HD6790 1Gb DDR5 (with AMD Phenom II 945) gets 1 wu in about 2900 sec (2850-2950).
My AMD gets one BRP4SSE in 124,500sec. One wu for pulsar search#2 is done in 15,200 sec.
Hi, New Linux Driver Catalyst
)
Hi,
New Linux Driver Catalyst 13.1 runtime decrease ~ 10%! (AMD-ATI-5770 Juniper)
12.10 runtime ~ 4300
13.1 runtime ~ 3900
1 unit 1,863.28s to
)
1 unit 1,863.28s to 1,927.96s. 2 units 3,423.72s to 3,356.81s. System specs in signature. 7 cpu's working 1 free core. GPU not overclocked. So 1 unit is around 34 mins and 2 units gets done around 58 mins. 1 and 2 units run the same GPU Load and Memory Load. "roughly very little difference"
PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home
sorry can't seem to edit my
)
sorry can't seem to edit my last post now... :( I am testing the 306.97 driver now and so far it seems about a min. faster at times. Will post more after some more testing and times.
PC setup MSI-970A-G46 AMD FX-8350 8 core OC'd 4.45GHz 16GB ram PC3-10700 Geforce GTX 650Ti Windows 7 x64 Einstein@Home