8 errors on linux (CasA) v1.08 (GWopencl-ati-Beta). feedback.

siu77
siu77
Joined: 5 Oct 12
Posts: 8
Credit: 16367384
RAC: 7245
Topic 197571

CasA (GWopencl-ati-Beta) works fine on the same rig in windows 8.1. Linux Arecibo Gpu's are working fine. Error appears only in linux if add GWopencl-ati-Beta.
However, 1 WU CasA was valid on linux. 8 CasA and 2 Arecibo Gpu have errors on linux.
Hope this helps.

http://einsteinathome.org/account/tasks&offset=0&show_names=1&state=5&appid=0

cpu: celeron g1820
mb: gigabyte ga-z87-hd3
mem: 2x4 kingston khx1866c9d3k2/8gx
videocard: xfx r7-260x-cnb4

Debian sid. Fglrx 14.4-2 form reps.
3.14-1-amd64 #1 SMP Debian 3.14.2-1 (2014-04-28) x86_64 GNU/Linux
OpenGL vendor string: Advanced Micro Devices, Inc.
OpenGL renderer string: AMD Radeon R7 200 Series
OpenGL version string: 4.4.12874 Compatibility Profile Context 14.10.1006

einsteinbinary_BRP4G

.33
.34

einstein_S6CasA

.33
.34

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2747472
RAC: 1576

8 errors on linux (CasA) v1.08 (GWopencl-ati-Beta). feedback.

Try running less of them under Linux, trying to run three GWopencl-ati-Beta tasks at once under Linux may be too many, your host under Linux also reports less GPU memory.

Claggy

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118618090724
RAC: 18064346

Your Windows machine has done

Your Windows machine has done only 5 tasks of the S6CasA variety as opposed to more than 60 of the BRP4G type. Because of that, it seems unlikely that you ever had 3 concurrent S6CasA tasks. Maybe you didn't even have 2.

If you want three concurrent GPU tasks, maybe the secret is to limit the mix to 1 S6CasA and 2 BRP4G. Maybe you get the failure if 3 S6CasA (or even 2) try to run concurrently. By observation, you should be able to see this.

You could try, for the S6CasA section of app_config.xml, to control this by adding 1 immediately before the line.

With only 2 CPU cores available for support, performance would suffer if you have too many CasA GPU tasks running. You have one BRP4G result that took 20ksecs whereas more recent results (when running together) are taking only 8ksecs.

Cheers,
Gary.

siu77
siu77
Joined: 5 Oct 12
Posts: 8
Credit: 16367384
RAC: 7245

Now I'm crunching only

Now I'm crunching only GWopencl-ati-Beta.

On windows 8.1 4 units are working simultaneously without any problems.

On linix 3 units have been calculated more then 5 hours. When I've changed app_config.xml to 0.5 0.5 and pressed "read config files" system has got a freeze with no reacting on keyboard or mouse.

After reboot 2 CasA units are working. So far so good. No errors.

So, maybe there is something wrong with the code itself. At least for 260x.
---------------------------------------------------
>your host under Linux also reports less GPU memory.
Yes, 1926MB. I've looked up for memory usage in gpu-z in windows.
4 Arecibo units needs about 1,7GB.
4 CasA units - about 1,3GB.
So, 1926MB should be enough.
---------------------------------------------------
>You could try, for the S6CasA section of app_config.xml, to control this by adding 1 immediately before the line.
Thanks, I'll play with this option.

>You have one BRP4G result that took 20ksecs.
Games. )
I've tried to compare perfomance with 2,3 and 4 Arecibo gpu units simultaneously using 1 core cpu. It's approximately equal on this card.

siu77
siu77
Joined: 5 Oct 12
Posts: 8
Credit: 16367384
RAC: 7245

Same error on windows 8.1

Same error on windows 8.1 with 4 CasA units.
I've started GPU-Z and inaccurately moved a mouse, and gpu load became 0%, driver was restored and 1 WU have an error. After reboot this units return to normal, gpu load is 100%.

So, I will stay away from beta for now.

Thank you for responding.

Tom*
Tom*
Joined: 9 Oct 11
Posts: 54
Credit: 366729484
RAC: 0

Whenever I have had a Driver

Whenever I have had a Driver restart for the GPU any in progress Einstein
workunits usually end in error. Why that is when its supposed to do checkpointing
I have no idea. Checkpointing usually works ok on a power failure restart at least on BRP4 and BRP5. Haven't tested this on CAS jobs.
Fix the driver restarts and you will fix the errors.

mikey
mikey
Joined: 22 Jan 05
Posts: 12820
Credit: 1881884328
RAC: 1095756

RE: Whenever I have had a

Quote:
Whenever I have had a Driver restart for the GPU any in progress Einstein
workunits usually end in error. Why that is when its supposed to do checkpointing
I have no idea. Checkpointing usually works ok on a power failure restart at least on BRP4 and BRP5. Haven't tested this on CAS jobs.
Fix the driver restarts and you will fix the errors.

In Windows Boinc sometimes has a problem restarting the gpu after checkpointing, most of us have put a line in our cc_config.xml line to prevent it from happening, like this:

1

Use Notepad to copy and save the file into your Boinc directory and then tell the Boinc Manager to read the preferences and it should be fine.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.