Einstein FGRPB1G Linux/Nvidia Special app "AIO"

biodoc

Joined: 30 Aug 09

Posts: 1

Credit: 1879044714

RAC: 3412763

Thanks Ian and Petri. The

4 Jun 2022 16:32:05 UTC

Message 197263

(moderation:

)

Thanks Ian and Petri. The v0.95 app is up and running on my 3 gpus. Well done!

tito

Joined: 10 Jun 06

Posts: 28

Credit: 1246074391

RAC: 679339

Ian&Steve C. wrote: just

4 Jun 2022 16:50:10 UTC

Message 197265 in response to message 197258

(moderation:

)

Ian&Steve C. wrote:

just keep an eye on your error rate. if you see a lot more errors, it might make sense to swap over to the 0.95 app.

Nothing unusual so far. Will keep an eye on error ratio.

gordonbb

Joined: 14 May 19

Posts: 26

Credit: 895570568

RAC: 0

Thanks @petri33 & @Ian&Steve

7 Jun 2022 19:49:20 UTC

Message 197409

(moderation:

)

Thanks @petri33 & @Ian&Steve C. for making this available. I'm fairly new at BOINC so I don't quite get the nuances of configuring the Anonymous Platform.

Though I tried to set:

    <coproc><br />
      <type>NVIDIA</type><br />
      <count>0.5</count><br />
    </coproc>

in app_info.xml on systems with 8GB VRAM once I reloaded boinc-client it would fail the second Task with a Computational Error:

</p>

<pre>
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 65 (0x41, -191)</message>
<stderr_txt>
13:29:23 (3697396): [normal]: This Einstein@home App (v1.0 by petri33) was built at: Apr 28 2022 18:47:15

13:29:23 (3697396): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0'.
13:29:23 (3697396): [debug]: 1e+16 fp, 6.4e+09 fp/s, 1647640 s, 457h40m40s30
13:29:23 (3697396): [normal]: % CPU usage: 1.000000, GPU usage: 0.500000
command line: ../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0 --inputfile ../../projects/einstein.phys.uwm.edu/LATeah3012L11.dat --alpha 2.59819959601 --delta -0.694603692878 --skyRadius 1.890770e-06 --ldiBins 15 --f0start 764.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 1.69860773e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah3012L11_0772_33406920.dat --debug 0 -o LATeah3012L11_772.0_0_0.0_33406920_0_0.out
output files: 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah3012L11_772.0_0_0.0_33406920_0_0' 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah3012L11_772.0_0_0.0_33406920_0_1'
13:29:23 (3697396): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
13:29:23 (3697396): [debug]: glibc version/release: 2.35/stable
13:29:23 (3697396): [debug]: Set up communication with graphics process.
EAH_SLEEP file found, value 0

kernel_compact 256 threads
kernel_raz 256 threads
kernel_ts_2_phase_diff_sorted 64 threads
kernel_prepare_power_toplist 256 threads
kernel_prepareSort 1024 threads
kernel_SortedPhoton 64 threads
kernel_setupPhotonPairsArray 64 threads
kernel_extractPhotonIndex 512 threads
Eah sleep true, 0
boinc_get_opencl_ids returned [0x55efcd653c40 , 0x55efcd649af0] 
Using OpenCL platform provided by: NVIDIA Corporation
Using OpenCL device "NVIDIA GeForce GTX 1070 Ti" by: NVIDIA Corporation
Max allocation limit: 2127691776
Global mem size: 8510767104
Could not open file: /tmp/dep-b9eb4b.d
OpenCL device has FP64 support
Could not open file: /tmp/dep-272e61.d
SemiCoh mode 0 start
skypoints(1)read_checkpoint(): Couldn't open file 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cpt': No such file or directory (2)
skypoint loop(1)
S0:
binpoints loop 639
set_up_fft samples:16777216
% fft length: 16777216(0x1000000)
Using alternate fft kernel file: ../../clfft.kernel.Transpose2.cl.alt
Could not open file: /tmp/dep-a7ca6a.d
Using alternate fft kernel file: ../../clfft.kernel.Stockham3.cl.alt
Could not open file: /tmp/dep-ee75f4.d
Using alternate fft kernel file: ../../clfft.kernel.Transpose4.cl.alt
Could not open file: /tmp/dep-3d35bf.d
Using alternate fft kernel file: ../../clfft.kernel.Stockham5.cl.alt
Could not open file: /tmp/dep-e68c2c.d
Using alternate fft kernel file: ../../clfft.kernel.Transpose6.cl.alt
Could not open file: /tmp/dep-743a39.d
% Scratch buffer size: 136314880
ZError in OpenCL context: Unknown error executing clFlush on NVIDIA GeForce GTX 1070 Ti (Device 0).

... {above repeated many times } ...
Failed to allocate tmp buffer for photon data
13:29:28 (3697396): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: 
mv: cannot stat 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out': No such file or directory
mv: cannot stat 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cohfu': No such file or directory
13:29:28 (3697396): [normal]: done. calling boinc_finish(65).
13:29:28 (3697396): called boinc_finish(65)
Warning:  Program terminating, but clFFT resources not freed.
Please consider explicitly calling clfftTeardown( ).

</stderr_txt>
]]></pre>

<pre>

Still, running just 1 Task/GPU I'm seeing a 45% decrease in time compared to the stock application (EVGA 1070ti @ 90W; Ubuntu 22.04 LTS; NVIDIA 510.73.05) and similar gains on a 2060, 2060 Super and a 1660ti.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18718876169

RAC: 6379427

You don't make the change in

7 Jun 2022 20:16:04 UTC

Message 197410 in response to message 197409

(moderation:

)

You don't make the change in task concurrence in the coproc_info.xml file. That file is autogenerated by the client detection of the system gpus. It is not meant to be tampered with by the user.

You make the change to crunch multiple tasks concurrently on the gpu either at the projects Computing Preferences settings or in an app_info.xml file which needs to be written by the user.

So change either here:

Project Preferences >> GPU utilization factor of FGRP apps: 1.00 >> 0.50

or here:

<app_config>
<app>
<name>hsgamma_FGRPB1G</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.9</cpu_usage>
</gpu_versions>
</app>
</app_config>

The app_config.xml file goes into the project directory >> einstein.phys.uwm.edu

Choose one or the other method. Not both. Project Preferences is the easiest.

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3952

Credit: 46789852642

RAC: 64202147

yeah use an app_config to do

7 Jun 2022 20:58:48 UTC

Message 197414

(moderation:

)

Keith, that coproc section he posted is actually from the app_info file, not the coproc_info file.

but yeah, use an app_config to do it.

_________________________________________________________________________

Ian&Steve C.

Joined: 19 Jan 20

Posts: 3952

Credit: 46789852642

RAC: 64202147

gordonbb wrote: Thanks

7 Jun 2022 21:08:52 UTC

Message 197416 in response to message 197409

(moderation:

)

gordonbb wrote:

Thanks @petri33 & @Ian&Steve C. for making this available. I'm fairly new at BOINC so I don't quite get the nuances of configuring the Anonymous Platform.

Though I tried to set:

    <coproc><br />
      <type>NVIDIA</type><br />
      <count>0.5</count><br />
    </coproc>

in app_info.xml on systems with 8GB VRAM once I reloaded boinc-client it would fail the second Task with a Computational Error:

</p>

<pre>
<core_client_version>7.18.1</core_client_version>
<![CDATA[
<message>
process exited with code 65 (0x41, -191)</message>
<stderr_txt>
13:29:23 (3697396): [normal]: This Einstein@home App (v1.0 by petri33) was built at: Apr 28 2022 18:47:15

13:29:23 (3697396): [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0'.
13:29:23 (3697396): [debug]: 1e+16 fp, 6.4e+09 fp/s, 1647640 s, 457h40m40s30
13:29:23 (3697396): [normal]: % CPU usage: 1.000000, GPU usage: 0.500000
command line: ../../projects/einstein.phys.uwm.edu/HSgammaPulsar_x86_64-pc-linux-gnu-opencl_v1.0 --inputfile ../../projects/einstein.phys.uwm.edu/LATeah3012L11.dat --alpha 2.59819959601 --delta -0.694603692878 --skyRadius 1.890770e-06 --ldiBins 15 --f0start 764.0 --f0Band 8.0 --firstSkyPoint 0 --numSkyPoints 1 --f1dot -1e-13 --f1dotBand 1e-13 --df1dot 1.69860773e-15 --ephemdir ../../projects/einstein.phys.uwm.edu/JPLEPH --Tcoh 2097152.0 --toplist 10 --cohFollow 10 --numCells 1 --useWeights 1 --Srefinement 1 --CohSkyRef 1 --cohfullskybox 1 --mmfu 0.1 --reftime 56100 --model 0 --f0orbit 0.005 --mismatch 0.1 --demodbinary 1 --BinaryPointFile ../../projects/einstein.phys.uwm.edu/templates_LATeah3012L11_0772_33406920.dat --debug 0 -o LATeah3012L11_772.0_0_0.0_33406920_0_0.out
output files: 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out' '../../projects/einstein.phys.uwm.edu/LATeah3012L11_772.0_0_0.0_33406920_0_0' 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cohfu' '../../projects/einstein.phys.uwm.edu/LATeah3012L11_772.0_0_0.0_33406920_0_1'
13:29:23 (3697396): [debug]: Flags: X64 SSE SSE2 GNUC X86 GNUX86
13:29:23 (3697396): [debug]: glibc version/release: 2.35/stable
13:29:23 (3697396): [debug]: Set up communication with graphics process.
EAH_SLEEP file found, value 0

kernel_compact 256 threads
kernel_raz 256 threads
kernel_ts_2_phase_diff_sorted 64 threads
kernel_prepare_power_toplist 256 threads
kernel_prepareSort 1024 threads
kernel_SortedPhoton 64 threads
kernel_setupPhotonPairsArray 64 threads
kernel_extractPhotonIndex 512 threads
Eah sleep true, 0
boinc_get_opencl_ids returned [0x55efcd653c40 , 0x55efcd649af0] 
Using OpenCL platform provided by: NVIDIA Corporation
Using OpenCL device "NVIDIA GeForce GTX 1070 Ti" by: NVIDIA Corporation
Max allocation limit: 2127691776
Global mem size: 8510767104
Could not open file: /tmp/dep-b9eb4b.d
OpenCL device has FP64 support
Could not open file: /tmp/dep-272e61.d
SemiCoh mode 0 start
skypoints(1)read_checkpoint(): Couldn't open file 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cpt': No such file or directory (2)
skypoint loop(1)
S0:
binpoints loop 639
set_up_fft samples:16777216
% fft length: 16777216(0x1000000)
Using alternate fft kernel file: ../../clfft.kernel.Transpose2.cl.alt
Could not open file: /tmp/dep-a7ca6a.d
Using alternate fft kernel file: ../../clfft.kernel.Stockham3.cl.alt
Could not open file: /tmp/dep-ee75f4.d
Using alternate fft kernel file: ../../clfft.kernel.Transpose4.cl.alt
Could not open file: /tmp/dep-3d35bf.d
Using alternate fft kernel file: ../../clfft.kernel.Stockham5.cl.alt
Could not open file: /tmp/dep-e68c2c.d
Using alternate fft kernel file: ../../clfft.kernel.Transpose6.cl.alt
Could not open file: /tmp/dep-743a39.d
% Scratch buffer size: 136314880
ZError in OpenCL context: Unknown error executing clFlush on NVIDIA GeForce GTX 1070 Ti (Device 0).

... {above repeated many times } ...
Failed to allocate tmp buffer for photon data
13:29:28 (3697396): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: 
mv: cannot stat 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out': No such file or directory
mv: cannot stat 'LATeah3012L11_772.0_0_0.0_33406920_0_0.out.cohfu': No such file or directory
13:29:28 (3697396): [normal]: done. calling boinc_finish(65).
13:29:28 (3697396): called boinc_finish(65)
Warning:  Program terminating, but clFFT resources not freed.
Please consider explicitly calling clfftTeardown( ).

</stderr_txt>
]]></pre>

<pre>

this is the same problem that's popped up for a few folks (mostly Keith) with Ryzen systems.

if you keep getting a lot of errors, you could consider running 2x v0.95 app tasks, which might be faster than 1x v1.0 task.

_________________________________________________________________________

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18718876169

RAC: 6379427

Thanks for the corrections,

7 Jun 2022 21:20:47 UTC

Message 197420 in response to message 197416

(moderation:

)

Thanks for the corrections, Ian. I breezed over the post too fast to read what was really going on.

Yes, with the flushing cache errors, you need to back level to the v0.95 version. That stops the errors on my Ryzen hosts.

Still faster than the stock 1.28 application.

gordonbb

Joined: 14 May 19

Posts: 26

Credit: 895570568

RAC: 0

Ian&Steve C. wrote: this is

7 Jun 2022 21:47:42 UTC

Message 197422 in response to message 197416

(moderation:

)

Ian&Steve C. wrote:

this is the same problem that's popped up for a few folks (mostly Keith) with Ryzen systems.

if you keep getting a lot of errors, you could consider running 2x v0.95 app tasks, which might be faster than 1x v1.0 task.

Thank-you @Ian&Steve C. & @Keith Myers.

I'll revert to the 0.95 version (yes, these are Ryzen systems) and put the app_config.xml file that I removed back and give it a try.

gordonbb

Joined: 14 May 19

Posts: 26

Credit: 895570568

RAC: 0

Curious. In my specific

8 Jun 2022 0:24:52 UTC

Message 197431

(moderation:

)

Curious.

In my specific use case: running my GPUs at their lowest Power-Limit (Pascal) or at a reduced graphics clock (Turing), the 1.0 significantly out-performs the 0.95 version to the point that running 1 Task/GPU on the 1.0 version outperforms 2 tasks per GPU on the 0.95 version.

For my 1070Ti, for example, the 1.0 version is 45% faster than the native Application but the 0.95 version is only 24.7% faster with 1 task and 22.2% faster comparing 2 Tasks/GPU.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18718876169

RAC: 6379427

If I understand correctly,

8 Jun 2022 2:14:34 UTC

Message 197435 in response to message 197431

(moderation:

)

If I understand correctly, you are able to run the v1.0 application with just a single task per gpu and it doesn't error out?

If so that is a new datapoint for troubleshooting the application on Ryzen systems and 8GB cards.

Einstein FGRPB1G Linux/Nvidia Special app "AIO"

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner