Gravitational Wave Engineering run on LIGO O1 Open Data

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7221554931

RAC: 967170

Continuing observations on my

4 Apr 2019 14:10:19 UTC

Message 170511 in response to message 170507

(moderation:

)

Continuing observations on my Radeon VII running 0.11.

It hit 99% reported progress after about 80 minutes elapsed time, at which point GPU usage dropped to zero.

GPU-Z reports the GPU clock at 25 MHz(!!!), memory clock at 349, temperature at 30/31, power at 20W, but memory usage still at 1541/189.

That lasted for a bit over 5 minutes (toplist statistics recalculation), after which the task completed and uploaded, with a reported elapsed time of 1:24:51.

If you care to take a look at the task it is 839004421. As the second task to fulfill the quorum is unsent, it may be some time before we can see validation or not.

Naturally running this task smashed my DCF up by over a factor of ten. As I've unsuspended the GRP work in my queue, that work is getting burned off in panic mode. It may be some time before I can try 2X or 3X on the GW work.

DF1DX

Joined: 14 Aug 10

Posts: 105

Credit: 3856326854

RAC: 4951862

With my Radeon VII (Win 7,

4 Apr 2019 14:12:30 UTC

Message 170512

(moderation:

)

With my Radeon VII (Win 7, AMD 19.4.1, einstein_O1OD1E_0.11_windows_x86_64__GW-opencl-ati-V1.exe) I can confirm the observation of archae86 - very low GPU load and clock frequency (~ 400 MHz) with a single WU.

When starting a second WU my PC crashed after a few minutes. So back to Gamma 1.18.

On the NV 1050Ti in my second host https://einsteinathome.org/de/host/12247194/tasks/2/0
3 WUs run in parallel, time about 7800 s. No problems at the moment.

San-Fernando-Valley

Joined: 16 Mar 16

Posts: 406

Credit: 10174733455

RAC: 25970703

... engineering 0.11 GPU WUs

4 Apr 2019 15:35:37 UTC

Message 170513

(moderation:

)

... engineering 0.11 GPU WUs running between 1 and 6 hours on fast NVIDIA !! GPU load varying between 1 % and max 10% !! Much effort for 144,000 GFLOPS.

Some WUs getting ERROR at address x'......0005'

Running WIN7 and WIN10.

I'm stopping 0.11 for now and will wait for "improvements".

tolafoph

Joined: 14 Sep 07

Posts: 122

Credit: 74659937

RAC: 0

crashtech schrieb:I'm not

4 Apr 2019 16:35:37 UTC

Message 170515 in response to message 170510

(moderation:

)

crashtech wrote:

I'm not going to run these for now because they massively underutilize my GPUs. Maybe there is an app_config that would help?

I just started to set it to use more than one task. I set it to 2 now but changing the value for the GW app to 0.5 in the settings on the website. https://einsteinathome.org/de/account/prefs/project

Now the GPU is used between 50 to 55%. But the clock is not at max , I believe. It can run at like 1800MHz, but its between 1600 and 1700 MHz most of the time.

For the next tasks I will try 3 tasks at once.

Edit: 2 tasks finished in about 7700s, compared to the 5500s for a single one.

mmonnin

Joined: 29 May 16

Posts: 291

Credit: 3397306540

RAC: 2959237

Looks like these are similar

4 Apr 2019 18:12:05 UTC

Message 170517

(moderation:

)

Looks like these are similar to the first revisions of the current GPU tasks. Low GPU utilization due to being heavily CPU dependent. Hopefully it'll improve.

archae86

Joined: 6 Dec 05

Posts: 3157

Credit: 7221554931

RAC: 967170

Running the GW 0.11 Windows

4 Apr 2019 23:34:23 UTC

Message 170519 in response to message 170507

(moderation:

)

Running the GW 0.11 Windows AMD application at 3X changed things materially, although the Radeon VII still is very lightly used.

1X   3X    Variable

64%  73%   GPU Load

39W  53W   GPU only Power Draw

463  880   GPU Clock MHz

798  839   Memory Clock MHZ

37C  44C   GPU temperature

39C  49C   Hot Spot GPU temperature

While the memory usage reported barely budged--which seems odd.

As you might suppose, the machine is considerably more productive at 3X than 1X on this work. I've tampered with some things mid-stream, and don't have much results, but on this machine a 1X run this morning took elapsed time of 5091 seconds, while a set of three which just finished (yes, I offset them some, but not enough) took only about 6200 seconds, so a huge productivity boost. One of those validated on completion, which is comforting. This host has only got four cores, and is not hyperthreaded, so although 4X might well work, and might be slightly more productive, I'm not tempted to try it.

My brand new (today) RX 570 host has six physical cores. If I get some days of stable running out of it one simpler stuff, I might give 4X on this work a try on it. That machine has so far run two of these tasks at 1X, with elapsed time around 3780 seconds. The much faster 1X time may mean the 570 is better for this work than a Radeon VII, but more likely the 9th generation 6-core CPU burns through the CPU portion of the job much faster than the older CPU on my Radeon VII host.

Meanwhile, my primary host is indicated as having 29 days of work on board, as I unintentionally allowed more of the new GW work to download when a spate of running GRP had driven the completion estimates way back down. I'm afraid a mass abort is in my future but I currently plan to run pure GW GPU for another half day.

tolafoph

Joined: 14 Sep 07

Posts: 122

Credit: 74659937

RAC: 0

archae86 schrieb:Meanwhile,

4 Apr 2019 23:41:55 UTC

Message 170520 in response to message 170519

(moderation:

)

archae86 wrote:

Meanwhile, my primary host is indicated as having 29 days of work on board, as I unintentionally allowed more of the new GW work to download when a spate of running GRP had driven the completion estimates way back down. I'm afraid a mass abort is in my future but I currently plan to run pure GW GPU for another half day.

Yeah, The extremly different runtimes of 15 min vs 2h is messing with the work I got. I almost ran out of tasks. I changed it from 0.25 to 0.5d of work buffer. But if I run only the 15 min tasks it might download way to many of the 2h ones. But so far I havent gotten any new GW units.

Jim1348

Joined: 19 Jan 06

Posts: 463

Credit: 257957147

RAC: 0

mmonnin wrote:Looks like

5 Apr 2019 15:57:14 UTC

Message 170531 in response to message 170517

(moderation:

)

mmonnin wrote:

Looks like these are similar to the first revisions of the current GPU tasks. Low GPU utilization due to being heavily CPU dependent. Hopefully it'll improve.

Yes, that is the way it was, and it will get better.

But if they are having that much of a problem with OpenCl, I wonder what the chances are for CUDA?

crashtech

Joined: 16 Mar 17

Posts: 3

Credit: 3095732582

RAC: 3969557

I am curious to try running

5 Apr 2019 21:43:25 UTC

Message 170533

(moderation:

)

I am curious to try running this app starting with 4 at a time, with each instance having its own physical core, like so:


<app_config>
  <app>
     <name>einstein_O1OD1E</name>
      <gpu_versions>
      <gpu_usage>0.25</gpu_usage>
      <cpu_usage>2.0</cpu_usage>
      </gpu_versions>
   </app>
</app_config>

Does anyone think this might work, ~~and if so, does anyone know the right project name to place in the app_config?~~

Edit: App name added, thanks to Keith Myers for the valuable information.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18721586983

RAC: 6437668

The project name is listed in

5 Apr 2019 20:02:34 UTC

Message 170534

(moderation:

)

The name is listed in the client_state.xml file under the project section in the <app> <name> declaration. The name is what you input into your app_config file.

Gravitational Wave Engineering run on LIGO O1 Open Data

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner