Gravitational Wave search O2 Multi-Directional ("O2MD1")

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117559286677

RAC: 35351323

Betreger wrote:It seems odd

27 Jan 2020 20:42:19 UTC

Message 175450 in response to message 175448

(moderation:

)

Betreger wrote:

It seems odd they aren't be sent out again.

Why do you think that?? They will be being sent out again and if it's a problem with the validator rather than the result, it will most likely get 'fixed'. The last result in Holmis' list has already been 'fixed' so it does look promising.

Cheers,
Gary.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

Seems all of them have been

27 Jan 2020 23:23:03 UTC

Message 175456

(moderation:

)

Seems all of them have been fixed now. Thanks for taking care of it!

Bernd Machenschalk

Moderator

Administrator

Joined: 15 Oct 04

Posts: 4312

Credit: 250453446

RAC: 34968

On the GPUs ("O2MDF") we have

4 Feb 2020 8:22:51 UTC

Message 175519

(moderation:

)

On the GPUs ("O2MDF") we have another chunk of the "V2" workunits. Based on previous experience these should run about twice as long as expected (e.g. like the "G2" ones). I doubled the credit and flops estimation to make up for that, hope that this helps.

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1588892395

RAC: 760174

Validate errors have

27 Feb 2020 1:30:10 UTC

Message 175775

(moderation:

)

Validate errors have returned.

https://einsteinathome.org/workunit/438858562

https://einsteinathome.org/workunit/438909767

https://einsteinathome.org/workunit/438611172

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1588892395

RAC: 760174

I'm getting a fair number of

15 Mar 2020 16:20:42 UTC

Message 176045

(moderation:

)

I'm getting a fair number of "Error while computing" on both boxes.

https://einsteinathome.org/workunit/441408802

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1588892395

RAC: 760174

Validate errors are

31 Mar 2020 0:14:49 UTC

Message 176278

(moderation:

)

Validate errors are back

https://einsteinathome.org/workunit/443513006

https://einsteinathome.org/workunit/442048724

https://einsteinathome.org/workunit/445576021

Mike Hewson

Moderator

Joined: 1 Dec 05

Posts: 6588

Credit: 316221159

RAC: 337627

You have the

31 Mar 2020 1:10:00 UTC

Message 176280 in response to message 176278

(moderation:

)

You have the dreaded CL_MEM_OBJECT_ALLOCATION_FAILURE on some of those Vela Junior tasks for your computer with a 2GB Nvidia card. See this thread for more information.

Cheers, Mike.

I have made this letter longer than usual because I lack the time to make it shorter ...

... and my other CPU is a Ryzen 5950X :-) Blaise Pascal

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1588892395

RAC: 760174

https://einsteinathome.org/wo

2 Apr 2020 17:44:10 UTC

Message 176328

(moderation:

)

https://einsteinathome.org/workunit/447314688 I have 22 in a row of these. Thos was running S@H just fine and runs pulsars fine also. This card is a GTX10603GB. Methinks it is bad data not the host. Most fail in ~ 1min.

Betreger

Joined: 25 Feb 05

Posts: 992

Credit: 1588892395

RAC: 760174

I rebooted the offending box

2 Apr 2020 19:26:52 UTC

Message 176330

(moderation:

)

I rebooted the offending box and have now successfully completed 7 in a row so the problem seems to have been the host. The are in a pending status so time will tell.

Gary Roberts

Moderator

Joined: 9 Feb 05

Posts: 5872

Credit: 117559286677

RAC: 35351323

Betreger

2 Apr 2020 23:29:15 UTC

Message 176334 in response to message 176328

(moderation:

)

Betreger wrote:

https://einsteinathome.org/workunit/447314688 I have 22 in a row of these. Thos was running S@H just fine and runs pulsars fine also. This card is a GTX10603GB. Methinks it is bad data not the host. Most fail in ~ 1min.

Betreger wrote:

I rebooted the offending box and have now successfully completed 7 in a row so the problem seems to have been the host. The are in a pending status so time will tell.

Unless the problem is the immediate consequence of the release of a new or modified app that has just been announced here, Technical News is NOT the best forum to report new problems in longer running searches. There is a Problems forum specifically for that purpose. There is also no use in reporting a problem in multiple places. You just create more work for the already overworked Devs in trying to keep up with all reports that are coming in. You just encourage the 'me too' and the 'maybe it could be me too (but it's actually different)' reports to be in different places as well.

At the start of every day, I check the problems forum first and try to deal with any overnight problem reports, if I can. When I checked your report, it must have been just before you rebooted because there were only failed tasks, and none in progress at the instant I looked. It's always a good idea to try a reboot before declaring that a problem exists.

It is still quite possible that there really could be memory allocation issues with these higher frequency tasks so if you see further examples, please report it in the Problems forum.

Cheers,
Gary.

Gravitational Wave search O2 Multi-Directional ("O2MD1")

Forums › Technical News

Comment viewing options

Forums › Technical News