Multi-Directional Gravitational Wave Search on O3 data (O3MD1/F)

TPCBF
TPCBF
Joined: 24 Nov 12
Posts: 17
Credit: 235981637
RAC: 1072403

Likewise have more than 150

Likewise have more than 150 of those GPU tasks crap out within a couple of seconds... :(

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5874
Credit: 117991328131
RAC: 21127825

Problem was first reported

Problem was first reported here.

It looks like bad tasks.  A message has been sent to the staff.  Maybe the bad tasks can be removed remotely.  Otherwise, it could take until Monday for the problem to be corrected.

 

Cheers,
Gary.

Pop Piasa
Pop Piasa
Joined: 12 Nov 19
Posts: 2
Credit: 106576543
RAC: 0

Suddenly began seeing this

Suddenly began seeing this error upon receiving a new batch on my host.

<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>

Is it my host or the WU that has a problem?

[edit- sorry< I thought I had newest messages first. I'm caught up now]

Thanks

Pop

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4330
Credit: 251182139
RAC: 41772

Sorry for that. We need to

Sorry for that. Our bad. We need to restart that sub-run. Bad "O3MDFV1" WUs have been cancelled.

BM

Aurum
Aurum
Joined: 12 Jul 17
Posts: 77
Credit: 3412397040
RAC: 133

alanb1951 wrote:Aurum,If

alanb1951 wrote:

Aurum,

If the problem is on one or more of your Linux systems and you have app_config.xml files for Einstein with either <max_concurrent> or <project_max_concurrent> statements, try upgrading your BOINC client to 7.20.2 or later...

There was a known problem where the client didn't properly account for existing work for projects using either of those directives - the result was that it would request more work each time there was contact with the project.  OOPS!

The problem has been discussed in great detail on the forums at several projects - I'm a bit surprised you've never come across mention of it somewhere else :-)

Nope, haven't heard about it. Thanks for explaining it.

My only Win7 computer has already upgraded to 7.20.2 and doesn't have the problem. 

Now I'm trying to upgrade Linux computers but have yet to figure out how. Seems BOINC folks have changed a lot of the way they do things but haven't updated links and instructions. E.g.,

From this page https://boinc.berkeley.edu/

BOINC client 7.20.2 released
A new version of the client has been released for Windows and Mac OS. Download it here. Release notes are here. 27 Jul 2022

goes to this page https://boinc.berkeley.edu/download_all.php which stops at 7.16.6.

From https://wiki.debian.org/BOINC is old too:

aurum@B-4:~$ sudo apt-get install boinc-client
Reading package lists... Done
Building dependency tree       
Reading state information... Done
boinc-client is already the newest version (7.16.6+dfsg-1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

aurum@B-4:~$ sudo apt-get upgrade boinc-client
Reading package lists... Done
Building dependency tree       
Reading state information... Done
boinc-client is already the newest version (7.16.6+dfsg-1).
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

I found 7.20.2 at https://github.com/BOINC/boinc/releases but it unzips to a set of pdb files with no instructions how to actually upgrade BOINC client. Tried the Linux Synaptic Package Manager but it reinstalled 7.16.6. I'll keep looking. Added my comments to: BOINC Linux Download Page is confusing #5129

mountkidd
mountkidd
Joined: 14 Jun 12
Posts: 177
Credit: 12678659731
RAC: 5691894

The best source for Boinc

The best source for Boinc Ubuntu releases is here.

zombie67 [MM]
Joined: 10 Oct 06
Posts: 121
Credit: 504026665
RAC: 524417

Wow.  The CPU tasks are using

Wow.  The CPU tasks are using 3200mb of RAM each.

Edit:  And taking 2-3 days per task.

Reno, NV Team: SETI.USA

ML1
ML1
Joined: 20 Feb 05
Posts: 347
Credit: 86563414
RAC: 114

zombie67 [MM wrote:]Wow.

zombie67 wrote:

Wow.  The CPU tasks are using 3200mb of RAM each.

Edit:  And taking 2-3 days per task.

Good to know I'm not the only one seeing that!

 

The time estimate is also waaaay off which is confusing the Boinc scheduler to then leave the tasks to always run overdue...

 

Hopefully, the e@h servers won't time them out?

 

Happy crunchin',

Martin

 

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)

tasoss
tasoss
Joined: 14 Mar 23
Posts: 3
Credit: 121450446
RAC: 0

Pop Piasa wrote: Suddenly

Pop Piasa wrote:

Suddenly began seeing this error upon receiving a new batch on my host.

<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>

Is it my host or the WU that has a problem?

[edit- sorry< I thought I had newest messages first. I'm caught up now]

Thanks

Pop

 

I'm having this issue.

Is it normal?

mikey
mikey
Joined: 22 Jan 05
Posts: 12746
Credit: 1839147599
RAC: 3514

tasoss wrote: Pop Piasa

tasoss wrote:

Pop Piasa wrote:

Suddenly began seeing this error upon receiving a new batch on my host.

<![CDATA[
<message>
Incorrect function.
 (0x1) - exit code 1 (0x1)</message>

Is it my host or the WU that has a problem?

[edit- sorry< I thought I had newest messages first. I'm caught up now]

Thanks

Pop 

I'm having this issue.

Is it normal? 

First unhide your pc's so people can see the error messages and your OS and drivers, 2nd restart your pc if you are using Windows as that will restart your cpu and gpu and give them a fresh start.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.