did we momentarily run out of hs_gamma WUs? for AMD

Anonymous
Topic 198463

Because of problems I have had with an AMD cruncher I wrote script that runs every 5 minutes and checks for the presence of active hs_gamma (gpu) WUs. Yesterday I received enail from that PC indicating no active GPU crunching. Today I investigated and there are active hs_gammma jobs.

What I just noticed is that the outage emails are all time stamped as hh.01. I would not be able to provide an explanation as to why this script would report no active gpu wu "on the hour" but be ok at other times. Jobs either exist or they don't. The email started at 2/23/16:0901 to 11:01 and today 12:01~05:01. Now all is well.

Were these legitimate outage periods?

FYI: I wrote this script becuase due to a power outage at my end and a bad UPS battery I once lost 3 days of processing before I realized a problem.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4332
Credit: 252046247
RAC: 33693

did we momentarily run out of hs_gamma WUs? for AMD

We had an experimental GPU version for hsgamma_FGRP3, which finished over a year ago. Its successor, FGRP4, did never have a GPU version. The last FGRP4 workunit was produced a few days ago, only in case of tasks timing out or reported as errors some new "tasks to send" are generated. The successor of FGRP4 is FGRPB1.

BM

BM

Anonymous

RE: We had an experimental

Quote:

We had an experimental GPU version for hsgamma_FGRP3, which finished over a year ago. Its successor, FGRP4, did never have a GPU version. The last FGRP4 workunit was produced a few days ago, only in case of tasks timing out or reported as errors some new "tasks to send" are generated. The successor of FGRP4 is FGRPB1.

BM

My Linux PC's "top" output shows "hsgamma_FGRPB1_". The script I run checks for the presence of this type of job and if it does not find them notifies me through email. It notified me 3 times on the 23rd and 6 times today (the 24th at midnight, 1,2,3,4, and 5) that this PC was not processing any of these tasks. I seem to recall that there was some scheduled down time at Wisconsin. This machine keeps a very short backlog of units so it is more upload/download active. Would that downtime possibly have starved my PC trigerring an out of WU condition? Twice this machine has gone down due to a bad UPS only to recover but does not restart boinc automatically. This is why I monitor tasks so I can minimize downtime. I just thought it curious.

Anonymous

I understand what is

I understand what is happening. I completed all of the FGRPB1 jobs and continued on with other work. I checked the server status and there are more FGRPB1 WUs available but I am between downloads periods. I will have to modify my script to check for GPU work. Blush.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.