Help a noob out - acceptable rate of invalid returns

Burned
Burned
Joined: 25 Jun 21
Posts: 32
Credit: 388221900
RAC: 0
Topic 225627

Crunching GPU work for GRPBS1.  I'm getting some Invalid returns.  Is this just the nature of the science and computations?  Different systems produces different results and you just try to converge on the most likely correct answer?

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5035
Credit: 19001427488
RAC: 6772951

Yes, unavoidable.  You get a

Yes, unavoidable.  You get a better chance if your wingmen have similar hardware and OS.

Some discussion about relaxing the validator limits or pairing wingmen with similar hardware taking place.

​​​​​​​[Edit] Comment for wrong project.

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3161
Credit: 7297421691
RAC: 2186937

For that specific application

For that specific application a pretty typical ratio of invalid to valid results for an individual system is roughly 100:1  one out of every hundred.  If a system is persistently well above that, say one out of twenty or worse, and a quick check of top systems of quorum partners suggests the background situation has not changed for everybody, then there is real reason for concern about the health of the system in question.

My quick look at your two hosts suggested to me that there was nothing unusual at hand. so far.

In floating point something as simple as conversions from internal to external representation causes infinitestimal differences in results from runs in which things get paused at different places.  IEEE floating point is so very good that such differences are quite usually inconsequential, but nevertheless detectable.  Setting the acceptance limits on "close enough to count" is not a trivial matter.

[edited in response to Gary Roberts pointing out that I had it very wrong indeed]

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5877
Credit: 118567837580
RAC: 19413628

Just a small clarification of

Just a small clarification of what Archae86 posted.  He said, "ratio of invalid to valid results for an individual system is roughly 100:1." but I'm sure he meant the other way around :-).

I would agree that's it's quite normal to see around 1% of returned tasks marked as 'invalid' due to very minor precision differences in different math libraries being used.  This can vary over time - maybe 0.5% at one point and perhaps as high as 2% at some other time.  If you see much greater than that consistently, it might be wise to investigate - things like clocks, voltages and temperature spring to mind :-).

Cheers,
Gary.

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

In QuChemPedIA@home I get a

In QuChemPedIA@home I get a 50%rate of invalid results using Windows 10. It is a Linux project and I have to use VirtualBox. Yet I am number 28 in the RAC ranking list.

Tullio

Burned
Burned
Joined: 25 Jun 21
Posts: 32
Credit: 388221900
RAC: 0

Tullio, I'm not certain how

Tullio, I'm not certain how virtual box works, but you may want to check your linux c runtime libraries.  The project should probably have a recommendation as to what package(s) they want used.

Ian&Steve C.
Ian&Steve C.
Joined: 19 Jan 20
Posts: 4068
Credit: 48427460745
RAC: 34209552

tullio wrote: In

tullio wrote:

In QuChemPedIA@home I get a 50%rate of invalid results using Windows 10. It is a Linux project and I have to use VirtualBox. Yet I am number 28 in the RAC ranking list.

Tullio

what does this have to do with Einstein?

_________________________________________________________________________

tullio
tullio
Joined: 22 Jan 05
Posts: 2118
Credit: 61407735
RAC: 0

I have a number of Einstein

I have a number of Einstein results, both GPU and CPU. Pending rate is almost zero. I have six BOINC running projects. Sometimes it is interesting to compare how different projects handle the valid/invalid ratio.

Tullio

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.