Validate error - What this really means!

CGR
CGR
Joined: 2 Sep 12
Posts: 3
Credit: 272118
RAC: 0

RE: It is not uncommon that

Quote:

It is not uncommon that we have a few bad BRP4 "beams" every month that slip through pre-processing without being detected as such. Most of the tasks generated from these will end up as validate errors.

Normally we do have scripts and internal web pages that monitor these, and it's usually me who then cancels the respective workunits.


Is this what happened to WU #170248721? I never had a validate error before that and I would like to understand what caused it. So any information on this particular WU is appreciated.

Thanks in advance.

Alex
Alex
Joined: 1 Mar 05
Posts: 451
Credit: 507064930
RAC: 87789

RE: what happened to WU

Quote:
what happened to WU #170248721?

already marked as cancelled. Happens from time to time. Nothing to worry about.

Maximilian Mieth
Maximilian Mieth
Joined: 4 Oct 12
Posts: 130
Credit: 10286732
RAC: 4018

Thanks to this thread I am

Thanks to this thread I am well informed about validate errors, but what does 'validation inconclusive' actually mean? See e.g. this workunit.

MAGIC Quantum Mechanic
MAGIC Quantum M...
Joined: 18 Jan 05
Posts: 1886
Credit: 1408157900
RAC: 1159282

That happens when there are


That happens when there are different results from the same tasks by different hosts which at times it takes a 3rd result to see if 2 of the results are the same.....or not at all.

edenist
edenist
Joined: 18 Jan 05
Posts: 2
Credit: 17516500
RAC: 0

Hi everyone, I've just

Hi everyone,

I've just built a new system and have been running BOINC on it this last week.
It's an AMD A10-7850K, running on Ubuntu 14.04 x64.

It seems that almost all of my BRP5 WU's are resulting in Validate Errors when being run with the opencl-ati application.

I had a couple which passed last week, but all the rest are failing. Is there an issue with running GPU code on the APU's?

Here are a couple of WU's which have validate errors...

http://einsteinathome.org/task/437729215
http://einsteinathome.org/task/437808773
http://einsteinathome.org/task/437841431

And this one passed [http://einsteinathome.org/task/437809655]

Any information would be helpful.

Cheers!

[EDIT]:
Taking a closer look at the stderr logs from each WU, it appears that the two which completed successfully never had an application restart occur during there run, whereas all failed WU's did. Is it possible that the pause/resume is causing an error in the computation?

Gordon Lack
Gordon Lack
Joined: 19 Jun 13
Posts: 6
Credit: 1378284
RAC: 557

I've just had a

I've just had a BRP5-opencl-ati complete and fail to validate.

http://einsteinathome.org/workunit/194577452

The stderr output finishes with:

Quote:
[01:39:49][4893][INFO ] Data processing finished successfully!

but the validate status on the job is:

Quote:
Workunit error - check skipped

(there are no errors mentioned...)

Mind you, I've just spotted that the link above now reports:

Quote:
errors WU cancelled
Logforme
Logforme
Joined: 13 Aug 10
Posts: 332
Credit: 1714373961
RAC: 0

RE: I've just had a

Quote:
I've just had a BRP5-opencl-ati complete and fail to validate.


This problem is covered in another thread

John Jamulla
John Jamulla
Joined: 26 Feb 05
Posts: 32
Credit: 1172292348
RAC: 547295

Sorry to bother all of you,

Sorry to bother all of you, but would like to know what's wrong with my GPU crunching tasks....

I have a relatively new machine (all kinds of problems with it out of the box, bad Mobo, mempry, CPU) for like 1st 6 months I had it. Theoretically fixed now - appears not with GPU with einstein@home though).

The GPU tasks won't seem to ever validate correctly, ever.

It's a i7-3930k 6-core, AsRock Z77 Extreme 6 Mobo, O.C worthy Mobo and Memory, excellent Corsair PSU, etc. Waterblock cooler. etc. There isn't a heat problem with it, memory and CPUa re fine, most of the tasks from the CPU are fine and working as expected (no errors most of the time).

CPU has/using 12 threads, overclocked CPU to 4.3 GHz (not GPU OC). I have a GTX770 in it.
It appears the GPU is running, but NO TASK ever validate. I don't see a single CPU task from the CPU as "good".

In my list of tasks under "invalid", I either get "Validate error" or "Completed, marked as invalid"
My computer ID: 11453074
Tasks are all (ON GPU): BRP5-cuda32-nv301

Here's at least one of each type with validation errors, can someone tell me what's wrong?

Loading GPU driver 337.88 "fresh" from NVIDIA now to see if it matters...
http://einsteinathome.org/workunit/194670882 - Binary Radio Pulsar
Search (Perseus Arm Survey) v1.39 (BRP5-cuda32-nv301

This one complete, but was marked as invalid:
http://einsteinathome.org/workunit/194648623

How can I tell what's going wrong? There doesn't seem to be a graphics type problem with my GPU, it's fine on screen (no games on this machine, cruncher only).

I am set to run 2 tasks simultaneous on the GPU...

Gavin
Gavin
Joined: 21 Sep 10
Posts: 191
Credit: 40644337738
RAC: 1

Here's a snippet from one of

Here's a snippet from one of your output files (doesn't matter which they are all the same) that hopefully gives a clue, in bold!:

7.2.42

Activated exception handling...
[15:01:26][3632][INFO ] Starting data processing...
[15:01:26][3632][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 299 MB (1750 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[15:01:26][3632][INFO ] Using CUDA device #0 "GeForce GTX 770" (0 CUDA cores / 0.00 GFLOPS)
[15:01:26][3632][INFO ] Version of installed CUDA driver: 6000
[15:01:26][3632][INFO ] Version of CUDA driver API used: 3020
[15:01:27][3632][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...

I would be tempted to remove then re-install the driver for your GPU with a fresh copy from the NVidia website.

Claggy
Claggy
Joined: 29 Dec 06
Posts: 560
Credit: 2699403
RAC: 0

RE: ------> Used in total:

Quote:
------> Used in total: 299 MB (1750 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[15:01:26][3632][INFO ] Using CUDA device #0 "GeForce GTX 770" (0 CUDA cores / 0.00 GFLOPS)
[15:01:26][3632][INFO ] Version of installed CUDA driver: 6000
[15:01:26][3632][INFO ] Version of CUDA driver API used: 3020

I wouldn't worry about that, the Cuda 3.2 api doesn't know about how GTX 770's are made up,
Arvid Almstrom's GTX780's top Nvidia host also doesn't report the number of Cuda cores or GFLOPS :

http://einsteinathome.org/host/6216490

[18:11:04][11780][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 675 MB (2399 MB free / 3074 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:11:04][11780][INFO ] Using CUDA device #1 "GeForce GTX 780" (0 CUDA cores / 0.00 GFLOPS)
[18:11:04][11780][INFO ] Version of installed CUDA driver: 6000
[18:11:04][11780][INFO ] Version of CUDA driver API used: 3020

and on my GT650M:

[18:30:27][5600][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 71 MB (1978 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:30:27][5600][INFO ] Using CUDA device #0 "GeForce GT 650M" (0 CUDA cores / 0.00 GFLOPS)
[18:30:27][5600][INFO ] Version of installed CUDA driver: 6050
[18:30:27][5600][INFO ] Version of CUDA driver API used: 3020

Claggy

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.