It is not uncommon that we have a few bad BRP4 "beams" every month that slip through pre-processing without being detected as such. Most of the tasks generated from these will end up as validate errors.
Normally we do have scripts and internal web pages that monitor these, and it's usually me who then cancels the respective workunits.
Is this what happened to WU #170248721? I never had a validate error before that and I would like to understand what caused it. So any information on this particular WU is appreciated.
That happens when there are different results from the same tasks by different hosts which at times it takes a 3rd result to see if 2 of the results are the same.....or not at all.
[EDIT]:
Taking a closer look at the stderr logs from each WU, it appears that the two which completed successfully never had an application restart occur during there run, whereas all failed WU's did. Is it possible that the pause/resume is causing an error in the computation?
Sorry to bother all of you, but would like to know what's wrong with my GPU crunching tasks....
I have a relatively new machine (all kinds of problems with it out of the box, bad Mobo, mempry, CPU) for like 1st 6 months I had it. Theoretically fixed now - appears not with GPU with einstein@home though).
The GPU tasks won't seem to ever validate correctly, ever.
It's a i7-3930k 6-core, AsRock Z77 Extreme 6 Mobo, O.C worthy Mobo and Memory, excellent Corsair PSU, etc. Waterblock cooler. etc. There isn't a heat problem with it, memory and CPUa re fine, most of the tasks from the CPU are fine and working as expected (no errors most of the time).
CPU has/using 12 threads, overclocked CPU to 4.3 GHz (not GPU OC). I have a GTX770 in it.
It appears the GPU is running, but NO TASK ever validate. I don't see a single CPU task from the CPU as "good".
In my list of tasks under "invalid", I either get "Validate error" or "Completed, marked as invalid"
My computer ID: 11453074
Tasks are all (ON GPU): BRP5-cuda32-nv301
Here's at least one of each type with validation errors, can someone tell me what's wrong?
Loading GPU driver 337.88 "fresh" from NVIDIA now to see if it matters... http://einsteinathome.org/workunit/194670882 - Binary Radio Pulsar
Search (Perseus Arm Survey) v1.39 (BRP5-cuda32-nv301
How can I tell what's going wrong? There doesn't seem to be a graphics type problem with my GPU, it's fine on screen (no games on this machine, cruncher only).
I am set to run 2 tasks simultaneous on the GPU...
Here's a snippet from one of your output files (doesn't matter which they are all the same) that hopefully gives a clue, in bold!:
7.2.42
Activated exception handling...
[15:01:26][3632][INFO ] Starting data processing...
[15:01:26][3632][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 299 MB (1750 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[15:01:26][3632][INFO ] Using CUDA device #0 "GeForce GTX 770" (0 CUDA cores / 0.00 GFLOPS)
[15:01:26][3632][INFO ] Version of installed CUDA driver: 6000
[15:01:26][3632][INFO ] Version of CUDA driver API used: 3020
[15:01:27][3632][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
I would be tempted to remove then re-install the driver for your GPU with a fresh copy from the NVidia website.
------> Used in total: 299 MB (1750 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[15:01:26][3632][INFO ] Using CUDA device #0 "GeForce GTX 770" (0 CUDA cores / 0.00 GFLOPS)
[15:01:26][3632][INFO ] Version of installed CUDA driver: 6000
[15:01:26][3632][INFO ] Version of CUDA driver API used: 3020
I wouldn't worry about that, the Cuda 3.2 api doesn't know about how GTX 770's are made up,
Arvid Almstrom's GTX780's top Nvidia host also doesn't report the number of Cuda cores or GFLOPS :
[18:11:04][11780][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 675 MB (2399 MB free / 3074 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:11:04][11780][INFO ] Using CUDA device #1 "GeForce GTX 780" (0 CUDA cores / 0.00 GFLOPS)
[18:11:04][11780][INFO ] Version of installed CUDA driver: 6000
[18:11:04][11780][INFO ] Version of CUDA driver API used: 3020
and on my GT650M:
[18:30:27][5600][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 71 MB (1978 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:30:27][5600][INFO ] Using CUDA device #0 "GeForce GT 650M" (0 CUDA cores / 0.00 GFLOPS)
[18:30:27][5600][INFO ] Version of installed CUDA driver: 6050
[18:30:27][5600][INFO ] Version of CUDA driver API used: 3020
RE: It is not uncommon that
)
Is this what happened to WU #170248721? I never had a validate error before that and I would like to understand what caused it. So any information on this particular WU is appreciated.
Thanks in advance.
RE: what happened to WU
)
already marked as cancelled. Happens from time to time. Nothing to worry about.
Thanks to this thread I am
)
Thanks to this thread I am well informed about validate errors, but what does 'validation inconclusive' actually mean? See e.g. this workunit.
That happens when there are
)
That happens when there are different results from the same tasks by different hosts which at times it takes a 3rd result to see if 2 of the results are the same.....or not at all.
Hi everyone, I've just
)
Hi everyone,
I've just built a new system and have been running BOINC on it this last week.
It's an AMD A10-7850K, running on Ubuntu 14.04 x64.
It seems that almost all of my BRP5 WU's are resulting in Validate Errors when being run with the opencl-ati application.
I had a couple which passed last week, but all the rest are failing. Is there an issue with running GPU code on the APU's?
Here are a couple of WU's which have validate errors...
http://einsteinathome.org/task/437729215
http://einsteinathome.org/task/437808773
http://einsteinathome.org/task/437841431
And this one passed [http://einsteinathome.org/task/437809655]
Any information would be helpful.
Cheers!
[EDIT]:
Taking a closer look at the stderr logs from each WU, it appears that the two which completed successfully never had an application restart occur during there run, whereas all failed WU's did. Is it possible that the pause/resume is causing an error in the computation?
I've just had a
)
I've just had a BRP5-opencl-ati complete and fail to validate.
http://einsteinathome.org/workunit/194577452
The stderr output finishes with:
but the validate status on the job is:
(there are no errors mentioned...)
Mind you, I've just spotted that the link above now reports:
RE: I've just had a
)
This problem is covered in another thread
Sorry to bother all of you,
)
Sorry to bother all of you, but would like to know what's wrong with my GPU crunching tasks....
I have a relatively new machine (all kinds of problems with it out of the box, bad Mobo, mempry, CPU) for like 1st 6 months I had it. Theoretically fixed now - appears not with GPU with einstein@home though).
The GPU tasks won't seem to ever validate correctly, ever.
It's a i7-3930k 6-core, AsRock Z77 Extreme 6 Mobo, O.C worthy Mobo and Memory, excellent Corsair PSU, etc. Waterblock cooler. etc. There isn't a heat problem with it, memory and CPUa re fine, most of the tasks from the CPU are fine and working as expected (no errors most of the time).
CPU has/using 12 threads, overclocked CPU to 4.3 GHz (not GPU OC). I have a GTX770 in it.
It appears the GPU is running, but NO TASK ever validate. I don't see a single CPU task from the CPU as "good".
In my list of tasks under "invalid", I either get "Validate error" or "Completed, marked as invalid"
My computer ID: 11453074
Tasks are all (ON GPU): BRP5-cuda32-nv301
Here's at least one of each type with validation errors, can someone tell me what's wrong?
Loading GPU driver 337.88 "fresh" from NVIDIA now to see if it matters...
http://einsteinathome.org/workunit/194670882 - Binary Radio Pulsar
Search (Perseus Arm Survey) v1.39 (BRP5-cuda32-nv301
This one complete, but was marked as invalid:
http://einsteinathome.org/workunit/194648623
How can I tell what's going wrong? There doesn't seem to be a graphics type problem with my GPU, it's fine on screen (no games on this machine, cruncher only).
I am set to run 2 tasks simultaneous on the GPU...
Here's a snippet from one of
)
Here's a snippet from one of your output files (doesn't matter which they are all the same) that hopefully gives a clue, in bold!:
7.2.42
Activated exception handling...
[15:01:26][3632][INFO ] Starting data processing...
[15:01:26][3632][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 299 MB (1750 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[15:01:26][3632][INFO ] Using CUDA device #0 "GeForce GTX 770" (0 CUDA cores / 0.00 GFLOPS)
[15:01:26][3632][INFO ] Version of installed CUDA driver: 6000
[15:01:26][3632][INFO ] Version of CUDA driver API used: 3020
[15:01:27][3632][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
I would be tempted to remove then re-install the driver for your GPU with a fresh copy from the NVidia website.
RE: ------> Used in total:
)
I wouldn't worry about that, the Cuda 3.2 api doesn't know about how GTX 770's are made up,
Arvid Almstrom's GTX780's top Nvidia host also doesn't report the number of Cuda cores or GFLOPS :
http://einsteinathome.org/host/6216490
[18:11:04][11780][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 675 MB (2399 MB free / 3074 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:11:04][11780][INFO ] Using CUDA device #1 "GeForce GTX 780" (0 CUDA cores / 0.00 GFLOPS)
[18:11:04][11780][INFO ] Version of installed CUDA driver: 6000
[18:11:04][11780][INFO ] Version of CUDA driver API used: 3020
and on my GT650M:
[18:30:27][5600][INFO ] CUDA global memory status (initial GPU state, including context):
------> Used in total: 71 MB (1978 MB free / 2049 MB total) -> Used by this application (assuming a single GPU task): 0 MB
[18:30:27][5600][INFO ] Using CUDA device #0 "GeForce GT 650M" (0 CUDA cores / 0.00 GFLOPS)
[18:30:27][5600][INFO ] Version of installed CUDA driver: 6050
[18:30:27][5600][INFO ] Version of CUDA driver API used: 3020
Claggy