[SOLVED] CUDA Error - Exit Code 240

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0
Topic 196553

Well my brand new GTX 670 has started to produce nothing but compute errors

They seem to happen anywhere from 35 to 120 seconds into the task.

The ones I've looked at all say something like:

[20:13:34][4633][INFO ] CUDA global memory status (GPU setup complete):
------> Used in total: 1717 MB (331 MB free / 2048 MB total) -> Used by this application (assuming a single GPU task): 620 MB
[20:14:32][4633][INFO ] Checkpoint committed!
[20:15:31][4633][ERROR] Error during CUDA device->host HS power spectrum transfer (error: 702)
[20:15:31][4633][ERROR] Demodulation failed (error: 1008)!
20:15:31 (4633): called boinc_finish

Here are some examples:
[url]
http://einsteinathome.org/task/312162146[/url]
http://einsteinathome.org/task/312162145
http://einsteinathome.org/task/312161996

For the record I'm on this system http://einsteinathome.org/host/5771544 running Scientific Linux (Red Hat derivative) nvidia drivers from the repository v 304.43

That sounds like a hardware or driver problem to me. So I've slapped together a little Matlab script that:

- Generates 4096x4096 arrays of random doubles
- Copies it to GPU memory, reads back and compares the two
- Does an fft on the array (1D row wise) in memory and in gpu, then does an ifft, takes the amplitude and generates an rms error between the original and result. It compares cpu results with cpu results and gpu with gpu.

All of that works just fine.
The memory transfers compare without errors and the average errors are very similar for cpu (1.51e-13) and gpu (1.39e-13). Errors for single precision values CPU 8.14e-5 and GPU 7.65e-5

Timing for the DP FFTs
[pre]
FFT iFFT AMP RMS
CPU 8.30 14.40 9.25 12.73
GPU 0.97 0.95 1.06 1.96
[/pre]

Timing for Single precision
[pre]
FFT iFFT AMP RMS
CPU 6.16 11.31 4.80 8.23
GPU 0.27 0.27 1.06 0.78
[/pre]

The FFT and iFFT are done 100 times row wise on a 4096x4096 matrix.

AMP is converting the complex ifft result back to real with sqrt(xi.*conj(xi))

RMS is mean of the row root mean square error of the original - result.

Any ideas what else I can test?

Joe

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

[SOLVED] CUDA Error - Exit Code 240

Well, I'm getting more information although I'm still confused.

My home setup is dual monitors and multiple computers on separate switch boxes.

I disconnected one monitor from that system and didn't switch the other and lo and behold the 3 running CUDA jobs finished successfully.

Now I'm trying to leave one disconnected but switch the other to a different system to see what happens.

I did notice that the latest drivers detect monitor connect and disconnects on the fly.

Anybody know how to turn that off?

Joe

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

OK so with only one monitor

OK so with only one monitor connected I can switch back and forth to that system with CUDA tasks running and they complete without error. We'll see if they also validate but I expect they will.

Just goes to prove "one man's signal is another man's noise" or is it "one man's feature is another man's bug".

Anyway this fancy new driver of NVIDIA 304.43 should not be used with dual monitors if you're switching video.

At least I got some CUDA diagnostic and timing tests started. Anybody with Matlab and the Parallel Toolbox who is interested in a short script is welcome to it.

Joe

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

They are validating. Is

They are validating.

Is there a way to add the [SOLVED] tag to the thread title?

Joe

Gundolf Jahn
Gundolf Jahn
Joined: 1 Mar 05
Posts: 1079
Credit: 341280
RAC: 0

RE: Is there a way to add

Quote:
Is there a way to add the [SOLVED] tag to the thread title?


Yes.

Post a new message to this thread. When editing that message within the hour, you can change the thread title too.

Gruß,
Gundolf

Computer sind nicht alles im Leben. (Kleiner Scherz)

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

RE: RE: Is there a way to

Quote:
Quote:
Is there a way to add the [SOLVED] tag to the thread title?

Yes.

Post a new message to this thread. When editing that message within the hour, you can change the thread title too.

Gruß,
Gundolf


Thanks Gundolf!

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

A new message to try to

A new message to try to indicate the issue was resolved.

As Gundolf said, post a new message, not reply or quote an existing message.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.