Greetings all,
I've been getting a lot of compute errors lately on tasks given to my GPU. I upgraded my drivers, but it seems to be continuing. Some will error after as little as 30 seconds processing time. I do not have this issue on the other project I run (SETI@Home) and the work units appear to be computed fine by other people.
My GPU is a Nvidia GeForce GT 650 M in a MacBook that's Bootcamped to Windows 10
Here is some copy and paste from the work units:
Error in OpenCL context: CL_MEM_OBJECT_ALLOCATION_FAILURE error executing CL_COMMAND_WRITE_BUFFER on GeForce GT 650M (Device 0).Error during OpenCL bloc_info host->device transfer - candidates (error: -4)
ERROR: opencl_prepare_power_toplist() returned with error 0
01:39:56 (13300): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: PRECISION
01:40:08 (13300): [normal]: done. calling boinc_finish(65).
01:40:08 (13300): called boinc_finish
ERROR: /home/bema/fermilat/src/bridge_fft_clfft.c:1175: kernel kernel_sortedPhoton failed. status=-4 error in opencl_qsort Error in OpenCL context: CL_MEM_OBJECT_ALLOCATION_FAILURE error executing CL_COMMAND_NDRANGE_KERNEL on GeForce GT 650M (Device 0).18:15:30 (13352): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: PRECISION
18:15:42 (13352): [normal]: done. calling boinc_finish(65).
18:15:42 (13352): called boinc_finish
Error in OpenCL context: CL_MEM_OBJECT_ALLOCATION_FAILURE error executing CL_COMMAND_NDRANGE_KERNEL on GeForce GT 650M (Device 0).ERROR: /home/bema/fermilat/src/bridge_fft_clfft.c:1175: kernel kernel_sortedPhoton failed. status=-4
error in opencl_qsort
15:33:32 (2616): [CRITICAL]: ERROR: MAIN() returned with error '1'
FPU status flags: PRECISION
15:33:44 (2616): [normal]: done. calling boinc_finish(65).
15:33:44 (2616): called boinc_finish</stderr_txt>
]]>
Copyright © 2024 Einstein@Home. All rights reserved.
Your GT 650 M can provide 1
)
Your GT 650 M can provide 1 GB memory only. Also, it can handle OpenCL 1.1 version only. Both specifications seem crucial to me.
This is a relatively new
)
This is a relatively new occurrence, an some GPU tasks seem to chug along okay. Is it just luck of the draw on what unis are sent my way?
Are you running more than 1WU
)
Are you running more than 1WU by any chance? Check if your gpu has any memory errors using HWInfo . Also check the gpu, and system , memory usage when the WUs are running.
Alpha_9 wrote:This is a
)
I don't believe it's anything to do with the tasks being sent to you. A task is just a set of different parameters with which to analyse a particular block of data. A given block of data will have many, many tasks spread over a wide range of hosts. If there were some data related problem, there would be lots of complaints from a wide range of volunteers.
I would be quite confident that the issue is to do with your hardware and the conditions under which it is operating. In addition to having problems with GPU tasks, there are also GW task failures. Of the 27 compute errors currently in the database, 7 are GW CPU tasks. The fact that this has recently started happening points to some component degrading or going out of spec or perhaps something heat related. Your Macbook shows as having 8 threads. How many are actually running CPU tasks? How good is the flow of cooling air?
In my experience, running BOINC on a laptop may shorten its life considerably. Laptops are really not designed to handle 100% load for extended periods. It's critical to monitor operating temperatures closely and to clean air passageways/filters regularly. How confident are you that there is a very good supply of cooling air?
I notice you have two computers crunching here. The second appears to be a desktop with a Pentium dual core processor and no discrete GPU. I have quite a few Pentium dual core hosts and I run budget GPUs like the RX 460 very successfully (RAC >250K). If you were to put something like a GTX1050 or an RX 460 in your host, you could improve your output considerably and protect your Macbook from, potentially, a premature demise. Both the mentioned GPUs are reasonably small and quite low on power requirements. Perhaps your existing case and PSU might be suitable but you would need to check that. Both GPUs are available without the need for any external PCIe power connector.
Cheers,
Gary.
Gary, That's some good
)
Gary,
That's some good information and I appreciate you taking the time. I have some technical background to the point of building a system or two, but not quite to the level of deciphering the logs that the BOINC client gives.
I didn't think it was the task itself, for the reasons you mentioned. I can see some of the ones I error on were completed fine by other people. I suspected something specific to my machine or settings so you've hit some good information for me there.
It is an 2013 era Macbook and it probably is on it's way out from good use and all. I run CPU tasks almost any time it's on, but the GPU I only run on idle times. As I type this I have 8 tasks going. 4 each for Seti and Einstein. Air flow should be okay, but it's possible that age and normal wear/tear have caused heat related degradation internally. While I doubt it's specifically heat related, the keyboard does like to act up sometimes, and some of the keys have the coating rubbing off.
A new computer is in my plans, I may dial back my usage on this laptop and prolong its life as much as I can. Hate for it to pop and sizzle on me at a bad moment. Better check when my last back up was ha :)
Thanks again!
Alpha_9 wrote:Thanks
)
No problem - you're very welcome!
Good luck with your Macbook. They're very expensive to fix/replace!
Cheers,
Gary.