I've installed Einstein@home with boinc 7.4.42 on my Intel Nvidia Win8.1 64bit PC. Every time I start Einstein it begins to work normally on the tasks but after a few minutes it makes my PC freeze (while its temperature makes my lap sweat). First the keyboard stops to work, then the mouse and I'm forced to do a cold reset. I already tried pausing single tasks and all tasks without success. I have to pause the whole project on boinc to avoid this phenomenon. Other boinc projects run normally.
B.t.w. I've never seen that screensaver. Maybe the moment when it starts to freeze is when the screensaver shold start working. What's going wrong here? I don't know. It's strange.
Does anybody have a clue or experienced the same problem?
Copyright © 2024 Einstein@Home. All rights reserved.
Albert makes my computer freeze
)
Hi Meck,
Welcome to Einstein@Home!
From your description, I presume your machine is a laptop and that it's getting rather hot. If so, that's pretty much your problem right there. Laptop cooling systems often have problems coping with the heat generated if you have a full 100% crunching load (the GPU plus all CPU cores). A lot of heat will be generated under these conditions. If a machine overheats, it will lock up or shut down for self protection.
The first thing to check is that none of the airways are blocked or restricted in any way. If the machine is relatively new, you wouldn't expect (yet) any significant buildup of dust/fluff blocking fans/filters but you should check anyway. If you work with the machine on your lap, perhaps you are restricting the airways yourself.
If the cooling system is clear and unrestricted, your only options are to reduce the crunching load by using some sort of throttling software or by restricting, through preferences, the number of cores/GPU allowed to run tasks. You might consider 'suspending' (in BOINC Manager) all CPU tasks and see if the GPU can run on its own. If it can, you might 'resume' a single CPU task and see if that combination will also work (1 GPU task + 1 CPU task). You would want to see no lockups for a significant time to be sure. You should be monitoring temperatures with a temperature measuring utility to see how hot the machine is getting.
By trial and error, you may be able to find a suitable mix of CPU/GPU tasks that your machine is able to handle. When you know this mix, you should set your preferences to allow this combination. You will find links to various sets of preferences on your account page on the website. You can also set preferences locally in BOINC Manager. Local preferences take precedence over website preferences. When things are running smoothly and preferences are in place, you can 'resume' all remaining 'suspended' CPU tasks and allow BOINC to manage things.
Cheers,
Gary.
RE: ... By trial and error,
)
O.K. Thank you Gary for your information. I'm gonna try several settings and hope this won't damage the machine completely one day. Are there any riscy parameters, where one should under no circumstances change a thing?
I am having the same problem
)
I am having the same problem after rejoining Einstein after an absence of a few months. I have used TThrottle since I bought this laptop 2 years ago, and never had this problem before. I also notice that I started getting GPU jobs for Einstein, which I did not get before.
The only way I could send this post was to user Task Manager to shut down all of BOINC right after restarting the laptop. If I let BOINC Manager start normally, my machine locks up all keyboard, mouse, and touch screen input faster than I can turn off Einstein from within the Manager. From the very brief look at the TThrottle graphs I get, it seems that TThrottle is having no effect on GPU per cent usage or temperature. It just shoots straight up as soon as BONIC Manager completes the connection.
First, does anybody have a suggestion how to turn off Einstein tasks without letting BOINC Manager do a full start? I would like to do this to isolate the problem.
Second, assuming it is Einstein causing the lockup, what gives?
RE: I'm gonna try several
)
Providing sufficient cooling is always going to be a problem for laptops under 100% load. My personal opinion is that people using laptops for crunching are taking quite a risk of having heat related issues (and shortened life) much earlier than for systems where you can have better cooling by improving the airflow or taking off side panels, etc. You should decide on a 'safe' temperature you are prepared to accept and stay below that limit. What is 'safe' is very much a matter of opinion. You should decide that after doing a bit of research. You should monitor temperatures regularly and inspect the cooling system if you see increases above the usual values.
Not really. The machine will tell you when it doesn't like the operating conditions. The worst that will happen is that the machine will become unstable and crash or reboot itself if you try to run too much at once. If you get an occasional crash, the machine is telling you it's too hot so you should reduce the load to keep the temperature below that range by a sensible margin, particularly if you're concerned about prolonging its life.
Cheers,
Gary.
RE: ... From the very brief
)
I don't use Windows or TThrottle but, from memory, I think I read that TThrottle only throttles CPU usage. I'm surprised that the machine fails as quickly as you describe. That suggests your cooling may be close to fully blocked off. When was the last time you inspected and cleaned all filters/heat sink fins, etc, and checked that fans are free-running and moving the right amount of air? You should stick you hand in front of the air exhaust point to test the flow rate and how hot it is. If you can't feel a good strong, warm flow when the machine is working hard, you have a blockage problem somewhere.
If cleaning the airways doesn't allow you enough time to get into BOINC Manager and suspend crunching there, you could always change computing settings on the website to suspend crunching when the machine is 'in use'. That way, on startup, your machine will be 'in use' while you play around with BOINC manager and disable enough stuff to allow a gradual startup, task by task, to see how much your machine can actually cope with.
Your machine has 8 cores. You could also think about changing the website settings to reduce the number of cores BOINC is allowed to use. There are other preference settings to restrict BOINC in various ways. Have you looked at any of those?
You could disable BOINC from auto-starting at all during Windows startup. There would be a simple setting (probably a tick box) somewhere to do that. The last time I used Windows (XP) was around 8 years ago (or more) and auto-starting was easy to disable then, although I don't remember exactly how :-).
If you change prefs on the website to restrict BOINC, you still have the problem of the time needed to start BOINC and then read the changed settings by communicating with the website and receiving a response. If the lockup is as fast as you suggest (and cleaning doesn't help) I would be editing the preferences locally without starting BOINC at all. It's very easy to do with a simple text editor like Windows notepad (if it still exists - I don't know). In the BOINC Data directory (folder) you have files like client_state.xml, global_prefs.xml and perhaps global_prefs_override.xml if you've ever changed your preference settings locally through BOINC Manager. If you browse global_prefs.xml, you will see all your website settings including this one
1
as well as all the others, many of which are reasonably obvious from the tags used.In a lot of cases, including this one, the value of '1' means 'yes', ie. allow tasks to run even if the user is active on the machine. If you change it to '0' it means 'no', don't allow tasks to run if the user is active. There is a separate setting you could also change if the machine still locks up with just a single GPU task running. If you make the change on both settings and save the file, you could then immediately start BOINC and have no crunching while you were typing on the keyboard or moving the mouse. Check on the website prefs page to see the delay period before BOINC would try to start crunching after the last 'activity'.
Your machine hasn't got sufficient cooling to operate at 100% load. Reduce your compute load until your cooling system is able to cope. You could restrict the number of CPU cores BOINC is allowed to use and/or you could deselect those science runs that use your GPU. You should look through all the preference settings on the website and ask questions if you don't understand the purpose of what you find there. I would be quite surprised if your machine couldn't run with a single GPU task and a (perhaps heavily) reduced number of CPU cores. What you run can easily be selected on the website. Using the GPU can be very productive and in the future, more science runs will tend to use it for the increased productivity.
Cheers,
Gary.
RE: First, does anybody
)
Copied from my reply on the BOINC message board (in case that board dies again before you get a chance to read it).
Thank you Richard. I managed
)
Thank you Richard. I managed to kill Einstein, then restart BOINC Manager. The laptop is running nicely now, crunching WCG, CPU only, without problems.
In answer to other suggestions: yes, I clean the air passages regularly. I only use 6 CPUs for BOINC, this has worked well for me since getting this machine. The fans and TThrottle are now running as I normally see them. TThrottle definitely throttled the GPU on this machine previously, and on my previous laptop.
I am waiting for S@H jobs to come back before running the GPU again, as I have never had issues with them. After that, I will try Einstein again, CPU only.
BREAKING NEWS: S@H is back. I'm running one Cuda50 and one openCL_intel task without issues. TThrottle is throttling GPU to maintain the set GPU and CPU temperatures. For now, it appears my machine and Einstein GPU tasks do not play well together.
I just hope that their are no other users locked out of using their computers because of an incompatibility with Einstein. That sort of thing could give all BOINC projects a bad name.
Since I deactivated GPU usage
)
Since I deactivated GPU usage I haven't had that frezze problem again. I suppose there's an incompatibility between Einstein and my (original and up to date) Nvidia driver. But I'm going to try on testing it by changing parameters.
Another bad thing that increases the CPU load is that some projects, fortunately not Einstein, start new tasks even when they're told not to do it. After removing a task or suspending and reactivating it, automatically there are new tasks created although they shouldn't be because of my manual choice. sometimes only stopping the whole project and then deleting some tasks helps to avoid an uncontrolable increase of the number of tasks.
Some other things you can do
)
Some other things you can do to help out are:
Raise the back of the laptop by putting something underneath. A couple of pink erasers works and you don't have to worry about scratching your desktop surface (if that's a concern). Whatever you use, make sure not to cover or block the vents.
Blow air at the laptop using a small fan.
Get a laptop cooler which sits under the laptop and blows air onto the underside of the laptop. These can be had relatively inexpensive in most electronics/computers sections.
In terms of settings, get a monitoring program that tells you the temperatures. If you are going to run your GPU then I suggest cutting back the number of CPU cores you use. You can also reduce the % CPU used. Start at 90% and work backwards in 5% increments. This allows a little cooling off.