Notification of errors

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0
Topic 195755

I've been keeping my Ubuntu system up to date mostly through automatic updates.

Evidently I left a background system needing a reboot for a week and started getting errors like:

[22:45:38][28960][INFO ] Starting data processing...
Error: API mismatch: the NVIDIA kernel module has version 270.29,
but this NVIDIA driver component has version 270.41.03.  Please make
sure that the kernel module and all NVIDIA driver components
have the same version.
[22:45:38][28960][ERROR] Couldn't initialize CUDA driver API (error: 100)!
[22:45:38][28960][ERROR] Demodulation failed (error: 1020)!

There were other much more complicated error messages but rebooting seems to let things run without error at least for a while, we'll see if they start to complete and validate.

I think what happed was the nVidia driver got updated but the reboot was needed to load the new kernel.

So is there a way to be notified of errors or repeated errors without checking?

Joe[/code]

telegd
telegd
Joined: 17 Apr 07
Posts: 91
Credit: 10212522
RAC: 0

Notification of errors

I am afraid it doesn't help you, but I never let my system update itself. Even with a fairly reliable OS like *buntu, the room for badness is just too great. If the machine is behind a NAT/Router/Firewall and doesn't share the LAN with untrusted clients, I can't see that an OS like Linux needs to be updated obsessively. Do it once every few weeks when you are sitting at the computer and can choose what needs to be done.

Also, I have had Kernel updates break my system too many times to count. The thought of updating it unmonitored just makes my skin crawl...

Just my personal opinion. Your situation may be different than mine.

joe areeda
joe areeda
Joined: 13 Dec 10
Posts: 285
Credit: 320378898
RAC: 0

RE: I am afraid it doesn't

Quote:

I am afraid it doesn't help you, but I never let my system update itself. Even with a fairly reliable OS like *buntu, the room for badness is just too great. If the machine is behind a NAT/Router/Firewall and doesn't share the LAN with untrusted clients, I can't see that an OS like Linux needs to be updated obsessively. Do it once every few weeks when you are sitting at the computer and can choose what needs to be done.

Also, I have had Kernel updates break my system too many times to count. The thought of updating it unmonitored just makes my skin crawl...

Just my personal opinion. Your situation may be different than mine.

This is one of my machines that is both on the Internet and behind the firewall so security updates are important.

Over the years, I've found myself updating with less and less checking for exactly what is being updated. I used to spend countless hours trying to figure out what Microsoft and RedHat/CentOS was changing and accepting 99.44% of them anyway. With Ubuntu, I haven't declined a suggested update yet and things seem to get better.

Perhaps you're right and I just have to accept occasional loss of valuable credits or spend a lot more time checking if I really need to update.

I do appreciate the comment.

Joe

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873219624
RAC: 1892407

RE: I've been keeping my

Quote:

I've been keeping my Ubuntu system up to date mostly through automatic updates.

Evidently I left a background system needing a reboot for a week and started getting errors like:

So is there a way to be notified of errors or repeated errors without checking?

Joe

There is not currently a system to do that, in the past when that was done, at Seti, there were many folks that either couldn't figure out how to fix the problem and just gave up, didn't like the response, etc, etc, etc. So the whole process was dropped, I am not sure how long it lasted but not too long. Also there are alot of people that ignore the messages from a project when they do come thru, so that way of communicating is not currently a good one. The problems are most likely project related, Scientists are good at what they do but communications is not always one of those things!

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4332
Credit: 251751499
RAC: 36047

RE: There is not currently

Quote:
There is not currently a system to do that, in the past when that was done, at Seti, there were many folks that either couldn't figure out how to fix the problem and just gave up, didn't like the response, etc, etc, etc. So the whole process was dropped, I am not sure how long it lasted but not too long.

In principle I like the idea of being notified (e.g. per email or PM) of errors on computers that run BOINC unsupervised. If there is or has been some code in BOINC to do that, I'd be happy if someone could point me to it. Even a hint of the time when that was implemented could be helpful (e.g. message board post).

BM

BM

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873219624
RAC: 1892407

RE: RE: There is not

Quote:
Quote:
There is not currently a system to do that, in the past when that was done, at Seti, there were many folks that either couldn't figure out how to fix the problem and just gave up, didn't like the response, etc, etc, etc. So the whole process was dropped, I am not sure how long it lasted but not too long.

In principle I like the idea of being notified (e.g. per email or PM) of errors on computers that run BOINC unsupervised. If there is or has been some code in BOINC to do that, I'd be happy if someone could point me to it. Even a hint of the time when that was implemented could be helpful (e.g. message board post).

BM

Oh good lord, that was eons ago even before I was a 'forum moderator' at Seti, and I left Seti in March 2007. It was tried a few times but didn't work out well, as you know less than 5% of the crunchers for a given project use the message boards, even though we can seem like a lot we really aren't. And the problem was that most had no clue they were having problems and then to be told that they were, most just gave up and walked away. I am guessing, but really AM guessing, that that is why the backoff of units a pc can download was implemented for pc's that are having problems. I am out of town right now but when I get back I will look up the guys name who might be able to pin it down closer for you, his name is Paul ?, he is handicapped, has alot of different kinds of machines and wrote his own wiki about Boinc, but I just cannot think of his name right now!

Jord
Joined: 26 Jan 05
Posts: 2952
Credit: 5893653
RAC: 3

RE: Paul ? Paul D.

Quote:
Paul ?


Paul D. Buck.

Is he still around then? I thought he left another time?

paul milton
paul milton
Joined: 16 Sep 05
Posts: 329
Credit: 35825044
RAC: 0

it could be done in such a

it could be done in such a way that its informative, like.

dear valued participant
where sending this email to let you know that one of your hosts (host i.d. AND the "nice" computer name) is showing some errors. these errors could be caused by X Y Z and could be fixed by A B C or a simple reboot, please visit our forums here (link) for assistance in troubleshooting these error's.

-------

instead of sending a super technical email, though for power users there could be settings to be more technical. etc.

its a nice idea, but might be a pain to implement. would have been nice for when my other system went nuts. i was going to setup a ping script to keep tabs of it, but oddly enough even though it was totally unresponsive it did infact reply to pings. so that idea went out the window lol

seeing without seeing is something the blind learn to do, and seeing beyond vision can be a gift.

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873219624
RAC: 1892407

RE: RE: Paul ? Paul D.

Quote:
Quote:
Paul ?

Paul D. Buck.

Is he still around then? I thought he left another time?

Yes that's him, I will look for him when I get home in a couple of days and see what he remembers. Thanks!

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4332
Credit: 251751499
RAC: 36047

RE: it could be done in

Quote:
it could be done in such a way that its informative, [...] instead of sending a super technical email, though for power users there could be settings to be more technical.

I'm not thinking of spamming ordinary participants with error messages that they could see on their desktop computers anyway. It would be a feature that would be disabled by default, and only techies that run "headless" computers would enable it to get notified of errors on these.

BM

Note to me: It would probably best be implemented in "update_stats" (run once per day)

BM

mikey
mikey
Joined: 22 Jan 05
Posts: 12783
Credit: 1873219624
RAC: 1892407

RE: RE: it could be done

Quote:
Quote:
it could be done in such a way that its informative, [...] instead of sending a super technical email, though for power users there could be settings to be more technical.

I'm not thinking of spamming ordinary participants with error messages that they could see on their desktop computers anyway. It would be a feature that would be disabled by default, and only techies that run "headless" computers would enable it to get notified of errors on these.

BM

Note to me: It would probably best be implemented in "update_stats" (run once per day)

Sounds like a good plan, i will be home tomorrow and look for Paul then.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.