All things Nvidia GPU

Boca Raton Community HS
Boca Raton Comm...
Joined: 4 Nov 15
Posts: 240
Credit: 10588985586
RAC: 21940003

Thanks for the reply- this

Thanks for the reply- this answers my question! 

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

 . . OK, there are definitely

 . . OK, there are definitely gremlins in this system.  Share data is on but while that graph is still there it only shows a single spike (representing a single day or so?) and then nothing, nada, zilch!

 . . I feel it is toying with me ... :)

Stephen

 

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

 . . Well that single spike

 . . Well that single spike is still all that the graph shows, go figure.

 . . Sadly the weather is getting hotter here. The room holds at 28degC overnight and hits 34 to 35 during the day. So I cannot keep the temps down and will have to soon shut down.  Already shut down the hottest machine.

 . . Have fun ppl!

Stephen

 

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7224334931
RAC: 1016651

Stephen "Heretic wrote: . .

Stephen "Heretic wrote:

 . . Well that single spike is still all that the graph shows, go figure.
<snip>

Stephen

I suggested to you early on (October 9) that a plausible cause was that a huge spike meant the autoscaling was rendering the other days not visible.  I was premature, as you had not yet enabled your data to be included in the published files.  But that explanation looks likely to me at the moment.

BOINCStats shows graphs for your Einstein account here:

https://www.boincstats.com/stats/5/user/detail/941970/charts

You might notice that the daily credit graph credits you with a single-day score of a bit over 367 million on October 10.   Recent days for the few I checked you scored a little under 2 million.

I think the graph you have been complaining about has a 60 day date range.  I suggest you look at it about December 11.  If the vertical scale ruined by a spike theory is correct, you may suddenly see something quite different from what you would have seen on December 7.

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117651002777
RAC: 35178260

Stephen "Heretic wrote: . .

Stephen "Heretic wrote:
. . Well that single spike is still all that the graph shows, go figure.

archae86 has given you the reason for that.  The problem is that when you finally allowed stats sharing, your full former production was assigned to that initial day.  With the large scale that was needed, the single days of production that followed were too small to register.

If you want to see a much better graphical representation of your production without bothering about stats sites, just use the stats tab in BOINC Manager - Advanced view.  If you insert a <save_stats_days>100</save_stats_days> line in cc_config.xml, you will be able to view much better graphs spanning over 3 months of total credit and RAC for both the machine itself and for all your hosts.

Stephen "Heretic wrote:
 . . Sadly the weather is getting hotter here. The room holds at 28degC overnight and hits 34 to 35 during the day. So I cannot keep the temps down and will have to soon shut down.  Already shut down the hottest machine.

I'm in Brisbane and yesterday was a 35C day as well - a foretaste of what is to come as summer approaches.  I have ~140 machines running - without aircon but with industrial grade forced air ventilation in the computer room.  Outside air is sucked in at one end of the room and exhausted at the other end by large fans - the type used in places like bakeries, etc.

I do shut down about half the fleet when the really hot weather arrives, but haven't had to do that yet.  Last year, that started on 17th November and I was able to progressively restart those from late February/early March this year.  I find that running GPU tasks only and using a totally open frame for the case allows a machine to survive.  If the room temp hits 38C, I have a script that disables all crunching on all running machines for a selected time interval.  That does wonders for dropping the room temperature very quickly.  Last summer was not too bad - I only had to use that for a few hours on just 3 extreme days.

Cheers,
Gary.

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

 .  Thanks Arch (if I can

 .  Thanks Arch (if I can call you that),

 . . That clarifies a lot.  I had suspected there must have some kind of glitch that produced that spike.  Given the numbers quoted it must have somehow dumped my historical info on that single day to get numbers that high.  I had observed that either side of the spike is a flat line which would be the actual values now dwarfed by the false  "High".  I have a while to wait for it to normalise. Something to look forward to ...

Stephen

 

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

 Hi Gary,  . . That was my

 Hi Gary,

 . . That was my conclusion when I saw the data Archae86 referred to. No other explanation for a number that high.

 . . 140 machines??  I would have to move out to house that many. The forced ventilation sounds like a good solution but I don't think I would be comfortable working in a wind tunnel environment :).  I might try that suggestion to expand the stats graphs to more than the current 60 days. ATM I am shutting down crunching on the hot machine during the day and recommencing it at night.  There is a small A/C unit up here but it is currently kaput.  I paid an A/C guy heaps to fix it and he just topped up the gas (damned expensive gas that) but it stopped again shortly after. I think the capacitor in the condenser fan cct has gone.  Trouble is these cheap units are not designed to be readily serviced. I need to get a set of inline 240V plugs to isolate the fan so I can remove it and replace that capacitor. It is only a 1.5Kw unit but it could help.

Stephen

 

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

 . . Hi folks, . .

 . . Hi folks,

 . . NOTICE!!

 . . Nvidia users running Maxwell cards might want to update their video drivers to 470 or later to improve their GPU performance by running the latest v1.28 app.  The boffins have very nicely added Maxwell cards to the list of those who can receive this app. I have tested it with great satisfaction on GTX950 and GTX970 cards. Runtimes for GRPBS#1 are typically about 21 mins on the 950 and 11 mins on the 970s. The actual times will vary depending on manufacturer and general PC resources.

Stephen

 

ENJOY!

 

mikey
mikey
Joined: 22 Jan 05
Posts: 12689
Credit: 1839094536
RAC: 3733

Stephen "Heretic wrote: Hi

Stephen "Heretic wrote:

 Hi Gary,

 . . That was my conclusion when I saw the data Archae86 referred to. No other explanation for a number that high.

 . . 140 machines??  I would have to move out to house that many. The forced ventilation sounds like a good solution but I don't think I would be comfortable working in a wind tunnel environment :).  I might try that suggestion to expand the stats graphs to more than the current 60 days. ATM I am shutting down crunching on the hot machine during the day and recommencing it at night.  There is a small A/C unit up here but it is currently kaput.  I paid an A/C guy heaps to fix it and he just topped up the gas (damned expensive gas that) but it stopped again shortly after. I think the capacitor in the condenser fan cct has gone.  Trouble is these cheap units are not designed to be readily serviced. I need to get a set of inline 240V plugs to isolate the fan so I can remove it and replace that capacitor. It is only a 1.5Kw unit but it could help.

Stephen

Would it be cheaper to replace it?

And the 'wind tunnel' effect could be reduced if you let the air in say above the door and then had the exhaust fan in the ceiling, that way most of the air flow would be up across and out not thru the space where you are. The other consideration with 140 machines is the electricity to run them, that would make for one heck on an electric bill.

Stephen "Heretic"
Stephen "Heretic"
Joined: 5 Feb 17
Posts: 94
Credit: 645067679
RAC: 0

mikey wrote:Would it be

mikey wrote:

Would it be cheaper to replace it?

And the 'wind tunnel' effect could be reduced if you let the air in say above the door and then had the exhaust fan in the ceiling, that way most of the air flow would be up across and out not thru the space where you are. The other consideration with 140 machines is the electricity to run them, that would make for one heck on an electric bill.

 . . Every 3 months when I get my power bill just running 3 units makes me question my own commitment to the cause :)

 . . Maybe he has the biggest, bestest Solar array ever conceived :)

 . . If I can find the right inline connector unit to isolate the fan for maintenance the capacitor should not cost much so it would be much cheaper than replacing the unit.

Stephen

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.