What determines if a host is assigned to work on a serie with long or short WU's?
One of my hosts was working on a serie with long
WU's. When that serie was finished, the host was asigned to a serie with short WU's. But in less the 24 hours, the host was assigned to a serie with long WU's again. Even if the serie with short WU's was not
finished. Other hosts are still working on WU's in that serie.
My host was completing a short WU in 70 min. And when it was working on two WU's at the same time, the host returned one result every 35 min. Several of the hosts still working on the serie with short WU's, are using more time to complete a WU. Some are using up to 1000 sec. more per WU. And they only work on one WU at a time.
So why was my host detached from that serie in less than 24 hours? When it in fact completed and returned results faster compared to hosts still working on the same serie.
And my host was running non stop.
Another one of my hosts have been working on a serie with long WU for quit some time now.
The host have completed 55 WU from the same serie. It takes the host 7,5 hours to complete one WU. But even so, the WU the host are asigned to work on, comes in sequense. (..631, ..630, ..629). We are only 3-4 members working on this serie with long WU's. It's the same teammates all along. And my host must be ahead in that small team, since it get's
WU's in sequense? It's the first who get's a new WU from the serie.
Why do you assign a bunch of hosts to work on short WU's, and a few hosts to work on long WU's? And even detach fast hosts from series with short WU. (My host was not the only fast host to be detached).
Why don't you assign more hosts to each serie in general?
This don't make sense to me. Don't you want to get the job (a serie) done as quick as possible?
I have 3 hosts running for E@H. All 3 are working on long WU's. Short WU's could be nice from time to time.
Dimmerjas
Copyright © 2024 Einstein@Home. All rights reserved.
Long or short WU's?
)
Here Bruce describes the current algorithm for determining if a host is "slow". If it is, it will be given a "short datafile" (i.e. a datafile that corresponds to short workunits) when the host needs to download a new one. Fast hosts are given datafiles (and thus long or short tasks based on it) simply as they come, more or less randomly from the database.
BM
BM
Mr Bruce better find a way to
)
Mr Bruce better find a way to NOT send short WUs to fast computers. My Core2 will finish almost 3 of the short WUs per CPU in 1 hour, which means i could run out of work since quota is only 64 per CPU, am i right? Its a close cut but i could experience idle-time if there are no long WUs mixed in. I want to keep this rig at 100% load 24/7, and i want only 1 project. Next week i will be slamming in a faster Core2 and go for considerable higher overclock, 64 short WUs pr CPU would leave me with many hours of idle.
Team Philippines
RE: Mr Bruce better find a
)
readhere
But perhaps there should be an algorithm that sets an upper bound for credits/CPUSeconds that will only distribute a large dataset to a very fast host
?
RE: readhere But perhaps
)
I read that already, i think the quota increased from 32 to 64 per CPU just after Bruce wrote that, or is he going to increase even more? With 64 per cpu, its just still not enough if i were to get only short WUs. Finsished a short one in 1306.77, multiplied iwth 64 thats only 83633 seconds, and 1 day is still 86400, i would have almost 3000 seconds idle per CPU if i suddenly got only short WUs. I hope the chance of getting only shorties is remotely small.
I think the "upper bound algorithm" is the way to go, then a quota of 32 per CPU would be more than enough for now.
Team Philippines
RE: I read that already, i
)
I'll bump up the quota a little bit more...
Bruce
Director, Einstein@Home
A host, that can handle 4 WU
)
A host, that can handle 4 WU at the same time, and have to use 95 min. on 1 short WU becouse of that, is obviously considered as a "slow" host. And can keep on working on short WU´s. Even if the host returns a result every 23.75 min. (95 min./4).
My host can only handle 2 WU at the same time. And was using 70 min. on each WU, and returning a result every 35 minute. And that made it a "fast" host. And it was detached from the same serie with short WU`s as the host descriped above.
So what makes a host slow or fast? The CPU-time used on a WU, or the interval between returned results?
My host never reached the daily quota of 32 WU´s, when it was assigned to a serie with long WU´s instead. The host had a small queue, and the long WU´s was simply put in after the short WU´s.
Dimmerjas
Everything needs to be
)
Everything needs to be crunched, sometimes there are not many (or any) long units split for the queue, so you will get short ones at that time.
I am unsure how the splitter and schedulers work, but I am sure that it takes a lot longer time to split off a longer WU, just like it takes us to crunch it. Plus, who knows how it's decided if the data should go into a long or short. Bruce and Bernd could probably explain this a little more.
It just might be that a lot of the long units are already done, and now the short units need to get done. I am unsure if there is a lot of control in that.
Of my seven machines, six take on long and short, and one takes only short. I have seen all 7 on short at times. It's a luck of the draw and what is available at the time you ask for work. They only generate work at a certain pace, (looks like they keep less than a 1 day cache, currently.
He kicked the # up to 72 per processor, per day (up to 4 processors or 288 results a day. At that rate you would need to do results in 3 minutes on a 1-4 processor machine to run out. The machines with more than 4 processors will have to adjust their time accordingly.
I am also betting with the number at 72, your machine should probably get some large ones in that, unless there are none available.
RE: At that rate you would
)
I guess you mean 3 results per hour, with 3 minutes per WU you would crunch 20 shorties per hour, or 480 per day per CPU.
Team Philippines
RE: RE: At that rate you
)
Correct, fast math got me (plus my phone rang).