After reading all this, I'm still not clear on the future. We're soon to complete S5, right?
S5 data collection at the detectors finishes ~ the middle of this calendar year. The first S5 data analysis, 'S5R1' if you like, finishes in ~ a month ( perhaps a bit longer given recent server troubles ).
There isn't a one-to-one relationship between data collection and analysis because of the many different things that could be searched for. We'll be working on new/enlarged data sets at E@H soon - the software details, data sets etc are being hammered into shape at present. One point made at the AEI presentation by Bruce Allen was to keep the crunching parameters ( file sizes etc ) suitable for the lower speed connections around the world. I gather broadband is NOT the majority mode of communication but dial-up IS.
Quote:
Are we soon to also run out of crunching work for E@H?
No, plenty of science yet.
Quote:
Will there be any crunching down-time for us?
No, servers willing.
Quote:
And if so, when and for how long?
Not-applicable.
Just imagine Dirk Hartog ( or insert your favourite sea-farer here ) returning with a huge list of depth soundings, times/dates, latitudes/longitudes from a round the world trip ..... it'll take a while to do the map and deduce the geography. No shortage of cartography work .... :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
After reading all this, I'm still not clear on the future. We're soon to complete S5, right? Are we soon to also run out of crunching work for E@H? Will there be any crunching down-time for us? And if so, when and for how long?
We are soon to complete S5R1. I imagine S5R2 will follow. There will be no break if past performance is any sort of a guide.
The front page blurb says we are crunching the most sensitive 840 hours of data from S5. If we allow a factor of 0.75 for the time that science mode is achieved by the detectors, those 840 hours would have taken 1120 wall clock hours to accumulate - ie less than 7 weeks. They have been accumulating S5 data for over a year now (i believe). To me that seems to say that they are accumulating data much faster than we are crunching it. I'm sure somebody will correct me if I'm wrong but I think it's just about impossible to run out of data to crunch :).
One point made at the AEI presentation by Bruce Allen was to keep the crunching parameters ( file sizes etc ) suitable for the lower speed connections around the world. I gather broadband is NOT the majority mode of communication but dial-up IS.
hmmm. how about some kind of "poll", asking users what bandwidth they can assure? or, i could imagine - thou hast to code it! ;) - some preference/setting in the client "prefer BIG chunks" / "rather like it small and handy"
i mean, after some time it would be clear were to put focus on rather than optimising peripherial questions...
though, i do not know how deep this approach of reducing file size impacts on the whole math behind.
one thing doesn't sound logical too much: why reducing size? you'll have to get/return it anyway at some time, and the retrieving/sending of data can be made asynchronious (once an up-/download fails by connection loss, next time transfers would be resumed... just like 'ftp' ). the question is, can a client already operate on incomplete datasets, or must it be fully available at start?
One point made at the AEI presentation by Bruce Allen was to keep the crunching parameters ( file sizes etc ) suitable for the lower speed connections around the world. I gather broadband is NOT the majority mode of communication but dial-up IS.
hmmm. how about some kind of "poll", asking users what bandwidth they can assure? or, i could imagine - thou hast to code it! ;) - some preference/setting in the client "prefer BIG chunks" / "rather like it small and handy"
i mean, after some time it would be clear were to put focus on rather than optimising peripherial questions...
though, i do not know how deep this approach of reducing file size impacts on the whole math behind.
one thing doesn't sound logical too much: why reducing size? you'll have to get/return it anyway at some time, and the retrieving/sending of data can be made asynchronious (once an up-/download fails by connection loss, next time transfers would be resumed... just like 'ftp' ). the question is, can a client already operate on incomplete datasets, or must it be fully available at start?
Interesting points. The whole dataset will be the largest yet. But I see the sense in setting the defaults to suit the weakest links ( no offense intended ) in the chain.
Bernard, any thoughts?
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Interesting points. The whole dataset will be the largest yet. But I see the sense in setting the defaults to suit the weakest links ( no offense intended ) in the chain.
Bernard, any thoughts?
Cheers, Mike.
At the beginning of the run, the bandwidth bottleneck will be at the users' end - obviously, most significantly in the case of dial-up users.
But towards the end of the S5R1 run, we are seeing a different pattern. As we fill in the gaps for WUs which didn't get finished for whatever reason the first time round, I'm seeing my machines switch from datapack to datapack with increasing frequency - and almost all of them are h1 short WUs, with the biggest downloads and the shortest crunch time. One user yesterday commented on receiving 170MB in, near enough, a single download session.
That's tough for the end-user, but it must be horrendous for the server too - I'd like to see a download traffic graph, which must be growing almost exponentially!
If there was any chance of coding for XUL's "small and handy" option, there could be a double benefit: a preference opt-in for users on slower connections, and a server choice when only a few results need to be completed to finish off a sequence.
But towards the end of the S5R1 run, we are seeing a different pattern. As we fill in the gaps for WUs which didn't get finished for whatever reason the first time round, I'm seeing my machines switch from datapack to datapack with increasing frequency - and almost all of them are h1 short WUs, with the biggest downloads and the shortest crunch time. One user yesterday commented on receiving 170MB in, near enough, a single download session.
This is something I have noticed as well. For example:
As of right now, my machine crunches 2 WUs every hour or so [dual core system]. Every 12 WUs comes with an approximately 17 megabyte file associated with the smaller parts. So every 6 hours, I require what amounts to about 20 megs of data.
Due to the fantastically bad uptime of the project in the last 2 weeks [not bitching. Just saying], as of now I am grabbing the maximum amount of data allowable for my machine - 144 units. That is approximately 3.5 days of of crunch time. So right now my client - just my dual core client, I have three other machines - requires about 70 megs of data per day transferred to me to keep up with the processing pace.
I too would love to see a plot of bandwidth and traffic for the server[s? Not sure how E@H is structured in the back end]. I think the technical term is a "metric shitload" of data.
At the beginning of the run, the bandwidth bottleneck will be at the users' end - obviously, most significantly in the case of dial-up users.
But towards the end of the S5R1 run, we are seeing a different pattern. As we fill in the gaps for WUs which didn't get finished for whatever reason the first time round, I'm seeing my machines switch from datapack to datapack with increasing frequency - and almost all of them are h1 short WUs, with the biggest downloads and the shortest crunch time. One user yesterday commented on receiving 170MB in, near enough, a single download session.
Wince... and let's be devil's advocates and say that was to simply finalise a single stray shorty unit!
Quote:
That's tough for the end-user, but it must be horrendous for the server too - I'd like to see a download traffic graph, which must be growing almost exponentially!
It would be interesting to see what other evidence could point to the hypothesis that such end of run thrashing ( reminds me of virtual memory and swap files ) did materially contribute to the failures. Certainly plausible.
Quote:
If there was any chance of coding for XUL's "small and handy" option, there could be a double benefit: a preference opt-in for users on slower connections, and a server choice when only a few results need to be completed to finish off a sequence.
Touche!! But I hear the stomping footfalls of the dev's!! :-)
Cheers, Mike
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
After reading all this, I'm still not clear on the future. We're soon to complete S5, right?
S5 data collection at the detectors finishes ~ the middle of this calendar year. The first S5 data analysis, 'S5R1' if you like, finishes in ~ a month ( perhaps a bit longer given recent server troubles ).
There isn't a one-to-one relationship between data collection and analysis because of the many different things that could be searched for. We'll be working on new/enlarged data sets at E@H soon - the software details, data sets etc are being hammered into shape at present. One point made at the AEI presentation by Bruce Allen was to keep the crunching parameters ( file sizes etc ) suitable for the lower speed connections around the world. I gather broadband is NOT the majority mode of communication but dial-up IS.
Quote:
Are we soon to also run out of crunching work for E@H?
No, plenty of science yet.
Quote:
Will there be any crunching down-time for us?
No, servers willing.
Quote:
And if so, when and for how long?
Not-applicable.
Just imagine Dirk Hartog ( or insert your favourite sea-farer here ) returning with a huge list of depth soundings, times/dates, latitudes/longitudes from a round the world trip ..... it'll take a while to do the map and deduce the geography. No shortage of cartography work .... :-)
Cheers, Mike.
As someone who CAN'T, as of yet get any type of broadband, I would echo the above comments regarding dialup as still being the most common means of accessing the web. The largest downloads I have had to date have been 31MB and as I can best recall that took the better of 3 1/2 hrs. Managable, but far from being convenient.
F. Prefedt
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
As someone who CAN'T, as of yet get any type of broadband, I would echo the above comments regarding dialup as still being the most common means of accessing the web. The largest downloads I have had to date have been 31MB and as I can best recall that took the better of 3 1/2 hrs. Managable, but far from being convenient.
F. Prefedt
I have a 4 Mbit/s ADSL connection and have downloaded several SuSE Linux releases of about 3.6 GB each. But, since I use BitTorrent with an Azureus client. it may take also days.
Tullio
RE: After reading all this,
)
S5 data collection at the detectors finishes ~ the middle of this calendar year. The first S5 data analysis, 'S5R1' if you like, finishes in ~ a month ( perhaps a bit longer given recent server troubles ).
There isn't a one-to-one relationship between data collection and analysis because of the many different things that could be searched for. We'll be working on new/enlarged data sets at E@H soon - the software details, data sets etc are being hammered into shape at present. One point made at the AEI presentation by Bruce Allen was to keep the crunching parameters ( file sizes etc ) suitable for the lower speed connections around the world. I gather broadband is NOT the majority mode of communication but dial-up IS.
No, plenty of science yet.
No, servers willing.
Not-applicable.
Just imagine Dirk Hartog ( or insert your favourite sea-farer here ) returning with a huge list of depth soundings, times/dates, latitudes/longitudes from a round the world trip ..... it'll take a while to do the map and deduce the geography. No shortage of cartography work .... :-)
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: After reading all this,
)
We are soon to complete S5R1. I imagine S5R2 will follow. There will be no break if past performance is any sort of a guide.
The front page blurb says we are crunching the most sensitive 840 hours of data from S5. If we allow a factor of 0.75 for the time that science mode is achieved by the detectors, those 840 hours would have taken 1120 wall clock hours to accumulate - ie less than 7 weeks. They have been accumulating S5 data for over a year now (i believe). To me that seems to say that they are accumulating data much faster than we are crunching it. I'm sure somebody will correct me if I'm wrong but I think it's just about impossible to run out of data to crunch :).
Cheers,
Gary.
Thanks for the answers!
)
Thanks for the answers!
Reno, NV Team: SETI.USA
RE: One point made at the
)
hmmm. how about some kind of "poll", asking users what bandwidth they can assure? or, i could imagine - thou hast to code it! ;) - some preference/setting in the client "prefer BIG chunks" / "rather like it small and handy"
i mean, after some time it would be clear were to put focus on rather than optimising peripherial questions...
though, i do not know how deep this approach of reducing file size impacts on the whole math behind.
one thing doesn't sound logical too much: why reducing size? you'll have to get/return it anyway at some time, and the retrieving/sending of data can be made asynchronious (once an up-/download fails by connection loss, next time transfers would be resumed... just like 'ftp' ). the question is, can a client already operate on incomplete datasets, or must it be fully available at start?
RE: RE: One point made at
)
Interesting points. The whole dataset will be the largest yet. But I see the sense in setting the defaults to suit the weakest links ( no offense intended ) in the chain.
Bernard, any thoughts?
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: Interesting points. The
)
At the beginning of the run, the bandwidth bottleneck will be at the users' end - obviously, most significantly in the case of dial-up users.
But towards the end of the S5R1 run, we are seeing a different pattern. As we fill in the gaps for WUs which didn't get finished for whatever reason the first time round, I'm seeing my machines switch from datapack to datapack with increasing frequency - and almost all of them are h1 short WUs, with the biggest downloads and the shortest crunch time. One user yesterday commented on receiving 170MB in, near enough, a single download session.
That's tough for the end-user, but it must be horrendous for the server too - I'd like to see a download traffic graph, which must be growing almost exponentially!
If there was any chance of coding for XUL's "small and handy" option, there could be a double benefit: a preference opt-in for users on slower connections, and a server choice when only a few results need to be completed to finish off a sequence.
RE: But towards the end of
)
This is something I have noticed as well. For example:
As of right now, my machine crunches 2 WUs every hour or so [dual core system]. Every 12 WUs comes with an approximately 17 megabyte file associated with the smaller parts. So every 6 hours, I require what amounts to about 20 megs of data.
Due to the fantastically bad uptime of the project in the last 2 weeks [not bitching. Just saying], as of now I am grabbing the maximum amount of data allowable for my machine - 144 units. That is approximately 3.5 days of of crunch time. So right now my client - just my dual core client, I have three other machines - requires about 70 megs of data per day transferred to me to keep up with the processing pace.
I too would love to see a plot of bandwidth and traffic for the server[s? Not sure how E@H is structured in the back end]. I think the technical term is a "metric shitload" of data.
RE: At the beginning of the
)
Wince... and let's be devil's advocates and say that was to simply finalise a single stray shorty unit!
It would be interesting to see what other evidence could point to the hypothesis that such end of run thrashing ( reminds me of virtual memory and swap files ) did materially contribute to the failures. Certainly plausible.
Touche!! But I hear the stomping footfalls of the dev's!! :-)
Cheers, Mike
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: RE: After reading all
)
As someone who CAN'T, as of yet get any type of broadband, I would echo the above comments regarding dialup as still being the most common means of accessing the web. The largest downloads I have had to date have been 31MB and as I can best recall that took the better of 3 1/2 hrs. Managable, but far from being convenient.
F. Prefedt
In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.....Douglas Adams
RE: As someone who CAN'T,
)
I have a 4 Mbit/s ADSL connection and have downloaded several SuSE Linux releases of about 3.6 GB each. But, since I use BitTorrent with an Azureus client. it may take also days.
Tullio