I think it is truly amazing that anyone with fairly basic hardware can be involved in such a frontier project - searching for effects of ultra low magnitude due to cosmic events & neighbourhoods of incredible violence & extremes. Following along the intellectual trail of the Mr Einstein is definitely a buzz .... :-)
The app improvements have contributed, along with a more persisting and powerful user base, to the early closure of R3 crunching and what sounds like a more ambitious usage of the algorithms for R4. Weighty Houghs eh? I'll have to read up about the Hough I think, as currently I could run over one in my car and not know it from a Wombat!
Quote:
a fat, blind, squat, living cylinder with short legs. Our local vet Pippa has the renown for having an episode on Animal Planet under the topic of wombat 'mange'. Other marsupial dermatologists ( all two of you ) should watch Discovery Channel more often ..... :-)
I gather the sky grid will/could be used to select position pairings of a shorter-runtime/longer-runtime nature, with the average runtime of each in the pair being relatively less fluctuant ( compared to the current ~ 30% variance ) over all such pairings? That is, if one unit is some 'x' amount below the ( sinusoidal principal value ) average then mate that with a unit ~ same x amount above said average. Send them both out together, wrapped/separate/consecutive, to the one host ......
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
I gather the sky grid will/could be used to select position pairings of a shorter-runtime/longer-runtime nature, with the average runtime of each in the pair being relatively less fluctuant ( compared to the current ~ 30% variance ) over all such pairings? That is, if one unit is some 'x' amount below the ( sinusoidal principal value ) average then mate that with a unit ~ same x amount above said average. Send them both out together, wrapped/separate/consecutive, to the one host ......
There have been two proposals in our group: mine was to distribute the sky-grid points such that every workunit covers the whole sky, just with a much coarser grid, and let the remaining workunits cover the points in between, so that we end up with a grid of the same coverage. So given 4 WUs the distribution of the gridpoints (based on the numbering used in R3) would be
WU#1: 1, 5, 9 ...
WU#2: 2, 6, 10 ...
WU#3: 3, 7, 11 ...
WU#4: 4, 8, 12 ...
Bruce proposed to slice the grid in right ascension instead of declination, where a run-variation is noticeable, too, but below the usual error of 4-5%. We're currently discussing both ideas (and hopefully more) with the people more involved in post-processing the data - at the end they will have to live with the results.
There have been two proposals in our group: mine was to distribute the sky-grid points such that every workunit covers the whole sky, just with a much coarser grid, and let the remaining workunits cover the points in between, so that we end up with a grid of the same coverage. So given 4 WUs the distribution of the gridpoints (based on the numbering used in R3) would be
WU#1: 1, 5, 9 ...
WU#2: 2, 6, 10 ...
WU#3: 3, 7, 11 ...
WU#4: 4, 8, 12 ...
Bruce proposed to slice the grid in right ascension instead of declination, where a run-variation is noticeable, too, but below the usual error of 4-5%. We're currently discussing both ideas (and hopefully more) with the people more involved in post-processing the data - at the end they will have to live with the results.
Well the significant variation ( sinusoidal ) has come from the spherical to rectangular geometry/co-ordinate mapping. That's inevitable. 'Square' degree blocks near the equator become 'trapezoidal' even 'triangular' toward the poles.
So is the 'best' path along ~ contours ( like around a mountain side using cross-country skis ) being declination, OR along ~ fall-line ( ski straight down the slope from the top ) being right ascension. Or not have a 'connected' path at all? Do we want 'efficient', or do we want 'fair'? We certainly don't want to miss a signal ...... hmmmm
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
Well the significant variation ( sinusoidal ) has come from the spherical to rectangular geometry/co-ordinate mapping. That's inevitable. 'Square' degree blocks near the equator become 'trapezoidal' even 'triangular' toward the poles.
Yes, the will still be a variation in calculation time between individual sky locations. But the idea is to distribute the sky positions over the workunits such that the sum of the variations is more or less constant over the workunits. Both proposals would achieve this with regular schemes, i.e. without too much "intelligence" necessary in the Workunit Generator, just by re-ordering the points in the skygrid files.
Yes, the will still be a variation in calculation time between individual sky locations. But the idea is to distribute the sky positions over the workunits such that the sum of the variations is more or less constant over the workunits. Both proposals would achieve this with regular schemes, i.e. without too much "intelligence" necessary in the Workunit Generator, just by re-ordering the points in the skygrid files.
Doh, of course! Brilliant ... shuffle the ordering in the sky grid files. Just don't forget the permutation, 'cos you have to gather that back up after the return of results .... :-)
So some selected, and pretty well fixed, permutation scheme(s) among the various work units. Any other levers/parameters in that?
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
I gather the sky grid will/could be used to select position pairings of a shorter-runtime/longer-runtime nature, with the average runtime of each in the pair being relatively less fluctuant ( compared to the current ~ 30% variance ) over all such pairings? That is, if one unit is some 'x' amount below the ( sinusoidal principal value ) average then mate that with a unit ~ same x amount above said average. Send them both out together, wrapped/separate/consecutive, to the one host ......
There have been two proposals in our group: mine was to distribute the sky-grid points such that every workunit covers the whole sky, just with a much coarser grid, and let the remaining workunits cover the points in between, so that we end up with a grid of the same coverage. So given 4 WUs the distribution of the gridpoints (based on the numbering used in R3) would be
WU#1: 1, 5, 9 ...
WU#2: 2, 6, 10 ...
WU#3: 3, 7, 11 ...
WU#4: 4, 8, 12 ...
Bruce proposed to slice the grid in right ascension instead of declination, where a run-variation is noticeable, too, but below the usual error of 4-5%. We're currently discussing both ideas (and hopefully more) with the people more involved in post-processing the data - at the end they will have to live with the results.
BM
Both solutions would yield nice visualizations :-). BTW. the runtime variation over right ascension was responsible for the "wiggles" in runtime graphs, so it's clearly visible. I'm not sure whether the S5R4 data might be different in this respect as it might stretch over a longer observation time => even less variation over RA? The all-sky-approach would still probably be more uniform in runtime.
The all-sky-approach would still probably be more uniform in runtime.
Yes, that was my intention.
But the problem is that you actually change the grid in making it much coarser, which in combination with the limiting the number of candidates sent back ('toplist') changes the statistics of the results. It's definitely the post-processing and the final analysis of the results that will drive the decision here.
But the problem is that you actually change the grid in making it much coarser, which in combination with the limiting the number of candidates sent back ('toplist') changes the statistics of the results. It's definitely the post-processing and the final analysis of the results that will drive the decision here.
And that way you will increase amount of postprocessing work, that could be avoided if the split model would be the same as S5R3. That way you will have to compile the result grid and only then you'll be able to calc any statistics even in 1 frequency band.
So, I think that Mike's proposition of pairs will not be so hard to write thing against the complexity and time needed for the postprocessing.
And one more thing I have to ask here. Is it possible to use grid files our clients already have in the new run. This will significantly decrease the traffic and my money as well (in several places I have a satellite provider that is not the unlimited yet).
And one more thing I have to ask here. Is it possible to use grid files our clients already have in the new run. This will significantly decrease the traffic and my money as well (in several places I have a satellite provider that is not the unlimited yet).
Well, the skygrid files make up only (roughly) 10% of the diskspace necessary to process a single Workunit, and they have a longer "lifetime" (are reused more frequently) than the other big files that are transferred. So the overall share of the skygrid files on the data traffic must be even less than 10%.
... Is it possible to use grid files our clients already have ...
Well, the skygrid files make up only (roughly) 10% ....
Yes, the savings are small but I feel compelled to cache the skygrids and then roll them out to all hosts so that a new host (and also an existing one when changing frequency significantly) can at least save something.
Also, when you have as many hosts as I have, there is waste when several hosts in my farm are working on similar frequencies and probably have independently downloaded some of the same large data files. Not to mention the waste when the scheduler seems hell bent on shifting incessantly to adjacent data bands and downloading further large files at the same time - and marking files for deletion at some high sequence number instead of allowing the host to keep getting tasks for the same data at lower sequence numbers.
It would be very nice to have some sort of BOINC preference that allowed any request for data to check a local cache first just in case the data was locally available. Of course there would need to be a companion setting to tell the client to also cache what it was downloading so that the local repository could be filled progressively and automatically. This may need to be done by someone in the E@H camp since I'm not sure other projects package their tasks the same way E@H does and so it might not be appropriate for BOINC as a whole.
It would also be nice to have the whole data set available on DVD so that people like me could avoid monthly excess data charges that my ISP is now hitting me with. Probably time for me to change ISP to someone who has more sensible data limits. My ISP must be relying on inertia to keep its customers as it's really not competitive anymore with its plans and charges.
RE: All that matters here
)
I think it is truly amazing that anyone with fairly basic hardware can be involved in such a frontier project - searching for effects of ultra low magnitude due to cosmic events & neighbourhoods of incredible violence & extremes. Following along the intellectual trail of the Mr Einstein is definitely a buzz .... :-)
The app improvements have contributed, along with a more persisting and powerful user base, to the early closure of R3 crunching and what sounds like a more ambitious usage of the algorithms for R4. Weighty Houghs eh? I'll have to read up about the Hough I think, as currently I could run over one in my car and not know it from a Wombat!
I gather the sky grid will/could be used to select position pairings of a shorter-runtime/longer-runtime nature, with the average runtime of each in the pair being relatively less fluctuant ( compared to the current ~ 30% variance ) over all such pairings? That is, if one unit is some 'x' amount below the ( sinusoidal principal value ) average then mate that with a unit ~ same x amount above said average. Send them both out together, wrapped/separate/consecutive, to the one host ......
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: I gather the sky grid
)
There have been two proposals in our group: mine was to distribute the sky-grid points such that every workunit covers the whole sky, just with a much coarser grid, and let the remaining workunits cover the points in between, so that we end up with a grid of the same coverage. So given 4 WUs the distribution of the gridpoints (based on the numbering used in R3) would be
WU#1: 1, 5, 9 ...
WU#2: 2, 6, 10 ...
WU#3: 3, 7, 11 ...
WU#4: 4, 8, 12 ...
Bruce proposed to slice the grid in right ascension instead of declination, where a run-variation is noticeable, too, but below the usual error of 4-5%. We're currently discussing both ideas (and hopefully more) with the people more involved in post-processing the data - at the end they will have to live with the results.
BM
BM
RE: There have been two
)
Well the significant variation ( sinusoidal ) has come from the spherical to rectangular geometry/co-ordinate mapping. That's inevitable. 'Square' degree blocks near the equator become 'trapezoidal' even 'triangular' toward the poles.
So is the 'best' path along ~ contours ( like around a mountain side using cross-country skis ) being declination, OR along ~ fall-line ( ski straight down the slope from the top ) being right ascension. Or not have a 'connected' path at all? Do we want 'efficient', or do we want 'fair'? We certainly don't want to miss a signal ...... hmmmm
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: Well the significant
)
Yes, the will still be a variation in calculation time between individual sky locations. But the idea is to distribute the sky positions over the workunits such that the sum of the variations is more or less constant over the workunits. Both proposals would achieve this with regular schemes, i.e. without too much "intelligence" necessary in the Workunit Generator, just by re-ordering the points in the skygrid files.
BM
BM
RE: Yes, the will still be
)
Doh, of course! Brilliant ... shuffle the ordering in the sky grid files. Just don't forget the permutation, 'cos you have to gather that back up after the return of results .... :-)
So some selected, and pretty well fixed, permutation scheme(s) among the various work units. Any other levers/parameters in that?
Cheers, Mike.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
RE: RE: I gather the sky
)
Both solutions would yield nice visualizations :-). BTW. the runtime variation over right ascension was responsible for the "wiggles" in runtime graphs, so it's clearly visible. I'm not sure whether the S5R4 data might be different in this respect as it might stretch over a longer observation time => even less variation over RA? The all-sky-approach would still probably be more uniform in runtime.
RE: The all-sky-approach
)
Yes, that was my intention.
But the problem is that you actually change the grid in making it much coarser, which in combination with the limiting the number of candidates sent back ('toplist') changes the statistics of the results. It's definitely the post-processing and the final analysis of the results that will drive the decision here.
BM
BM
RE: But the problem is that
)
And that way you will increase amount of postprocessing work, that could be avoided if the split model would be the same as S5R3. That way you will have to compile the result grid and only then you'll be able to calc any statistics even in 1 frequency band.
So, I think that Mike's proposition of pairs will not be so hard to write thing against the complexity and time needed for the postprocessing.
And one more thing I have to ask here. Is it possible to use grid files our clients already have in the new run. This will significantly decrease the traffic and my money as well (in several places I have a satellite provider that is not the unlimited yet).
RE: And one more thing I
)
Well, the skygrid files make up only (roughly) 10% of the diskspace necessary to process a single Workunit, and they have a longer "lifetime" (are reused more frequently) than the other big files that are transferred. So the overall share of the skygrid files on the data traffic must be even less than 10%.
CU
Bikeman
RE: RE: ... Is it
)
Yes, the savings are small but I feel compelled to cache the skygrids and then roll them out to all hosts so that a new host (and also an existing one when changing frequency significantly) can at least save something.
Also, when you have as many hosts as I have, there is waste when several hosts in my farm are working on similar frequencies and probably have independently downloaded some of the same large data files. Not to mention the waste when the scheduler seems hell bent on shifting incessantly to adjacent data bands and downloading further large files at the same time - and marking files for deletion at some high sequence number instead of allowing the host to keep getting tasks for the same data at lower sequence numbers.
It would be very nice to have some sort of BOINC preference that allowed any request for data to check a local cache first just in case the data was locally available. Of course there would need to be a companion setting to tell the client to also cache what it was downloading so that the local repository could be filled progressively and automatically. This may need to be done by someone in the E@H camp since I'm not sure other projects package their tasks the same way E@H does and so it might not be appropriate for BOINC as a whole.
It would also be nice to have the whole data set available on DVD so that people like me could avoid monthly excess data charges that my ISP is now hitting me with. Probably time for me to change ISP to someone who has more sensible data limits. My ISP must be relying on inertia to keep its customers as it's really not competitive anymore with its plans and charges.
Cheers,
Gary.