Looking at the numbers we see, that ABP2 and S5GC1 caches are draining, while BRP3 and S5GC1 are still growing up. So, when the tasks to send will grow up to about 30 000 for BRP3, we'll see that the WUG will be paused for sometime. Am I right?
ABP2 and S5GC1 are draining because those research runs are essentially complete, and we're merely chasing down the stragglers.
BRP3 and S5GC1HF are still growing up because they are current, active research runs, with 'Work still remaining', as it says further down the page.
ABP2 and S5GC1 are being phased out. In particular there are no workunit generators (WUGs) running that would generate new workunits for these applications. New tasks for these applications will only be generated to for the "workunits without canonical result" e.g. because tasks "in progress" get reported as client errors (or because we manually raise the number of task to be sent out for these workunits in order to speed up finishing these "runs").
Eventually the "Workunits without canonical result" of these applications will reach zero. A week later these will be purged from the database and total tasks and workunits for these applications will show zero, too. Then we usually "deprecate" this application, which means that it doesn't show up on the server status page at all anymore. During the past few weeks this could be observed with "S5GCE".
The variation we see in the numbers of S5GC1HF is actually pretty small, it has reached a more or less steady state by now. When BRP3 goes from experimental to production level and the BRP3 throughput is increased, the numbers for S5GC1HF will decrease noticeably.
For less than 15 Minutes the "Tasks to send" of BRP3 were slightly above 30.000, and the BRP WUGs stopped. However, they will be started every five minutes again and see whether the number of unsent tasks has dropped below 20.000, and immediately terminate if not. The two WUGs that are shown running are actually an artifact of the delay caused by the database access of the WUGs and the script that checks the status - technically the processes are running, but these won't generate additional workunits.
( edit ) Because it's such a large and complex database, plus traditionally a bit persnickety, then probably it's not trivial/wise to query it for these tally purposes any closer to the true real time figures. Each query has to reduce throughput performance .....
There is a great thing created especially for such situations. It is called REGISTER. This wonderful thing is thought up to collect those numbers that we can see on the status page. First time I've seen these in "1C:Accounting", an integrated programmable complex for trade accounting http://v8.1c.ru/eng/the-system-of-programs/ (you know, we have very difficult tax laws and tax accounting therefore here in Russia). So, any event (query, insertion or deletion of database records) leads to change of certain register. And any register anytime keeps current status of what is stored in it. It gives us current information as soon and as often as we want it to receive.
And it would be wonderful to see changing in real time numbers like on a counter. But I think, this proposition should be addressed to BOINC developers.
I think that more powerful mechanism already exists. It's an READ ONLY TRANSACTION in Oracle Database words.
:)
Read only transaction gets additional load to the db server. Instead of this, it will be better to use integrated procedures (may be I'm talking about MS-Sql? I'm not a guru in SQL yet, sorry), connected to database events. In any such event, connected integrated procedure will update connected registers. And there is no need in any transaction at all, only single addition or subtraction operation with used register. That is why I insist on using registers.
In Oracle, Read only transaction does not increase a server load, if it duration smaller than undo_retention parameter, that by default is equivalent of 90 minutes. But, of course, for each project database we should have a different situation.
Quote:
Quote:
I think that more powerful mechanism already exists. It's an READ ONLY TRANSACTION in Oracle Database words.
:)
Does MySQL support an equivalent? Boinc server was designed without a DAL, and is too tightly coupled to MySQL to swap it out for a different DB. Which is unfortunate since since some of the larger projects could use the increased scalability of competing products.
In MySQL exists SERIALIZATION mode for transaction, but MySQL is "blocking" database and this mode will press it performance into zero (if I understand right).
I think it would be wonderful to add a row called "overal number of tasks" to the table "Workunits and tasks". Now we only see overal number for the S5GCHF. But it will be very interesting to see these numbers for any run we do.
BTW. Does anybody there have information (it is better in table representation) about all the runs we have completed so far, e.g. total number or WUs in the run, total number of FLOPS in the run, total time (usual time, better in days) to complete the run, total number of cores/computers/processors/volunteers involved in that run, total machine time, average productivity of the run in common cobblestones per day or FLOPS?
P.S. Thank you, Jord, for the help with the table ;)
P.P.S. Sadly, but the "Total needed" days for S5GCHF grows up almost to 110 days. It is because of BRP3 coming out in production phase, that uses only 0.2% of a core. But it seems, many people reduced their "on multiprocessor systems use at most %" to open the road for BRP3. Usually "Total needed" starts decreasing when 40-50% already complete. :(
P.P.P.S. Yeah! The ABP2 "tasks to send" is zero now. But last 43 WUs are still in progress. I'll be glad to calc them as fast as possible if any of these will suddenly error out ;)
I think it would be wonderful to add a row called "overal number of tasks" to the table "Workunits and tasks". Now we only see overal number for the S5GCHF. But it will be very interesting to see these numbers for any run we do.
Total number of tasks isn't really meaningful for the BRP searches. The LIGO data is heavily culled so that we're only analyzing a small chunk of each science run. In contrast Arecibo is generating new data every day, and all of its take is being processed, so there will always be more data being fed into the BRP system. I don't know if the other telescopes whose data is being processed are doing daily collects suitable for a BRP search, or if it's just a case of existing, completed, data sets collected for something else that are suitable for analysis.
There was an ABP1/2 progress page that showed the rate of data processed vs data collected, but it hasn't been updated for BRP yet.
Yes, I understand that BRP search is endless somekind. But all the other searches are indeed have certain volume of data split into WUs. So the number of these WUs is known from the beginning of each run. And that's why I asked for the row being added.
Oh! I see status page content has been changed. Now we see BRP3 status instead of ABP2 and "Computing" section above it. It looks like BRP3 will last about a year, always eating time from S5GC1HF which is going to the second half of the computation volume today. All the previous searches "total needed" runtimes were going down while were in progress. But this new one (S5GC1HF) is going up almost from the beginning. Is it because of the BRP3 eating enough CPU time to make the search longer?
Yes, when we started S5GC1HF there was no Radio-Pulsar search (ABP2 or BRP3) running (well, to be precise, the last tasks of ABP2 were shipped).
BRP3 was delayed by almost two months, and we're still experimenting with it. While we are ramping up the output, the computation spent on S5GC1HF decreases, and the estimated end of S5GC1HF pushes later.
Right now I'm trying to find the limits of the system, i.e. how much BRP3 work we can ship. I'm shifting the scheduling ratio between GW and RP work about twice a week towards BRP3. I know that we couldn't run the project with more that ~40% of ABP2 work. With the new setup and the new workunit generators of BRP3 some limitations have been removed, I'd like to see how far we could get in case we need to.
Then ultimately I'd like to settle for about 50/50 GW/RP search in normal operation.
RE: First, 7:50 UTC 6 jan
)
Use the [pre ][/pre ] tags in BBCode (hold the space).
RE: Looking at the numbers
)
ABP2 and S5GC1 are draining because those research runs are essentially complete, and we're merely chasing down the stragglers.
BRP3 and S5GC1HF are still growing up because they are current, active research runs, with 'Work still remaining', as it says further down the page.
ABP2 and S5GC1 are being
)
ABP2 and S5GC1 are being phased out. In particular there are no workunit generators (WUGs) running that would generate new workunits for these applications. New tasks for these applications will only be generated to for the "workunits without canonical result" e.g. because tasks "in progress" get reported as client errors (or because we manually raise the number of task to be sent out for these workunits in order to speed up finishing these "runs").
Eventually the "Workunits without canonical result" of these applications will reach zero. A week later these will be purged from the database and total tasks and workunits for these applications will show zero, too. Then we usually "deprecate" this application, which means that it doesn't show up on the server status page at all anymore. During the past few weeks this could be observed with "S5GCE".
The variation we see in the numbers of S5GC1HF is actually pretty small, it has reached a more or less steady state by now. When BRP3 goes from experimental to production level and the BRP3 throughput is increased, the numbers for S5GC1HF will decrease noticeably.
For less than 15 Minutes the "Tasks to send" of BRP3 were slightly above 30.000, and the BRP WUGs stopped. However, they will be started every five minutes again and see whether the number of unsent tasks has dropped below 20.000, and immediately terminate if not. The two WUGs that are shown running are actually an artifact of the delay caused by the database access of the WUGs and the script that checks the status - technically the processes are running, but these won't generate additional workunits.
BM
BM
RE: RE: RE: RE: (
)
In Oracle, Read only transaction does not increase a server load, if it duration smaller than undo_retention parameter, that by default is equivalent of 90 minutes. But, of course, for each project database we should have a different situation.
In MySQL exists SERIALIZATION mode for transaction, but MySQL is "blocking" database and this mode will press it performance into zero (if I understand right).
I think it would be wonderful
)
I think it would be wonderful to add a row called "overal number of tasks" to the table "Workunits and tasks". Now we only see overal number for the S5GCHF. But it will be very interesting to see these numbers for any run we do.
BTW. Does anybody there have information (it is better in table representation) about all the runs we have completed so far, e.g. total number or WUs in the run, total number of FLOPS in the run, total time (usual time, better in days) to complete the run, total number of cores/computers/processors/volunteers involved in that run, total machine time, average productivity of the run in common cobblestones per day or FLOPS?
P.S. Thank you, Jord, for the help with the table ;)
P.P.S. Sadly, but the "Total needed" days for S5GCHF grows up almost to 110 days. It is because of BRP3 coming out in production phase, that uses only 0.2% of a core. But it seems, many people reduced their "on multiprocessor systems use at most %" to open the road for BRP3. Usually "Total needed" starts decreasing when 40-50% already complete. :(
P.P.P.S. Yeah! The ABP2 "tasks to send" is zero now. But last 43 WUs are still in progress. I'll be glad to calc them as fast as possible if any of these will suddenly error out ;)
RE: I think it would be
)
Total number of tasks isn't really meaningful for the BRP searches. The LIGO data is heavily culled so that we're only analyzing a small chunk of each science run. In contrast Arecibo is generating new data every day, and all of its take is being processed, so there will always be more data being fed into the BRP system. I don't know if the other telescopes whose data is being processed are doing daily collects suitable for a BRP search, or if it's just a case of existing, completed, data sets collected for something else that are suitable for analysis.
There was an ABP1/2 progress page that showed the rate of data processed vs data collected, but it hasn't been updated for BRP yet.
http://einstein-dl.aei.uni-hannover.de/EinsteinAtHome/ABP/
Yes, I understand that BRP
)
Yes, I understand that BRP search is endless somekind. But all the other searches are indeed have certain volume of data split into WUs. So the number of these WUs is known from the beginning of each run. And that's why I asked for the row being added.
Oh! I see status page content
)
Oh! I see status page content has been changed. Now we see BRP3 status instead of ABP2 and "Computing" section above it. It looks like BRP3 will last about a year, always eating time from S5GC1HF which is going to the second half of the computation volume today. All the previous searches "total needed" runtimes were going down while were in progress. But this new one (S5GC1HF) is going up almost from the beginning. Is it because of the BRP3 eating enough CPU time to make the search longer?
Yes, when we started S5GC1HF
)
Yes, when we started S5GC1HF there was no Radio-Pulsar search (ABP2 or BRP3) running (well, to be precise, the last tasks of ABP2 were shipped).
BRP3 was delayed by almost two months, and we're still experimenting with it. While we are ramping up the output, the computation spent on S5GC1HF decreases, and the estimated end of S5GC1HF pushes later.
Right now I'm trying to find the limits of the system, i.e. how much BRP3 work we can ship. I'm shifting the scheduling ratio between GW and RP work about twice a week towards BRP3. I know that we couldn't run the project with more that ~40% of ABP2 work. With the new setup and the new workunit generators of BRP3 some limitations have been removed, I'd like to see how far we could get in case we need to.
Then ultimately I'd like to settle for about 50/50 GW/RP search in normal operation.
BM
BM
So, BRP becomes equally
)
So, BRP becomes equally important together with GW-search, does it?