First off, I commend Gary Roberts for his VERY DETAILED explanation of what he thinks is going on with your computer(s), and I feel that you should definitely head his advice.
With that said... I have another, if different approach to your problem(s).
I have installed BoincTasks Js 1.33 on my main rig.
If you follow the directions precisely, you will eventually get this, or a version that reflects your computers:
I have intentionally set the number of my Universe@Home tasks higher than what the deadline for the tasks calls for so that you can see what I'm taking about.
In Einstein@Home, I am running 4 tasks on two separate GPUs at 2 tasks per GPU. Even though I have nearly 2,000 tasks in the Que (I am running a 3080 & 2070S which are much faster for tasks than your GPUs), they will get finished before the deadline as indicated in the columns 'TimeLeft' & 'Deadline', highlighted in blue.
In Universe@Home, I am running a single task per each of my CPU core threads on a 3950X, which has 16C/32T. In the same two columns as Einstein, again highlighted in blue, you will see that the 'TimeLeft' column has nearly three times the the amount as in the 'Deadline' column (by intent).
That means that they will 'time out' and become aborted by the BOINC Manager before they even can be started.
So to cure this problem, I will need to lessen the task numbers to a value much less than what I have right now so that the 'TimeLeft' value is much less than the 'Deadline' value. That can only be done by reducing the number of days that I have in my preferences. As Gary Roberts suggested, maybe drop your 10+10 down to 2+0 and see what happens.
Also, just another tid-bit, if you are using your computer(s) as a daily driver in addition to a BOINC cruncher, you must consider how many CPU cores/threads you have in total versus how many cores/threads you are using for BOINC. I leave at least 2 cores available for the system to operate, maybe three if I do a lot of other computing or have many multiple browser tabs open at the same time. This is all dependent on what type of CPU you are running and how many cores you have, not to mention how much system memory you have. For reference, I have 32GB of system memory.
One more thing... if you are using the web to make all of your 'preferences', you must realize that the last task preference you made will become the same for all of your computers, unless you are also changing the venue for each computer (in Einstein, that is only 4 - Home, Work, School, Generic). If for some reason you want to have different preferences for each computer, then I suggest that you use the local preferences in your BOINC Manager so that each computer will have it's own preferences and not be affected by another computer's preferences.
Okay, I think I have digested everything that you told me and have activated it on both my RX560's machine and on my laptop. I believe everything you said, as I just was checking my laptop and found that it had just aborted a bunch of tasks that I wouldn't have known about since by the time I got around to checking, the abortion notice would have been out of sight on the Event Log.
I will be resetting all of my machines shortly. I set to 2 and 0 and have suspended a great many tasks on both machines. The RX 560 machine had 850 tasks waiting.
I do have Boinc in Advanced Mode, so I can't use that as an excuse. I just didn't know any better.
Btw, this 560's machine is running Windows7 and the laptop is running Windows10. I suppose you knew that, but just making sure. The only Linux machine is running the RX580 and is doing fine, however, I found that I can only run 5 tasks at a time or it goes bonkers and starts having errors. At present, none of my machines are running CPU tasks and I am not attached to any other DC projects.
Presently I have 5 tasks on the 560's machine in waiting and in about 30 minutes that number will be one, so I will have to release somemore soon.
I surely appreciate all the fantastic information you gave me and don't consider it to be overkill in any way.
Okay, I think I have digested everything that you told me and have activated it on both my RX560's machine and on my laptop. I believe everything you said, as I just was checking my laptop and found that it had just aborted a bunch of tasks that I wouldn't have known about since by the time I got around to checking, the abortion notice would have been out of sight on the Event Log.
I'm on the website looking at your tasks list and things are looking great! Well done!
The most recently completed task was returned around 8 hours prior to deadline so you are now winning. The current deadline (7+ hours from now) only has ~20 tasks outstanding which will take around 3 hours or so. After that the next deadline is a further ~12hrs in the future and less than 30 tasks for that one. 30 tasks should take less than 5 hours so everything is continuing to improve.
Here is a specific question. Did you originally set x3 running with an app_config file or did you use the GPU utilization factor on the website? I'm guessing it must be the latter because your current crunch times are ~55 mins which you reported in a previous message when you were running x3. The reason why nothing has changed must be that you are still running x3. Please double check BOINC Manager and confirm that a total of 6 are running at any one time.
There is a good reason for this. If you use the website GPU utilization factor, any change you make ONLY gets transmitted to your client when new work is sent. The very last thing you need at the moment is more new work so, if 6 tasks indeed are running, just leave things as they are. Everything seems to be just peachy! :-)
With nice stable run times you should be able to resume quite a few more tasks. It should be quite safe to resume a further 12-15 hours or work, ie around 100 tasks. These will have deadlines further into the future and shouldn't disturb BOINC. If there is any sign of a BOINC panic (running tasks stopping and new tasks starting) just suspend what you resumed and try a smaller number. I haven't dealt with a BOINC panic in quite a long time so my memory might be faulty :-).
Allen wrote:
I will be resetting all of my machines shortly. I set to 2 and 0 and have suspended a great many tasks on both machines. The RX 560 machine had 850 tasks waiting.
I do have Boinc in Advanced Mode, so I can't use that as an excuse. I just didn't know any better.
It's all OK, don't stress about it. You're winning now!
Allen wrote:
Btw, this 560's machine is running Windows7 and the laptop is running Windows10. I suppose you knew that, but just making sure. The only Linux machine is running the RX580 and is doing fine, however, I found that I can only run 5 tasks at a time or it goes bonkers and starts having errors. At present, none of my machines are running CPU tasks and I am not attached to any other DC projects.
OK. I can see the OS for each machine on the website. I just looked at the Linux machine - tasks taking 40 mins so 8 mins per task at x5. I saw a change from 40m to 32m at one point. That would appear to be a change from x5 to x4.
There is no benefit running at x5 or x4, just an increased risk of worse times and of things going wrong due to added stress. Did you actually test for improvement over x2 performance? I run lots of RX 570 GPUs (slightly worse performance than RX 580s) and I get slightly under 8 mins at x2. The testing I did some years ago showed that x2 was the best.
Edit: There is another reason for not using x5. FGRPB1G tasks use somewhere close enough to 1GB each to have the risk of running out of memory for a GPU with 4GB. There might be less of a risk if you're not using a full graphical desktop but it would be better to lessen the risk by running a lower number,
I have another reason for running Linux Ubuntu which makes it quite simple to see what your GPUs are doing.
I'm not sure if you can do these on a Windows machine, but on Linux Ubuntu they work just fine.
I have two different commands to use in the Terminal window, essentially they are the same but one gives more information than the other, and vice-versa. These are 'live' windows so they are constantly changing a bit, but the pictures I'm showing are 'static'.
If you type in the Terminal window "watch -n 1.0 nvidia-smi" (without quotes) you'll get this:
If you type in the Terminal window "watch -n 1 nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv" (without quotes) you'll get this:
If you'll notice the two GPUs are listed by name in the second command. But it the first command you only see the GPUs listed as "Bus - Id" which are "A" & "B" because the name is too long to fit (i.e. - NVIDIA GeForce EVGA RTX 3080). Also, you can see how much, or how fast the fans are running on each GPU as well as the fan temps in the first one, but not in the second.
But in the second Terminal window you will see how much power draw each GPU uses, as well as the GPUs current clock use, memory use, what PCIe Gen 4 & Gen 3 is used and the PCIe link use. In the first window you'll see the total memory used vs the total available memory on each GPU, but in the second you'll see them listed as a percent (%).
But most importantly, you will see the "GPU-Util" (GPU Utility) use on each GPU in each window. On my computers running at 2X tasks per GPU, I'm using all available GPU resources. The advantage of this is that you can see how much work your GPU is doing. If you only run one task per each GPU, you may not utilize all of the GPU resources. If you are running more than two tasks per GPU, you will definitely be at 100% GPU utilization, but your times will become slower the more tasks you are trying to run. The game is to get as much GPU utilization without losing any performance in times (i.e. - slower times per individual task). You'll just need to work that out for yourself.
The "Processes:" in the first window towards the bottom are just what work is being is being done by the CPU & GPUs.
I don't know if you'll ever switch to using Linux, but I thought I'd share a bit of what is beneficial.
they are listed as "A" and "B" because that is the actual busid in hex, not just random or because the name field is too long. your system just happens to use 0A and 0B as the hex busids. you can see the bus id listed in your second command which is exactly the same as the normal nvidia-smi command. (ie, 00000000:0A:00.0). the 535 series drivers actually expanded the width of the output so you can see the full device name rather than cutting it off with (...).
GPU-Util = GPU utilization (not utility), how much it is being used in terms of a percentage.
Okay, I think I have digested everything that you told me and have activated it on both my RX560's machine and on my laptop. I believe everything you said, as I just was checking my laptop and found that it had just aborted a bunch of tasks that I wouldn't have known about since by the time I got around to checking, the abortion notice would have been out of sight on the Event Log.
I'm on the website looking at your tasks list and things are looking great! Well done!
The most recently completed task was returned around 8 hours prior to deadline so you are now winning. The current deadline (7+ hours from now) only has ~20 tasks outstanding which will take around 3 hours or so. After that the next deadline is a further ~12hrs in the future and less than 30 tasks for that one. 30 tasks should take less than 5 hours so everything is continuing to improve.
Here is a specific question. Did you originally set x3 running with an app_config file or did you use the GPU utilization factor on the website? I'm guessing it must be the latter because your current crunch times are ~55 mins which you reported in a previous message when you were running x3. The reason why nothing has changed must be that you are still running x3. Please double check BOINC Manager and confirm that a total of 6 are running at any one time.
There is a good reason for this. If you use the website GPU utilization factor, any change you make ONLY gets transmitted to your client when new work is sent. The very last thing you need at the moment is more new work so, if 6 tasks indeed are running, just leave things as they are. Everything seems to be just peachy! :-)
With nice stable run times you should be able to resume quite a few more tasks. It should be quite safe to resume a further 12-15 hours or work, ie around 100 tasks. These will have deadlines further into the future and shouldn't disturb BOINC. If there is any sign of a BOINC panic (running tasks stopping and new tasks starting) just suspend what you resumed and try a smaller number. I haven't dealt with a BOINC panic in quite a long time so my memory might be faulty :-).
Allen wrote:
I will be resetting all of my machines shortly. I set to 2 and 0 and have suspended a great many tasks on both machines. The RX 560 machine had 850 tasks waiting.
I do have Boinc in Advanced Mode, so I can't use that as an excuse. I just didn't know any better.
It's all OK, don't stress about it. You're winning now!
Allen wrote:
Btw, this 560's machine is running Windows7 and the laptop is running Windows10. I suppose you knew that, but just making sure. The only Linux machine is running the RX580 and is doing fine, however, I found that I can only run 5 tasks at a time or it goes bonkers and starts having errors. At present, none of my machines are running CPU tasks and I am not attached to any other DC projects.
OK. I can see the OS for each machine on the website. I just looked at the Linux machine - tasks taking 40 mins so 8 mins per task at x5. I saw a change from 40m to 32m at one point. That would appear to be a change from x5 to x4.
There is no benefit running at x5 or x4, just an increased risk of worse times and of things going wrong due to added stress. Did you actually test for improvement over x2 performance? I run lots of RX 570 GPUs (slightly worse performance than RX 580s) and I get slightly under 8 mins at x2. The testing I did some years ago showed that x2 was the best.
Edit: There is another reason for not using x5. FGRPB1G tasks use somewhere close enough to 1GB each to have the risk of running out of memory for a GPU with 4GB. There might be less of a risk if you're not using a full graphical desktop but it would be better to lessen the risk by running a lower number,
Gary,
Everything is running well, except I cut the running tasks a bit to short on my RX580 and it stopped running. I had to hurry and release some of the Tasks. Running five tasks, they runout pretty fast.
I am thinking of cutting tasks to 4 and see what happens. I think in the beginning I did check the run times of different numbers of tasks running at once, but I could easily be mistaken. When Milkyway shutdown, I was moving pretty fast to E@H and may have taken some shortcuts that are biting me in the butt now.
Other than some very minor tweaks, I am leaving everything alone for now.
The two 560's machine is in fact running 4 total tasks at this time. I am using local prefs and I do have an app_config file for it. Times are still in the 50's.
Allen, First off, I
)
Allen,
First off, I commend Gary Roberts for his VERY DETAILED explanation of what he thinks is going on with your computer(s), and I feel that you should definitely head his advice.
With that said... I have another, if different approach to your problem(s).
I have installed BoincTasks Js 1.33 on my main rig.
You can get it at: https://efmer.com/boinctasks/boinctasks-connect-to-linux-machine/
If you follow the directions precisely, you will eventually get this, or a version that reflects your computers:
I have intentionally set the number of my Universe@Home tasks higher than what the deadline for the tasks calls for so that you can see what I'm taking about.
In Einstein@Home, I am running 4 tasks on two separate GPUs at 2 tasks per GPU. Even though I have nearly 2,000 tasks in the Que (I am running a 3080 & 2070S which are much faster for tasks than your GPUs), they will get finished before the deadline as indicated in the columns 'TimeLeft' & 'Deadline', highlighted in blue.
In Universe@Home, I am running a single task per each of my CPU core threads on a 3950X, which has 16C/32T. In the same two columns as Einstein, again highlighted in blue, you will see that the 'TimeLeft' column has nearly three times the the amount as in the 'Deadline' column (by intent).
That means that they will 'time out' and become aborted by the BOINC Manager before they even can be started.
So to cure this problem, I will need to lessen the task numbers to a value much less than what I have right now so that the 'TimeLeft' value is much less than the 'Deadline' value. That can only be done by reducing the number of days that I have in my preferences. As Gary Roberts suggested, maybe drop your 10+10 down to 2+0 and see what happens.
Also, just another tid-bit, if you are using your computer(s) as a daily driver in addition to a BOINC cruncher, you must consider how many CPU cores/threads you have in total versus how many cores/threads you are using for BOINC. I leave at least 2 cores available for the system to operate, maybe three if I do a lot of other computing or have many multiple browser tabs open at the same time. This is all dependent on what type of CPU you are running and how many cores you have, not to mention how much system memory you have. For reference, I have 32GB of system memory.
One more thing... if you are using the web to make all of your 'preferences', you must realize that the last task preference you made will become the same for all of your computers, unless you are also changing the venue for each computer (in Einstein, that is only 4 - Home, Work, School, Generic). If for some reason you want to have different preferences for each computer, then I suggest that you use the local preferences in your BOINC Manager so that each computer will have it's own preferences and not be affected by another computer's preferences.
These are just my thoughts on the subject...
Proud member of the Old Farts Association
Gary, Okay, I think I have
)
Gary,
Okay, I think I have digested everything that you told me and have activated it on both my RX560's machine and on my laptop. I believe everything you said, as I just was checking my laptop and found that it had just aborted a bunch of tasks that I wouldn't have known about since by the time I got around to checking, the abortion notice would have been out of sight on the Event Log.
I will be resetting all of my machines shortly. I set to 2 and 0 and have suspended a great many tasks on both machines. The RX 560 machine had 850 tasks waiting.
I do have Boinc in Advanced Mode, so I can't use that as an excuse. I just didn't know any better.
Btw, this 560's machine is running Windows7 and the laptop is running Windows10. I suppose you knew that, but just making sure. The only Linux machine is running the RX580 and is doing fine, however, I found that I can only run 5 tasks at a time or it goes bonkers and starts having errors. At present, none of my machines are running CPU tasks and I am not attached to any other DC projects.
Presently I have 5 tasks on the 560's machine in waiting and in about 30 minutes that number will be one, so I will have to release somemore soon.
I surely appreciate all the fantastic information you gave me and don't consider it to be overkill in any way.
Will keep you advised.
Thank you,
Allen
George, That's a nice
)
George,
That's a nice program. Unfortunately, I'm running Windows on most of my machines, so I won't be able to use it.
Thanks for the explanation of what's going on and how you managed to solve the problem. It looks very nice.
I'm trying some things and will let you all know how it works out.
Thanks again,
Allen
Allen wrote: George, That's
)
Boinc Tasks IS for Windows pc's:
https://efmer.com/boinctasks/download-boinctasks/
Thanks Mikey, didn't realize
)
Thanks Mikey, didn't realize that it was for Windows too.
Nice!
Allen wrote:Okay, I think I
)
I'm on the website looking at your tasks list and things are looking great! Well done!
The most recently completed task was returned around 8 hours prior to deadline so you are now winning. The current deadline (7+ hours from now) only has ~20 tasks outstanding which will take around 3 hours or so. After that the next deadline is a further ~12hrs in the future and less than 30 tasks for that one. 30 tasks should take less than 5 hours so everything is continuing to improve.
Here is a specific question. Did you originally set x3 running with an app_config file or did you use the GPU utilization factor on the website? I'm guessing it must be the latter because your current crunch times are ~55 mins which you reported in a previous message when you were running x3. The reason why nothing has changed must be that you are still running x3. Please double check BOINC Manager and confirm that a total of 6 are running at any one time.
There is a good reason for this. If you use the website GPU utilization factor, any change you make ONLY gets transmitted to your client when new work is sent. The very last thing you need at the moment is more new work so, if 6 tasks indeed are running, just leave things as they are. Everything seems to be just peachy! :-)
With nice stable run times you should be able to resume quite a few more tasks. It should be quite safe to resume a further 12-15 hours or work, ie around 100 tasks. These will have deadlines further into the future and shouldn't disturb BOINC. If there is any sign of a BOINC panic (running tasks stopping and new tasks starting) just suspend what you resumed and try a smaller number. I haven't dealt with a BOINC panic in quite a long time so my memory might be faulty :-).
It's all OK, don't stress about it. You're winning now!
OK. I can see the OS for each machine on the website. I just looked at the Linux machine - tasks taking 40 mins so 8 mins per task at x5. I saw a change from 40m to 32m at one point. That would appear to be a change from x5 to x4.
There is no benefit running at x5 or x4, just an increased risk of worse times and of things going wrong due to added stress. Did you actually test for improvement over x2 performance? I run lots of RX 570 GPUs (slightly worse performance than RX 580s) and I get slightly under 8 mins at x2. The testing I did some years ago showed that x2 was the best.
Edit: There is another reason for not using x5. FGRPB1G tasks use somewhere close enough to 1GB each to have the risk of running out of memory for a GPU with 4GB. There might be less of a risk if you're not using a full graphical desktop but it would be better to lessen the risk by running a lower number,
Cheers,
Gary.
Hi Allen, I have another
)
Hi Allen,
I have another reason for running Linux Ubuntu which makes it quite simple to see what your GPUs are doing.
I'm not sure if you can do these on a Windows machine, but on Linux Ubuntu they work just fine.
I have two different commands to use in the Terminal window, essentially they are the same but one gives more information than the other, and vice-versa. These are 'live' windows so they are constantly changing a bit, but the pictures I'm showing are 'static'.
If you type in the Terminal window "watch -n 1.0 nvidia-smi" (without quotes) you'll get this:
If you type in the Terminal window "watch -n 1 nvidia-smi --query-gpu=name,pci.bus_id,utilization.gpu,utilization.memory,clocks.current.sm,power.draw,memory.used,pcie.link.gen.current,pcie.link.width.current --format=csv" (without quotes) you'll get this:
If you'll notice the two GPUs are listed by name in the second command. But it the first command you only see the GPUs listed as "Bus - Id" which are "A" & "B" because the name is too long to fit (i.e. - NVIDIA GeForce EVGA RTX 3080). Also, you can see how much, or how fast the fans are running on each GPU as well as the fan temps in the first one, but not in the second.
But in the second Terminal window you will see how much power draw each GPU uses, as well as the GPUs current clock use, memory use, what PCIe Gen 4 & Gen 3 is used and the PCIe link use. In the first window you'll see the total memory used vs the total available memory on each GPU, but in the second you'll see them listed as a percent (%).
But most importantly, you will see the "GPU-Util" (GPU Utility) use on each GPU in each window. On my computers running at 2X tasks per GPU, I'm using all available GPU resources. The advantage of this is that you can see how much work your GPU is doing. If you only run one task per each GPU, you may not utilize all of the GPU resources. If you are running more than two tasks per GPU, you will definitely be at 100% GPU utilization, but your times will become slower the more tasks you are trying to run. The game is to get as much GPU utilization without losing any performance in times (i.e. - slower times per individual task). You'll just need to work that out for yourself.
The "Processes:" in the first window towards the bottom are just what work is being is being done by the CPU & GPUs.
I don't know if you'll ever switch to using Linux, but I thought I'd share a bit of what is beneficial.
Proud member of the Old Farts Association
they are listed as "A" and
)
they are listed as "A" and "B" because that is the actual busid in hex, not just random or because the name field is too long. your system just happens to use 0A and 0B as the hex busids. you can see the bus id listed in your second command which is exactly the same as the normal nvidia-smi command. (ie, 00000000:0A:00.0). the 535 series drivers actually expanded the width of the output so you can see the full device name rather than cutting it off with (...).
GPU-Util = GPU utilization (not utility), how much it is being used in terms of a percentage.
_________________________________________________________________________
Gary Roberts wrote: Allen
)
Gary,
Everything is running well, except I cut the running tasks a bit to short on my RX580 and it stopped running. I had to hurry and release some of the Tasks. Running five tasks, they runout pretty fast.
I am thinking of cutting tasks to 4 and see what happens. I think in the beginning I did check the run times of different numbers of tasks running at once, but I could easily be mistaken. When Milkyway shutdown, I was moving pretty fast to E@H and may have taken some shortcuts that are biting me in the butt now.
Other than some very minor tweaks, I am leaving everything alone for now.
The two 560's machine is in fact running 4 total tasks at this time. I am using local prefs and I do have an app_config file for it. Times are still in the 50's.
Thanks, Allen
George, Does this work
)
George,
Does this work with the ATI cards too? I only have Linux on one machine and it has an RX580 in it.
Nice bunch of info. Btw, I am reducing the number of tasks on this machine to see what happens with the timing, thanks!
Allen