Thanks for the lesson on PATH and ./ Gary. I had the PATH to my bin directory already set, but no bin directory, so I made that. The problem now is that when I execute $ ohgodatool on the compiled file, I get "Permission denied". When I do $ sudo ohgodatool , I get, "sudo: ohgodatool: command not found", but sudo works for me for other things. (Same results also when I execute the commands with ./ from within the original folder on my desktop.)
Permissions for the ohgodatool binary are -rw-rw-r--, which seems okay, right?
The master directory of the ohgodatool C file also had two other files, ohgodatool-args.c and ohgodatool-utils.c, which I compiled separately. I don't know whether I am supposed to pack those into one compiled file somehow? There is no documentation for this !@#$%^ program that the cryptominers seem to use with aplomb.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
I was reading through the documentation for a similar program on Github, amdcovc, that says "NOTE for AMD Crimson/Catalyst drivers: If no X11 server is running, then this program requires root privileges." So maybe that ohgodatool program needs to be run from root? But, because amdcovd had actual documentation for installing and running it and it is much newer than THAT OTHER program, I might give it a go instead of trying to decipher THAT OTHER program.
In either case, how does one run a program from root?
Ideas are not fixed, nor should they be; we live in model-dependent reality.
In either case, how does one run a program from root?
root has a couple of meanings :
(a) the user root.
On the command line ie. when interacting via a terminal screen : prefix any command by sudo and then a space. You will be asked for your root password and then, if you entered that correctly, whatever command follows will be executed with all privileges. I think of sudo as 'super-user do'.
If you want to be superuser for a while eg. the remainder of a terminal session : then just type su and hit return, you will be asked for the password and if correct you will be super-user/root. All commands that you now use will be treated as issued by root.
(b) the directory root.
Prefix the path with / eg. alpha in the root directory is /alpha
Linux is by default such that you have to be root to run a program in root's directory spaces anyway.
Now from context : "NOTE for AMD Crimson/Catalyst drivers: If no X11 server is running, then this program requires root privileges." implies you have to be super-user/root to get it to work right. But
"Permissions for the ohgodatool binary are -rw-rw-r--, which seems okay, right?"
is not going to run for anybody as the execute bits are not set. You want ( for example superuser alone to use ) -rwxrw-r-- instead. Use chmod for this eg.
sudo chmod 764 ohgodatool
chmod sets the bits as if they were a group of three octal numbers. So 7 = 111 yielding rwx, 6 = 110 gives rw-, 4 = 100 gives r-- where a '-' means binary zero.
Cheers, Mike.
( edit ) FWIW some googling seems to indicate that while ohgodatool is capable of setting values in some internal table on the video card, it doesn't mean those settings will be honored. I think you have entered There Be Dragons territory.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
You can't run something unless the execute permission bit (x) is set, so you need the binary to show -rwxrwxr-x. You can change permissions to exactly that using the command 'chmod 775 <filename>'. Each digit in the group of three is constructed on the basis that r=4 w=2 x=1. As an example 644 gives -rw-r--r--. Ignoring the very first bit (it's special) the other three groups of 3 bits set the permissions for the owner, other members of the group the owner belongs to, and everyone else in the world, respectively. If you want to know the full power of chmod try 'man chmod' to read its manual page.
However, before you do that, a *strong* word of caution. I just had a look at the github link you gave. There is a README there but it doesn't really tell you anything - except that there are other tools "in the branches" including one that is "paid for in BTC".
There are three C programs and some header files but no 'makefile' - a recipe for properly compiling and linking stuff, for setting permissions and installing to the right place, etc. - so without that, or any information about what each program does and the arguments you need to supply when running them, I would think you are at risk of bricking your GPU if you blindly proceed, hoping for the best.
Can you document exactly what you did to compile and link the source? Did you look through the screen output for any warnings or error messages? A good thing to do now would be to take a long listing of your build directory and save the output in a file in the parent directory so as not to add this file to the contents of the build directory. Just go to the build directory and use the command 'ls -l > ../files.lst'. Then use 'cd ..' to go up to the parent directory and you will find a new file (files.lst) there. If you copy and paste the contents of that file into a message (use code tags to preserve spacing) we might be able understand what was built.
What you should do is go to where you found out about this tool and ask for information on what exactly it does, what arguments it uses, what risks there are when using it, etc. You need to satisfy yourself that the tool will work for your particular GPU model(s) and what the benefits/risks might really be.
PS: As I was about to post the above, I did a check and can see that you have posted again about some other tool. Since I've already made the effort, I'll leave the above stand and ask for more information (a link) to the next thing you have found :-).
Please note that running things as root is a really, REALLY, ***REALLY*** bad idea unless you are absolutely sure it's absolutely necessary and you know exactly WHY it's necessary. Running random stuff you got off the internet somewhere is a big enough security risk without compounding the risk by running it as root. Because you're running a ubuntu variant, you can run with root privileges by prefacing the command you want to run with 'sudo'. Try 'man sudo'.
You need to understand the comment about Catalyst drivers and "no X11 server running". Firstly, if you have a graphical desktop environment running you ARE running X11. Secondly, Catalyst drivers (aka fglrx) are the old proprietary (and now deprecated since about mid-2016) drivers for older AMD GPUs and NOT your current Polaris GPUs.
I should have posted the first part of my earlier reply immediately and then created the PS in a further post.. It took a lot longer than I had intended to do the PS since I also started to try to find a good basic introduction to the Linux shell (bash). I didn't notice that Mike had also replied when I eventually posted my combined answer to both your posts.
If you are keen to do more than just launch BOINC and basically let things take care of themselves, you will need to master the command line. My first experience was when I used the Bourne shell back in the late 1970s/1980s in a University environment so it was very easy for me after I retired and decided to investigate setting up a crunching farm using Linux. The bash shell is just the Bourne shell on steroids so, for me, the hard part was choosing and configuring the desktop rather than coping with the command line :-).
I found a recommendation for this introduction to the shell and it looks very suitable for the purpose. I browsed some of the stuff on the website and it seems very clear and understandable. I haven't gone any further than the website itself but the really interesting thing is that there is a link to a free downloadable version of a 555 page book in pdf format (by the same author) which supposedly covers the whole topic in much greater detail than what is on the website. When I get some time, I might download it and see how useful it is.
I found a recommendation for this introduction to the shell and it looks very suitable for the purpose.
Thank you also Gary for the explanations and guidance and the link to that command line book. I'm looking forward to digging into it.
The amdcovc program https://github.com/matszpk/amdcovc does have a makefile, but when I tried it, it threw an error when it couldn't find an #include file. However...
For the moment I'm not going to pursue that or installing ohgodatool until I do more research on Linux AMD utilities (and bone up on Linux basics) - I don't want run into any of Mike's Dragons or try out Gary's ***REALLY*** bad ideas. In the meanwhile, I think I'll put the Win7 HDD back in that host just so I can get both GPUs running cooler with power limiting and underclocking and hope I eventually learn to do similar GPU tweaks in the Ubuntu system.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Okay, after sleeping on it, processing what y'all told me, and doing some reading on C compiling, I got this to work from within the ohgodatool-master directory on my desktop:
$ cc ohgodatool.c ohgodatool-args.c ohgodatool-utils.c -o ohgodatool -lm
(It didn't work without the -lm option)
then *tada*:
$ ./ohgodatool
OhGodATool v1.2.1
Usage: ./ohgodatool [-i GPUIdx | -f VBIOSFile] [Generic Options] [--core-state StateIdx] [--mem-state StateIdx] [--volt-state StateIdx] [State modification options]
Generic modification options:
--set-fanspeed <percent>
--set-tdp <W>
--set-tdc <W>
--set-max-power <W>
--set-max-core-clock <Mhz>
--set-max-mem-clock <Mhz>
State selection options (must be used before state modification options):
--core-state <index>
--mem-state <index>
--volt-state <index>
State modification options:
--mem-clock <Mhz>
--core-clock <Mhz>
--mem-vddc-idx <index>
--core-vddc-idx <index>
--mvdd <mV>
--vddci <mV>
--core-vddc-off <mV>
--vddc-gfx-off <mV>
--vddc-table-set <mV>
Display options (shows the selected states, or if none selected, all states):
--show-mem
--show-core
--show-voltage
--show-fanspeed
--show-temp
All I really wanted to do, for the time being, was mimic what I had done under Windows 7 with AMD's WattMan utility, which was underclock (cap the performance states of) my two AMD GPUs (RX570 & RX460). I knew from simple tweeks in WattMan that I could get cooler running GPUs without much of a performance hit for running E@H tasks.
After reading the ROC-smi documentation, it appeared that capping the GPU clock speed to a particular state was pretty straight forward, without the need for ohgodatool, so, trusting the AMD ROC folks, I downloaded it and unpacked it on my desktop. No need to compile anything; the python program just does its thing as is.
From within that folder, following the README.md documentation, I got a read-out of both GPU settings with:
~/Desktop/ROC-smi-master$ ./rocm-smi -a
, which, among lots of other metrics, included clock speed tables that I recognized from the WattMan utility. Here is the table for the RX570 while it was running E@H:
GPU[0] : Supported GPU clock frequencies on GPU0
GPU[0] : 0: 300Mhz
GPU[0] : 1: 588Mhz
GPU[0] : 2: 952Mhz
GPU[0] : 3: 1076Mhz
GPU[0] : 4: 1143Mhz
GPU[0] : 5: 1208Mhz
GPU[0] : 6: 1250Mhz
GPU[0] : 7: 1286Mhz *
indicating that it was currently running at top speed, state 7.
So to cap the clock speed at what I know worked under Windows, I just:
~/Desktop/ROC-smi-master$ ./rocm-smi -d 0 --setsclk 0 1 2 3 4 5
which essentially told the RX570 (device 0) to not use any performance state above 5.
Prior to this modification, while running E@H with the default GPU settings (max clock speed), I knew what the RX570 had been pulling:
$ sudo cat /sys/kernel/debug/dri/0/amdgpu_pm_info | grep GPU
124.88 W (average GPU)
GPU Temperature: 81 C
GPU Load: 90 %
and now, after the ROC-smi modification:
103.247 W (average GPU)
GPU Temperature: 76 C
GPU Load: 100 %
..shaving off 20W!
I did the same sort of thing for the RX460. Tomorrow I'll look at the E@H task run times and compare them to pre-mod times and compare power use of the host.
I may sometime play around with fan speeds and voltage levels to eke out further efficiency gains, but for now I'm pleased as punch. I'll have to use ROC-smi to reset clock speeds after each system reboot, but I had to do that anyway with AMD's WattMan under Windows.
I've learned a lot, thanks for everyone's help. Glad I made the switch to Linux/Lubuntu.
(Full disclosure: before issuing any rocm-smi code, I clicked on the test-rocm-smi.sh file that came with the package. That launched a series of tests, one of which put some case fan or GPU fan into a tizzy. After the shell script completed the fan was still going all out, so I rebooted the system. Everything was fine after that.)
Ideas are not fixed, nor should they be; we live in model-dependent reality.
update: hang on, there is an issue with the rx570 (device 0) reverting to default clock settings (using all 7 available states, max speed 1286 MHz) within 40-50 minutes of underclocking it (limiting it to the first 5 states, max speed 1208 MHz). For the rx460 (device 1), the modified clock settings (limiting to first 4 states, max speed 1102 MHz) are stable. Trying to sort that out...
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Quest completed. With nary a scratch, the Dragons were slain! No permissions had to be changed. And I didn't use ohgodatool.
I am in complete shock and awe as to how you managed to slay the beast and come out of that hellish inferno completely unscathed :-). I know, ... you must have had some guru knight do it for you, ... so time to 'fess up now ;-).
Seriously, that's one helluva completed 'quest' for a supposed Linux noob. An extremely impressive outcome!
One of the things I really like about AMD is the way they are embracing open source development. They had to do something to catch up and become competitive with nvidia and although it's taken a while, it's really starting to pay off now. The Radeon Open Compute (ROC) initiative should lead to good things in the future as it matures.
With regard to keeping your temperatures in check, there are two factors to consider. As you've already found out, frequency is one, but core voltage may be an even bigger one. From what I've read, you may get a bigger benefit by a modest undervolting, perhaps in combination with a smaller frequency reduction. I don't have hands-on experience. Whilst I do read what people do, I have far too many disparate GPUs and disparate hosts in which they are installed to cope with the effort of trying to fine tune them all. I decided a long time ago to just use stock conditions and forced ventilation to keep them cool enough to survive. They're doing quite OK so far.
With regard to needing to re-apply your settings with each reboot, you can just put the commands in a script which can be invoked automatically as part of the reboot process. Also, as Keith mentioned somewhere recently, if you want added convenience in tweaking all sorts of stuff to do with BOINC, you could bypass the way your distro sets up everything in a special location with separate user and group of boinc:boinc and just install under your home directory with your standard user owning everything. You then don't have to worry about being careful to preserve the separate boinc ownership and permissions regime. I really know nothing about Ubuntu and how things are organised there but I'm sure it would be fairly simple to make the transition - particularly for a flourishing Linux Pro like yourself :-).
Thanks for the lesson on PATH
)
Thanks for the lesson on PATH and ./ Gary. I had the PATH to my bin directory already set, but no bin directory, so I made that. The problem now is that when I execute $ ohgodatool on the compiled file, I get "Permission denied". When I do $ sudo ohgodatool , I get, "sudo: ohgodatool: command not found", but sudo works for me for other things. (Same results also when I execute the commands with ./ from within the original folder on my desktop.)
Permissions for the ohgodatool binary are -rw-rw-r--, which seems okay, right?
The master directory of the ohgodatool C file also had two other files, ohgodatool-args.c and ohgodatool-utils.c, which I compiled separately. I don't know whether I am supposed to pack those into one compiled file somehow? There is no documentation for this !@#$%^ program that the cryptominers seem to use with aplomb.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
I was reading through the
)
I was reading through the documentation for a similar program on Github, amdcovc, that says "NOTE for AMD Crimson/Catalyst drivers: If no X11 server is running, then this program requires root privileges." So maybe that ohgodatool program needs to be run from root? But, because amdcovd had actual documentation for installing and running it and it is much newer than THAT OTHER program, I might give it a go instead of trying to decipher THAT OTHER program.
In either case, how does one run a program from root?
Ideas are not fixed, nor should they be; we live in model-dependent reality.
cecht wrote:In either case,
)
root has a couple of meanings :
(a) the user root.
On the command line ie. when interacting via a terminal screen : prefix any command by sudo and then a space. You will be asked for your root password and then, if you entered that correctly, whatever command follows will be executed with all privileges. I think of sudo as 'super-user do'.
If you want to be superuser for a while eg. the remainder of a terminal session : then just type su and hit return, you will be asked for the password and if correct you will be super-user/root. All commands that you now use will be treated as issued by root.
(b) the directory root.
Prefix the path with / eg. alpha in the root directory is /alpha
Linux is by default such that you have to be root to run a program in root's directory spaces anyway.
Now from context : "NOTE for AMD Crimson/Catalyst drivers: If no X11 server is running, then this program requires root privileges." implies you have to be super-user/root to get it to work right. But
"Permissions for the ohgodatool binary are -rw-rw-r--, which seems okay, right?"
is not going to run for anybody as the execute bits are not set. You want ( for example superuser alone to use ) -rwxrw-r-- instead. Use chmod for this eg.
sudo chmod 764 ohgodatool
chmod sets the bits as if they were a group of three octal numbers. So 7 = 111 yielding rwx, 6 = 110 gives rw-, 4 = 100 gives r-- where a '-' means binary zero.
Cheers, Mike.
( edit ) FWIW some googling seems to indicate that while ohgodatool is capable of setting values in some internal table on the video card, it doesn't mean those settings will be honored. I think you have entered There Be Dragons territory.
I have made this letter longer than usual because I lack the time to make it shorter ...
... and my other CPU is a Ryzen 5950X :-) Blaise Pascal
You can't run something
)
You can't run something unless the execute permission bit (x) is set, so you need the binary to show -rwxrwxr-x. You can change permissions to exactly that using the command 'chmod 775 <filename>'. Each digit in the group of three is constructed on the basis that r=4 w=2 x=1. As an example 644 gives -rw-r--r--. Ignoring the very first bit (it's special) the other three groups of 3 bits set the permissions for the owner, other members of the group the owner belongs to, and everyone else in the world, respectively. If you want to know the full power of chmod try 'man chmod' to read its manual page.
However, before you do that, a *strong* word of caution. I just had a look at the github link you gave. There is a README there but it doesn't really tell you anything - except that there are other tools "in the branches" including one that is "paid for in BTC".
There are three C programs and some header files but no 'makefile' - a recipe for properly compiling and linking stuff, for setting permissions and installing to the right place, etc. - so without that, or any information about what each program does and the arguments you need to supply when running them, I would think you are at risk of bricking your GPU if you blindly proceed, hoping for the best.
Can you document exactly what you did to compile and link the source? Did you look through the screen output for any warnings or error messages? A good thing to do now would be to take a long listing of your build directory and save the output in a file in the parent directory so as not to add this file to the contents of the build directory. Just go to the build directory and use the command 'ls -l > ../files.lst'. Then use 'cd ..' to go up to the parent directory and you will find a new file (files.lst) there. If you copy and paste the contents of that file into a message (use code tags to preserve spacing) we might be able understand what was built.
What you should do is go to where you found out about this tool and ask for information on what exactly it does, what arguments it uses, what risks there are when using it, etc. You need to satisfy yourself that the tool will work for your particular GPU model(s) and what the benefits/risks might really be.
PS: As I was about to post the above, I did a check and can see that you have posted again about some other tool. Since I've already made the effort, I'll leave the above stand and ask for more information (a link) to the next thing you have found :-).
Please note that running things as root is a really, REALLY, ***REALLY*** bad idea unless you are absolutely sure it's absolutely necessary and you know exactly WHY it's necessary. Running random stuff you got off the internet somewhere is a big enough security risk without compounding the risk by running it as root. Because you're running a ubuntu variant, you can run with root privileges by prefacing the command you want to run with 'sudo'. Try 'man sudo'.
You need to understand the comment about Catalyst drivers and "no X11 server running". Firstly, if you have a graphical desktop environment running you ARE running X11. Secondly, Catalyst drivers (aka fglrx) are the old proprietary (and now deprecated since about mid-2016) drivers for older AMD GPUs and NOT your current Polaris GPUs.
Cheers,
Gary.
Craig, I should have posted
)
Craig,
I should have posted the first part of my earlier reply immediately and then created the PS in a further post.. It took a lot longer than I had intended to do the PS since I also started to try to find a good basic introduction to the Linux shell (bash). I didn't notice that Mike had also replied when I eventually posted my combined answer to both your posts.
If you are keen to do more than just launch BOINC and basically let things take care of themselves, you will need to master the command line. My first experience was when I used the Bourne shell back in the late 1970s/1980s in a University environment so it was very easy for me after I retired and decided to investigate setting up a crunching farm using Linux. The bash shell is just the Bourne shell on steroids so, for me, the hard part was choosing and configuring the desktop rather than coping with the command line :-).
I found a recommendation for this introduction to the shell and it looks very suitable for the purpose. I browsed some of the stuff on the website and it seems very clear and understandable. I haven't gone any further than the website itself but the really interesting thing is that there is a link to a free downloadable version of a 555 page book in pdf format (by the same author) which supposedly covers the whole topic in much greater detail than what is on the website. When I get some time, I might download it and see how useful it is.
Cheers,
Gary.
Mike Hewson wrote:root has a
)
Thanks for the explanations! Cool stuff.
Thank you also Gary for the explanations and guidance and the link to that command line book. I'm looking forward to digging into it.
The amdcovc program https://github.com/matszpk/amdcovc does have a makefile, but when I tried it, it threw an error when it couldn't find an #include file. However...
For the moment I'm not going to pursue that or installing ohgodatool until I do more research on Linux AMD utilities (and bone up on Linux basics) - I don't want run into any of Mike's Dragons or try out Gary's ***REALLY*** bad ideas. In the meanwhile, I think I'll put the Win7 HDD back in that host just so I can get both GPUs running cooler with power limiting and underclocking and hope I eventually learn to do similar GPU tweaks in the Ubuntu system.
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Okay, after sleeping on it,
)
Okay, after sleeping on it, processing what y'all told me, and doing some reading on C compiling, I got this to work from within the ohgodatool-master directory on my desktop:
$ cc ohgodatool.c ohgodatool-args.c ohgodatool-utils.c -o ohgodatool -lm
(It didn't work without the -lm option)
then *tada*:
$ ./ohgodatool
OhGodATool v1.2.1
Usage: ./ohgodatool [-i GPUIdx | -f VBIOSFile] [Generic Options] [--core-state StateIdx] [--mem-state StateIdx] [--volt-state StateIdx] [State modification options]
Generic modification options:
--set-fanspeed <percent>
--set-tdp <W>
--set-tdc <W>
--set-max-power <W>
--set-max-core-clock <Mhz>
--set-max-mem-clock <Mhz>
State selection options (must be used before state modification options):
--core-state <index>
--mem-state <index>
--volt-state <index>
State modification options:
--mem-clock <Mhz>
--core-clock <Mhz>
--mem-vddc-idx <index>
--core-vddc-idx <index>
--mvdd <mV>
--vddci <mV>
--core-vddc-off <mV>
--vddc-gfx-off <mV>
--vddc-table-set <mV>
Display options (shows the selected states, or if none selected, all states):
--show-mem
--show-core
--show-voltage
--show-fanspeed
--show-temp
---------------------
and...
$ sudo ./ohgodatool -i 0 --show-core
DPM state 0:
VDDC: 750 (voltage table entry 0)
VDDC offset: 0
Core clock: 300
DPM state 1:
VDDC: 65282 (voltage table entry 1)
VDDC offset: -26
Core clock: 588
DPM state 2:
VDDC: 65283 (voltage table entry 2)
VDDC offset: -26
Core clock: 952
DPM state 3:
VDDC: 65284 (voltage table entry 3)
VDDC offset: -26
Core clock: 1076
DPM state 4:
VDDC: 65285 (voltage table entry 4)
VDDC offset: -26
Core clock: 1143
DPM state 5:
VDDC: 65286 (voltage table entry 5)
VDDC offset: -26
Core clock: 1208
DPM state 6:
VDDC: 65287 (voltage table entry 6)
VDDC offset: -26
Core clock: 1250
DPM state 7:
VDDC: 65288 (voltage table entry 7)
VDDC offset: 0
Core clock: 1286
-------------------------------
With lance in hand I will now slay some Dragons. (If you don't hear back from me, then, well, you'll know how it ended.)
Ideas are not fixed, nor should they be; we live in model-dependent reality.
Quest completed. With nary a
)
Quest completed. With nary a scratch, the Dragons were slain! No permissions had to be changed. And I didn't use ohgodatool.
There was one on-line discussion that put me on the right track, perhaps the one Gary referred to earlier: https://www.reddit.com/r/gpumining/comments/903lqt/underclock_undervolt_with_ohgodatool/, where I saw a comment that the way to get ohgodatool to work was to invoke another program called ROC-smi.
That program is from the Radeon Open Computing group on GitHub https://github.com/RadeonOpenCompute/ROC-smi.
All I really wanted to do, for the time being, was mimic what I had done under Windows 7 with AMD's WattMan utility, which was underclock (cap the performance states of) my two AMD GPUs (RX570 & RX460). I knew from simple tweeks in WattMan that I could get cooler running GPUs without much of a performance hit for running E@H tasks.
After reading the ROC-smi documentation, it appeared that capping the GPU clock speed to a particular state was pretty straight forward, without the need for ohgodatool, so, trusting the AMD ROC folks, I downloaded it and unpacked it on my desktop. No need to compile anything; the python program just does its thing as is.
From within that folder, following the README.md documentation, I got a read-out of both GPU settings with:
~/Desktop/ROC-smi-master$ ./rocm-smi -a
, which, among lots of other metrics, included clock speed tables that I recognized from the WattMan utility. Here is the table for the RX570 while it was running E@H:
GPU[0] : Supported GPU clock frequencies on GPU0
GPU[0] : 0: 300Mhz
GPU[0] : 1: 588Mhz
GPU[0] : 2: 952Mhz
GPU[0] : 3: 1076Mhz
GPU[0] : 4: 1143Mhz
GPU[0] : 5: 1208Mhz
GPU[0] : 6: 1250Mhz
GPU[0] : 7: 1286Mhz *
indicating that it was currently running at top speed, state 7.
So to cap the clock speed at what I know worked under Windows, I just:
~/Desktop/ROC-smi-master$ ./rocm-smi -d 0 --setsclk 0 1 2 3 4 5
which essentially told the RX570 (device 0) to not use any performance state above 5.
Prior to this modification, while running E@H with the default GPU settings (max clock speed), I knew what the RX570 had been pulling:
$ sudo cat /sys/kernel/debug/dri/0/amdgpu_pm_info | grep GPU
124.88 W (average GPU)
GPU Temperature: 81 C
GPU Load: 90 %
and now, after the ROC-smi modification:
103.247 W (average GPU)
GPU Temperature: 76 C
GPU Load: 100 %
..shaving off 20W!
I did the same sort of thing for the RX460. Tomorrow I'll look at the E@H task run times and compare them to pre-mod times and compare power use of the host.
I may sometime play around with fan speeds and voltage levels to eke out further efficiency gains, but for now I'm pleased as punch. I'll have to use ROC-smi to reset clock speeds after each system reboot, but I had to do that anyway with AMD's WattMan under Windows.
I've learned a lot, thanks for everyone's help. Glad I made the switch to Linux/Lubuntu.
(Full disclosure: before issuing any rocm-smi code, I clicked on the test-rocm-smi.sh file that came with the package. That launched a series of tests, one of which put some case fan or GPU fan into a tizzy. After the shell script completed the fan was still going all out, so I rebooted the system. Everything was fine after that.)
Ideas are not fixed, nor should they be; we live in model-dependent reality.
update: hang on, there is an
)
update: hang on, there is an issue with the rx570 (device 0) reverting to default clock settings (using all 7 available states, max speed 1286 MHz) within 40-50 minutes of underclocking it (limiting it to the first 5 states, max speed 1208 MHz). For the rx460 (device 1), the modified clock settings (limiting to first 4 states, max speed 1102 MHz) are stable. Trying to sort that out...
Ideas are not fixed, nor should they be; we live in model-dependent reality.
cecht wrote:Quest completed.
)
I am in complete shock and awe as to how you managed to slay the beast and come out of that hellish inferno completely unscathed :-). I know, ... you must have had some guru knight do it for you, ... so time to 'fess up now ;-).
Seriously, that's one helluva completed 'quest' for a supposed Linux noob. An extremely impressive outcome!
One of the things I really like about AMD is the way they are embracing open source development. They had to do something to catch up and become competitive with nvidia and although it's taken a while, it's really starting to pay off now. The Radeon Open Compute (ROC) initiative should lead to good things in the future as it matures.
With regard to keeping your temperatures in check, there are two factors to consider. As you've already found out, frequency is one, but core voltage may be an even bigger one. From what I've read, you may get a bigger benefit by a modest undervolting, perhaps in combination with a smaller frequency reduction. I don't have hands-on experience. Whilst I do read what people do, I have far too many disparate GPUs and disparate hosts in which they are installed to cope with the effort of trying to fine tune them all. I decided a long time ago to just use stock conditions and forced ventilation to keep them cool enough to survive. They're doing quite OK so far.
With regard to needing to re-apply your settings with each reboot, you can just put the commands in a script which can be invoked automatically as part of the reboot process. Also, as Keith mentioned somewhere recently, if you want added convenience in tweaking all sorts of stuff to do with BOINC, you could bypass the way your distro sets up everything in a special location with separate user and group of boinc:boinc and just install under your home directory with your standard user owning everything. You then don't have to worry about being careful to preserve the separate boinc ownership and permissions regime. I really know nothing about Ubuntu and how things are organised there but I'm sure it would be fairly simple to make the transition - particularly for a flourishing Linux Pro like yourself :-).
Cheers,
Gary.