einsteinbinary_: page allocation failure

Cat22
Cat22
Joined: 13 May 21
Posts: 28
Credit: 921320068
RAC: 1528774

I am using the run file and

I am using the run file and my gcc is 14.2.0

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18727012524
RAC: 6648035

Did you read through that

Did you read through that opensuse thread I linked?

Seems either changing to X.org from Wayland is a possible fix.

The other is removal of the fbdev=1 from the kernel command line or the 

/usr/lib/modprobe.d/50-nvidia-default.conf file.

 

Cat22
Cat22
Joined: 13 May 21
Posts: 28
Credit: 921320068
RAC: 1528774

I have been sing X only so

I have been sing X only so far. I eliminated the einstein stuff for now and with milywatathome running I ran over 24 hours and counting with no issue. There may be a problem with the nvidia driver but for sure there is  something the einstein binary is doing that's causing problems, In my initial post i was hoping an einstein developer would look at it and maybe glean some info that would help identify the root cause. Rolling back nvidia tells me that there is some new feature in the driver that einstein interacts with somehow. Unfortunately for me, I cant get the driver to compile when i get back to version 555.xxx. Some functions are different or have been removed in the newer kernels compared to what the nvidia driver expects

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18727012524
RAC: 6648035

The issue is with the nvidia

The issue is with the nvidia drivers.  Nvidia did a stoopid and told developers to do one thing and then they changed their minds mid-way and did something completely different and broke many things that all the devs were expecting to work.

 

Cat22
Cat22
Joined: 13 May 21
Posts: 28
Credit: 921320068
RAC: 1528774

Any idea what release they

Any idea what release they are going to fix it in?

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18727012524
RAC: 6648035

I don't know as the nvidia

I don't know as the nvidia drivers are maintained and released by each distro independently of Nvidia.  The only way to choose which Nvidia driver to use is to go straight to Nvidia and download the .run file for the release you are interested in and run it.

The direct from Nvidia releases are updated fastest and have the corrected codes.  But again, how the drivers interact with the graphic subsystem installed by the distro can interact incorrectly and you are again in a big mess.

Best is to inquire in your help forums for your distro about solutions they are working on and when they are to be deployed.

 

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 4964
Credit: 18727012524
RAC: 6648035

Just wanted to point out that

Just wanted to point out that I haven't heard of any issues with the Nvidia drivers, since they updated to the 550.120 drivers with the fbdev fix from any other distros like Ubuntu that I am using or the Mint or Arch camps.

Seems just Suse users are having the issues. The Suse distro maintainers are slow to incorporate the changes or something. 

No blame to be put on Einstein here.

 

Cat22
Cat22
Joined: 13 May 21
Posts: 28
Credit: 921320068
RAC: 1528774

I ran some cuda apps that put

I ran some cuda apps that put a load on both gpu's, in fact i got both gpu loads up over 95%. I let that run for over 30 hours and there was no issue  showing up in dmesg. I wonder what it is that einstein does that kicks off the errors i see? BTW I run einstein on 2 other pc's that have identical os's but the hardware is i9-9900ks and different gpu's (RTX3070 and GTX1660ti on both) instead of my problem child which is a i9-13900kf with 2 RTX 2060 gpu's.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.