Running FC41 with clang18-libs package installed, compilation of ASGW tasks fails due to missing symbol:
ld.lld: error: undefined symbol: __printf_alloc
Here is an example of failed compilation job:
All-Sky Gravitational Wave search on O3 v1.07 (GW-opencl-ati-2)
https://einsteinathome.org/task/1688515297
The binary search works fine however:
Binary Radio Pulsar Search (MeerKAT) v0.17 (BRP7-opencl-ati)
https://einsteinathome.org/task/1683420145
Any suggestion what could be wrong here?
How to debug such issues?
Copyright © 2024 Einstein@Home. All rights reserved.
What GPU? What drivers do you
)
What drivers do you have installed? Your host(s) is hidden on the website so we have no way to inspect the host details or more details from the failed tasks that might be helpful.
_________________________________________________________________________
Task
)
Task 1688515297
...
Failed OpenCL buildlog:
ld.lld: error: undefined symbol: __printf_alloc
>>> referenced by /tmp/comgr-6ce7dc/input/linked.bc.o:(XLALLoopOverCoarseGridFrequencyBins)
>>> referenced by /tmp/comgr-6ce7dc/input/linked.bc.o:(XLALLoopOverCoarseGridFrequencyBins)
Error: Creating the executable from LLVM IRs failed.
XLAL Error - XLALOpenCLGetProgramFromSource (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalpulsar/lib/GPUUtils/OpenCLUtils.c:705): clBuildProgram failed with OpenCL error: CL_BUILD_PROGRAM_FAILURE
XLAL Error - XLALOpenCLGetProgramFromSource (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalpulsar/lib/GPUUtils/OpenCLUtils.c:705): Generic failure
XLAL Error - XLALGCTOpenCLKernelsSetup (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT_OpenCL.c:212): Check failed: XLALOpenCLGetProgramFromSource ( source, &(GCTOpenCLKernels.HierarchSearchGCTProgramm) ) == XLAL_SUCCESS
XLAL Error - XLALGCTOpenCLKernelsSetup (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT_OpenCL.c:212): Internal function call failed: Generic failure
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1394): Check failed: XLALGCTOpenCLKernelsSetup( uvar->SortToplist, uvar->getMaxFperSeg, uvar->computeBSGL, detectorIDs, usefulParams.BSGLsetup ) == XLAL_SUCCESS
XLAL Error - MAIN (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalapps/src/pulsar/GCT/HierarchSearchGCT.c:1394): Internal function call failed: Generic failure
2024-11-28 13:00:47.2900 (131814) [CRITICAL]: ERROR: MAIN() returned with error '-1'
Code-version: %% LAL: 7.1.4.1 (CLEAN )
%% LALPulsar: 3.1.0.1 (CLEAN )
%% LALApps: 7.3.0.1 (CLEAN )
...
What drivers do you have
)
name: amdgpu
vermagic: 6.11.8-300.fc41.x86_64 SMP preempt mod_unload
rocm-clinfo
The NVIDIA build works
)
The NVIDIA build works perfectly fine: https://einsteinathome.org/task/1688586317
reanimator wrote: What
)
I asked what drivers are installed on the W5700 system producing the errors, this appears to be the clinfo from your Radeon VII/Nvidia system which has not completed any tasks.
_________________________________________________________________________
reanimator - just unhide your
)
reanimator - just unhide your computers.
That way you don't have to litter up the forum.
cheers
sfv
That way you don't have to
)
The reports should publicly available now.
I asked what drivers are
)
The amdgpu kernel module is use. It is part of signed Linux kernel 6.11.8-300.fc41.x86_64. No extra DKMS driver from ROCm package is installed.
@Ian&Steve C. What kind of the information are you looking for?
I guess some LLVM/Clang library is missing that causes ld.lld linker to produce no valid application due to missing function __printf_alloc.
The built in Linux drivers
)
again you’re showing info from the wrong system. I’m not sure how else to get that across. You have two computers, one with a W5700 and one with a Radeon VII, and both times I’m asking for the drivers installed on the W5700 system and both times you’ve replied with info from the Radeon VII system.
But the built in Linux drivers usually don’t work well on Einstein from what I’ve seen. Try installing the ROCM drivers or the driver installer from AMD’s website for your hardware.
_________________________________________________________________________
again you’re showing info
)
Sorry for the confusion. Yes, indeed I do have two system with the same FC41 OS installed running the same kernel version. On both systems I observe the same failure with ASGWS tasks. So in respect of installed driver the systems are identical.