I'm getting the following error on a number of WU's for GPU. A bunch have also completed successfully, but I had a run of crashes yesterday that made me suspend GPU work. I'm new to Linux, so I was wondering if I was missing some special Lib or something. Any suggestions appreciated. Computer is:
<core_client_version>7.16.17</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)</message>
<stderr_txt>
putenv 'LAL_DEBUG_LEVEL=3'
2022-02-07 13:24:35.5589 (65335) [normal]: This program is published under the GNU General Public License, version 2
2022-02-07 13:24:35.5590 (65335) [normal]: For details see http://einstein.phys.uwm.edu/license.php
2022-02-07 13:24:35.5590 (65335) [normal]: This Einstein@home App was built at: Aug 5 2021 17:20:50
2022-02-07 13:24:35.5590 (65335) [normal]: Start of BOINC application '../../projects/einstein.phys.uwm.edu/einstein_O3AS_1.01_x86_64-pc-linux-gnu__GW-opencl-nvidia'.
[DEBUG} GPU type: 1
[ERROR] Couldn't get OpenCL device from BOINC (-1)!
2022-02-07 13:24:35.5882 (65335) [debug]: Flags: LAL_DEBUG, OPTIMIZE, HS_OPTIMIZATION, GC_SSE2_OPT, X64, SSE, SSE2, GNUC X86 GNUX86
2022-02-07 13:24:35.5882 (65335) [debug]: glibc version/release: 2.34/stable
2022-02-07 13:24:35.588268 - mytime()
2022-02-07 13:24:35.5884 (65335) [debug]: Set up communication with graphics process.
einstein_O3AS_1.01_x86_64-pc-linux-gnu__GW-opencl-nvidia: unrecognized option `--device'
Usage: einstein_O3AS_1.01_x86_64-pc-linux-gnu__GW-opencl-nvidia [-h|--help] [-v|--version] [@<config-file>] [--log] [--semiCohToplist] [--DataFiles1] [--IFOs] [--skyRegion] [--numSkyPartitions] [--partitionIndex] [--skyGridFile] [--dAlpha] [--dDelta] [-f|--Freq] [--dFreq] [-b|--FreqBand] [--f1dot] [--df1dot] [--f1dotBand] [--f2dot] [--df2dot] [--f2dotBand] [--f3dot] [--df3dot] [--f3dotBand] [--peakThrF] [-m|--mismatch1] [--gridType1] [--metricType1] [-g|--gammaRefine] [-G|--gamma2Refine] [-o|--fnameout] [--fnameChkPoint] [-n|--nCand1] [--printCand1] [--refTime] [--ephemEarth] [--ephemSun] [--minStartTime1] [--maxStartTime1] [--printFstat1] [--assumeSqrtSX] [--nStacksMax] [-T|--tStack] [--segmentList] [--recalcToplistStats] [--loudestSegOutput] [--writeLeanerOutput] [--tlCompartments] [--computeBSGL] [--Fstar0sc] [--oLGX] [--getMaxFperSeg] [--SortToplist] [--FstatMethod] [--FstatMethodRecalc] [--injectionSources] [--injectSqrtSX] [--timestampsFiles] [--Tsft] [--useGPUSemiCoh] [--GPUDevice]
2022-02-07 13:24:35.5891 (65335) [CRITICAL]: ERROR: MAIN() returned with error '1'
DEPRECATION WARNING: program has invoked obsolete function XLALGetVersionString(). Please see XLALVCSInfoString() for information about a replacement.
Code-version: %% LAL: 6.21.0.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALPulsar: 1.18.2.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
%% LALApps: 6.25.1.1 (CLEAN 8d0838c264f9ff9adc8c3cdbfa17b5154eaa2994)
FPU status flags:
2022-02-07 13:24:35.5894 (65335) [debug]: worker done. return(1) to caller
2022-02-07 13:24:35.5894 (65335) [normal]: done. calling boinc_finish(1).
13:24:35 (65335): called boinc_finish
Copyright © 2024 Einstein@Home. All rights reserved.
[ERROR] Couldn't get OpenCL
)
[ERROR] Couldn't get OpenCL device from BOINC (-1)!
einstein_O3AS_1.01_x86_64-pc-linux-gnu__GW-opencl-nvidia: unrecognized option `--device'
Looks like you lost the OpenCL compute portion of the drivers.
A quick check with clinfo will confirm they have gone missing. sudo apt install clinfo
Either reload the Nvidia drivers or reinstall the OpenCL portion sudo apt install ocl-icd-libopencl1
It is probably due to a
)
It is probably due to a recent (unattended) upgrade to the Nvidia drivers (you can check the dpkg or apt log to be sure); a simple reboot should fix it.
Thanks! And yeah, I saw that
)
Thanks! And yeah, I saw that in the error and figured it had something to do with it, just wasn't sure how. Unfortunately, you were correct about clinfo. Number of Platforms = 0. Also unfortunately, no joy with sudo apt install ocl-icd-libopencl1. It said it was "already the newest version (2.2.14-2)". I'll have to search around the Nvidia drivers to see what's available.
When the drivers go
)
When the drivers go ka-bloooey for unknown reasons, it is often best and fastest to just do a purge of Nvidia drivers and reinstall.