Can't get GPU work (how to read scheduler logs?)

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4463
Credit: 3265752395
RAC: 1917399
Topic 198518

I am having trouble to get gpu work (Nvidia). My host is requesting work for Nvidia but nothing gets sent. Here's the scheduler log from on request. It suggest to me that abot 70000 seconds worth of work is requested but scheduler is not reacting to that. It is only checking situation for Intel GPU and for CPU. Currently CPU does not need new work and Intel GPU is not allowed to be used by my preferences. So what goes wrong? Or can someone explain how to read this log.

2016-03-19 17:44:00.5992 [PID=19226]   Request: [USER#xxxxx] [HOST#7964949] [IP xxx.xxx.xxx.75] client 7.6.22
2016-03-19 17:44:00.7143 [PID=19226]    [send] effective_ncpus 8 max_jobs_on_host_cpu 999999 max_jobs_on_host 999999
2016-03-19 17:44:00.7143 [PID=19226]    [send] effective_ngpus 3 max_jobs_on_host_gpu 999999
2016-03-19 17:44:00.7143 [PID=19226]    [send] Not using matchmaker scheduling; Not using EDF sim
2016-03-19 17:44:00.7143 [PID=19226]    [send] CPU: req 0.00 sec, 0.00 instances; est delay 0.00
2016-03-19 17:44:00.7143 [PID=19226]    [send] CUDA: req 70436.38 sec, 1.00 instances; est delay 0.00
2016-03-19 17:44:00.7143 [PID=19226]    [send] Intel GPU: req 0.00 sec, 0.00 instances; est delay 0.00
2016-03-19 17:44:00.7143 [PID=19226]    [send] work_req_seconds: 0.00 secs
2016-03-19 17:44:00.7144 [PID=19226]    [send] available disk 36.73 GB, work_buf_min 17280
2016-03-19 17:44:00.7144 [PID=19226]    [send] active_frac 0.998884 on_frac 0.999224 DCF 1.516099
2016-03-19 17:44:00.7153 [PID=19226]    [send] [HOST#7964949] is reliable
2016-03-19 17:44:00.7153 [PID=19226]    [send] set_trust: random choice for error rate 0.000010: yes
2016-03-19 17:44:00.7154 [PID=19226]    [mixed] sending locality work first (0.6233)
2016-03-19 17:44:00.7541 [PID=19226]    [version] Checking plan class 'X64O1F'
2016-03-19 17:44:00.7565 [PID=19226]    [version] reading plan classes from file '/BOINC/projects/EinsteinAtHome/plan_class_spec.xml'
2016-03-19 17:44:00.7566 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7566 [PID=19226]    [version] Don't need CPU jobs, skipping version 104 for einstein_O1AS20-100F (X64O1F)
2016-03-19 17:44:00.7566 [PID=19226]    [version] Checking plan class 'SSE2O1F'
2016-03-19 17:44:00.7566 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7567 [PID=19226]    [version] Don't need CPU jobs, skipping version 104 for einstein_O1AS20-100F (SSE2O1F)
2016-03-19 17:44:00.7567 [PID=19226]    [version] no app version available: APP#35 (einstein_O1AS20-100F) PLATFORM#9 (windows_x86_64) min_version 0
2016-03-19 17:44:00.7567 [PID=19226]    [version] no app version available: APP#35 (einstein_O1AS20-100F) PLATFORM#2 (windows_intelx86) min_version 0
2016-03-19 17:44:00.7569 [PID=19226]    [mixed] sending non-locality work second
2016-03-19 17:44:00.7736 [PID=19226]    [send] [HOST#7964949] will accept beta work.  Scanning for beta work.
2016-03-19 17:44:00.7876 [PID=19226]    [version] Checking plan class 'opencl-intel_gpu'
2016-03-19 17:44:00.7876 [PID=19226]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2016-03-19 17:44:00.7877 [PID=19226]    [version] [HOST#7964949] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 9.18.10.3165; platform version: OpenCL 1.2; device version: OpenCL 1.2
2016-03-19 17:44:00.7877 [PID=19226]    [version] Peak flops supplied: 4.48e+10
2016-03-19 17:44:00.7877 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7877 [PID=19226]    [version] Don't need Intel GPU jobs, skipping version 134 for einsteinbinary_BRP4 (opencl-intel_gpu)
2016-03-19 17:44:00.7877 [PID=19226]    [version] Checking plan class 'opencl-intel_gpu-Beta'
2016-03-19 17:44:00.7877 [PID=19226]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2016-03-19 17:44:00.7877 [PID=19226]    [version] [HOST#7964949] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 9.18.10.3165; platform version: OpenCL 1.2; device version: OpenCL 1.2
2016-03-19 17:44:00.7877 [PID=19226]    [version] Peak flops supplied: 4.48e+10
2016-03-19 17:44:00.7877 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7877 [PID=19226]    [version] Don't need Intel GPU jobs, skipping version 134 for einsteinbinary_BRP4 (opencl-intel_gpu-Beta)
2016-03-19 17:44:00.7877 [PID=19226]    [version] Checking plan class 'opencl-intel_gpu-new'
2016-03-19 17:44:00.7877 [PID=19226]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2016-03-19 17:44:00.7877 [PID=19226]    [version] [HOST#7964949] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 9.18.10.3165; platform version: OpenCL 1.2; device version: OpenCL 1.2
2016-03-19 17:44:00.7877 [PID=19226]    [version] driver version 918103165, min: 0, max: 1018103906
2016-03-19 17:44:00.7877 [PID=19226]    [version] Peak flops supplied: 4.48e+10
2016-03-19 17:44:00.7877 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7877 [PID=19226]    [version] Don't need Intel GPU jobs, skipping version 134 for einsteinbinary_BRP4 (opencl-intel_gpu-new)
2016-03-19 17:44:00.7877 [PID=19226]    [version] Checking plan class 'opencl-intel_gpu'
2016-03-19 17:44:00.7878 [PID=19226]    [version] parsed project prefs setting 'gpu_util_brp': 1.000000
2016-03-19 17:44:00.7878 [PID=19226]    [version] [HOST#7964949] device name: 'Intel(R) HD Graphics 4000'; OpenCL driver version: 9.18.10.3165; platform version: OpenCL 1.2; device version: OpenCL 1.2
2016-03-19 17:44:00.7878 [PID=19226]    [version] Peak flops supplied: 4.48e+10
2016-03-19 17:44:00.7878 [PID=19226]    [version] plan class ok
2016-03-19 17:44:00.7878 [PID=19226]    [version] Don't need Intel GPU jobs, skipping version 134 for einsteinbinary_BRP4 (opencl-intel_gpu)
2016-03-19 17:44:00.7878 [PID=19226]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#9 (windows_x86_64) min_version 0
2016-03-19 17:44:00.7878 [PID=19226]    [version] no app version available: APP#19 (einsteinbinary_BRP4) PLATFORM#2 (windows_intelx86) min_version 0
2016-03-19 17:44:00.7931 [PID=19226] [debug]   [HOST#7964949] MSG(high) No work sent
2016-03-19 17:44:00.7931 [PID=19226] [debug]   [HOST#7964949] MSG(high) see scheduler log messages on https://einsteinathome.org/host_sched_logs/7964/7964949
2016-03-19 17:44:00.7931 [PID=19226]    Sending reply to [HOST#7964949]: 0 results, delay req 60.00
2016-03-19 17:44:00.7932 [PID=19226]    Scheduler ran 0.197 seconds

Pollux_P3D
Pollux_P3D
Joined: 8 Feb 11
Posts: 30
Credit: 212418648
RAC: 0

Can't get GPU work (how to read scheduler logs?)

Harri Liljeroos wrote:
Can't get GPU work


There is probably a server down !

MarkHNC
MarkHNC
Joined: 31 Aug 12
Posts: 37
Credit: 170965842
RAC: 0

I get this when I try to

I get this when I try to reach the server status page:

"Forbidden

"You don't have permission to access /server_status.html on this server.
Apache/2.2.3 (CentOS) Server at einstein.phys.uwm.edu Port 443"

Pollux_P3D
Pollux_P3D
Joined: 8 Feb 11
Posts: 30
Credit: 212418648
RAC: 0

That will probably be

That will probably be "repaired" on Monday

Harri Liljeroos
Harri Liljeroos
Joined: 10 Dec 05
Posts: 4463
Credit: 3265752395
RAC: 1917399

GPU work is once again

GPU work is once again flowing, I received BRP6 and BRP4G work units just a few minutes ago. Scheduler log seems also OK now.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.