@Oliver
And what AMD SDK version did you use to build intel_gpu version of your app?
Also, there are NV-related comments in your OpenCL code. So, no OpenCL version for NV just because CUDA build faster or some another issues?
Cause this 100% CPU issue deserves understanding IMO.
Yeah.. you could probably improve power efficiency of the iGPU by >50% just by getting rid of the CPU usage! Assuming performance wouldn't suffer, or course.
Cause this 100% CPU issue deserves understanding IMO.
Yeah.. you could probably improve power efficiency of the iGPU by >50% just by getting rid of the CPU usage! Assuming performance wouldn't suffer, or course.
MrS
That was my thinking too. If the CPU could be 'held in reserve but not doing much' (the way the current Einstein app seems to be working), rather than 'spinning like mad', I reckon I'd see a 10W (>10%) reduction in total system power draw. That's a significant drop.
@Oliver
And what AMD SDK version did you use to build intel_gpu version of your app?
2.6, for our Linux and Windows builds.
Quote:
Also, there are NV-related comments in your OpenCL code. So, no OpenCL version for NV just because CUDA build faster or some another issues?
Faster builds? Where did you see that (line number)?
No, our app just won't validate correctly when run on NVIDIA GPUs. We intended/developed it to be able to but never got round to get it working correctly. I'd have loved to drop CUDA in favour of OpenCL, but as I said, OpenCL is a dead end concerning NVIDIA...
I spoke about these lines in code (that imply you considered OpenCL build suitable for NV too hence was my question about speed/validness. Answer is - validness issues, OK):
// defined in OpenCL 1.1 (but Apple is still using 1.0)
#ifndef CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV
#define CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV 0x4000
#define CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV 0x4001
#endif
PS. I run BOINC 6 on my dev netbook where most AstroPulse profiling done and your app requires BOINC 7 to be downloaded. So I installed CodeXL on another host, with ATi HD6950 + BOINC 7 but had not time to try profiling so far.
Will post results later (app downloaded with some task to do OK there).
[22:17:42][7988][INFO ] Seed for random number generator is 1147371725.
Activated exception handling...
22:19:04 (5132): Can't set up shared mem: -1. Will run in standalone mode.
22:19:04 (5132): called boinc_finish
Activated exception handling...
22:19:51 (5224): Can't set up shared mem: -1. Will run in standalone mode.
22:19:51 (5224): called boinc_finish
Can't run offline, app exits after few seconds...
Should I supply some command line options for it? I copied real executable in corresponding slot directory and trying to run from there (BOINC switched off).
Activated exception handling...
19:54:35 (3092): Can't set up shared mem: -1. Will run in standalone mode.
[19:54:35][3092][INFO ] Starting data processing...
[19:54:35][3092][INFO ] Using OpenCL platform provided by: Intel(R) Corporation
[19:54:35][3092][INFO ] Using OpenCL device "Intel(R) HD Graphics 4600" by: Intel(R) Corporation
[19:54:35][3092][INFO ] Checkpoint file unavailable: status.cpt (No such file or directory).
------> Starting from scratch...
[19:54:35][3092][INFO ] Header contents:
------> Original WAPP file: ./p2030.20130203.G203.76-01.67.N.b5s0g0.00000_DM77.70
------> Sample time in microseconds: 65.4762
------> Observation time in seconds: 274.62705
------> Time stamp (MJD): 56327.094780941181
------> Number of samples/record: 0
------> Center freq in MHz: 1214.289551
------> Channel band in MHz: 0.33605957
------> Number of channels/record: 960
------> Nifs: 1
------> RA (J2000): 62859.8098984
------> DEC (J2000): 72421.9974003
------> Galactic l: 0
------> Galactic b: 0
------> Name: G203.76-01.67.N
------> Lagformat: 0
------> Sum: 1
------> Level: 3
------> AZ at start: 0
------> ZA at start: 0
------> AST at start: 0
------> LST at start: 0
------> Project ID: --
------> Observers: --
------> File size (bytes): 0
------> Data size (bytes): 0
------> Number of samples: 4194304
------> Trial dispersion measure: 77.7 cm^-3 pc
------> Scale factor: 0.000858516
[19:54:36][3092][INFO ] Seed for random number generator is 1174326477.
[19:54:38][3092][ERROR] Error in OpenCL context: Out of device memory.
[19:54:38][3092][ERROR] Error in OpenCL context: Out of device memory.
[19:54:38][3092][INFO ] Derived global search parameters:
------> f_A probability = 0.08
------> single bin prob(P_noise > P_thr) = 1.32531e-008
------> thr1 = 18.139
------> thr2 = 21.241
------> thr4 = 26.2686
------> thr8 = 34.6478
------> thr16 = 48.9581
(and is still running). Pretty much the same as the live result for task 406867971.
Recipe:
Copied all the symlinks out of the slot directory into a temporary folder for reference.
Made a test working folder with the full versions of the same files.
Looked in client_state for the description of the same task.
Concatenated a command line from the data there:
Could you quickly try this in
)
Could you quickly try this in a test-build of your app? Seems like only developers could help any further in this matter.
MrS
Scanning for our furry friends since Jan 2002
@Oliver And what AMD SDK
)
@Oliver
And what AMD SDK version did you use to build intel_gpu version of your app?
Also, there are NV-related comments in your OpenCL code. So, no OpenCL version for NV just because CUDA build faster or some another issues?
RE: Could you quickly try
)
Yes, I will try replicate all differencies eventually. Cause this 100% CPU issue deserves understanding IMO.
RE: Cause this 100% CPU
)
Yeah.. you could probably improve power efficiency of the iGPU by >50% just by getting rid of the CPU usage! Assuming performance wouldn't suffer, or course.
MrS
Scanning for our furry friends since Jan 2002
RE: RE: Cause this 100%
)
That was my thinking too. If the CPU could be 'held in reserve but not doing much' (the way the current Einstein app seems to be working), rather than 'spinning like mad', I reckon I'd see a 10W (>10%) reduction in total system power draw. That's a significant drop.
RE: @Oliver And what AMD
)
2.6, for our Linux and Windows builds.
Faster builds? Where did you see that (line number)?
No, our app just won't validate correctly when run on NVIDIA GPUs. We intended/developed it to be able to but never got round to get it working correctly. I'd have loved to drop CUDA in favour of OpenCL, but as I said, OpenCL is a dead end concerning NVIDIA...
Oliver
Einstein@Home Project
I spoke about these lines in
)
I spoke about these lines in code (that imply you considered OpenCL build suitable for NV too hence was my question about speed/validness. Answer is - validness issues, OK):
// defined in OpenCL 1.1 (but Apple is still using 1.0)
#ifndef CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV
#define CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV 0x4000
#define CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV 0x4001
#endif
// NVIDIA-specific
bool nvidia = false;
cl_uint nvCompCapMajor = 0;
cl_uint nvCompCapMinor = 0;
in ocl_utilities.cpp.
#define VENDOR_AMD 1
#define VENDOR_NVIDIA 2
in demod_binary_ocl.cpp
BTW, vendor INTEL not defined at all.
PS. I run BOINC 6 on my dev netbook where most AstroPulse profiling done and your app requires BOINC 7 to be downloaded. So I installed CodeXL on another host, with ATi HD6950 + BOINC 7 but had not time to try profiling so far.
Will post results later (app downloaded with some task to do OK there).
RE: you considered OpenCL
)
Yep.
Yep, we don't have any Intel-specifics in the code so far (even VENDOR_AMD is unused right now)...
Great!
Einstein@Home Project
RE: [22:17:42][7988][INFO ]
)
Can't run offline, app exits after few seconds...
Should I supply some command line options for it? I copied real executable in corresponding slot directory and trying to run from there (BOINC switched off).
Well, I tried much the same,
)
Well, I tried much the same, and it starts off:
(and is still running). Pretty much the same as the live result for task 406867971.
Recipe:
Copied all the symlinks out of the slot directory into a temporary folder for reference.
Made a test working folder with the full versions of the same files.
Looked in client_state for the description of the same task.
Concatenated a command line from the data there:
Hit enter
Noted "[19:54:35][3092][INFO ] Application startup - thank you for supporting Einstein@Home!"
Went and made a cup of coffee.
Edit - came back a few minutes later to find