BRP6_1.56_cuda55 - Driver API not loading on either Host

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0
Topic 198146

BRP6_1.56_cuda55 - Driver API not loading on either Host - 6 Run/in Pending so far
=
[13:38:41][2120][INFO ] Using CUDA device #0 "GeForce GTX 960"

(1024 CUDA cores / -1676.60 GFLOPS)
[13:38:41][2120][INFO ] Version of installed CUDA driver: 7050
[13:38:41][2120][INFO ] Version of CUDA driver API used: 3020
==
[15:54:57][3868][INFO ] Using CUDA device #0 "GeForce GTX 970"

(1664 CUDA cores / -38.46 GFLOPS)
[15:54:57][3868][INFO ] Version of installed CUDA driver: 7050
[15:54:57][3868][INFO ] Version of CUDA driver API used: 3020
=
What am I missing?

Thanks

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0

BRP6_1.56_cuda55 - Driver API not loading on either Host

BRP6_1.56_cuda55 - Driver API not loading on either Host - 6 Run/in Pending so far
=
[13:38:41][2120][INFO ] Using CUDA device #0 "GeForce GTX 960"

(1024 CUDA cores / -1676.60 GFLOPS)
[13:38:41][2120][INFO ] Version of installed CUDA driver: 7050
[13:38:41][2120][INFO ] Version of CUDA driver API used: 3020
==
[15:54:57][3868][INFO ] Using CUDA device #0 "GeForce GTX 970"

(1664 CUDA cores / -38.46 GFLOPS)
[15:54:57][3868][INFO ] Version of installed CUDA driver: 7050
[15:54:57][3868][INFO ] Version of CUDA driver API used: 3020
=
What am I missing?

Thanks

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0

- 6 Run/in Pending so far =

- 6 Run/in Pending so far
=
[13:38:41][2120][INFO ] Using CUDA device #0 "GeForce GTX 960"

(1024 CUDA cores / -1676.60 GFLOPS)
[13:38:41][2120][INFO ] Version of installed CUDA driver: 7050
[13:38:41][2120][INFO ] Version of CUDA driver API used: 3020
==
[15:54:57][3868][INFO ] Using CUDA device #0 "GeForce GTX 970"

(1664 CUDA cores / -38.46 GFLOPS)
[15:54:57][3868][INFO ] Version of installed CUDA driver: 7050
[15:54:57][3868][INFO ] Version of CUDA driver API used: 3020
=
What am I missing?

Thanks

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220564931
RAC: 970958

RE: What am I missing?

Quote:

What am I missing?

Thanks


That you have validation on each host?

So what is the concern here?

FYI, I've got at least nine validations across three hosts and five GPUs, and on a spot check of one stderr from each host, all contain lines like [pre][09:01:47][6004][INFO ] Version of installed CUDA driver: 6050
[09:01:47][6004][INFO ] Version of CUDA driver API used: 3020[/pre]

Gary Roberts
Gary Roberts
Moderator
Joined: 9 Feb 05
Posts: 5872
Credit: 117537716817
RAC: 35323014

I've combined the other two

I've combined the other two messages you posted here because it's not nice for others to have the topics of their threads moved off in an entirely different direction without their consent. You only need to post once. Churning out the same message in different places is more likely to get your concerns ignored than to elicit useful responses.

Quote:
What am I missing?


An actual description as to exactly what it is that you think is a 'problem'.

You said, "Driver API not loading on either Host ...." but I don't know what you mean by that. The stderr output tells you the version of the driver you have installed on your computer and it also tells you about the version of the API used when the BRP6 app was built by the Devs. These are two entirely different things. I'm not a programmer so I don't know the full significance but I'm sure you don't need to be worried about this.

If you think there is a problem, what you should do is provide a link to the complete file - like this one for example, so that someone prepared to help can see the full context of what you are concerned about. Then you should ask specific questions. Perhaps you wanted to know why the reported GFLOPS was negative? Perhaps you wanted to know the significance of "CUDA driver version (7050)" as opposed to "CUDA driver API (3020)"? Whatever it was, you should ask the specific question.

Before even getting to the point of reporting something as a 'problem', you should look to see if there is consistency between what is being reported for different tasks on the same host and also on different hosts. If there is consistency from task to task and host to host, it's very likely that's 'just the way it is' rather than there being any problem at all. The final clincher is when tasks start validating, as some of yours have. The other point to realise is that the stderr outputs have indicators in square brackets - like [INFO] or [ERROR] for example. Unless you can see [ERROR], there is no reason to presume a problem exists.

Cheers,
Gary.

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0

Touché and apologies for

Touché and apologies for being a bit cryptic; as well as spattering my Posts in 3 places.
You did raise *another question I've had for a while, re: negative GFLOPS reporting; which I believe, I can remedy if I can find the post.

But, back to my original query: Yes I should have said,
"The *appropriate (5050) CUDA Driver API was not loaded"

I was looking for and expecting to *see/confirm -
[Version of CUDA driver API used: 5050]
Which would indicate CUDA 5.5 libraries actually loaded and used;

as opposed to the older 3020 associated with CUDA 3.2

=Both sets of cudart and cufft are of course in my Einstein Folder;
but 5050 (our primary advancement here) wasn't called and used.
=
Yes the 7050 is part of the NVidia 7.5 SDK Driver package associated with my Installed GTX 960 and 970 - It's backward compatible with earlier CUDA versions.
But this is NVidia GPU specific.
=
My Point and Wish, is to see 5.5/5050 USED - and *how to get there or make that happen. = This, was the gist of my "What am I missing?" statement.
=
Looking forward to the faster runtimes associated with it; as well as progress Forward to 6.5 and beyond, in the Windows Environment
6.5 is pretty darn amazing in that Dept (runtimes)
=
Anticipating latency Blessings from DirectX 12 - associated with Win 10, which is just around the corner (July 29 Release)
==
So again, sorry for any cryptic syntax or "gray" queries yesterday - I *thought I was Posting in the right Place(s) - no replies/response after an Hour? I was on the move, then.
Thanks Gary and archae86 for taking time to look at it.

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220564931
RAC: 970958

RE: My Point and Wish, is

Quote:
My Point and Wish, is to see 5.5/5050 USED - and *how to get there or make that happen. = This, was the gist of my "What am I missing?" statement.
=
Looking forward to the faster runtimes associated with it;


While I'm not even slightly expert on these matters, the stderr text simply asserts "Version of CUDA driver API used: 3020", which hardly precludes that some components of 5.5 were actually employed. Further, it is nearly certain that some were.

After all, the greater than 20% productivity improvement many of us have observed on various Kepler and Maxwell cards came from something and I'm not aware that other programming improvements were made in this revision aside from the CUDA level upgrade.

JBird
Joined: 22 Dec 14
Posts: 1963
Credit: 4046216051
RAC: 0

Whom do we ask then, for

Whom do we ask then, for confirmation/assurance that 5050 will load?
Must say, I'm not convinced, when I see 3020 instead.
=
Is there an app_info.xml or app_ config.xml that can address this plan_class and Driver API appropriately?

Significant runtime improvements await.
=
Do look at Properties- Details you see in Einstein *and Seti folders of the cufft and cudart dlls.

Richard Haselgrove
Richard Haselgrove
Joined: 10 Dec 05
Posts: 2143
Credit: 2956096472
RAC: 716593

Use Process Explorer while

Use Process Explorer while the task is running to see which DLL files are being used, and where they're being loaded from.

I was caught out many years ago at SETI Beta, when I confidently predicted that an application would fail because it had been deployed wrongly - but it ran successfully and validated.

Process explorer demonstrated (screenshot in that link) that Windows had found some SDK files I'd forgotten I'd even installed on that machine - and they were the right ones for the application.

In general, Windows will try to use the files supplied by Einstein first, but if you've installed developer files (not needed for normal crunching), they may be used instead.

floyd
floyd
Joined: 12 Sep 11
Posts: 133
Credit: 186610495
RAC: 0

RE: While I'm not even

Quote:
While I'm not even slightly expert on these matters, the stderr text simply asserts "Version of CUDA driver API used: 3020", which hardly precludes that some components of 5.5 were actually employed.


I'm no expert either but to me "Version of CUDA driver API used: 3020" actually does mean no features of CUDA later than 3.2 can be directly used. All improvement would then come from the supplied CUDA 5.5 library being more efficient on CUDA 3.2 code than the older CUDA 3.2 library.

Which raises the question why we need to use a particular library at all. Couldn't the application utilize whatever CUDA library is installed on the target system, provided it implements a minimum set of features?

archae86
archae86
Joined: 6 Dec 05
Posts: 3157
Credit: 7220564931
RAC: 970958

RE: Use Process Explorer

Quote:
Use Process Explorer while the task is running to see which DLL files are being used, and where they're being loaded from.


On my way to trying this I also noticed that the slot directories which have the 1.52 executable have cudart and cudfft dlls indicating 3.2, while the slot directories with the 1.56 executable have versions with file names indicating 5.5

Moving on to your actual suggestion, for a currently running Parkes 1.56 task, I find
[pre]
cudart32_55.dll NVIDIA CUDA Runtime, Version 5.5.20 NVIDIA Corporation 6.14.11.5050
cufft32_55.dll NVIDIA CUDA FFT library, Version 5.5.20 NVIDIA Corporation 6.14.11.5050[/pre]

While there are dozens of other dll's I don't find any dll's labeled as CUDA 3.2.

[If some reading this, as I, have never previously used Process Explorer to find dll usage, it may save you some searching if I mention that to get there use View|Show Lower Pane Lower Pane View|DLLs]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.