Heads-up: BRP4 might run briefly out of work

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 165
Credit: 2250401027
RAC: 626015

Bernd Machenschalk

Bernd Machenschalk wrote:

BRP4G will remain out of work. Our intention is to direct the computing power of the official Intel CPU BRP4G apps to FGRP5.

Does this mean that there will be no more any BRP4 tasks for the x86 CPU? And the only task type available for x86_64 CPUs now and in the foreseeable future is FGRP5 ?

And do I understand correctly (correct me if I'm wrong), the current distribution is:

O3AS and BRP7 searchers are for NV+AMD GPUs only (no CPU app)

BRP4A for ARM CPUs

BRP4 for Intel+ARM GPUs only (no CPU app)

FGRP5 for x86 CPUs only (no GPU app)
 

Link
Link
Joined: 15 Mar 20
Posts: 136
Credit: 12236532
RAC: 39503

Mad_Max wrote:BRP4 for

Mad_Max wrote:
BRP4 for Intel+ARM GPUs only (no CPU app)

All applications for ARM are for CPU, not GPU.

.

Keith Myers
Keith Myers
Joined: 11 Feb 11
Posts: 5054
Credit: 19116400962
RAC: 5634827

There is no difference

There is no difference between the BRP4 cpu and gpu tasks on ARM platforms.  You can run the BRP4 tasks on the Nvidia gpu in Jetsons with Gaurav's application in the anonymous platform.

Same with the BRP4G tasks before they were removed.

Been doing so since forever on my Jetson Nano and TX2-NX SBC's

Nano

TX2-NX

 

Link
Link
Joined: 15 Mar 20
Posts: 136
Credit: 12236532
RAC: 39503

Keith Myers wrote:There is no

Keith Myers wrote:
There is no difference between the BRP4 cpu and gpu tasks on ARM platforms. You can run the BRP4 tasks on the Nvidia gpu in Jetsons with Gaurav's application in the anonymous platform.

I was talking about the official applications, they are only for ARM CPU. With anonymous platform you can run BRP4 also on x86_64 CPUs either with the official BRP4/BRP4G application (it's same executable for both) or with the enhanced app. Even the old 1.33 CUDA application should still work on Nvidia GPUs if someone has it on his computer.

.

Bernd Machenschalk
Bernd Machenschalk
Moderator
Administrator
Joined: 15 Oct 04
Posts: 4343
Credit: 252703861
RAC: 36052

O3AS is our current GW

O3AS is our current GW search, the main purpose of the project and we try to direct as much computing power as possible to it. However, it requires the most computation, so that it's hardly feasible even for modern fast CPUs; so it's GPU only (NVidia&AMD).

BRP7 also requires a lot of computation, so we're trying to utilize the GPUs for it that can't run O3AS. Besides AMD&NVidia, we're trying to use the Intel GPUs that do (still) have double precision support, although the distinction doesn't seem to work perfectly (yet). Sorry if you have to manually disable it on your Intel GPU because it doesn't work.

FGRP5 is well suited for modern CPUs, Intel and fast ARM64 (including Apple Silicon).

BRP4 is our least demanding search in terms of memory, disk space, computing power etc, so we have application versions for all low-profile devices - Android mobile devices, PowerPC Macs, Raspberry Pi and also for the GPUs that lack double precision capability.

BRP4A is the application that we currently use to test Apple Silicon versions (CPU & Metal GPU). It is split out from BRP4 mainly to have finer control over the validation.

When we do have some BRP4 analysis that requires more urgency than the low profile devices can deliver, we put it into the BRP4G pipeline. In that the workunits are actually bundles of multiple (4-16) single BRP4 tasks, because single BRP4 tasks would run too fast on the devices we run these on (currently fast CPUs) and flood our DB with requests. Currently, though, we don't have such urgency, and the BRP4G pipeline is suspended.

This is our policy regarding "official" applications. Of course with anonymous platform apps you can do what you want and bypass that as you like. As long as the number of people doing this is small enough to not endanger the whole system, this is OK for us and welcome. But bear in mind that we have the policy above for good reason.

Have a nice Xmas and thanks for supporting Einstein@Home!

BM

GWGeorge007
GWGeorge007
Joined: 8 Jan 18
Posts: 3170
Credit: 5121986723
RAC: 3812264

Bernd Machenschalk

Bernd Machenschalk wrote:

O3AS is our current GW search, the main purpose of the project.

.....snip.....

Have a nice Xmas and thanks for supporting Einstein@Home!

Thank you Bernd for a nice clarification of the projects!

I hope your Holidays are pleasant as well!

George

Proud member of the Old Farts Association

Mad_Max
Mad_Max
Joined: 2 Jan 10
Posts: 165
Credit: 2250401027
RAC: 626015

Thanks for the details Bernd 

Thanks for the details Bernd  !

Bernd Machenschalk wrote:

O3AS is our current GW search, the main purpose of the project and we try to direct as much computing power as possible to it. However, it requires the most computation, so that it's hardly feasible even for modern fast CPUs; so it's GPU only (NVidia&AMD).

BRP7 also requires a lot of computation, so we're trying to utilize the GPUs for it that can't run O3AS.

A small recommendation on optimization, which would be nice to do after the holidays on this part. Just in case you haven't noticed it on your own yet.

If, as you write, the GW sub-project is the highest priority and you want to direct ALL available and suitable computing power to it, then note that it looks like it's time for you to check/test and start tuning (or just allocate more computing resources for it) the "work generator" app for the O3AS project.

Recently (actually at least for several months now) I observe that my GPUs, which work with GW tasks just fine(usually >99.x% WUs completed+validated successfully) , began to receive also BRP7 tasks quite often instead of O3AS (both projects are allowed in my settings). When I became interested in why this was happening and did a little "debugging", I found out that this happens at times when the server scheduler simply does not have ready to send tasks for the O3AS project and it sends tasks for the BRP7 project instead (unless it is prohibited by the user in the settings, of course).

I determined this due to the fact that if I disable BRP7 in the settings, sometimes when requesting a task, the client receives "no work available for O3AS" responses from the server (so I turned BRP7 back on in the settings to avoid possible downtime). The official statistics page of the project also speaks in favor of this on which the "Tasks to send" line for the O3AS project usually floats around relatively low values - usually only from several dozen to several hundred tasks, sometimes dropping to almost zero. Whereas for all other active sub-projects, this means at least several thousand tasks ready to be sent at any given moment.

So it looks like the O3AS work generator is already starting to fail to meet high demand, and because of this, some of the GPU computing power that could have been used to work on O3AS (i.e., at the same time, the hardware meets the requirements and the owner has agreed to participate in it) is inadvertently "leaking" in favor of BRP7 for purely technical reasons.
And the overall max throughput of the GW project is now, in fact, limited on the server side: because when/if additional computing power arrives on the side of volunteers with suitable GPUs, almost all of these ADDITIONAL power will similarly "leak" from O3AS into BRP7 instead.

Note - of course, this does not mean that new users or new added equipment will not receive GW tasks at all. They will get a lot of GW work too. It's just that all the old users participating in both projects will get slightly change in their work proportions and they will receive slightly fewer GW tasks and slightly more BRP7 tasks after each increase in the total amount of GPU computing power involved in the project.

AndreyOR
AndreyOR
Joined: 28 Jul 19
Posts: 44
Credit: 745356910
RAC: 904399

Mad_Max wrote:"... Tasks to

Mad_Max wrote:
"... Tasks to send" line for the O3AS project usually floats around relatively low values - usually only from several dozen to several hundred tasks, sometimes dropping to almost zero. Whereas for all other active sub-projects, this means at least several thousand tasks ready to be sent at any given moment. ...

I remember reading that there's a reason for the O3AS being released in smaller batches, I don't remember what it is though, perhaps someone who does can chime in.

Also, a CUDA O3AS app for Windows is in the works, when it's ready it should significantly increase the overall processing pace of O3AS.  According to Bernd, Linux version was released first because it was much easier to implement and the Windows version has been problematic to get to work so far.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.