Am I getting longer running tasks ...

Bert Hyman

Joined: 5 Dec 05

Posts: 15

Credit: 6206746

RAC: 0

26 Jan 2020 1:20:51 UTC

Topic 220540

(moderation:

)

... or has my computer sprung a leak?

Until recently, the "Gravitational Wave Search" tasks ran in about 10 hours on my old Intel I3-based Linux box. The past week or so, I find some running for over 15 hours. The machine's done nothing but run 2 tasks at a time, each receiving 100% of a a thread on a 2-core 4-thread processor for more than a week. The processes are shown as each using 25% of the CPU.

The two running tasks show an estimated 15 hour running time. They're

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_13

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_12

Those tasks and all those waiting to run have an "Estimated Computation Size" of 144,000 GFLOPs.

Is it my machine or the universe at fault here?

Richie

Joined: 7 Mar 14

Posts: 656

Credit: 1702989778

RAC: 0

The focus of the "search"

26 Jan 2020 1:53:15 UTC

Message 175410

(moderation:

)

The focus of the "search" changed from O2MD1V1_VelaJr1 to O2MD1C1_CasA. That can be seen on the names of the tasks.

Different data... and these CasA tasks may require longer to crunch through. Personally all my six CasA tasks so far have ended up 'validate error' (linux host). i don't know yet if it's my computer or something else.

Bert Hyman

Joined: 5 Dec 05

Posts: 15

Credit: 6206746

RAC: 0

Thanks. Looking back, I now

26 Jan 2020 2:49:02 UTC

Message 175411 in response to message 175410

(moderation:

)

Thanks. Looking back, I now see that the earlier tasks that ran long were actually "Gama Ray Pulsar something." One was LATeah1002F_1320.0_178994_0.0_0.

So none of the new ones have even completed on my machine. I have 2 running and 6 more in the queue. Looking at the elapsed + estimated remaining time for the 2 active tasks, it looks like 16 hours each now.

See you all (much) later, I guess.

Keith Myers

Joined: 11 Feb 11

Posts: 4964

Credit: 18744270710

RAC: 7012190

All my O2MD1C1_CasA tasks

26 Jan 2020 7:44:32 UTC

Message 175413 in response to message 175410

(moderation:

)

All my O2MD1C1_CasA tasks have gone straight to validate errors too. You are not alone. Probably the tasks are bad.

ursmii

Joined: 15 Sep 19

Posts: 2

Credit: 20585157

RAC: 0

why should I spend so much

26 Jan 2020 10:19:03 UTC

Message 175414

(moderation:

)

why should I spend so much computing time if so many were invalid.

bye bye einstein ...

Bert Hyman

Joined: 5 Dec 05

Posts: 15

Credit: 6206746

RAC: 0

The two tasks that finished

26 Jan 2020 13:54:38 UTC

Message 175415 in response to message 175413

(moderation:

)

The two tasks that finished both failed with validate errors.

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_13_0

h1_0186.00_O2C02Cl1In0__O2MD1C1_CasA_186.15Hz_12_0

Looks like something is wrong.

At least it's keeping that part of the basement warm.

San-Fernando-Valley

Joined: 16 Mar 16

Posts: 409

Credit: 10206773455

RAC: 23113523

... that's an excellent

26 Jan 2020 14:07:19 UTC

Message 175416 in response to message 175414

(moderation:

)

... that's an excellent question ...

maybe somebody can explain that to you?

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

When things like this happens

26 Jan 2020 16:28:51 UTC

Message 175420

(moderation:

)

When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Bert Hyman

Joined: 5 Dec 05

Posts: 15

Credit: 6206746

RAC: 0

San-Fernando-Valley wrote:...

26 Jan 2020 18:52:00 UTC

Message 175422 in response to message 175416

(moderation:

)

San-Fernando-Valley wrote:

... that's an excellent question ...

maybe somebody can explain that to you?

Explain what, exactly?

Bert Hyman

Joined: 5 Dec 05

Posts: 15

Credit: 6206746

RAC: 0

Holmis wrote:When things like

26 Jan 2020 18:55:47 UTC

Message 175423 in response to message 175420

(moderation:

)

Holmis wrote:

When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Not worried so much about the credits, but about the repeated failures of the tasks.

If there's something wrong on my end, I'd like to fix it. If there's a problem with the software that's massaging the data, I'll stop taking new tasks until I see that it's been fixed. Or, if the error reports are actually spurious, I'll let things continue as they are.

Holmis

Joined: 4 Jan 05

Posts: 1118

Credit: 1055935564

RAC: 0

Bert Hyman wrote:Holmis

26 Jan 2020 19:59:12 UTC

Message 175425 in response to message 175423

(moderation:

)

Bert Hyman wrote:

Holmis wrote:
When things like this happens the project staff usually grants credit manually for failed tasks (validate errors) after the problem is fixed. So usually you'll get credit for the work done but the science is still lost.

Not worried so much about the credits, but about the repeated failures of the tasks.

I do share that worry.

Quote:

If there's something wrong on my end, I'd like to fix it.

If you look at your tasks and find that others also get validate errors then it will most probably not be your fault.
A validate error is declared when there is something obliviously wrong with the result returned, the validator does a sanity check of the result when returned before trying to compare it to your wingman, if that check fails then it's declared as a validate error.
If you run your gear out of default operating parameters and your wingmen returns good results, then it might be a good indication that your hardware is returning bad results.

Quote:

If there's a problem with the software that's massaging the data, I'll stop taking new tasks until I see that it's been fixed.

If you get numerous validate erors and your wingman do too then it will probably be something wrong with the tasks, feel free to stop running them until the staff gets a chance to examine the problem.

Quote:

Or, if the error reports are actually spurious, I'll let things continue as they are.

Nothing wrong with reporting errors! That's one way the staff gets informed of what's happening!

Am I getting longer running tasks ...

Forums › Cruncher's Corner

Comment viewing options

Forums › Cruncher's Corner