Posts by MarkRBright

1) Message boards : Number crunching : Server is out of disk space (Message 936)
Posted 24 Nov 2020 by Profile MarkRBright
Post:
Trying to get into this bucket myself, but there doesn't seem to be any space left in it! We need a bigger bucket!
2) Message boards : Number crunching : CNode (Message 867)
Posted 9 Oct 2020 by Profile MarkRBright
Post:
I see...
I just implemented a new fix...
It can restart the tasks and we'll see if it's better?

Works for me. Thanks
3) Message boards : Number crunching : CNode (Message 861)
Posted 8 Oct 2020 by Profile MarkRBright
Post:
Today in the morning (around 7:00 ETC) an error occurred on one of the servers. The problem has already been fixed. You can reboot tasks if they take too long or there are errors. The new tasks should work well. Can you confirm? :)

Not fixed for me! 72 WUs today on 2 machines, all failed, including in the last half hour. Something still isn't right.
4) Message boards : Number crunching : Why not do twice as much work? (Message 625)
Posted 27 Jun 2020 by Profile MarkRBright
Post:
Am I missing something here or is perhaps this project missing something?
If you look at the extract from my log file below you can hopefully see that this project could get through twice as many work units as it does. As it stands, when it gets a task, it is always immediately followed by a "Project requested delay of 303 seconds". The task itself only takes about 140 seconds from start to completion of the subsequent upload. The result is that it waits about 160 seconds before it reports it as complete and requests another Work Unit. This is despite having "report completed tasks immediately" set in my cc_config.xml file.
Is there a way of me overriding this sleep time to something more fitting for my computers? Or is it controlled by the Project?
If so, can the projects Admin reduce the delay number to something more in line with the time it takes to do the job?
Looking at the delays of the other projects that I currently do work for, the delay appears to be anything from 7 to 90 seconds with your 303 being exceptional. Clearly the smaller the number, the less idle time for your project on all of your crunchers machines. The flip side is that there would be more calls to your servers, so I accept it's a balance. In my case I would suggest that about 150 seconds would ensure fairly constant crunching which would literally allow me to do twice as much work for your project, and would hopefully not cause your servers to go into meltdown.
Perhaps there any other way round this?
Yours optimistically
Mark

27/06/2020 08:50:24 | iThena | Requesting new tasks for CPU
27/06/2020 08:50:26 | iThena | Scheduler request completed: got 1 new tasks
27/06/2020 08:50:26 | iThena | Project requested delay of 303 seconds
27/06/2020 08:50:28 | iThena | Starting task PERF_TESTS_0_8916908_2_30_0
27/06/2020 08:52:40 | iThena | Computation for task PERF_TESTS_0_8916908_2_30_0 finished
27/06/2020 08:52:43 | iThena | Started upload of PERF_TESTS_0_8916908_2_30_0_r1417084583_0
27/06/2020 08:52:45 | iThena | Finished upload of PERF_TESTS_0_8916908_2_30_0_r1417084583_0
27/06/2020 08:55:31 | iThena | Sending scheduler request: To report completed tasks.
27/06/2020 08:55:31 | iThena | Reporting 1 completed tasks
27/06/2020 08:55:31 | iThena | Requesting new tasks for CPU
27/06/2020 08:55:33 | iThena | Scheduler request completed: got 1 new tasks
27/06/2020 08:55:33 | iThena | Project requested delay of 303 seconds
27/06/2020 08:55:36 | iThena | Starting task PERF_TESTS_0_8920207_2_30_0




© 2021 iThena. All rights reserved. | Private Policy

Page generated on 17 Oct 2021, 16:12:42 UTC in 0.0633 seconds.