New workunits
log in

Advanced search

Message boards : News : New workunits

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next
Author Message
Profile BilBg
Avatar
Send message
Joined: 19 Jun 12
Posts: 221
Credit: 623,640
RAC: 0
Message 4290 - Posted: 29 Mar 2015, 13:23:29 UTC - in response to Message 4284.

I discovered (with 32GB of RAM ...) that RAM usage crept up and up.
...
The computer runs 24/7 ...So...what I do now is Reboot once per day.

You have to have some program with very serious memory leak (bug) to fill 32GB in one day.

(I have 3 GB of RAM for Windows XP - sometimes I don't need a Reboot for weeks.
Only restart per 2-3 days the only program with memory leak here - cracked (by Russians) Skype 4.2.0.187 - I hate the new Skype
(No, I will not 'update'; No, I am not afraid of Russian cracks)
)

Which program uses so many GB of RAM is easy to see by any of:
Windows Task Manager
Process Explorer
Process Lasso
System Explorer
ProcessHacker

(no need of links - first result in Google is OK for all of them)
____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

James Lee*
Send message
Joined: 28 Sep 13
Posts: 29
Credit: 103,641,240
RAC: 2
Message 4292 - Posted: 29 Mar 2015, 14:05:09 UTC

I had memory problems on occasion. www.wisecleaner.com took care of it. Hope that helps.
____________

Profile BilBg
Avatar
Send message
Joined: 19 Jun 12
Posts: 221
Credit: 623,640
RAC: 0
Message 4293 - Posted: 29 Mar 2015, 14:55:12 UTC - in response to Message 4292.
Last modified: 29 Mar 2015, 15:01:09 UTC

 
Several programs I use have "Memory Clean"/"Memory Usage Trim" Options:
Memory Cleaner (Koshy John)
Process Lasso
System Explorer

In fact all such programs call standard Windows functions to do the job

Usage of such programs/tools to "Trim Memory Usage" of processes may be helpful up to some point.
Windows do this "Trim" by its own when an active program want more Memory then RAM is free at the moment.

The real cure is to find the 'bad' (RAM leaking) program, then find different version which don't have the bug or find alternative program for the same job.
If no alternatives/versions are viable - restart the 'bad' (RAM leaking) program as often as needed (e.g. once a day).

P.S.
BUT I don't think that the Atom CPU that was in the beginning of this 'RAM talk' have any issue.
The Atom is just not so fast ;)
 
____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Profile Yank
Send message
Joined: 13 Mar 13
Posts: 8
Credit: 5,995,680
RAC: 0
Message 4294 - Posted: 29 Mar 2015, 16:38:17 UTC - in response to Message 4288.

Same here, pretty much 100% failure across all my systems.


Same problem with my systems. Will be back later.

Profile Yank
Send message
Joined: 13 Mar 13
Posts: 8
Credit: 5,995,680
RAC: 0
Message 4295 - Posted: 29 Mar 2015, 20:20:14 UTC

Looking over the US Navy team members task downloads most 'error while downloading' occurred at 04:00 UTC time 3/29/2015. Some members are now receiving some work units but many have just stopped the Asteroid@home project for now.

nanoprobe
Send message
Joined: 15 Jan 13
Posts: 12
Credit: 904,320
RAC: 0
Message 4296 - Posted: 29 Mar 2015, 23:26:53 UTC - in response to Message 4288.

Same here, pretty much 100% failure across all my systems.

Same here. I'm done until this is fixed.

Profile Kyong
Project administrator
Project developer
Project tester
Project scientist
Avatar
Send message
Joined: 9 Jun 12
Posts: 576
Credit: 52,667,664
RAC: 0
Message 4298 - Posted: 30 Mar 2015, 6:43:04 UTC

I tried something else, I didn't cancel the workunits. Unfortunately I was without internet connection during the weekend so I couldn't do anything else or check it.

Mio
Avatar
Send message
Joined: 17 Mar 15
Posts: 6
Credit: 11,520
RAC: 0
Message 4299 - Posted: 30 Mar 2015, 8:46:11 UTC - in response to Message 4289.

I had there estimated time 2000 sec, after day it growed up to 20000 sec

Usually BOINC, when you start on some new app/project, 'thinks' the opposite - estimated time is 10x more than real.
In your case BOINC did it (estimation) wrong somehow the other way.

As you can guess it is not possible Xeon 3.00GHz to do this same task in >9000 seconds by AVX app and your Atom 1.66GHz to manage to do it in 2000 seconds by SSE3 app

Estimation should become better after you do some more tasks.


yes, i know that atom CPU is like a CPU unit of an calculator, but i have another mobile CPU attached on another project, it is intel pentium M 1,7GHz and is working on tasks without any issues, when is on some task written 32000 secs, then it will do it for this time,... and it is older CPU as intel Atom with 4 cores (2 real cores, 2 threads) and 64bit architecture... so it is strange for me, that the newer mobile processor is slower on this project as an 15 years old CPU...
____________

Alez
Send message
Joined: 31 Oct 12
Posts: 7
Credit: 4,381,920
RAC: 0
Message 4300 - Posted: 30 Mar 2015, 10:33:23 UTC

Still nothing but download errors. Any update ?

30/03/2015 11:31:38 | Asteroids@home | Scheduler request completed: got 1 new tasks
30/03/2015 11:31:40 | Asteroids@home | Started download of input_2376_7
30/03/2015 11:31:43 | Asteroids@home | Incomplete read of 39.000000 < 5KB for input_2376_7 - truncating
30/03/2015 11:31:43 | Asteroids@home | Finished download of input_2376_7
30/03/2015 11:31:43 | Asteroids@home | [error] File input_2376_7 has wrong size: expected 61489, got 0
30/03/2015 11:31:43 | Asteroids@home | [error] Checksum or signature error for input_2376_7

Profile mikey
Avatar
Send message
Joined: 1 Jan 14
Posts: 285
Credit: 27,835,920
RAC: 60
Message 4301 - Posted: 30 Mar 2015, 11:17:52 UTC - in response to Message 4299.

I had there estimated time 2000 sec, after day it growed up to 20000 sec

Usually BOINC, when you start on some new app/project, 'thinks' the opposite - estimated time is 10x more than real.
In your case BOINC did it (estimation) wrong somehow the other way.

As you can guess it is not possible Xeon 3.00GHz to do this same task in >9000 seconds by AVX app and your Atom 1.66GHz to manage to do it in 2000 seconds by SSE3 app

Estimation should become better after you do some more tasks.


yes, i know that atom CPU is like a CPU unit of an calculator, but i have another mobile CPU attached on another project, it is intel pentium M 1,7GHz and is working on tasks without any issues, when is on some task written 32000 secs, then it will do it for this time,... and it is older CPU as intel Atom with 4 cores (2 real cores, 2 threads) and 64bit architecture... so it is strange for me, that the newer mobile processor is slower on this project as an 15 years old CPU...


Is the pc stopping crunching at some points? If it doesn't crunch 24/7 that could be the problem.

Profile BilBg
Avatar
Send message
Joined: 19 Jun 12
Posts: 221
Credit: 623,640
RAC: 0
Message 4302 - Posted: 30 Mar 2015, 16:06:23 UTC - in response to Message 4299.
Last modified: 30 Mar 2015, 16:42:52 UTC

... but i have another mobile CPU attached on another project ... when is on some task written 32000 secs, then it will do it for this time,...

On this "another project" (SETI@home):
http://setiathome.berkeley.edu/show_host_detail.php?hostid=7522532
... this Pentium M processor 1.70GHz CPU have "Number of tasks completed 41":
http://setiathome.berkeley.edu/host_app_versions.php?hostid=7522532

So BOINC had time/'experience' to show you better estimated time

Check what BOINC knows for your Atom on this project/apps by yourself:
http://asteroidsathome.net/boinc/host_app_versions.php?hostid=154098

***

In the task there are no 'written' seconds, instead it have <rsc_fpops_est>XXXXXXXXXX</rsc_fpops_est>
(the Copy/Paste is from my client_state.xml from 26.03.2015 copy)
<workunit> <name>period_search_1010_1426497677.081390_228913</name> <app_name>period_search</app_name> <version_num>10210</version_num> <rsc_fpops_est>55757157098049.469000</rsc_fpops_est> <rsc_fpops_bound>5575715709804947.000000</rsc_fpops_bound> <rsc_memory_bound>64000000.000000</rsc_memory_bound> <rsc_disk_bound>100000000.000000</rsc_disk_bound> <file_ref> <file_name>input_4801_69</file_name> <open_name>period_search_in</open_name> </file_ref> </workunit>


fpops_est = Floating-Point OPerationS _ Estimation (this is Not per sec, this is Estimation of total needed fpops to finish the task)

BOINC uses rsc_fpops_est (which is set by the project) and several other factors (which estimate in general the speed of the computer system) to show estimated times.

You may just not pay attention to BOINC estimates or you may go crazy ;)
If you really want to go crazy:
https://www.google.com/#q=boinc+estimated+cpu+time+remaining
https://www.google.com/#q=boinc+estimated+cpu+time+code+rsc_fpops_est

http://boinc.berkeley.edu/trac/wiki/CreditNew

***

http://boinc.berkeley.edu/trac/wiki/RuntimeEstimation

The only case when the estimated time matters is when it is 'too' low

As you can see by this two lines:
<rsc_fpops_est>55757157098049.469000</rsc_fpops_est>
<rsc_fpops_bound>5575715709804947.000000</rsc_fpops_bound>
... here on Asteroids@home the task is allowed to run at most 100x times the initial estimate (after that it will be auto-aborted by BOINC, this is to avoid hang apps to run 'forever')

On SETI@home this was (~year ago (?)) 10x and now is 20x
 
____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Profile cliff
Avatar
Send message
Joined: 19 Nov 14
Posts: 93
Credit: 30,066,240
RAC: 0
Message 4306 - Posted: 31 Mar 2015, 0:27:43 UTC - in response to Message 4298.

I tried something else, I didn't cancel the workunits. Unfortunately I was without internet connection during the weekend so I couldn't do anything else or check it.


Hi Boss,

Any idea of a timeframe to sort the download problem out?

Regards,
Cliff

AmigaForever
Send message
Joined: 11 Jul 13
Posts: 49
Credit: 1,346,640
RAC: 571
Message 4307 - Posted: 31 Mar 2015, 4:01:57 UTC

After the last five WUs worked properly (last one in work yet), all I get now are download errors:

31.03.2015 05:54:22 | Asteroids@home | [error] File input_3039_76 has wrong size: expected 57485, got 0
31.03.2015 05:54:22 | Asteroids@home | [error] Checksum or signature error for input_3039_76
31.03.2015 05:54:22 | Asteroids@home | [error] File input_3039_70 has wrong size: expected 57485, got 0
31.03.2015 05:54:22 | Asteroids@home | [error] Checksum or signature error for input_3039_70
31.03.2015 05:54:23 | Asteroids@home | Incomplete read of 39.000000 < 5KB for input_2985_76 - truncating
31.03.2015 05:54:23 | Asteroids@home | Finished download of input_2985_76
31.03.2015 05:54:23 | Asteroids@home | [error] File input_2985_76 has wrong size: expected 66069, got 0
31.03.2015 05:54:23 | Asteroids@home | [error] Checksum or signature error for input_2985_76

robertmiles
Send message
Joined: 17 Aug 14
Posts: 48
Credit: 5,196,960
RAC: 0
Message 4310 - Posted: 1 Apr 2015, 0:44:25 UTC
Last modified: 1 Apr 2015, 0:46:06 UTC

Could the download limiting server software be changed so that it doesn't count Download failed as a reason to stop a computer from downloading more workunits?

That should help get rid of the workunits set up with missing input files on the server faster, without the current restriction on how many good workunit get done each day.

I've found that there is no need to abort workunits with Download failed - just use the normal method for reporting them back to the server.

MLx
Send message
Joined: 6 Oct 12
Posts: 1
Credit: 985,560
RAC: 0
Message 4317 - Posted: 1 Apr 2015, 21:39:46 UTC

So what is currently the suggested action from the users? Keep going through the faulty WUs, or don't bother?

Looking at a tcpdump of my traffic to ateroidsathome.net, and trying the URL in a web browser, it seems that instead of the WU's data, the server provides just the 39-byte MD5 sum. Going by the presence of "Content-Location" header and its contents, I'm guessing this is caused by a misconfigured mod_rewrite in apache..?

GET /boinc/download/270/input_4410_20 HTTP/1.1 User-Agent: BOINC client (x86_64-apple-darwin 7.4.36) Host: asteroidsathome.net Accept: */* Accept-Encoding: deflate, gzip Content-Type: application/x-www-form-urlencoded Accept-Language: en_GB HTTP/1.1 200 OK Date: Wed, 01 Apr 2015 20:48:27 GMT Server: Apache/2.2.22 (Debian) Content-Location: input_4410_20.md5 Vary: negotiate TCN: choice Last-Modified: Thu, 12 Mar 2015 14:06:09 GMT ETag: "3e17cb7-27-51117e2c28e47;512afa50be3e5" Accept-Ranges: bytes Content-Length: 39 Content-Type: application/x-md5 e7dd4418e7b33fa674bacff3eeab2de3 70797 GET /boinc/download/317/input_4532_19 HTTP/1.1 ...

nanoprobe
Send message
Joined: 15 Jan 13
Posts: 12
Credit: 904,320
RAC: 0
Message 4318 - Posted: 1 Apr 2015, 22:02:47 UTC

What is this nonsense. 10 task daily limit on a 2600k? This computer doesn't have an Nvidia GPU.



47589 Asteroids@home 4/1/2015 4:59:00 PM Requesting new tasks for CPU
47590 Asteroids@home 4/1/2015 4:59:02 PM Scheduler request completed: got 0 new tasks
47591 Asteroids@home 4/1/2015 4:59:02 PM No tasks sent
47592 Asteroids@home 4/1/2015 4:59:02 PM Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
47593 Asteroids@home 4/1/2015 4:59:02 PM This computer has finished a daily quota of 10 tasks

Profile mikey
Avatar
Send message
Joined: 1 Jan 14
Posts: 285
Credit: 27,835,920
RAC: 60
Message 4322 - Posted: 2 Apr 2015, 11:02:40 UTC - in response to Message 4318.
Last modified: 2 Apr 2015, 11:04:27 UTC

What is this nonsense. 10 task daily limit on a 2600k? This computer doesn't have an Nvidia GPU.



47589 Asteroids@home 4/1/2015 4:59:00 PM Requesting new tasks for CPU
47590 Asteroids@home 4/1/2015 4:59:02 PM Scheduler request completed: got 0 new tasks
47591 Asteroids@home 4/1/2015 4:59:02 PM No tasks sent
47592 Asteroids@home 4/1/2015 4:59:02 PM Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
47593 Asteroids@home 4/1/2015 4:59:02 PM This computer has finished a daily quota of 10 tasks


Sounds like too many errors were reported so your work fetch has been throttled, this is standard at alot of projects, but since you have your pc's hidden I am only guessing. As you return valid units your daily quota will climb rapidly, if I am remembering right it's almost doubled every day as long as no more errors creep in again.

Profile BilBg
Avatar
Send message
Joined: 19 Jun 12
Posts: 221
Credit: 623,640
RAC: 0
Message 4326 - Posted: 2 Apr 2015, 11:12:16 UTC - in response to Message 4317.
Last modified: 2 Apr 2015, 11:12:54 UTC

it seems that instead of the WU's data, the server provides just the 39-byte MD5 sum

If you go to:
http://asteroidsathome.net/boinc/download/317/

,,, you will see that input_4532_19.md5 exists but input_4532_19 is not there

Kyong (the admin) say it is caused by BOINC server and re-using "same input name as before"
"I have checked how the BOINC server is checking the input files that have the same input name as before"
http://asteroidsathome.net/boinc/forum_thread.php?id=424&postid=4212#4212

Seems that BOINC server prematurely deletes WU files
____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

nanoprobe
Send message
Joined: 15 Jan 13
Posts: 12
Credit: 904,320
RAC: 0
Message 4332 - Posted: 2 Apr 2015, 18:42:09 UTC - in response to Message 4322.

What is this nonsense. 10 task daily limit on a 2600k? This computer doesn't have an Nvidia GPU.



47589 Asteroids@home 4/1/2015 4:59:00 PM Requesting new tasks for CPU
47590 Asteroids@home 4/1/2015 4:59:02 PM Scheduler request completed: got 0 new tasks
47591 Asteroids@home 4/1/2015 4:59:02 PM No tasks sent
47592 Asteroids@home 4/1/2015 4:59:02 PM Tasks for NVIDIA GPU are available, but your preferences are set to not accept them
47593 Asteroids@home 4/1/2015 4:59:02 PM This computer has finished a daily quota of 10 tasks


Sounds like too many errors were reported so your work fetch has been throttled, this is standard at alot of projects, but since you have your pc's hidden I am only guessing. As you return valid units your daily quota will climb rapidly, if I am remembering right it's almost doubled every day as long as no more errors creep in again.

Looks the download failure rate is 90%+. I've had 4454 failures to get 413 tasks that can be run since March 19th.
Makes no sense to me why this issue has not been fixed.

Richie
Send message
Joined: 25 Jul 14
Posts: 64
Credit: 100,582,080
RAC: 0
Message 4333 - Posted: 2 Apr 2015, 19:35:21 UTC - in response to Message 4332.
Last modified: 2 Apr 2015, 19:36:12 UTC

Makes no sense to me why this issue has not been fixed.


Well, the reason is very simple: Administrator of this project is busy. Obviously he hasn't had enough time to do it yet. He will fix it when he has enough time to study what causes the problem plus apply the fix.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 10 · Next
Post to thread

Message boards : News : New workunits


Main page · Your account · Message boards


Copyright © 2020 Asteroids@home