PDA

View Full Version : Uploading errors...


Rizzo
8th May 2003, 13:26
OK, my F@H client seems to be able to upload some results, but no others...

(Warning, lots of code below).


[16:32:26] Completed 500000 out of 500000 steps (100)
[16:32:28] Writing final coordinates.
[16:32:28] Past main M.D. loop
[16:33:28]
[16:33:28] Finished Work Unit:
[16:33:28] - Reading up to 316596 from "work/wudata_09.arc": Read 316596
[16:33:28] - Reading up to 0 from "work/wudata_09.xtc": Read 0
[16:33:28] goefile size: 167
[16:33:28] logfile size: 51618
[16:33:28] Leaving Run
[16:33:31] - Writing 375493 bytes of core data to disk...
[16:33:31] ... Done.
[16:33:31] - Shutting down core
[16:33:31]
[16:33:31] Folding@home Core Shutdown: FINISHED_UNIT
[16:33:34] CoreStatus = 64 (100)
[16:33:34] Sending work to server


[16:33:34] + Attempting to send results <- Works here
[16:33:40] + Results successfully sent
[16:33:40] Thank you for your contribution to Folding@home.
[16:33:40] + Number of Units Completed: 2


[16:33:44] + Attempting to send results <- Doesn't work here
[16:33:45] Couldn't send HTTP request to server (wininet)
[16:33:45] + Could not connect to Work Server (results)
[16:33:45] - Error: Could not transmit unit 07 (completed May 6). Keeping unit in queue.


[16:33:45] + Attempting to send results <- Doesn't work here
[16:33:45] Couldn't send HTTP request to server (wininet)
[16:33:45] + Could not connect to Work Server (results)
[16:33:45] - Error: Could not transmit unit 08 (completed May 7). Keeping unit in queue.
[16:33:45] + Attempting to get work packet
[16:33:45] - Connecting to assignment server
[16:33:45] - Successful: assigned to (171.64.122.144). <- But now it works again
[16:33:45] + News From Folding@Home: Welcome to Folding@Home
[16:33:46] Loaded queue successfully.
[16:33:50] + Closed connections
[16:33:50]
[16:33:50] + Processing work unit
[16:33:50] Core required: FahCore_78.exe
[16:33:50] Core found.
[16:33:50] Working on Unit 00 [May 8 16:33:50]
[16:33:50] + Working ...
[16:33:51]
[16:33:51] *------------------------------*
[16:33:51] Folding@home Gromacs Core
[16:33:51] Version 1.46 (April 21, 2003)
[16:33:51]
[16:33:51] Preparing to commence simulation
[16:33:51] - Looking at optimizations...
[16:33:51] - Created dyn
[16:33:51] - Files status OK
[16:33:51] Error: Work unit read from disk is invalid <- And what is this ..?
[16:33:51]
[16:33:51] Folding@home Core Shutdown: CORE_OUTDATED
[16:33:55] CoreStatus = 6E (110)
[16:33:55] + Core out of date. Auto updating...
[16:33:55] - Attempting to download new core...
[16:33:55] + Downloading new core: FahCore_78.exe
[16:33:56] + 10240 bytes downloaded
[16:33:56] + 20480 bytes downloaded

It now continues to download and work on this WU with no problems


I have one WU to my name, so some stuff is getting through for sure, but why not those two units ?

All the events happen in such a short space of time that it is very unlikely that either the 'net here or the Stanford servers were down...

phil
8th May 2003, 13:36
I have some similar entries re: problems uploading. It looks like they were cached and subsequently uploaded.

Medic193
8th May 2003, 14:37
I've never had a problem. Maybe you are just catching it at a bad time.

Bruce
8th May 2003, 23:30
I've answered that question many, many times but I don't mind explaining it here.

When you complete a WU, it must be returned to the same server that sent it to you. If that server is congested or down (or you've got a problem on your end) you may not be able to upload the result immediately. The purpose of the queue is to retain finished work that does not upload immediately. Ordinarily, your client will be able to contact a different server for a new assignment (as yours did) and go back to crunching.

According to the information you posted, you have successfully returned two units, and you've completed two more which need to be uploaded (in queue slots 7 and 8). If they don't upload automatically within a day or so, ask again and we'll figure out what's blocking them. You can add the command line parameter -verbosity 9 next time you start the client and you'll get much better debugging information. (You'll also get messages you don't really need.)

Your "now it works again" is NOT a successful upload, but is the download of a new WU . . .

If there is a results file in the queue, the client automatically trys to upload it at least every 6 hours until it succeeds. Look for the "Thank you " message like you had at the beginning (and an increasing count of Units Completed).

You'll also note that the new WU actually caused the dowload of a new "core" which essentially upgraded the analysis software on your machine.

Rizzo
9th May 2003, 04:58
Thanks Bruce, I understand now.

I'll keep an eye on them to make sure they go eventually.

Rizzo
12th May 2003, 10:22
OK, so it has been a few days since I had the uploading errors, and they are still there. But it
is worse now.

The client doesn't want to upload or download. I stopped the client, deleted the config file so I could check the settings and this is what I get.


# Windows Console Edition ################################################## ###
################################################## #############################

Folding@home Client Version 3.24

http://foldingathome.stanford.edu
email:help@foldingathome.stanford.edu

################################################## #############################
################################################## #############################



[14:14:35] Configuring Folding@home...

[14:15:08] - Ask before connecting: No
[14:15:08] - Use IE connection settings: Yes
[14:15:08] - User name: rizwindu (Team 131)
[14:15:08] - User ID = 21A1FD20280899FC
[14:15:08] - Machine ID: 1
[14:15:08]
[14:15:08] Loaded queue successfully.
[14:15:08] + Benchmarking ...
[14:15:11]
[14:15:11] + Processing work unit
[14:15:11] Core required: FahCore_65.exe
[14:15:11] Core found.


[14:15:11] + Attempting to send results
[14:15:11] Working on Unit 02 [May 12 14:15:11]
[14:15:11] + Working ...
[14:15:11] Error: Got status code 413 from server
[14:15:11] + Could not connect to Work Server (results)
[14:15:11] - Error: Could not transmit unit 01 (completed May 10). Keeping unit in queue.


[14:15:11] + Attempting to send results
[14:15:12] Folding@Home Client Core Version 2.47 (June 14, 2002)
[14:15:12]
[14:15:12] Proj: work/wudata_02
[14:15:12] Done: 15790 -> 100013 (decompressed 633.3 percent)
[14:15:12] nsteps: 5000000 dt: 2.000000 dt_dump: 250.000000 temperature: 296.000000
[14:15:12] xyzfile:
[14:15:12] " 220 p648_TZ1_1LEcap
[14:15:12] 1 N3 48.023986 -2.845941 -22.373507 124 ..."
[14:15:12] keyfile:
[14:15:12] "parameters ./proj648.prm
[14:15:12] NOVERSION
[14:15:12] ARCHIVE
[14:15:12]
[14:15:12] printout 1000
[14:15:12] writeout ..."
[14:15:12]
[14:15:12] Hashes matched on file work/wudata_02.dyn
[14:15:12] Error: Got status code 413 from server
[14:15:12] + Could not connect to Work Server (results)
[14:15:12] - Error: Could not transmit unit 07 (completed May 6). Keeping unit in queue.


[14:15:12] + Attempting to send results
[14:15:14] Couldn't send HTTP request to server (wininet)
[14:15:14] + Could not connect to Work Server (results)
[14:15:14] - Error: Could not transmit unit 08 (completed May 7). Keeping unit in queue.
[14:15:16] ARC file integrity verified
[14:15:16] Restarting from checkpointed files.
[14:15:16]
[14:15:16] Protein: p648_TZ1_1LEcap
[14:15:16] - Run: 8 (Clone 95, Gen 4)
[14:15:16] - Frames Completed: 389, Remaining: 11
[14:15:16] - Dynamic steps required: 137500
[14:15:16]
[14:15:16] Writing local files:
[14:15:16]
[14:15:16] parameters work/wudata_02.prm
[14:15:16] - Writing "work/wudata_02.key": (overwrite) successful.
[14:15:16] - Writing "work/wudata_02.xyz": (overwrite) successful.
[14:15:16] - Writing "work/wudata_02.prm": (overwrite) successful.
[14:15:17] - Writing "work/wudata_02.key": (append) successful.
[14:15:17]
[14:15:17] PROJECT="work/wudata_02", NSTEPS=137500, DT=2.0000, DTDUMP=25.000000, TEMP=296.00
[14:15:17] TINKER: Software Tools for Molecular Design
[14:15:17] Version 3.8 October 2000
[14:15:17] Copyright (c) Jay William Ponder 1990-2000
[14:15:17] portions Copyright (c) Michael Shirts 2001
[14:15:17] portions Copyright (c) Vijay S Pande 2001


Then it starts doing this (times are different because it is from a different time when I tried the 'delete the config file' thing.


[20:00:29] portions Copyright (c) Michael Shirts 2001
[20:00:29] portions Copyright (c) Vijay S Pande 2001
[20:06:27] Finished a frame (1)
[20:14:12] Finished a frame (2)
[20:17:57] Finished a frame (3)
[20:21:42] Finished a frame (4)
[20:25:02] Finished a frame (5)
[20:28:14] Finished a frame (6)
[20:31:26] Finished a frame (7)
[20:34:38] Finished a frame (8)
[20:37:51] Finished a frame (9)
[20:41:03] Finished a frame (10)
[20:44:16] Finished a frame (11)
[20:47:28] Finished a frame (12)
[20:50:39] Finished a frame (13)
[20:53:51] Finished a frame (14)
[20:57:02] Finished a frame (15)
[21:00:15] Finished a frame (16)
[21:03:26] Finished a frame (17)
[21:06:39] Finished a frame (18)
[21:09:51] Finished a frame (19)
[21:13:03] Finished a frame (20)
[21:16:15] Finished a frame (21)
[21:19:28] Finished a frame (22)
[21:22:40] Finished a frame (23)
[21:25:53] Finished a frame (24)
[21:29:04] Finished a frame (25)
[21:32:17] Finished a frame (26)
[21:35:30] Finished a frame (27)
[21:38:42] Finished a frame (28)
[21:41:54] Finished a frame (29)
[21:45:06] Finished a frame (30)
[21:48:18] Finished a frame (31)
[21:51:29] Finished a frame (32)
[21:54:42] Finished a frame (33)
[21:57:53] Finished a frame (34)
[22:01:05] Finished a frame (35)
[22:04:17] Finished a frame (36)
[22:07:29] Finished a frame (37)
[22:10:41] Finished a frame (38)
[22:13:53] Finished a frame (39)
[22:17:04] Finished a frame (40)
...
...
...


Every now and then it will try and send / recieve work but it won't work.

I don't understand what is going on because I have sent / recieved work from behind the firewall here at uni before (both gah and fah).

phil
12th May 2003, 10:26
Change the "Use IE connection settings: Yes" to "No" and retry. I have mine set to "No" with no problems.

Rizzo
12th May 2003, 10:26
Just realised that Bruce would probably want the -verbosity 9 flag so I restarted the client with that flag as well.


# Windows Console Edition ################################################## ###
################################################## #############################

Folding@home Client Version 3.24

http://foldingathome.stanford.edu
email:help@foldingathome.stanford.edu

################################################## #############################
################################################## #############################

Arguments: -forceasm -advmethods -verbosity 9

[14:24:59] - Ask before connecting: No
[14:24:59] - Use IE connection settings: Yes
[14:24:59] - User name: rizwindu (Team 131)
[14:24:59] - User ID = 21A1FD20280899FC
[14:24:59] - Machine ID: 1
[14:24:59]
[14:24:59] Loaded queue successfully.
[14:24:59] + Benchmarking ...
[14:25:01] The benchmark result is 4048
[14:25:01]
[14:25:01] + Processing work unit
[14:25:01] Core required: FahCore_65.exe
[14:25:01] Core found.
[14:25:01] - Autosending finished units...
[14:25:01] Trying to send all finished work units


[14:25:01] + Attempting to send results
[14:25:01] - Reading file work/wuresults_01.dat from core
[14:25:01] (Read 1507021 bytes from disk)
[14:25:01] Working on Unit 02 [May 12 14:25:01]
[14:25:01] + Working ...
[14:25:01] - Calling 'FahCore_65.exe -dir work/ -suffix 02 -forceasm -lifeline 2884
'

[14:25:01] Error: Got status code 413 from server
[14:25:01] + Could not connect to Work Server (results)
[14:25:01] - Error: Could not transmit unit 01 (completed May 10). Keeping unit in queue.


[14:25:01] + Attempting to send results
[14:25:01] - Reading file work/wuresults_07.dat from core
[14:25:01] (Read 1049998 bytes from disk)
[14:25:02] Folding@Home Client Core Version 2.47 (June 14, 2002)
[14:25:02]
[14:25:02] Proj: work/wudata_02
[14:25:02] Done: 15790 -> 100013 (decompressed 633.3 percent)
[14:25:02] nsteps: 5000000 dt: 2.000000 dt_dump: 250.000000 temperature: 296.000000
[14:25:02] xyzfile:
[14:25:02] " 220 p648_TZ1_1LEcap
[14:25:02] 1 N3 48.023986 -2.845941 -22.373507 124 ..."
[14:25:02] keyfile:
[14:25:02] "parameters ./proj648.prm
[14:25:02] NOVERSION
[14:25:02] ARCHIVE
[14:25:02]
[14:25:02] printout 1000
[14:25:02] writeout ..."
[14:25:02]
[14:25:02] Hashes matched on file work/wudata_02.dyn
[14:25:02] ARC file integrity verified
[14:25:02] Restarting from checkpointed files.
[14:25:02]
[14:25:02] Protein: p648_TZ1_1LEcap
[14:25:02] - Run: 8 (Clone 95, Gen 4)
[14:25:02] - Frames Completed: 391, Remaining: 9
[14:25:02] - Dynamic steps required: 112500
[14:25:02]
[14:25:02] Writing local files:
[14:25:02]
[14:25:02] parameters work/wudata_02.prm
[14:25:02] - Writing "work/wudata_02.key": (overwrite) successful.
[14:25:02] - Writing "work/wudata_02.xyz": (overwrite) successful.
[14:25:02] - Writing "work/wudata_02.prm": (overwrite) successful.
[14:25:03] - Writing "work/wudata_02.key": (append) successful.
[14:25:03]
[14:25:03] PROJECT="work/wudata_02", NSTEPS=112500, DT=2.0000, DTDUMP=25.000000, TEMP=296.00
[14:25:03] TINKER: Software Tools for Molecular Design
[14:25:03] Version 3.8 October 2000
[14:25:03] Copyright (c) Jay William Ponder 1990-2000
[14:25:03] portions Copyright (c) Michael Shirts 2001
[14:25:03] portions Copyright (c) Vijay S Pande 2001

Bruce
14th May 2003, 00:50
Http error 413 is "entity too large"

That typically happens when you've got a proxy server between you and Stanford which limits the size of uploads. There were several reports of (XXXX - - name forgotten, but I can find it) which had a limit. Later versions defaulted to unlimited, or it could be set with a configuration setting. In one case, the proxy-server was administered locally and was fixed easily. In a couple of cases, the proxy-server was at the ISP, so it took a few persuasive calls to tech support to get it fixed.

Rizzo
14th May 2003, 04:58
hmm - that is weird, I've uploaded stuff from here before I'm sure.

I will try and find out if there is a limit (I'm at uni on campus, so it might be hard to change it though).

Bruce
14th May 2003, 12:15
http://forum.folding-community.org/viewtopic.php?p=31324&highlight=#31324