We just acquired three new build slaves for Hudson, which are running Windows XP x64. We're having issues deploying to these that we haven't seen before (we have two other XP32 machines already slaved).
When we first reboot the server, or just after restarting the Server service, the node's log on hudson shows the following (domain name changed to protect the innocent):
Connecting to beast.example.com Copying slave.jar The parameter is incorrect. jcifs.smb.SmbException: The parameter is incorrect. at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:542) at jcifs.smb.SmbTransport.send(SmbTransport.java:644) at jcifs.smb.SmbSession.sessionSetup(SmbSession.java:371) at jcifs.smb.SmbSession.send(SmbSession.java:235) at jcifs.smb.SmbTree.treeConnect(SmbTree.java:161) at jcifs.smb.SmbFile.doConnect(SmbFile.java:858) at jcifs.smb.SmbFile.connect(SmbFile.java:901) at jcifs.smb.SmbFile.connect0(SmbFile.java:827) at jcifs.smb.SmbFile.open0(SmbFile.java:917) at jcifs.smb.SmbFile.open(SmbFile.java:951) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:142) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:97) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:67) at jcifs.smb.SmbFile.getOutputStream(SmbFile.java:2793) at hudson.os.windows.ManagedWindowsServiceLauncher.copySlaveJar(ManagedWindowsServiceLauncher.java:198) at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:152) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:175) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) at java.lang.Thread.run(Thread.java:613)
On any subsequent attempts to "Launch slave service", we get:
Connecting to beast.example.com Copying slave.jar 0xC0000205 jcifs.smb.SmbException: 0xC0000205 at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:542) at jcifs.smb.SmbTransport.send(SmbTransport.java:644) at jcifs.smb.SmbSession.send(SmbSession.java:242) at jcifs.smb.SmbTree.send(SmbTree.java:111) at jcifs.smb.SmbFile.send(SmbFile.java:729) at jcifs.smb.SmbFile.open0(SmbFile.java:934) at jcifs.smb.SmbFile.open(SmbFile.java:951) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:142) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:97) at jcifs.smb.SmbFileOutputStream.(SmbFileOutputStream.java:67) at jcifs.smb.SmbFile.getOutputStream(SmbFile.java:2793) at hudson.os.windows.ManagedWindowsServiceLauncher.copySlaveJar(ManagedWindowsServiceLauncher.java:198) at hudson.os.windows.ManagedWindowsServiceLauncher.launch(ManagedWindowsServiceLauncher.java:152) at hudson.slaves.SlaveComputer$1.call(SlaveComputer.java:175) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269) at java.util.concurrent.FutureTask.run(FutureTask.java:123) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:651) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:676) at java.lang.Thread.run(Thread.java:613)
It seems like samba itself, not Hudson, may be the issue. We've double-checked group memberships and directory permissions for C:\hudson and they're identical to the other two slaves.
Using smbclient from the MacOSX server that's actually running Tomcat+Hudson (but does not execute builds), I was able to get a strange response on one attempt:
smb: \hudson\> get hudson-slave.exe NT_STATUS_INSUFF_SERVER_RESOURCES opening remote file \hudson\hudson-slave.exe
Googling around suggest an IRPStackSize issue might be the culprit, but jacking that up 5 at a time (eventually to 50 = 0x32) and restarting the Server service doesn't seem to help.
As an aside, launching JNLP client works just fine, although we'd prefer to have it as a service.
Hudson version is 1.323, by the way (only one behind, nothing in the changelog looks particularly relevant).
Looks like JCIFS may have a fix for this. From a coworker:
"jcifs-1.3.10 released / Bugfix for SmbException: The parameter is incorrect posted by Mike, June 4, 2009 This release fixes a bug that could sporadically trigger a "The parameter is incorrect" error."
"Just looked at the current hudson source, they're using jcifs-1.3.3 so they are behind and do not have this (as well as several other) update(s)."
I'll see about pushing this into the upstream bug tracker, and perhaps give a shot at integrating the newer version and rebuilding locally.
Update 1: filed an issue tracker entry here
Update 2: we've switched over to JNLP and used that to install a service, which is set to automatically start. This has been working without offline issues for a day or two now. Will keep watching the upstream bug to see if/when any activity happens there.