The macosxhints Forums

The macosxhints Forums (http://hintsforums.macworld.com/index.php)
-   Networking (http://hintsforums.macworld.com/forumdisplay.php?f=14)
-   -   SSH server won't run? (http://hintsforums.macworld.com/showthread.php?t=98517)

msp3k 02-04-2009 04:55 PM

SSH server won't run?
 
Hi gurus,

I have a new problem with my 10.5.6 that has developed. At some point my SSH daemon has died and refuses to run. When I go to System Preferences -> Sharing and check Remote Login, it's checked, but I am unable to ssh into the machine:

$ ssh -l <admin-user> <host>
ssh: Connection to host <host> port 22: Connection refused

When I uncheck System Preferences -> Sharing -> Remote Login and then re-check it, I see this in the log files:

==> /var/log/system.log <==
Feb 4 15:42:24 <host> com.apple.service_helper[57046]: getaddrinfo(): nodename nor servname provided, or not known
Feb 4 15:42:24 <host> com.apple.launchd[1] (com.openssh.sshd): Unknown key: SHAuthorizationRight

But ps -ef shows no sshd running.

On another machine where sshd is working, I see the following:
/usr/libexec/launchproxy /usr/sbin/sshd -i
/usr/sbin/sshd -i

And, on the working machine, there is no mention of SHAuthorizationRight in the log files.

Thanks for any help you can give,

Michael

trevor 02-04-2009 05:01 PM

sshd will not start if there is no configuration file. Have you erased /etc/sshd_config? Or is it unreadable?

Trevor

msp3k 02-04-2009 06:02 PM

$ ls -ald /etc/sshd_config
-rw-r--r-- 1 root wheel 3525 Mar 5 2008 /etc/sshd_config
$ cat /etc/sshd_config
# $OpenBSD: sshd_config,v 1.75 2007/03/19 01:01:29 djm Exp $

# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/bin:/bin:/usr/sbin:/sbin

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options change a
# default value.

#Port 22
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

# Disable legacy (protocol version 1) support in the server for new
# installations. In future the default will change to require explicit
# activation of protocol 1
Protocol 2

# HostKey for protocol version 1
#HostKey /etc/ssh_host_key
# HostKeys for protocol version 2
#HostKey /etc/ssh_host_rsa_key
#HostKey /etc/ssh_host_dsa_key

# Lifetime and size of ephemeral version 1 server key
#KeyRegenerationInterval 1h
#ServerKeyBits 768

# Logging
# obsoletes QuietMode and FascistLogging
SyslogFacility AUTHPRIV
#LogLevel INFO

# Authentication:

#LoginGraceTime 2m
#PermitRootLogin yes
#StrictModes yes
#MaxAuthTries 6

#RSAAuthentication yes
#PubkeyAuthentication yes
#AuthorizedKeysFile .ssh/authorized_keys

# For this to work you will also need host keys in /etc/ssh_known_hosts
#RhostsRSAAuthentication no
# similar for protocol version 2
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for
# RhostsRSAAuthentication and HostbasedAuthentication
#IgnoreUserKnownHosts no
# Don't read the user's ~/.rhosts and ~/.shosts files
#IgnoreRhosts yes

# To disable tunneled clear text passwords, change to no here! Also,
# remember to set the UsePAM setting to 'no'.
#PasswordAuthentication yes
#PermitEmptyPasswords no

# SACL options
#SACLSupport yes

# Change to no to disable s/key passwords
#ChallengeResponseAuthentication yes

# Kerberos options
#KerberosAuthentication no
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes
#KerberosGetAFSToken no

# GSSAPI options
#GSSAPIAuthentication no
#GSSAPICleanupCredentials yes
#GSSAPIStrictAcceptorCheck yes
#GSSAPIKeyExchange no

# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
# Also, PAM will deny null passwords by default. If you need to allow
# null passwords, add the " nullok" option to the end of the
# securityserver.so line in /etc/pam.d/sshd.
#UsePAM yes

#AllowTcpForwarding yes
#GatewayPorts no
#X11Forwarding no
#X11DisplayOffset 10
#X11UseLocalhost yes
#PrintMotd yes
#PrintLastLog yes
#TCPKeepAlive yes
#UseLogin no
#UsePrivilegeSeparation yes
#PermitUserEnvironment no
#Compression delayed
#ClientAliveInterval 0
#ClientAliveCountMax 3
#UseDNS yes
#PidFile /var/run/sshd.pid
#MaxStartups 10
#PermitTunnel no

# no default banner path
#Banner /some/path

# override default of no subsystems
Subsystem sftp /usr/libexec/sftp-server

# Example of overriding settings on a per-user basis
#Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# ForceCommand cvs server

tlarkin 02-04-2009 06:14 PM

OK, a few things come to mind

Has anything changed on the network? New router, NAT enabled, etc?

Do you have ssh keys for this server? They could be expired or corrupt, you may need to trash your ~/.ssh file to wipe out your keys and get new ones.

Can machines ssh back into the machine where the problem occurs?

hayne 02-04-2009 06:26 PM

Quote:

Originally Posted by msp3k (Post 517384)
But ps -ef shows no sshd running

Note that you don't necessarily get 'sshd' running just from turning on the Remote Login checkbox.
What happens instead is that 'launchd' starts listening on port 22 and it will start 'sshd' when needed.

Try running the following command:
sudo lsof -i -P | grep 22

hayne 02-04-2009 06:39 PM

Quote:

Originally Posted by msp3k (Post 517384)
Feb 4 15:42:24 <host> com.apple.service_helper[57046]: getaddrinfo(): nodename nor servname provided, or not known

For what it's worth, I see by looking in the Darwin source code that that error message comes from the following lines in the function 'getaddrinfo' (in the file openssh/openbsd-compat/fake-rfc2553.c):
Code:

        /* Don't try DNS if AI_NUMERICHOST is set */
        if (hints && hints->ai_flags & AI_NUMERICHOST)
                return (EAI_NONAME);

I don't know what AI_NUMERICHOST means, but maybe that will give you a clue.

msp3k 02-04-2009 08:23 PM

Quote:

Originally Posted by hayne (Post 517408)
Note that you don't necessarily get 'sshd' running just from turning on the Remote Login checkbox.
What happens instead is that 'launchd' starts listening on port 22 and it will start 'sshd' when needed.

Try running the following command:
sudo lsof -i -P | grep 22

ntpd 46 root 22u IPv6 0x6909b18 0t0 UDP localhost:123
mDNSRespo 56 _mdnsresponder 22u IPv4 0x6908cc0 0t0 UDP *:56559

put17 02-04-2009 08:24 PM

Can you confirm that the file /System/Library/LaunchDaemons/ssh.plist looks _exactly_ like below:

Code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Label</key>
        <string>com.openssh.sshd</string>
        <key>Program</key>
        <string>/usr/libexec/sshd-keygen-wrapper</string>
        <key>ProgramArguments</key>
        <array>
                <string>/usr/sbin/sshd</string>
                <string>-i</string>
        </array>
        <key>SHAuthorizationRight</key>
        <string>system.preferences</string>
        <key>SessionCreate</key>
        <true/>
        <key>Sockets</key>
        <dict>
                <key>Listeners</key>
                <dict>
                        <key>Bonjour</key>
                        <array>
                                <string>ssh</string>
                                <string>sftp-ssh</string>
                        </array>
                        <key>SockServiceName</key>
                        <string>ssh</string>
                </dict>
        </dict>
        <key>StandardErrorPath</key>
        <string>/dev/null</string>
        <key>inetdCompatibility</key>
        <dict>
                <key>Wait</key>
                <false/>
        </dict>
</dict>
</plist>

and that your system matches the following:

Code:

$ grep "^ssh " /etc/services
ssh              22/udp    # SSH Remote Login Protocol
ssh              22/tcp    # SSH Remote Login Protocol


put17 02-04-2009 08:25 PM

Quote:

Originally Posted by msp3k (Post 517433)
ntpd 46 root 22u IPv6 0x6909b18 0t0 UDP localhost:123
mDNSRespo 56 _mdnsresponder 22u IPv4 0x6908cc0 0t0 UDP *:56559

that confirms that launchd is unable to bind to port 22, so the problem is not with the sshd configuration, but rather with launchd (which listens, and then hands off to sshd once an incoming connection request is detected).

Please reply to my above post, all those details pertain to launchd.

msp3k 02-04-2009 08:29 PM

Quote:

Originally Posted by tlarkin (Post 517404)
OK, a few things come to mind

Has anything changed on the network? New router, NAT enabled, etc?

Nothing's changed on the network.

Quote:

Originally Posted by tlarkin (Post 517404)
Do you have ssh keys for this server? They could be expired or corrupt, you may need to trash your ~/.ssh file to wipe out your keys and get new ones.

I don't believe that this is the problem. I believe that this error is occurring because the problem host isn't listening to port 22 at all. I knock, but it's not answering the door.

Quote:

Originally Posted by tlarkin (Post 517404)
Can machines ssh back into the machine where the problem occurs?

That's the problem. On the problem host, I can ssh out. But from the outside, I cannot ssh in.

msp3k 02-04-2009 08:35 PM

Quote:

Originally Posted by put17 (Post 517434)
Can you confirm that the file /System/Library/LaunchDaemons/ssh.plist looks _exactly_ like below:

Yes.

Quote:

Originally Posted by put17 (Post 517434)
and that your system matches the following:

Code:

$ grep "^ssh " /etc/services
ssh              22/udp    # SSH Remote Login Protocol
ssh              22/tcp    # SSH Remote Login Protocol


Yes again.

put17 02-04-2009 08:56 PM

Can you please copy and past the output of the following commands:

Code:

grep -A1 "SockService" /System/Library/LaunchDaemons/ssh.plist

and

grep "^ssh " /etc/services

Even though that's repeating some of the above testing I think there's a good chance of that being the culprit. Just for fun some output below from how I can break launchd's sshd entry to give me the same error you are getting:

Code:

bash-3.2$ pwd
/System/Library/LaunchDaemons
bash-3.2$ sudo vim ssh.plist
bash-3.2$ grep -A1 "SockService" ssh.plist
                        <key>SockServiceName</key>
                        <string>sshTHISDOESNOTEXIST</string>

So I'm telling it to listen to a bad service port. Then I unload and load the configuration:

Code:

bash-3.2$ sudo launchctl
launchd% unload ./ssh.plist
launchd% load ./ssh.plist
getaddrinfo(): nodename nor servname provided, or not known
launchd% exit

Same error as you are getting. Fixing it again gets it to load the service just fine:

Code:

bash-3.2$ sudo vim ssh.plist
bash-3.2$ grep -A1 "SockService" ssh.plist
                        <key>SockServiceName</key>
                        <string>ssh</string>
bash-3.2$ sudo launchctl
launchd% unload ./ssh.plist
launchd% load ./ssh.plist
launchd% exit
$ sudo lsof -i -P | grep ":22"
launchd      1          root  27u  IPv6 0x4040984      0t0    TCP *:22 (LISTEN)
launchd      1          root  46u  IPv4 0x5f46a68      0t0    TCP *:22 (LISTEN)


msp3k 02-04-2009 11:23 PM

Quote:

Originally Posted by put17 (Post 517445)
Can you please copy and past the output of the following commands:

Code:

grep -A1 "SockService" /System/Library/LaunchDaemons/ssh.plist

and

grep "^ssh " /etc/services


Code:

                        <key>SockServiceName</key>
                        <string>ssh</string>

and

Code:

ssh              22/udp    # SSH Remote Login Protocol
ssh              22/tcp      # SSH Remote Login Protocol


msp3k 02-04-2009 11:25 PM

And:

Code:

$ sudo lsof -i -P | grep ":22"
password:
$


hayne 02-04-2009 11:52 PM

Quote:

Originally Posted by msp3k (Post 517456)
Code:

ssh              22/udp    # SSH Remote Login Protocol
ssh              22/tcp      # SSH Remote Login Protocol


That's strange - you seem to have 6 spaces after the "tcp" in the above line from /etc/services
Is is possible that you have edited that file in the past?

What do you get when you run the command 'md5' on the /etc/services file ?
Code:

% md5 /etc/services
MD5 (/etc/services) = 6591b0b9196c5fe81eb9c25fbaaf25cd


msp3k 02-05-2009 08:59 AM

Quote:

Originally Posted by hayne (Post 517459)
That's strange - you seem to have 6 spaces after the "tcp" in the above line from /etc/services
Is is possible that you have edited that file in the past?

That may be my fault. At that time I was looking at the machine from home via remote desktop. Being a remote desktop noob I couldn't figure out how to cut-and-paste from the remote desktop to the browser running locally. I resorted to typing it as carefully as I could.

Quote:

Originally Posted by hayne (Post 517459)
What do you get when you run the command 'md5' on the /etc/services file ?
Code:

% md5 /etc/services
MD5 (/etc/services) = 6591b0b9196c5fe81eb9c25fbaaf25cd


$ md5 /etc/services
MD5 (/etc/services) = 6591b0b9196c5fe81eb9c25fbaaf25cd

msp3k 02-05-2009 09:05 AM

Oh my...

This problem exists on the workstation that I use as a source for my netinstall images.
(Covers eyes with hand, peeks through fingers...)
I bet this means that this problem also exists on the machines that I imaged earlier this week.

...Yes. Yes it does.
(*sigh*)

hayne 02-05-2009 12:34 PM

Quote:

Originally Posted by msp3k (Post 517504)
This problem exists on the workstation that I use as a source for my netinstall images.

So do you understand what caused the problem? It isn't clear for me.

msp3k 02-05-2009 01:49 PM

Quote:

Originally Posted by hayne (Post 517530)
So do you understand what caused the problem? It isn't clear for me.

No, I have no clue. The only thing I think I understand is that for some reason launchd isn't listening to port 22.

I'm at a loss.

tlarkin 02-05-2009 01:52 PM

Quote:

Originally Posted by msp3k (Post 517549)
No, I have no clue. The only thing I think I understand is that for some reason launchd isn't listening to port 22.

I'm at a loss.

Well, by default it does that, and then actives the sshd once it becomes active. So, either your image is a botched OS install, or you did something in the image to configure the ssh or launchd items and it is not working.

what happens if you use launchctl and manually launch the ssh daemon? Does it work then?

hayne 02-05-2009 02:28 PM

When you turn on Remote Login in the Sharing Preferences, it reads the file "/System/Library/LaunchDaemons/ssh.plist" (as others have referred to above).

You can see this happening by running the following command in a Terminal window while you go to Sharing Preferences and turn on Remote Login:
sudo filebyproc.d | grep -i launchd

Maybe your copy of that file is corrupted somehow (even though the 'grep's above look fine) - what do you get from 'md5' applied to that file?
Code:

% md5 /System/Library/LaunchDaemons/ssh.plist
MD5 (/System/Library/LaunchDaemons/ssh.plist) = 32efeb6f3fb420f60e6ab3ef0ebfb06d


msp3k 02-05-2009 04:26 PM

Quote:

Originally Posted by tlarkin (Post 517550)
Well, by default it does that, and then actives the sshd once it becomes active. So, either your image is a botched OS install, or you did something in the image to configure the ssh or launchd items and it is not working.

what happens if you use launchctl and manually launch the ssh daemon? Does it work then?

# launchctl unload -w /System/Library/LaunchDaemons/ssh.plist
# launchctl load -w /System/Library/LaunchDaemons/ssh.plist
getaddrinfo(): nodename nor servname provided, or not known

I'm in the process of rebuilding my imaging machine now.

msp3k 02-05-2009 04:33 PM

Quote:

Originally Posted by msp3k (Post 517569)
I'm in the process of rebuilding my imaging machine now.

Oh, and I've discovered something nasty.

I restored the factory default image to the image master's hard drive.

I booted the image master and then set up it's LDAP configuration, binding it to the LDAP master.

Immediately after this, all of the clients installed using the old image lost their LDAP binding. They all, each and every one, reported that the LDAP server was not responding.

I have no idea why this happened.

tlarkin 02-05-2009 04:46 PM

Quote:

Originally Posted by msp3k (Post 517571)
Oh, and I've discovered something nasty.

I restored the factory default image to the image master's hard drive.

I booted the image master and then set up it's LDAP configuration, binding it to the LDAP master.

Immediately after this, all of the clients installed using the old image lost their LDAP binding. They all, each and every one, reported that the LDAP server was not responding.

I have no idea why this happened.

I have never seen or heard this and the only thing I can think that would cause this is that if your Open Directory Master server (or whatever server you are bound to) has had DNS or IP changes to it.

msp3k 02-05-2009 11:52 PM

Quote:

Originally Posted by hayne (Post 517557)
When you turn on Remote Login in the Sharing Preferences, it reads the file "/System/Library/LaunchDaemons/ssh.plist" (as others have referred to above).

You can see this happening by running the following command in a Terminal window while you go to Sharing Preferences and turn on Remote Login:
sudo filebyproc.d | grep -i launchd

Maybe your copy of that file is corrupted somehow (even though the 'grep's above look fine) - what do you get from 'md5' applied to that file?
Code:

% md5 /System/Library/LaunchDaemons/ssh.plist
MD5 (/System/Library/LaunchDaemons/ssh.plist) = 32efeb6f3fb420f60e6ab3ef0ebfb06d


MD5 (/System/Library/LaunchDaemons/ssh.plist) = be1e3b8480daa175f18cf391e17b823f

Mine differs from yours?

Here's the complete contents of my file:
Code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Label</key>
        <string>com.openssh.sshd</string>
        <key>Program</key>
        <string>/usr/libexec/sshd-keygen-wrapper</string>
        <key>ProgramArguments</key>
        <array>
                <string>/usr/sbin/sshd</string>
                <string>-i</string>
        </array>
        <key>SHAuthorizationRight</key>
        <string>system.preferences</string>
        <key>SessionCreate</key>
        <true/>
        <key>Sockets</key>
        <dict>
                <key>Listeners</key>
                <dict>
                        <key>Bonjour</key>
                        <array>
                                <string>ssh</string>
                                <string>sftp-ssh</string>
                        </array>
                        <key>SockServiceName</key>
                        <string>ssh</string>
                </dict>
        </dict>
        <key>StandardErrorPath</key>
        <string>/dev/null</string>
        <key>inetdCompatibility</key>
        <dict>
                <key>Wait</key>
                <false/>
        </dict>
</dict>
</plist>


msp3k 02-06-2009 12:29 AM

Quote:

Originally Posted by tlarkin (Post 517572)
I have never seen or heard this and the only thing I can think that would cause this is that if your Open Directory Master server (or whatever server you are bound to) has had DNS or IP changes to it.

I'm discovering lots of things about my apple computers that don't seem to work 100%.

I was thinking that maybe the problem was that my image master was already bound when I made the image. And although I had the workflow set to bind the client on install, I thought that perhaps there was something left behind from the binding of the image master when it did so, such that all clients made from that image were incorrectly using some value with the LDAP server that should have instead been unique to each host. (Apple does seem to put a lot of strange things that I don't understand in their LDAP database.) And that perhaps when I bound the freshly re-made image master with the same name, it overwrote something in the LDAP server that caused the other clients to suddenly be out of whack.

But that was just a guess.

My new approach will be to leave the image master un-bound when I make a new image, and then bind it via a post-install script after it's been installed on the client.

tlarkin 02-06-2009 01:10 AM

Quote:

Originally Posted by msp3k (Post 517622)
I'm discovering lots of things about my apple computers that don't seem to work 100%.

I was thinking that maybe the problem was that my image master was already bound when I made the image. And although I had the workflow set to bind the client on install, I thought that perhaps there was something left behind from the binding of the image master when it did so, such that all clients made from that image were incorrectly using some value with the LDAP server that should have instead been unique to each host. (Apple does seem to put a lot of strange things that I don't understand in their LDAP database.) And that perhaps when I bound the freshly re-made image master with the same name, it overwrote something in the LDAP server that caused the other clients to suddenly be out of whack.

But that was just a guess.

My new approach will be to leave the image master un-bound when I make a new image, and then bind it via a post-install script after it's been installed on the client.

I have scripts to change binding on a client on my site, linked below.

hayne 02-06-2009 03:00 AM

Quote:

Originally Posted by msp3k (Post 517618)
MD5 (/System/Library/LaunchDaemons/ssh.plist) = be1e3b8480daa175f18cf391e17b823f

Mine differs from yours?

Here's the complete contents of my file:
Code:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>Label</key>
        <string>com.openssh.sshd</string>
[...]


The ssh.plist on my Mac (with 10.5.6) has the following extra lines just after the "<dict>":
Code:

        <key>Disabled</key>
        <true/>



All times are GMT -5. The time now is 07:59 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.
Site design © IDG Consumer & SMB; individuals retain copyright of their postings
but consent to the possible use of their material in other areas of IDG Consumer & SMB.