¿ªÔÆÌåÓý

Re: SSH Disconnects


 

¿ªÔÆÌåÓý


Hello Tom,

Are you sure your Rpi isn't loosing it's wifi connection during this 0130 to 0230 UTC?? Try running this command on your Rpi and as you can see from my example, one of my Rpi looses it's link often.? Once it looses the link, all SSH sessions will be automatically disconnected.

sudo grep -e "wlan0: carrier" /var/log/syslog
--
Jul? 6 00:02:26 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 00:40:10 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 00:40:11 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 01:07:34 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 01:07:55 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 01:08:01 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 01:08:40 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 01:50:26 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 01:51:39 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 02:12:24 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 02:12:28 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 02:12:34 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 02:44:34 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 02:44:40 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 03:21:36 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 03:21:42 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 03:26:31 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 04:02:53 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 04:16:37 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 04:54:59 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 04:55:01 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 05:21:12 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 05:21:26 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 05:21:32 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 05:21:48 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 05:55:40 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 05:55:55 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 06:43:52 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 06:43:54 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
Jul? 6 07:22:05 rpi0w-2 dhcpcd[515]: wlan0: carrier lost
Jul? 6 07:22:05 rpi0w-2 dhcpcd[515]: wlan0: carrier acquired
--


If you're seeing a similar issue on your side, you need to improve your Wifi connection's reliability.? To get a signal strength report, you can run this script:

/usr/local/bin/get-wifi-stats.sh
--
#!/bin/bash

# 05/25/21 - dranch - minor fixes
# 04/19/21 - dranch - original version

echo -en "\nWifi Signal strength: "
iw wlan0 station dump | grep signal | awk '{print $2}'?
echo -e "?? -70 dbm or less? : is a weak signal and the link will drop if it gets much weaker"
echo -e "?? -60 dbm to -50?? : is a good signal"
echo -e "?? -40 dbm or better: is a great signal"

echo -en "\nWifi RX speed: "
iw wlan0 station dump | grep 'rx bitrate' | awk '{print $3}'
echo -en "Wifi TX speed: "
iw wlan0 station dump | grep 'tx bitrate' | awk '{print $3}'
echo -en "Wifi session reconnects since last reboot: "
iw wlan0 station dump | grep failed | awk '{print $3}'

echo " "
--


If you cannot fix this link drop issue, you need to change your approach to deal with this.? Running sessions within a re-attachable screen / tmux / etc might be important.

--David
KI6ZHD


On 07/05/2021 12:59 PM, Tom McKee K4ZAD wrote:

Thanks guys for the responses. They set me on the way toward a possible resolution, which I haven¡¯t yet achieved.

SOME CLARIFICATIONS:

1.?????? The Putty error message is - Network Error: Software caused connection to abort.

2.?????? Several different Putty ¡°keep alive¡± time setting failed to prevent the problem.

3.?????? The problem is not router based. At the abort Putty stays connected to the RPi, only the login to RPi is broken. Also the router log shows no activity at the time of the problem.

4.?????? If the RPi WSPR TX is run directly on the RPi (no SSH), the login does not break even after hours of activity and also does not break at 01:30 to 02:30 UTC as it does when run via SSH.

I believe these indicate that the problem must be with the RPi¡¯s SSHD implementation.

ACTIONS TAKEN:

1.?????? root/usr/share/openssh/sshd_config was edited to enable ClientAliveInterval and set its value to 14400 (4 hours) and to enable ClientAliveCountMax and set its value to 3. This provides 12 hours of daylight operation, but does not stop the abort at 01:30 UTC. There are probably about 30 possible settings in my sshd_config, but only 3 (5 with my changes) are enabled. I guess the defaults are generally OK, or should others be enabled?

2.?????? As suggested, I have viewed dmesg. but all I see there seems to relate to boot up, not loss of login. Most of the other Internet advice about sshd logs is generic Linux and references files that I don¡¯t find on my RPi.

3.?????? I have looked at the log files in root/var/log. The applicable files all carry a time stamp consistent with my logging back into Pi again after the abort. The only one that I can read is debug and there is nothing in it with a time stamp consistent with the abort ¨C all times are for my re-logging into Pi.

?? So the problem remains, and any further suggestions would be appreciate.

Thanks,

Tom???? K4ZAD


Join [email protected] to automatically receive all group messages.