I think there is another status reply I made before this one, pending...
I moved to 64bit Bookworm on the Pi-3, and setup the Pi-1 to monitor the serial interface on Pi-3 so I can capture the crash.
Linux rms-gw3 6.1.0-rpi8-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.73-1+rpt1 (2024-01-25) aarch64 GNU/Linux
A new process appears in the panic string this time, but the behavior is similar.?
17h 24m uptime, and we panic? with this message (a clip).? This time the panic string is kworker.? I think that's a generic kernel task
----------
[62681.000007] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[62681.006386] Modules linked in: mkiss ax25 cmac algif_hash aes_arm64 aes_generic algif_skcipher af_alg bnep vc4 snd_soc_hdmi_codec drm_display_helpe
r cec drm_dma_helper drm_kms_helper brcmfmac snd_soc_core binfmt_misc brcmutil hci_uart snd_compress cfg80211 btbcm snd_pcm_dmaengine fb_sys_fops rasp
berrypi_hwmon bcm2835_codec(C) syscopyarea sysfillrect cdc_acm bcm2835_v4l2(C) bcm2835_isp(C) bluetooth sysimgblt v4l2_mem2mem bcm2835_mmal_vchiq(C) v
ideobuf2_dma_contig videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev ecdh_generic snd_bcm2835(C) ecc snd_pcm rfkill libaes ?
raspberrypi_gpiomem snd_timer vc_sm_cma(C) snd mc uio_pdrv_genirq uio drm fuse dm_mod drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 i
2c_bcm2835
[62681.074035] CPU: 0 PID: 8218 Comm: kworker/u8:0 Tainted: G???????? C???????? 6.1.0-rpi8-rpi-v8 #1? Debian 1:6.1.73-1+rpt1
[62681.085158] Hardware name: Raspberry Pi 3 Model B Rev 1.2 (DT)
[62681.091075] Workqueue: events_unbound flush_to_ldisc
--------
I was not making any changes to the Pi-3 at the time, and only recognized the system had faulted when I attempted to check winlink mail via VHF from another system.? It would not respond but I could see lights on the TNC decoding my attempts to connect.
A few hours before I was checking on system health and noticed this.
Feb 02 09:42:04 rms-gw3 rmsgw_aci[7546]: Channel Stats: 1 read, 1 active, 0 down, 1 updated, 0 errors?? <---- last good update which matches timestamp on winlink status page
Feb 02 10:14:01 rms-gw3 rmsgw_aci[7616]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors? <----- things start to break down here
Feb 02 10:42:01 rms-gw3 rmsgw_aci[7664]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:54:39 rms-gw3 rmsgw_aci[7698]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:54:54 rms-gw3 rmsgw_aci[7721]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:55:01 rms-gw3 rmsgw_aci[7742]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:55:11 rms-gw3 rmsgw_aci[7775]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:55:52 rms-gw3 rmsgw_aci[7802]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 10:59:19 rms-gw3 rmsgw_aci[7835]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 11:14:01 rms-gw3 rmsgw_aci[7875]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 11:42:01 rms-gw3 rmsgw_aci[7976]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Feb 02 12:14:01 rms-gw3 rmsgw_aci[8193]: Channel Stats: 1 read, 1 active, 1 down, 0 updated, 0 errors
Tracing the script that does ACI, I notice it is failing to detect a device from the netstat command and reporting a device is down.
root@rms-gw3:/usr/local/etc/rmsgw# netstat --protocol=ax25 -l
Active AX.25 sockets
Dest?????? Source???? Device? State??????? Vr/Vs??? Send-Q? Recv-Q
*????????? WA6BGS-10????????? LISTENING??? 000/000? 0?????? 0??? ?
In the device column, it should show ax0.? Even though this was not displaying in the netstat command, the gateway was passing traffic between 9:42 and 12:14 before it crashed.?
I'm the only user, so it's very lightly used while I work out these bugs.
Now I restarted the Pi-3 and to try something different I did not give kissattach an IP address.? It starts okay and passes traffic, but the ACI script fails because the rmschanstat script looks for an IP to determine the interface is up.
I'm running out of things to change or try besides abandoning the Pi.
32 bit Bullseye and Bookworm - same results with mostly similar crash times.
Pi-1 and Pi-3 - same results.
64 bit Bookworm on Pi-3 - same results so far.
I'm following the same build/config recipe each time.? Nearly identical packages added from the repos, and the same git code pull for the rmsgw software.
I put the xml and config files in the same place from the same source, and the gateway starts as expected each time.
64bit bookworm appears to be a 32/64 bit kernel (lscpu), but every app I'm running returns "ELF 64-big LSB" including the rmsgw app I compiled from git source.
Any suggestions will be considered.
I would like to compare config notes with others, especially if you are using a TNC.
-Jon