开云体育

zbitx, compiler optimizations #zbitx


 

Hi all,
?
from what I've understood about the timing issues with CW on zbitx, it seems that the CPU load is at least part of the problem.
?
I personally would love an option to switch off the waterfall (either in menu or set options), because I only need it when searching for signals, but not when I have found one.
?
Apart from that, I wonder why the binary is compiled with debug symbols and no optimization at all.
?
I exchanged the "gcc -g" with a "gcc -O3" in the build script, and the CPU load decreased from 60-70% to 45-55% running FT8, so there is quite some optimization that can be done.
?
Interestingly the CPU load is quite higher on CW, which is 65-70% optimized and 75-80% original (unoptimized). I guess this might be due to polling the key's GPIOs instead of handling them via interrupt (which would make the code more difficult of course).
?
Is there a reason why the build script is not using optimization? (for development side "-g" is perfectly understandable, but for "production" side?)
Maybe compiler optimization could at least contribute to the puzzle pieces for optimizing CW response.
?
vy73,
Peter DL4PIT
?
?
?


 

Next improvement could be this:
https://radxa.com/products/zeros/zero2pro/


 

开云体育

It wouldn't be a 1:1 swap.? The GPIO pins are in a different configuration on the Radxa.? It might be possible to change all the pin assignments in the software to get it to work.?


From: [email protected] <[email protected]> on behalf of pd0zz via groups.io <pd0zz@...>
Sent: Thursday, May 8, 2025 12:37 PM
To: [email protected] <[email protected]>
Subject: Re: [BITX20] zbitx, compiler optimizations #zbitx
?
Next improvement could be this:
https://radxa.com/products/zeros/zero2pro/


 

Peter,
?
I have been experimenting with getting CW to work smoothly on this device.? Previously I had published a modification that I made to sbitx_gtk.c which turned off the water waterfall (and more) on the dual GUIs whenever the device was in Transmit Mode.? That made it much more responsive in CW Mode.? Especially when using a Straight Key.??However, it still stutters if it is off doing something else whenever it should be switching right into Transmit Mode.
?
Today I setup the DASH input pin as a pin IRQ.? That works.? I have it switching into Transmit Mode immediately on a key down.? But I have been struggling with figuring out a way to pass the key-down / key-up state form the IRQ processing routine to the complex string of CW processing functions.? I have not given up on that.? But I'm out of time for today.
?
So, I don't think it is so much the load on the device, although running waterfalls on two GUIs contributes to that load.? I think it is more of a control issue.? Just an opinion.? ?
?
72,
Jody - K3JZD? ?


 

Hi Jody,
?
very interesting, thank you!
?
If you managed to detect key-down very fast with an IRQ handler, wouldn't it be sufficient to exit/skip time consuming tasks in ui_tick()?
I mean setting a new variable e g. "key_wants_tx", and time consuming code (like waterfall?) exits when this variable is set? This way, the further handling of key-down could stay where it is, it simply would be called faster.
?
Or did you try that already and it was not sufficient?
?
73 Peter
?
?
?


 

Fascinating.? Sure the waterfall is able to be eliminated and is a good idea.? Is the big yellow display of the dots and dashes worth eliminating, too.? I sure don't need that, although it could be a nice touch.? As a CW guy, I'm a bit disappointed in the cw operation, and I appreciate your good work.? ?Regards, John W9NET


 

Today I setup the DASH input pin as a pin IRQ. ?That works. ?I have it switching into Transmit Mode immediately on a key down. ?But I have been struggling with figuring out a way to pass the key-down / key-up state form the IRQ processing routine to the complex string of CW processing functions. ?I have not given up on that. ?But I'm out of time for today.
?
?
I have no time yet to involve into investigating that problem, but can't the libgpiod functionality of reporting the timestamps of the pin state changes be used for that?
- see? gpiod_edge_event_get_timestamp_ns function.
73,
Wojtek - SP5DAA


 

Hi,
?
digging deeper I was profiling the sbitx binary, and for better profiling support, I put the different parts of the ui_tick in own subroutines, so they can be identified separately at the profiler output:
handle_remote_cmd, handle_gtk_invalidations, handle_tuning_knob, handle_tick_count_routine (the part running only every 50/100/200 ticks), handle_cw_key, handle_scroll.
?
I found that the program is running in an empty time delay loop (i2c_delay) for nearly 60% of the time:
?
? % ? cumulative ? self ? ? ? ? ? ? ?self ? ? total ? ? ? ? ??
?time ? seconds ? seconds ? ?calls ? s/call ? s/call ?name ? ?
?58.73 ? ? 39.57 ? ?39.57 ?8844853 ? ? 0.00 ? ? 0.00 ?i2c_delay
?15.66 ? ? 50.12 ? ?10.55 ?4487474 ? ? 0.00 ? ? 0.00 ?get_field
? 9.62 ? ? 56.60 ? ? 6.48 ? ?14238 ? ? 0.00 ? ? 0.00 ?rx_linear
? 2.70 ? ? 58.42 ? ? 1.82 ? ? 3862 ? ? 0.00 ? ? 0.00 ?tx_process
? 2.35 ? ? 60.00 ? ? 1.58 ? ? ? ?1 ? ? 1.58 ? ?23.54 ?sound_loop
? 1.40 ? ? 60.94 ? ? 0.94 ? ?11294 ? ? 0.00 ? ? 0.00 ?spectrum_update
? 1.35 ? ? 61.85 ? ? 0.91 ? ?83078 ? ? 0.00 ? ? 0.00 ?handle_gtk_invalidations
? 1.34 ? ? 62.75 ? ? 0.90 ?4600445 ? ? 0.00 ? ? 0.00 ?vfo_read
?
?
The i2c_delay seems to be the main reason for the ui_tick being clogged (ui_tick -> handle_tick_count_routine -> zbitx_poll -> i2cbb-calls), see details in the attached profiler output.
?
Just for test I changed delayTicks to 1 (so i2c_delay counts to 1 instead of 400), and this seems to remove the clogging of the ui_tick completely.
?
Anyone knowing the code much better than I do:
I guess the burning of CPU time in i2c_delay has some reason, can anyone enlighten me why this is done?
Would there be any chance to put the i2c handling (communication with the screen pi, right?) in a separate thread?
?
73, Peter
?
?


 

Peter , will it help to use the timing implementation from WiringPi?
?
?


 

Hi Peter,
?
That? is essentially what I did - there is an existing global variable (in_tx) that is True when in Transmit Mode.? In the existing code, once the key down is discovered (which does not always happen right away), this in_tx global is being held True until the Semi-QSK tail timer expires.? Right now I am using that global being True to bypass all GUI processing that is in the ui_tick() function (which being called regularly by a clock timer).? That shuts down all updates to everything on both GUIs and makes the system much more responsive to key strokes (both Iambic and Straight Key).?
?
I now have GPIO Pin Interrupts working fine for the DASH Input and the PTT(DOT) Input (For some reason or another, the PTT and the DOT use same GPIO Input Pin, which seems to complicate things). ?I am presently forcing the in_tx global to True from within both of my IRQ handlers.? However, it is looks like I may need to trigger something else that is in the processing train that is ahead of that in_tx global instead.?
?
Using the IRQ handler for determining both key down and key up when in Straight Key Mode was my first idea.? But that does not look like anything that I can use due to how the exiting code is structured. ??I'm trying to band-aid, not totally restructure - that may be a mistake.? ?
?
So, right now I am trying to figure out what existing function I can wake up from my key-down IRQ handlers to start them polling the GPIO Pins and continue to act on the subsequent key-down events.. There are a number of CW processing functions.? It is not real straight forward.? Might have to print out the code and go though it on paper because I end up going around in circles whenever trying to chase through this in my code editor.
?
But, this head banging stuff is good for the brain they say.
?
72,
Jody - K3JZD? ? ??


 

Peter,
?
My response was based on your first message, bit your second one, where you profiles the various function in ui_tick().? ?I used the brute force approach to get rid of everything except polling the key inputs.
?
72,
Jody - K3JZD


 

开云体育

You folks are going to fix this I’m convinced of it.

More power to you! ? I would be trying to help you, but obviously you know what you’re doing and I’ve got students getting ready for final exams that I have to help. ?Meanwhile, my satellite control system is getting ready for prime time.

Gordon kx4z?

On May 9, 2025, at 14:11, Jody - K3JZD via groups.io <k3jzd.jody@...> wrote:

?
Peter,
?
My response was based on your first message, bit your second one, where you profiles the various function in ui_tick().? ?I used the brute force approach to get rid of everything except polling the key inputs.
?
72,
Jody - K3JZD


 

Peter thanks for sharing interesting performance data!

You may be right that delayTicks could be shortened. There is a note in the i2cbb.c source code that says

delayTicks = 400; // Delay value empirically chosen to be twice the value that just start to cause I2C NACKs - N3SB

That note was in the sbitx code for machines with the RPI4 processor. I don't know what it should be for the Raspberry Pi Zero 2W processor in the zbitx - maybe cut it down proportionate to processor speed difference?
--
Mike KB2ML

On Fri, May 9, 2025 at 12:48 PM, Peter, DL4PIT wrote:


Just for test I changed delayTicks to 1 (so i2c_delay counts to 1 instead of
400), and this seems to remove the clogging of the ui_tick completely.

Anyone knowing the code much better than I do:
I guess the burning of CPU time in i2c_delay has some reason, can anyone
enlighten me why this is done?


 

Peter, I've been looking at your note again about time spent in i2c_delay. It is a simple busy-wait loop that waits a constant 400 'delayTicks' (set in i2cbb.c). I put some timing code in for debug and found 400 delayTicks to be just 5.8 microseconds on my sbitx RPI4. Since the delay is meant to make the i2c bus work right I assume that the _time_ value is going to be constant across processors though the delayTicks maybe could be adjusted for slower processors.
Long story, shorter ... What if you change i2c_delay() to get rid of the busy-wait.

static void i2c_delay() {
delayMicroseconds(5.8); // maybe put this value up where delayTicks was set in i2cbb_init()
}

which lets wiringPi use its own timing mechanism to wait the right time, on all platforms?
I haven't done any profilings like you did, but would love to hear if you profile it again and it changes the picture at all. It is called almost 9 million times in your sample! (if wiringPi uses a busy-wait then I should find a different timer).

NOTE: THIS HAS BEEN ONLY CASUALLY TESTED!

Mike - KB2ML

On Fri, May 9, 2025 at 12:48 PM, Peter, DL4PIT wrote:

I found that the program is running in an empty time delay loop (i2c_delay)
for nearly 60% of the time:

% ? cumulative ? self ? ? ? ? ? ? ?self ? ? total
time ? seconds ? seconds ? ?calls ? s/call ? s/call ?name
58.73 ? ? 39.57 ? ?39.57 ?8844853 ? ? 0.00 ? ? 0.00 ?i2c_delay
--
Mike KB2ML


 

开云体育


On May 8, 2025, at 11:51, Peter, DL4PIT <pr@...> wrote:

Is there a reason why the build script is not using optimization? (for development side "-g" is perfectly understandable, but for "production" side?)
Maybe compiler optimization could at least contribute to the puzzle pieces for optimizing CW response.

I wrote a Makefile in??and made further changes in??and yeah I have it building without debug info by default. But if you type “make?SBITX_DEBUG=1” then you get ASAN too, which finds all the memory problems at runtime (buffer overflows and things like that). ?But it seems to me that ASAN might be causing a memory leak itself (which becomes critical more quickly on zbitx with so little RAM), and it also deoptimizes it even more, so that’s only for development, not to actually use.

And I have ft8lib as a submodule (because it also needs fixes and improvements) and the sbitx Makefile also builds ft8lib if it hasn’t been built yet. ?Also with ASAN, if debug is turned on.