Appending zeros is not making assumptions about out-of-band data, it's a stupid trick to obtain more than N equally spaced points of the DTFT. Or the Zoom DFT, if you prefer. Or the unit circle of the Z transform, if you prefer. Or the CZT. These are all different terms for the same thing, at least in this context where we don't move off the unit circle. To prove it, write out the sum for the DFT, notice that addition of 0 is very literally the same thing as doing nothing, change the bounds on the sum in accordance with this mathematical fact, compare the formula to one of the other transforms, realize it's the same, achieve enlightenment. The Fourier Cinematic Universe has tons of redundant and mostly-redundant terminology, this is one example of hundreds.
?
You will not get more time resolution without more GHz on your VNA. You just won't. It's easy to evaluate "between" time bins -- this is what padding and czt are doing -- but you are really just looking at the Fourier Conjugate of your frequency-domain window function convolved onto the original discrete samples of the DFT.
?
If this is not obvious to you after a quick reminder, you may be underinvested in understanding Fourier Transform techniques in general. Trust me, it's worth the time to understand the idea of going from FT (continuous->continuous) to DFT (discrete->discrete) by way of sampling. Multiplying by a narrowly-spaced delta comb (ideal sampling) in one domain is the same as convolving by a widely-spaced delta comb (aliasing) in the other. Convolving a narrowly-spaced delta comb by a one-sample-width box (non-ideal sampling) is the same as multiplying by a wide sinc function in the other domain (windowing). Sample rate <> aliasing, sample aperture <> windowing. Your VNA imposes a rectangular frequency domain window due to its finite frequency range. This transforms into a finite width sample aperture that determines the best time-domain resolution it can accomplish. Using any Fourier-related technique to peer "between" time domain samples is really just looking at the aperture function, which will by default be the fourier conjugate of a rectangle, which will be a sinc with lots of ringing. Don't mistake the ringing for additional data. It's the mathematical consequence of the choice of a rectangular window function, nothing more.
?
Again, if this rustled up memories of DSP class, great. If not, I'm sorry, I can't fit a DSP class in an email, and I am a poor teacher besides. If you want recommendations for DSP books, I used Orfanidis, but I predate a bunch of material (3Blue1Brown videos) that looks quite good, so the furthest I can give trustworthy advice is: this is worth your time to understand. This will not be the last time you run into it. Not by a long shot. "Trust me bro" may or may not work here, but it definitely won't work the next time you run into this, or the time after, or the time after.