开云体育

ctrl + shift + ? for shortcuts
© 2025 Groups.io

testing on the web


 

On Wed, Mar 30, 2022 at 4:29 PM J. B. Rainsberger <me@...> wrote:
On Tue, Mar 29, 2022 at 5:31 PM Russell Gold <russ@...> wrote:
?
When I get a failing unit test, I generally just revert the change that caused it and try again. If that’s not practical for some reason, I will try this.

You're assuming that you know the change that caused it. The whole point of this technique is to help find the cause of a defect when we don't know the cause of the defect by other means.

Based on this reaction, I clarified the situation in the article: we use the Saff Squeeze when we don't already know what caused the test to fail. Thank you for helping me see one of my unstated assumptions.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

开云体育







On Mar 30, 2022, at 3:29 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 5:31 PM Russell Gold <russ@...> wrote:


On Mar 29, 2022, at 2:58 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 1:57 PM Russell Gold <russ@...> wrote:
I’ve never heard of the “Mikado method,” but that’s pretty much the way I learned to do TDD.

Saff’s technique seems interesting, and I will keep it in mind for the future; I’m not sure yet exactly where it would help. Most of the problems I deal with are errors that wind up being caught in integration testing, because we missed a unit test. There, the problem is that it is not immediately obvious what test is missing, because we haven’t even reproduced the problem other than the integration test: and it doesn’t directly call the production code, so we cannot inline anything useful.

That's _exactly_ what the Saff Squeeze does: it starts with a failing, bigger unit test and produces a missing, smaller unit test. It merely uncovers that test systematically instead of relying on your intuition to imagine it.

But I don’t have a failing unit test. I have a failing?integration?test (and one that typically takes close to an hour to run through all setup and initialization each time).?

I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

Ah, yes, terminology problems.

As my current team uses it, a unit test tests code by making calls. It does no I/O and does not interact with system services, including timers. What we’re calling an integration test is one which interacts with the entire system in a way that simulates how a user or external system would do it. In this case, it sets up Kubernetes and Docker, builds new images, and uses https calls to make things happen. That is not a “bigger unit” in this sense. It’s particular slow because the testers wrote it incrementally - each test adds to the previous environment before running its own logic, so you pretty much have to run the entire suite. We’ve had discussions on this point.

I’ve also heard these called “acceptance tests” and “functional tests” and “system tests.” We’re using the Maven Failsafe plugin to run these, so?

?
When I get a failing unit test, I generally just revert the change that caused it and try again. If that’s not practical for some reason, I will try this.

You're assuming that you know the change that caused it. The whole point of this technique is to help find the cause of a defect when we don't know the cause of the defect by other means.

Correct; since we run the unit tests after each change, we know which one caused it.?

That obviously doesn’t work for learning tests, where the change was the creation of the unit test itself. I’ll have to keep this technique in mind for the next time I write one of those.
-----------------
Author, HttpUnit <> and SimpleStub <>
Now blogging at <>

Have you listened to Edict Zero <>? If not, you don’t know what you’re missing!


 

Terminologically, I believe acceptance tests should be written by the customer or product owner (or at least directly derived from their acceptance criteria), whereas the other kinds of tests are written by the developers to verify the code does what they intend it?to.


On Wed, Mar 30, 2022 at 6:38 PM Russell Gold <russ@...> wrote:






On Mar 30, 2022, at 3:29 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 5:31 PM Russell Gold <russ@...> wrote:


On Mar 29, 2022, at 2:58 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 1:57 PM Russell Gold <russ@...> wrote:
I’ve never heard of the “Mikado method,” but that’s pretty much the way I learned to do TDD.

Saff’s technique seems interesting, and I will keep it in mind for the future; I’m not sure yet exactly where it would help. Most of the problems I deal with are errors that wind up being caught in integration testing, because we missed a unit test. There, the problem is that it is not immediately obvious what test is missing, because we haven’t even reproduced the problem other than the integration test: and it doesn’t directly call the production code, so we cannot inline anything useful.

That's _exactly_ what the Saff Squeeze does: it starts with a failing, bigger unit test and produces a missing, smaller unit test. It merely uncovers that test systematically instead of relying on your intuition to imagine it.

But I don’t have a failing unit test. I have a failing?integration?test (and one that typically takes close to an hour to run through all setup and initialization each time).?

I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

Ah, yes, terminology problems.

As my current team uses it, a unit test tests code by making calls. It does no I/O and does not interact with system services, including timers. What we’re calling an integration test is one which interacts with the entire system in a way that simulates how a user or external system would do it. In this case, it sets up Kubernetes and Docker, builds new images, and uses https calls to make things happen. That is not a “bigger unit” in this sense. It’s particular slow because the testers wrote it incrementally - each test adds to the previous environment before running its own logic, so you pretty much have to run the entire suite. We’ve had discussions on this point.

I’ve also heard these called “acceptance tests” and “functional tests” and “system tests.” We’re using the Maven Failsafe plugin to run these, so?

?
When I get a failing unit test, I generally just revert the change that caused it and try again. If that’s not practical for some reason, I will try this.

You're assuming that you know the change that caused it. The whole point of this technique is to help find the cause of a defect when we don't know the cause of the defect by other means.

Correct; since we run the unit tests after each change, we know which one caused it.?

That obviously doesn’t work for learning tests, where the change was the creation of the unit test itself. I’ll have to keep this technique in mind for the next time I write one of those.
-----------------
Author, HttpUnit <> and SimpleStub <>
Now blogging at <>

Have you listened to Edict Zero <>? If not, you don’t know what you’re missing!


 

开云体育

Concur. Also, in XP, called Customer Tests.

On Mar 31, 2022, at 9:07 AM, Steve Gordon <sgordonphd@...> wrote:

Terminologically, I believe acceptance tests should be written by the customer or product owner (or at least directly derived from their acceptance criteria), whereas the other kinds of tests are written by the developers to verify the code does what they intend it?to.


Ron Jeffries
I know we always like to say it'll be easier to do it now than it
will be to do it later. Not likely. I plan to be smarter later than
I am now, so I think it'll be just as easy later, maybe even easier.
Why pay now when we can pay later?


 

I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

Respectfully, this definition makes the term unit test worthless and replaceable with just "test" or at best "automated test".

Differentiating between unit tests (tests that only test your code, not dependencies, and ideally?a very small part of your code like a single method or class) and integration tests (which include more of the system than unit tests and may include infrastructure concerns) can be very useful both for discussion purposes and as a means of splitting your tests into groups or projects, I've found.

Steve


On Wed, Mar 30, 2022 at 3:30 PM J. B. Rainsberger <me@...> wrote:
On Tue, Mar 29, 2022 at 5:31 PM Russell Gold <russ@...> wrote:


On Mar 29, 2022, at 2:58 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 1:57 PM Russell Gold <russ@...> wrote:
I’ve never heard of the “Mikado method,” but that’s pretty much the way I learned to do TDD.

Saff’s technique seems interesting, and I will keep it in mind for the future; I’m not sure yet exactly where it would help. Most of the problems I deal with are errors that wind up being caught in integration testing, because we missed a unit test. There, the problem is that it is not immediately obvious what test is missing, because we haven’t even reproduced the problem other than the integration test: and it doesn’t directly call the production code, so we cannot inline anything useful.

That's _exactly_ what the Saff Squeeze does: it starts with a failing, bigger unit test and produces a missing, smaller unit test. It merely uncovers that test systematically instead of relying on your intuition to imagine it.

But I don’t have a failing unit test. I have a failing integration test (and one that typically takes close to an hour to run through all setup and initialization each time).?

I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

If you're starting with a failing test that takes 1 hour to run, then the Saff Squeeze remains perfectly effective, but perhaps you can't afford the time to do it that way. Or you merely might not have the patience for it. I wouldn't blame you. In that case, you might need to guess at some smaller failing unit tests, then Squeeze from there.

Even so, I propose two possibilities:

1. I can imagine situations in which Saff Squeeze would lead you to discovering the cause of the defect sooner than groping in the dark, hoping to find the failing part of the code, even if the first few iterations of the Squeeze took an entire day.

2. The Saff Squeeze might be helpful, even if you can't afford to _run_ the tests for the first few iterations. Perhaps even systematically inlining would work better (and feel better) than using a debugger or other, less-systematic techniques.
?
When I get a failing unit test, I generally just revert the change that caused it and try again. If that’s not practical for some reason, I will try this.

You're assuming that you know the change that caused it. The whole point of this technique is to help find the cause of a defect when we don't know the cause of the defect by other means.
--
J. B. (Joe) Rainsberger :: ?:: ::




--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002



--
Steve Smith


 

It's funny how long the terminology "unit tests" has hung on. Perhaps that's because of its vagueness. When I worked on IBM mainframes, a typical "unit" test covered a single unit of compilation, which was generally the amount of compiled code for which we had enough base registers available: usually 4, 8 or 12K. So as you may imagine it was pretty small. If you were testing more than one such unit, that was integration. If you were testing the whole system, that was a system test. Of course, these were terms used by the testing group, since programmers didn't do testing (other than adhoc) back then. :-)

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

Charlie


On Thu, Mar 31, 2022 at 12:07 PM Steven Smith <ssmith.lists@...> wrote:
I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

Respectfully, this definition makes the term unit test worthless and replaceable with just "test" or at best "automated test".

Differentiating between unit tests (tests that only test your code, not dependencies, and ideally?a very small part of your code like a single method or class) and integration tests (which include more of the system than unit tests and may include infrastructure concerns) can be very useful both for discussion purposes and as a means of splitting your tests into groups or projects, I've found.

Steve


On Wed, Mar 30, 2022 at 3:30 PM J. B. Rainsberger <me@...> wrote:
On Tue, Mar 29, 2022 at 5:31 PM Russell Gold <russ@...> wrote:


On Mar 29, 2022, at 2:58 PM, J. B. Rainsberger <me@...> wrote:

On Tue, Mar 29, 2022 at 1:57 PM Russell Gold <russ@...> wrote:
I’ve never heard of the “Mikado method,” but that’s pretty much the way I learned to do TDD.

Saff’s technique seems interesting, and I will keep it in mind for the future; I’m not sure yet exactly where it would help. Most of the problems I deal with are errors that wind up being caught in integration testing, because we missed a unit test. There, the problem is that it is not immediately obvious what test is missing, because we haven’t even reproduced the problem other than the integration test: and it doesn’t directly call the production code, so we cannot inline anything useful.

That's _exactly_ what the Saff Squeeze does: it starts with a failing, bigger unit test and produces a missing, smaller unit test. It merely uncovers that test systematically instead of relying on your intuition to imagine it.

But I don’t have a failing unit test. I have a failing integration test (and one that typically takes close to an hour to run through all setup and initialization each time).?

I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

If you're starting with a failing test that takes 1 hour to run, then the Saff Squeeze remains perfectly effective, but perhaps you can't afford the time to do it that way. Or you merely might not have the patience for it. I wouldn't blame you. In that case, you might need to guess at some smaller failing unit tests, then Squeeze from there.

Even so, I propose two possibilities:

1. I can imagine situations in which Saff Squeeze would lead you to discovering the cause of the defect sooner than groping in the dark, hoping to find the failing part of the code, even if the first few iterations of the Squeeze took an entire day.

2. The Saff Squeeze might be helpful, even if you can't afford to _run_ the tests for the first few iterations. Perhaps even systematically inlining would work better (and feel better) than using a debugger or other, less-systematic techniques.
?
When I get a failing unit test, I generally just revert the change that caused it and try again. If that’s not practical for some reason, I will try this.

You're assuming that you know the change that caused it. The whole point of this technique is to help find the cause of a defect when we don't know the cause of the defect by other means.
--
J. B. (Joe) Rainsberger :: ?:: ::




--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002



--
Steve Smith


 

On Thu, Mar 31, 2022 at 4:07 PM Steven Smith <ssmith.lists@...> wrote:
I infer that you're using the term "integration test" to mean "unit test for a bigger unit". It's still a unit, even if it's a large one. :) (Yes, I'm trying to rehabilitate the original meaning of "unit": any independently-inspectable part of the system. And yes, that means that the entire system is a unit.)

Respectfully, this definition makes the term unit test worthless and replaceable with just "test" or at best "automated test".

I don't think so. It describes a particular intent of the test: to check an inspectable unit, rather than limit oneself only to end-to-end tests or tests through the end-user interface. It merely simplifies the definition to label "the entire system" as a unit. In practice, unit tests are rarely end-to-end tests, except that some programmers insist on using end-to-end tests (or at least very large unit tests) to check small units of the system.

Moreover, it stands in contrast to "system test", which has the specific intent of checking system-level issues that are generally not considered faults of a single, particular unit.

Differentiating between unit tests (tests that only test your code, not dependencies, and ideally?a very small part of your code like a single method or class) and integration tests (which include more of the system than unit tests and may include infrastructure concerns) can be very useful both for discussion purposes and as a means of splitting your tests into groups or projects, I've found.

Those are useful distinctions. Using the terms "unit test" and "integration test" for them is both familiar and inaccurate. The terms become arbitrary jargon that newcomers have to learn, then unlearn.

Yes, I know that those terms are not going away, even with those inaccurate usages. They are inaccurate nonetheless. I witness the confusion about the intent of so-called "integration tests" at least a few times per month. (Hint: they're often not actually checking integration.) When I use more-accurate terms in a smaller community, understanding improves. The rest of the world is welcome both to join us and not join us. I get paid the same.
--
J. B. (Joe) Rainsberger :: ?:: ::

--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

On Thu, Mar 31, 2022 at 8:04 PM Charlie Poole <charliepoole@...> wrote:
?
It's funny how long the terminology "unit tests" has hung on. Perhaps that's because of its vagueness. When I worked on IBM mainframes, a typical "unit" test covered a single unit of compilation, which was generally the amount of compiled code for which we had enough base registers available: usually 4, 8 or 12K. So as you may imagine it was pretty small. If you were testing more than one such unit, that was integration. If you were testing the whole system, that was a system test. Of course, these were terms used by the testing group, since programmers didn't do testing (other than adhoc) back then. :-)

I didn't know that the term literally originated as referring to compilation units, but that makes immediate sense to me. Modernizing this meaning to "unit of inspection" (since we don't have those limitations as often any more) seems even more natural now.

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

And here is where I whine a little. :) "micro" means "small", so could we pretty please use "microtests" to mean "small tests" instead of turning the word into new jargon? We can all acknowledge a general bias towards small tests, but not all our best tests are microtests all the time. Sadly, our audience is routinely hearing "only write microtests or you're a bad person". We can't stop that, but we can slow it down. I'd like to try.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

开云体育

On Apr 29, 2022, at 10:22 AM, J. B. Rainsberger <me@...> wrote:

On Thu, Mar 31, 2022 at 8:04 PM Charlie Poole <charliepoole@...> wrote:
?

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

And here is where I whine a little. :) "micro" means "small", so could we pretty please use "microtests" to mean "small tests" instead of turning the word into new jargon? We can all acknowledge a general bias towards small tests, but not all our best tests are microtests all the time. Sadly, our audience is routinely hearing "only write microtests or you're a bad person". We can't stop that, but we can slow it down. I'd like to try.

That suggests that size matters - my experience is that it is primarily *speed* and lack of real-world dependencies that are most useful. So the meaning I seek is not that a test is small (whatever that may mean), but that it is super fast (on other of a couple of milliseconds) and tests the project code only - no dependencies on I/O, system clocks, or any other environment behavior. If that includes a large code unit, fine - as long as it is very fast and clearly repeatable. ?I don’t think “microtests” captures that meaning. “Unit test” can be a problem, as I have had some people use it to mean anything written with an xUnit library. “Pure unit test” is the least bad name I have found so far.
-----------------
Author, HttpUnit <> and SimpleStub <>
Now blogging occasionally at <>

Have you listened to Edict Zero <>? If not, you don’t know what you’re missing!






On Apr 29, 2022, at 10:22 AM, J. B. Rainsberger <me@...> wrote:

On Thu, Mar 31, 2022 at 8:04 PM Charlie Poole <charliepoole@...> wrote:
?
It's funny how long the terminology "unit tests" has hung on. Perhaps that's because of its vagueness. When I worked on IBM mainframes, a typical "unit" test covered a single unit of compilation, which was generally the amount of compiled code for which we had enough base registers available: usually 4, 8 or 12K. So as you may imagine it was pretty small. If you were testing more than one such unit, that was integration. If you were testing the whole system, that was a system test. Of course, these were terms used by the testing group, since programmers didn't do testing (other than adhoc) back then. :-)

I didn't know that the term literally originated as referring to compilation units, but that makes immediate sense to me. Modernizing this meaning to "unit of inspection" (since we don't have those limitations as often any more) seems even more natural now.

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

And here is where I whine a little. :) "micro" means "small", so could we pretty please use "microtests" to mean "small tests" instead of turning the word into new jargon? We can all acknowledge a general bias towards small tests, but not all our best tests are microtests all the time. Sadly, our audience is routinely hearing "only write microtests or you're a bad person". We can't stop that, but we can slow it down. I'd like to try.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

Joe,

It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"
rather than in the sense of something negative or not understood.

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that
already exist, not just what the word "sounds like."

Now I'll whine too. :-)

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,
have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the
actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not
immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"
or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

Also, don't get me started about slogans!

Charlie


On Fri, Apr 29, 2022 at 7:23 AM J. B. Rainsberger <me@...> wrote:
On Thu, Mar 31, 2022 at 8:04 PM Charlie Poole <charliepoole@...> wrote:
?
It's funny how long the terminology "unit tests" has hung on. Perhaps that's because of its vagueness. When I worked on IBM mainframes, a typical "unit" test covered a single unit of compilation, which was generally the amount of compiled code for which we had enough base registers available: usually 4, 8 or 12K. So as you may imagine it was pretty small. If you were testing more than one such unit, that was integration. If you were testing the whole system, that was a system test. Of course, these were terms used by the testing group, since programmers didn't do testing (other than adhoc) back then. :-)

I didn't know that the term literally originated as referring to compilation units, but that makes immediate sense to me. Modernizing this meaning to "unit of inspection" (since we don't have those limitations as often any more) seems even more natural now.

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

And here is where I whine a little. :) "micro" means "small", so could we pretty please use "microtests" to mean "small tests" instead of turning the word into new jargon? We can all acknowledge a general bias towards small tests, but not all our best tests are microtests all the time. Sadly, our audience is routinely hearing "only write microtests or you're a bad person". We can't stop that, but we can slow it down. I'd like to try.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

开云体育

Hi Charlie--

Also particularly amusing: I describe what I do as a programmer as TDD, and what we do as a team to get on the same page to drive features as BDD. But you can swap the acronyms, too. (Context usually makes it clear what we're talking about, but I've had a few customer calls where I had to stop them and ask for a definition.)

Cheers,
Jeff

Jeff Langr / +1-719-287-4335




April 29, 2022 at 9:47 AM
Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")


 

Hi Jeff,

Context as well as the fact that human beings are extraordinarily good at resolving ambiguity... at least when they choose to try. :-)

Charlie


On Fri, Apr 29, 2022 at 8:56 AM Jeff Langr <jeff@...> wrote:
Hi Charlie--

Also particularly amusing: I describe what I do as a programmer as TDD, and what we do as a team to get on the same page to drive features as BDD. But you can swap the acronyms, too. (Context usually makes it clear what we're talking about, but I've had a few customer calls where I had to stop them and ask for a definition.)

Cheers,
Jeff

Jeff Langr / +1-719-287-4335




April 29, 2022 at 9:47 AM
Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")


 

Charlie

On 4/29/22 11:47 AM, Charlie Poole wrote:
Also, don't get me started about slogans!
After you've finished with slogans, would you care to tackle brand names? :-)

- George

--
----------------------------------------------------------------------
* George Dinwiddie *
Software Development
Consultant and Coach
----------------------------------------------------------------------


 

On Fri, Apr 29, 2022 at 12:11 PM Russell Gold <russ@...> wrote:
On Apr 29, 2022, at 10:22 AM, J. B. Rainsberger <me@...> wrote:

On Thu, Mar 31, 2022 at 8:04 PM Charlie Poole <charliepoole@...> wrote:
?

The traditional (XP) view is that so-called "unit" tests verify that the intention of the programmer has been met, while "acceptance" tests verify the intention of the user or customer. For unit tests, I've long since moved to using Hill's term, microtests, which means for me "the kind of unit tests I aspire to write."

And here is where I whine a little. :) "micro" means "small", so could we pretty please use "microtests" to mean "small tests" instead of turning the word into new jargon? We can all acknowledge a general bias towards small tests, but not all our best tests are microtests all the time. Sadly, our audience is routinely hearing "only write microtests or you're a bad person". We can't stop that, but we can slow it down. I'd like to try.

That suggests that size matters - my experience is that it is primarily *speed* and lack of real-world dependencies that are most useful. So the meaning I seek is not that a test is small (whatever that may mean), but that it is super fast (on other of a couple of milliseconds) and tests the project code only - no dependencies on I/O, system clocks, or any other environment behavior. If that includes a large code unit, fine - as long as it is very fast and clearly repeatable.? I don’t think “microtests” captures that meaning. “Unit test” can be a problem, as I have had some people use it to mean anything written with an xUnit library. “Pure unit test” is the least bad name I have found so far.

"Fast test"? Why not? I talk about "fast tests" and "slow tests" quite frequently. Moreover, it's the first division between groups of tests that most people find _helpful_, rather than arbitrary.

I agree that "microtest" merely correlates strongly with "fast test", but I use microtests more for design reasons and fast tests more for fast feedback reasons. The two tend to go together, but they are slightly different.

And they're all unit tests. :)
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

On Fri, Apr 29, 2022 at 12:49 PM Charlie Poole <charliepoole@...> wrote:
?
It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"
rather than in the sense of something negative or not understood.

Yup. And since human nature is to let meanings wander but keep the words the same, if we want a self-regulating system, then we need some people to nudge us back in the direction of using words that convey more broadly the meanings we wish to convey. I am one of those people. :) I have no illusion about fixing anything, but if I can make it easier for more people to understand us, I'll do that.

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that
already exist, not just what the word "sounds like."

Well... "micro" suggests "small" and I happen to know the origin of the term as well as its originators, so I feel pretty confident in clarifying the original intention. :)
?
Now I'll whine too. :-)

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,
have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the
actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not
immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"
or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

Yup. I have no illusions of success, but I see the value in nudging. Either it helps or it doesn't. We all have different pet projects. I don't mind.

I've encountered a person who believed that "refactor" means "add a feature" and there is much confusion about what "integration test" means. Sometimes big groups come to understand X to mean not X. If I can help reduce that confusion, I'm happy to continue, at least in my spare time.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

It's interesting to me.?
Today I wrote a test which was confirming the behavior of a queue to write to a DB with the correct changes.

I starting by writing the test in our e2e folder, because I assumed the test would take multiple seconds. However, since we are using docker containers for all the dbs and queue services, I was actually able to remove all my polling loops waiting for data to be updated, and the test looks like a typical unit test, (it's 5 lines long) though it takes 700ms to execute in the IDE.?

I used to be able to distinguish unit/integration/e2e by saying unit all runs in memory, integration involves one network call, and e2e involves multiple network/frontend calls and most closely resembles what an end user would experience.? But now, my faith in such distinctions are fading.

On Fri, 13 May 2022, 17:14 J. B. Rainsberger, <me@...> wrote:
On Fri, Apr 29, 2022 at 12:49 PM Charlie Poole <charliepoole@...> wrote:
?
It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"
rather than in the sense of something negative or not understood.

Yup. And since human nature is to let meanings wander but keep the words the same, if we want a self-regulating system, then we need some people to nudge us back in the direction of using words that convey more broadly the meanings we wish to convey. I am one of those people. :) I have no illusion about fixing anything, but if I can make it easier for more people to understand us, I'll do that.

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that
already exist, not just what the word "sounds like."

Well... "micro" suggests "small" and I happen to know the origin of the term as well as its originators, so I feel pretty confident in clarifying the original intention. :)
?
Now I'll whine too. :-)

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,
have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the
actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not
immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"
or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

Yup. I have no illusions of success, but I see the value in nudging. Either it helps or it doesn't. We all have different pet projects. I don't mind.

I've encountered a person who believed that "refactor" means "add a feature" and there is much confusion about what "integration test" means. Sometimes big groups come to understand X to mean not X. If I can help reduce that confusion, I'm happy to continue, at least in my spare time.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

开云体育

All of it very interesting indeed!

Wrt. integration tests -> I’ve recently split our IT suite into two so that we can easily run more of our ITs during build. Now we have:

  • “Integration Tests – Local”: all tests that reach outside of the process, but the access is local only, no network calls. Examples: everything using filesystem (and we have a few layers building on top of each other) or calling OS apis like time functions, locale, etc.. These can now run in our team build.
  • “Integration Tests – Authenticated”: tests that make authenticated network calls (like calling cloud APIs). These are not easy to run in our build: first because we don’t allow credentials of any kind to be checked in, second the build process does not have access to the public internet. There’s a way to allow build to query for credentials, but it’s not trivial and a “later” concern for today. Those we run locally using each user’s locally stored authentication.
  • (there would be a third one – Public Internet Unauthenticated, but we don’t make such calls in our code)
  • I also sometimes wonder if it’d make sense to have partial UI as ITs. Today I consider those e2e tests.

?

UTs – anything that can run in-memory, minus slow (>30ms) ones, minus tests that span architectural boundaries (even if configured in memory).

?

From: [email protected] <[email protected]> on behalf of Avi Kessner <akessner@...>
Date: Friday, May 13, 2022 at 8:08 AM
To: [email protected] <[email protected]>
Subject: Re: [testdrivendevelopment] testing on the web

It's interesting to me.?

Today I wrote a test which was confirming the behavior of a queue to write to a DB with the correct changes.

?

I starting by writing the test in our e2e folder, because I assumed the test would take multiple seconds. However, since we are using docker containers for all the dbs and queue services, I was actually able to remove all my polling loops waiting for data to be updated, and the test looks like a typical unit test, (it's 5 lines long) though it takes 700ms to execute in the IDE.?

?

I used to be able to distinguish unit/integration/e2e by saying unit all runs in memory, integration involves one network call, and e2e involves multiple network/frontend calls and most closely resembles what an end user would experience.? But now, my faith in such distinctions are fading.

?

On Fri, 13 May 2022, 17:14 J. B. Rainsberger, <me@...> wrote:

On Fri, Apr 29, 2022 at 12:49 PM Charlie Poole <charliepoole@...> wrote:

?

It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"

rather than in the sense of something negative or not understood.

?

Yup. And since human nature is to let meanings wander but keep the words the same, if we want a self-regulating system, then we need some people to nudge us back in the direction of using words that convey more broadly the meanings we wish to convey. I am one of those people. :) I have no illusion about fixing anything, but if I can make it easier for more people to understand us, I'll do that.

?

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that

already exist, not just what the word "sounds like."

?

Well... "micro" suggests "small" and I happen to know the origin of the term as well as its originators, so I feel pretty confident in clarifying the original intention. :)

?

Now I'll whine too. :-)

?

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand

(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic

dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some

proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low

grade on his report card!")

?

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,

have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the

actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not

immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"

or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

?

Yup. I have no illusions of success, but I see the value in nudging. Either it helps or it doesn't. We all have different pet projects. I don't mind.

?

I've encountered a person who believed that "refactor" means "add a feature" and there is much confusion about what "integration test" means. Sometimes big groups come to understand X to mean not X. If I can help reduce that confusion, I'm happy to continue, at least in my spare time.

--

J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

Test execution speed is only one negative effect of this overly-tight coupling. In the past, it provided practical motivation to split things apart; now that the speed difference has diminished, we have lost one of the blunt instruments that used to help us nudge programmers towards the benefits of looser coupling.

Coupling always feels comfortable until suddenly and violently it blocks your path. Usually when you _really_ want to change some little thing, then realize that you need to change the same thing in 20 places. And that's even before the code becomes legacy code for you.

That means that this "lesson" becomes even harder to teach, but not much less urgent to learn.

At the same time, 700 ms is pretty slow for a "fast" test. 1,000 such tests takes 11.5 minutes to run, even with 0 startup time.

The trick is to need fewer of these tests, which still leads to isolating technology integration from the domain core _at a minimum_.

J. B. Rainsberger :: :: ::


On Fri., May 13, 2022, 12:38 Avi Kessner, <akessner@...> wrote:
It's interesting to me.?
Today I wrote a test which was confirming the behavior of a queue to write to a DB with the correct changes.

I starting by writing the test in our e2e folder, because I assumed the test would take multiple seconds. However, since we are using docker containers for all the dbs and queue services, I was actually able to remove all my polling loops waiting for data to be updated, and the test looks like a typical unit test, (it's 5 lines long) though it takes 700ms to execute in the IDE.?

I used to be able to distinguish unit/integration/e2e by saying unit all runs in memory, integration involves one network call, and e2e involves multiple network/frontend calls and most closely resembles what an end user would experience.? But now, my faith in such distinctions are fading.

On Fri, 13 May 2022, 17:14 J. B. Rainsberger, <me@...> wrote:
On Fri, Apr 29, 2022 at 12:49 PM Charlie Poole <charliepoole@...> wrote:
?
It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"
rather than in the sense of something negative or not understood.

Yup. And since human nature is to let meanings wander but keep the words the same, if we want a self-regulating system, then we need some people to nudge us back in the direction of using words that convey more broadly the meanings we wish to convey. I am one of those people. :) I have no illusion about fixing anything, but if I can make it easier for more people to understand us, I'll do that.

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that
already exist, not just what the word "sounds like."

Well... "micro" suggests "small" and I happen to know the origin of the term as well as its originators, so I feel pretty confident in clarifying the original intention. :)
?
Now I'll whine too. :-)

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,
have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the
actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not
immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"
or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

Yup. I have no illusions of success, but I see the value in nudging. Either it helps or it doesn't. We all have different pet projects. I don't mind.

I've encountered a person who believed that "refactor" means "add a feature" and there is much confusion about what "integration test" means. Sometimes big groups come to understand X to mean not X. If I can help reduce that confusion, I'm happy to continue, at least in my spare time.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


 

You know... now that I think about this more, I have a change of mind.

A 700 ms test is perfectly fine... until we have 20 of them. Then it starts to become noticeable. If we need 20 of them, then there's clearly a coupling problem; with only 20 of them, the coupling problem isn't yet a disaster. It's still a fuzzy kitten who doesn't know yet understand the sharpness of its claws.

That still works pretty well to nudge the programmer towards dealing with the couping problem. And many of them will still reach for changes to the tests. I'll be there to advise changes to the design instead. :)

J. B. Rainsberger :: :: ::


On Sun., Jun. 5, 2022, 10:19 J. B. Rainsberger, <me@...> wrote:
Test execution speed is only one negative effect of this overly-tight coupling. In the past, it provided practical motivation to split things apart; now that the speed difference has diminished, we have lost one of the blunt instruments that used to help us nudge programmers towards the benefits of looser coupling.

Coupling always feels comfortable until suddenly and violently it blocks your path. Usually when you _really_ want to change some little thing, then realize that you need to change the same thing in 20 places. And that's even before the code becomes legacy code for you.

That means that this "lesson" becomes even harder to teach, but not much less urgent to learn.

At the same time, 700 ms is pretty slow for a "fast" test. 1,000 such tests takes 11.5 minutes to run, even with 0 startup time.

The trick is to need fewer of these tests, which still leads to isolating technology integration from the domain core _at a minimum_.

J. B. Rainsberger :: :: ::

On Fri., May 13, 2022, 12:38 Avi Kessner, <akessner@...> wrote:
It's interesting to me.?
Today I wrote a test which was confirming the behavior of a queue to write to a DB with the correct changes.

I starting by writing the test in our e2e folder, because I assumed the test would take multiple seconds. However, since we are using docker containers for all the dbs and queue services, I was actually able to remove all my polling loops waiting for data to be updated, and the test looks like a typical unit test, (it's 5 lines long) though it takes 700ms to execute in the IDE.?

I used to be able to distinguish unit/integration/e2e by saying unit all runs in memory, integration involves one network call, and e2e involves multiple network/frontend calls and most closely resembles what an end user would experience.? But now, my faith in such distinctions are fading.

On Fri, 13 May 2022, 17:14 J. B. Rainsberger, <me@...> wrote:
On Fri, Apr 29, 2022 at 12:49 PM Charlie Poole <charliepoole@...> wrote:
?
It's ALL jargon! In the first meaning (in most dictionaries I checked) of "language peculiar to a particular trade profession or group,"
rather than in the sense of something negative or not understood.

Yup. And since human nature is to let meanings wander but keep the words the same, if we want a self-regulating system, then we need some people to nudge us back in the direction of using words that convey more broadly the meanings we wish to convey. I am one of those people. :) I have no illusion about fixing anything, but if I can make it easier for more people to understand us, I'll do that.

The "micro-test" movement has grown rather large, so I think you should critique it (if you want to) by using the definitions that
already exist, not just what the word "sounds like."

Well... "micro" suggests "small" and I happen to know the origin of the term as well as its originators, so I feel pretty confident in clarifying the original intention. :)
?
Now I'll whine too. :-)

Our TDD, XP and Agile movements have been plagued by the search for terms, which everyone will automatically understand
(in the same way we understand them) upon first encountering them. There really are no terms like that. In the rather basic
dictionary I have by my desk, there are 14 definitions of "test" and 13 for "unit." ("Behavior" has only four, which will make some
proponents happy, but they are mostly pretty non-specific and don't include "Category for which Charlie used to get a low
grade on his report card!")

Generally, to understand the "jargon" of a group, you need at least a definition. Sometimes you may need to read an article,
have a conversation or even digest a book or two. While that may sound like a problem, I've never seen it arise within the
actual teams that do the work. Of course, when we get on the internet with people following different usage, we may not
immediately understand one another. I think that's just a fact of life. We should mitigate it by not calling "hot" things "cold"
or "fast" stuff "slow" or "small" tests "big". But non-self-explanatory terms are easily bested by people who work at understanding.

Yup. I have no illusions of success, but I see the value in nudging. Either it helps or it doesn't. We all have different pet projects. I don't mind.

I've encountered a person who believed that "refactor" means "add a feature" and there is much confusion about what "integration test" means. Sometimes big groups come to understand X to mean not X. If I can help reduce that confusion, I'm happy to continue, at least in my spare time.
--
J. B. (Joe) Rainsberger :: ?:: ::


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002


--
J. B. (Joe) Rainsberger :: :: ::
Teaching evolutionary design and TDD since 2002