Many clicker trainers are familiar with what is almost universally known by the ridiculously simple name of The Training Game. It’s a shaping game played among humans, and most often a learner is sent from the room while the group determines a (physically and socially safe) behavior to shape, and then a trainer shapes the learner with the clicker to perform the chosen behavior.
There are a number of variations on this game, many useful. The trainer (and observers) can learn a great deal by doing this! and it’s a great way to test various training concepts and approaches. There is a variation I have not used in nearly a decade, however, with good reason: It broke the learner.
Corrections OR Reinforcement
Long ago I used to share the Training Game just as I first experienced it, with both a how-to and how-not-to session. Not only did we shape a behavior with reinforcement, we also did it with punishment, using “no” to guide the learner away from unwanted behavior (facing the window) and into the right behavior (facing the light switch).
It wasn’t a great demo. It did a fantastic job of showing that all punishment was exhausting for both the learner and the trainer, frustrating for both, and we usually got some good quotes from the learner about, “I just didn’t feel that I could do anything right, I had no idea what you wanted” but it wasn’t directly applicable, because few trainers use only punishment (or at least believe that they do).
We always specified that the “no” was supposed to be fairly neutral, but it was still a punisher, and it was easy to see the frustration in the learner. Then one day I did it with a group of kids, anxious to learn about dogs and training, and two girls were experimenting with the “no” session. The learner seemed to be working, was experimenting with alternate behaviors, was concentrating — and then she quit. She didn’t get frustrated and slow down or get stuck, she quit. Turned, walked away, sat against the wall with her arms about herself, and refused eye contact or speech with everyone.
Ooookay. Punishment, even mild, can have fallout and will reduce the motivation of the learner. Nothing we didn’t already know, but maybe we don’t actually have to demonstrate that in sessions.
But what about mixing punishment or corrections with reinforcement? How would that work? I wouldn’t have set up such a situation myself, but I once had the opportunity to observe it.
“You’re Doing It Wrong”
The speaker was a man well-known in his field, and he had adopted some clicker training into his program. He was introducing the concept to his audience, some of whom were familiar with clicker training and some were not. And he decided to demonstrate by using the training game, like so many presenters before him. So he selected a volunteer, identified a behavior, and then started shaping behavior.
He had a pretty good volunteer, who immediately started offering lots of small and large behaviors to choose from. She turned her head different directions, she walked forward and backward, she moved sideways. She raised her arms. She squatted. It was a trainer’s dream — so many points from which to select the next behavior!
The clicks, however, were fairly few and far between, compared to the rate of offered behavior, and the learner began to slow. Finally she got a couple of clicks for walking forward, so she tried that for a bit.
As she passed a certain chair, the instructor clicked, and she paused to experiment with the chair. Looking? Touching? Sitting? Nothing else worked. So she repeated the situation, going back a few steps and then walking up past the chair again. Again, he clicked as she past the chair. And again she experimented, with no further clicks.
She went back once more, walking slowly toward the chair, eyes intensely focused on the instructor for any subtle clue. As she reached the chair, he clicked, and she stopped. She touched the chair a couple of times without feedback, and then she just hesitated.
And then the instructor spoke. “You’re not paying any attention,” he said. “Did you even listen to what the clicks mean? I already explained how this works. If I click, you are to do that again. Now do it again, and listen to what I click.”
My jaw kind of dropped at that moment. We had leapt from a trainer being unclear with his criteria and confusing his willing learner, to a trainer blaming the animal for being inattentive and stupid instead of acknowledging his own errors.
And then the session continued. I will not pain you with all the details, because it was very painful for me to watch: the low rate of clicks, and their unclear timing — his goal behavior actually had nothing to do with the chair at all, he only wanted her to walk to the front of the room — caused the learner to eventually just give up and stand still helplessly. That’s when he starting joking about her intelligence and abilities and got the observers laughing at her.
Eventually the “game” ended, much to both the learner’s and my relief. The instructor traced the learner’s path, pointing out what she did wrong in response to each click he’d given — “When I clicked here, you should have turned this way, but instead you went straight. Why would you do that? Why didn’t you pay attention? This was what I marked, so you must do this thing.”
He asked as she sat down how she felt about the session (showing an impressive lack of body language observation). She said she found it very frustrating.
“That’s why we have to start with a puppy,” he answered, “so it understands.”
Okay, I’m interrupting my own narrative here, but if one cannot efficiently explain a binary system (click/no click) with all our human language available to an intelligent adult who is motivated and engaged enough to pay hundreds of dollars to attend a seminar, then one probably cannot efficiently explain it to a puppy.
However, it’s often said (and often true) that we can learn something from everything, and indeed this afforded a lot of good observations about clicker training, learners, and trainers.
We all gravitate toward reinforcement.
We all gravitate toward reinforcement. If reinforcement occurs repeatedly for the same behavior or in the same place, we’ll tend to do that or go there.
By clicking three times for proximity to the chair, he firmly fixed the chair in the learner’s mind, even though that wasn’t his intention. This is a great technique for fixing a learner in place! but be careful to vary your clicks if you want movement.
Redirected aggression is real.
During the session, while the learner was “stuck” and confused, one of the spectators asked what she was thinking. She said crisply, “I’m sort of aggressive.”
While social protocol prevented her from directing her feelings toward the esteemed speaker we’d all come to hear, she could redirect toward someone less socially protected. (“Why is my dog snapping at my kid or my other dogs? I do all this training with him….”)
Redirected aggression may appear in group classes when dogs are overfaced and unable to succeed at the training tasks being asked of them, and instead of snapping at their owners, they snark at the other dogs.
The fundamental things apply, as clicks go by…
Rate of reinforcement is important; the learner needed more feedback, and even if the clicks were unclear, she might have stayed in the game longer if she were getting “paid” better.
But even a higher rate of reinforcement could not buffer the effects of punishment. Even when the clicks came more frequently, it only took one punitive comment to freeze her again. By the end, she didn’t care about earning positive reinforcement any more, she just wanted out.
Also, of course, the learner must be able to distinguish via the marker exactly what behavior is being reinforced. Timing and criteria matter. A lot.
The rat is always right.
The rat is always right. THE RAT IS ALWAYS RIGHT.
This last is a phrase attributed to B.F. Skinner, explaining that when the animal in a learning experiment did something other than what the researchers or trainers predicted or wanted, it wasn’t that the rat had done something wrong. The rat had simply been itself, and if he wasn’t pushing a lever to get food, it was because the trainer had not adequately taught or explained that lever-pushing was the key.
I frequently hear from new clients that their dogs are stubborn, because they don’t behave as asked or expected. I explain that this is almost never the case, and that generally the dog is not stubborn, just confused. Splitting the behavior into small, fluent pieces leads to immediate improvement!
Sometimes people find it hard to believe that the dog isn’t stubborn. After all, it’s a concept they’ve heard frequently, and it’s an easy blame-shift. I told the dog to sit, and I can’t be the problem, so the dog must be stubborn!
But it’s easy to see in a situation like this human example that “stubborn” is rarely the problem. This learner had actually traveled over 2,000 miles to hear this speaker. She volunteered for the game. She had no interest at all in humiliating him or proving him wrong — she had invested a lot of time and money and effort into coming to learn! And yet he easily fell into the common traps of blaming her inattention, lack of intelligence, etc.
Now I’m not trying particularly to bash this speaker for his mistakes; he’s still learning, like the rest of us, and he’s just not quite ready yet to teach this method to others. He’s good at what he’s good at. And no doubt he felt some pressure when his volunteer didn’t immediately grasp what he wanted, and it was a lot easier to blame the learner than to revise his own training in front of a watching and listening crowd.
But if a highly-regarded professional can slip down the blame slope, how much more should we be watching ourselves at home?
“It’s Not You, It’s Me”
If my learner seems to stall out, get “bored,” act stubborn, or just plain be stupid, I have to consider the ugly truth: it’s probably not the learner. Stop the session and look at the training plan — are we splitting finely enough? (That’s a good first check, as it’s often the problem.) How is our timing — is the click accurately marking exactly what we want and nothing else?
And here’s the good news: if I screw up a click, my dog gets a free cookie. It’s definitely not what I want, but it’s probably not a crisis. There’s very little fallout from a free tiny treat.
But if I punish my dog for not understanding, I’m setting up trouble with the entire training scenario.
Alena and I were comparing stories the other day about dogs who recognized us as trainers when we arrived. One went totally berserk aggressive at me, anticipating punishment. (It’s okay, there was a happy ending! By the end of the session, she was snuggled up beside me on the couch and begging to do more work.) Another dog greeted Alena happily until recognizing a leash and treats as predicting a training session, at which point she ducked and slunk away to hide. Punishment had made a big impression, even though there were also treats.
And seriously, who wants to get in trouble for not understanding poor teaching? We’ve all had that moment in school, right? I certainly remember being frustrated and angry when a teacher scolded us for not understanding something which hadn’t been explained or explained well.
It doesn’t mean training can’t work; it just means I need to revise and improve my training.
The rat is always right. The rat is always right.