The Quadrants of Dog Training are Nonsense
The Motivation Matrix
I’ve wanted to tackle the famous quadrants of dog training for a while. So here goes. I can’t avoid getting a bit technical in places, but I’ll minimise it as much as possible.
I’m sure you’ve all seen the operant conditioning 2×2 matrix known as the quadrants of dog training. I’ve included its latest iteration, the motivation matrix. I want to address why we refute the quadrants, and I’ll use scientific literature and real world examples to do so. If you’ve been here for any length of time you’ll have seen me dismiss the quadrants of dog training as nonsense multiple times. I will delve into why I hold this position.

First some background on the quadrants. They originated as interpretation of Operant Conditioning as discovered by B.F. Skinner. They are often used by dog trainers almost interchangeably with the term Operant Conditioning. That is an error, though they are directly part of Operant Conditioning.
Operant Conditioning has its own definition, well, two actually, the erroneous Dog Training version, and B.F. Skinner’s actual scientific version. Sadly they are very different and that is where a lot of the problems with dog training have come from. I am not going to go any further into this today, Operant Conditioning is a blog topic of its own. This blog is specifically about the quadrants, which aren’t off to a good start when the very discoverer of Operant Conditioning refutes them unknowingly in his literature. Exactly how the person(s) who came up with quadrants didn’t pick up on Skinner’s refutation of the very concept is beyond me.
So let’s begin. First we need to establish definitions for terms:
Reinforcement:
Positive reinforcement: When a type of behaviour has a consequence called reinforcing, it is more likely to happen again. A positive reinforcer strengthens any behaviour that produces it. Example: A glass of water when you are thirsty is positively reinforcing, if you do something that gets you a drink of water when you are thirsty, you are more likely to do so again on similar occasions.
Negative reinforcement: A negative reinforcer strengthens any behaviour that reduces or terminates it. If your shoe is hurting your foot, the relief from pain in taking off the shoe is negatively reinforcing, so you are more likely to take off the shoe next time it hurts your foot.
Hopefully that’s so far so good. You should be able to correlate those two types of reinforcement with the left hand column of the matrix, in the green boxes.
The ultimate state of positive reinforcement is when all your needs are met; food, water, shelter, a mate, no stress in your life, you’re achieving everything you want, everything is calm and easy. You are relaxed, satiated and happy. This is what we are all striving towards. There is a deeper step into positive reinforcement, with a serious dark side, but that’s a topic for a different day, this definition will do for now.
Let’s now tackle why things are reinforcing. A common misconception exists, and in fact I’ve used that exact misconception above in the negative reinforcement definition, in why things are reinforcing. The misconception is that reinforcement exists because it feels / tastes / smells good. That is actually secondary. The reason reinforcement exists, and where it comes from is deeper. Mammalian behavioural reinforcement has developed and exists as the underpinning survival mechanism for both the individual and the species. Everything that we find reinforcing can in some way be traced back to a related survival and / or mating mechanism. Avoiding pain prevents injury, in turn allowing survival and mate finding. Being able to source food and water on demand provides the best opportunity for both survival of the individual, their mate(s) and offspring, thus the species. It therefore follows that being successful at these activities is reinforcing as everyone wants to survive, and most want to reproduce. Historically with survival of the fittest, only those at the top of the survival, resource gathering and protecting games could mate. You can take this to the Nth degree, the foods that make our mouths water are those that evolutionarily are the most important and difficult to obtain, again, its survival related, and all lead toward the ultimate state of reinforcement.
Let’s now turn to punishment. I’ll get a bit technical here.
The stimuli that function as reinforcers when they are reduced or removed, can be thought of as aversive stimuli, or punishing stimuli. The punishing stimuli, in terms of the quadrants can be thought of as the opposite of reinforcement, they are designed to reduce or remove behaviour, not increase it. The problem is that when that theory was tested, it didn’t hold true as I’ll demonstrate.
Positive Punishment: Smacking a child to stop them from doing something again.
Negative Punishment: Putting an offender in prison (for smacking a child???) is a negative punishment – you could think of this as taking away freedom.
In the positive punishment example above, B.F. Skinner ended up determining that it could not be technically distinguished from presenting a negative reinforcer, despite saying that punishment is easily confused with but different to negative reinforcement. Bit strange, no?
In the negative punishment example, B.F. Skinner could not technically distinguish it from removing a positive reinforcement. Very odd indeed. What does that mean then?
It means that by design and definition, punishment is not reinforcement, but, in a technical sense in terms of analysing behaviour in experiments, the very person who coined the terms, along with Operant Conditioning, could not technically distinguish between positive punishment and negative reinforcement, nor could he distinguish between negative punishment and removal of positive reinforcement. The reason for this I’ll get into below, but for now, the lesson to take away is that removal of your positive reinforcement – is punishment. Anything that takes you away from your goal of the ultimate positive reinforcement, or any positive reinforcement, is punishment. That’s a lot to take in, I know.
Example: If your positive reinforcement is to meet that dog over there to determine if they are friend or foe (survival and / or mating based reinforcement), and your handler stops you and shoves a treat in your face as an alternative reinforcement, they are by definition punishing you by removing your ability to achieve your positive reinforcement. Worse, it can be viewed as rewarding punishment when the treat is considered in the equation. There’s some good words for when we “pay” someone to do something for us that they don’t want to……. Bribery / manipulation spring to mind. Not things that form the foundation of great relationships are they.
It gets worse: How can the quadrants exist, when behaviourists can’t design any experiments that can tell when you aren’t in the left hand column of reinforcement when testing behaviour? Bit of an issue when determining what works and what doesn’t.
Here is the crux of the problem:
According to B.F. Skinner in About Behaviourism: [I’m paraphrasing] If the effect of punishment were simply the reverse of reinforcement, a great deal of behaviour could be easily explained. Unfortunately, this is not the case, when punishment is administered, the punished person (or dog, or any mammal) remains inclined to behave in a punishable way, but punishment is avoided by doing something else instead, often nothing more than stubbornly doing nothing.
Let me put that into a real world example for you. I used to use a slip leash on my dog, Rollo. A slip leash works by strangulation when the dog pulls on the leash, making it a positive punishment (or presenting a negative reinforcement if you prefer). Why did I use this tool? Because he did things like lunge at dogs he wanted to meet, ignoring my commands to leave it. Meeting the dogs was (still is) Rollo’s positive reinforcement, he loves it. Having Rollo wear a slip leash I managed to get Rollo to be able to walk directly by other dogs on the street and not do anything except keep walking. I didn’t know this at the time, but Rollo was stubbornly doing nothing every time he walked past another dog and ignored it. I had Rollo in this tool for two years. I met Robert Hynes somewhere around this point, and Robert told me punishment doesn’t work, citing the Skinner text I’ve paraphrased above. So I took Robert’s challenge and took the slip leash off Rollo, swapped it for a flat collar and went to town. What did Rollo do to the first dog we came across on that walk after 2 years in a slip leash? He lunged at it with full power against my holding the leash in a desperate attempt to meet the dog. Rollo repeated this with the very next dog, which was humiliating when the other owner asked me if Rollo was my new rescue dog.
What’s my point? Rollo’s behaviour followed exactly what B.F. Skinner wrote in About Behaviourism. The punishment Rollo was subjected to for two whole years did not change Rollo’s desire to do the behaviour of meeting other dogs one iota. Not in two years of being strangled every time he acted on his desire. The very moment the positive punishment (or negative reinforcement???) was removed, the original behaviour returned, and with some serious force. Rollo was meeting that dog and there wasn’t a damn thing I could do about it. Rollo knew this too. He knew I couldn’t stop him getting his positive reinforcement.
If we stick to the quadrants a moment, think on this; if your dog is doing behaviours that you don’t like, if you try and extinguish those behaviours via punishment, you will have some success at stopping them, BUT, you will have to employ that punishment for the rest of the dog’s life because you cannot and will not change the desire of the dog to do that behaviour by punishing it. It is scientifically impossible. As soon as the punishing control mechanism you have used is either taken away or not present in the environment, the original behaviour will return, and likely stronger than before. That is a fail in changing behaviour.
Think on this example: How many dogs pull on a leash? It hurts after a while, some dogs get bald spots where they rub the fur off by pulling. How many dogs stop pulling because it hurts? Next to zero. People have to search out a harsher leash like a slip or a prong. Why? It needs to hurt more, or be more punishing to stop the dog from doing it. The moment the lesser leash is reintroduced, the pulling returns, because punishment cannot change the behaviour. Only modify it temporarily. Changing behaviour is the topic of a previous blog. Well worth your time, even if I do say so myself.
So what is the point of the punishment side of the quadrants above if they don’t work to actually change behaviour, and even worse, they don’t technically exist according to the Professor who discovered Operant Conditioning? Well, if you note the language of trainers and people who use the quadrants, they use the words “modify behaviour” which is a trick. Yes behaviour can be modified, but it cannot be permanently changed by punishment – ergo the punishment side of the quadrants are only useful if you can control the dog’s behaviours that you don’t like in every situation that they might occur. That distils down to control, always and everywhere, for life. The two punishment parts of the quadrants are therefore about control, and only control. There is no room for anything else, otherwise the behaviour returns. This is why people using punishment need treats, slip leashes, prong collars, e-collars, etc for life.
If you want to change your dog’s behaviour, the punishment quadrants have to be shunned, they cannot exist for you. The only way to change behaviour permanently is to change the emotional feeling that is causing the behaviour (I’ve written a blog on that exact topic, handily for you…….) to a positive reinforcement to do something else. Only the dog can change their emotional feeling and therefore behaviour, via operant conditioning, which as I’ve mentioned, under B.F. Skinner has a very different look and definition to the one that spawned those damn quadrants, which don’t really exist.
Oh and you see the indusive (I don’t think that’s actually a word, but the quadrants are nonsense so it makes sense for a made up word to appear) and compulsive arrows through the matrix, both of those terms fit neatly into manipulating, with varying degrees of physical force and / or mental coercion, the dog to do something. That’s a tangent from here for another blog (is that 3 new blogs I’ve talked myself into in this one?), but have we really got to a stage where we have to manipulate man’s best friend to do things for us that they’ve done for tens of thousands of years without said manipulation? Utterly shameful.
The answer is no, we don’t need to do that, we do however need to understand that punishment doesn’t change behaviour, is technically indistinguishable from removal of positive reinforcement, and that only your dog can decide what is reinforcement and what is punishment in each situation, not you on your dog’s behalf, thus, the quadrants are nonsense.