I happen to own two Swiss Army knives; both from Switzerland, although I have never been there. One is mine and the other belonged to my late father. For a long time I kept them carefully separate so I knew which one was which. They are both special to me; mine was brought to me by my parents after they visited Switzerland and my father’s was given to me by my mother at his funeral. They both have toothpicks and I haven’t lost either toothpick. They both have two blades, a big one and a little one. They both have a couple of screwdrivers (a Robertson and a regular flat blade one) and they both have the inevitable corkscrew which I have never ever used since I don’t drink. They are very handy items even though they don’t come with every tool in the book.
Positive reinforcement is a bit like my Swiss Army knife. I can use my knife both for cutting string and for opening a paint can. I can use positive reinforcement to teach a dog to stay or to come. I can use the Swiss Army knife to pick my teeth or to clean out the groove that the little tab belongs in on the lawn mower. I can use positive reinforcement to help a dog with thunderstorm anxiety or to jump over a jump. Both of these tools are widely useful. The key to being successful is having a solid understanding of how the tool works, what it does and what it doesn’t do.
The easiest way I know of understanding what positive reinforcement is would be to divide the term into its component two terms. Positive refers to everything that the trainer adds. So, if I were to GIVE the dog a treat, I am positive. If I YELL at the dog, I am also being positive. Sadly, positive has a second and not useful meaning related to behaviour. When we say we are being positive, we might be indicating something pleasant or desirable. In training we simply mean adding something to the interaction. Reinforcement is anything that increases the behaviour; so when the trainer adds something, and the dog does more of whatever we are training, we see positive reinforcement in effect. This means that if the trainer YELLS at the dog and the dog BITES more often, we are positively reinforcing biting! That sure doesn’t sound like the positivity of feeling good or better.
In its simplest form, positive reinforcement increases desired behaviours. If we want the dog to sit more often, we wait for the dog to sit and add something he wants. If the dog sits, randomly, we could get right up and feed him a treat. If this happens 3 times in an hour in the first hour, and 7 times in the second hour, and 22 times in the third hour, and 21 times in the fourth hour, and 34 times in the fifth hour, we can see a trend where over time, the dog is increasing the frequency that sitting happens because the trainer added a treat each time. Easy peasy, right? This works for simple skills where the dog offers the behaviour in its entirety without having to learn something unusual or unnatural and we wall it capturing because we are capturing the behaviour as is. What about more complicated things?
Complex behaviours can be developed through a process known as shaping. You can think of shaping as a series of steps towards an end goal. Heeling off leash is a behaviour that is very easily shaped. Simply stand still and when your dog comes towards you, mark the behaviour (we use clickers, so I will just say click when I mean mark), and toss a treat away from you. We are adding the click, and the dog keeps coming back to us after each tossed treat, which means that the behaviour of being with us is increasing, so this is still positive reinforcement. Next, you could stand so that your right side is against a wall, and click when the dog comes into heel position, and toss the treat over your shoulder so that the dog goes behind you to get the treat. In this way, he will begin to get closer and closer to a nice stationary heel position. After a few reps of this, you might try taking a half step away from the wall, and continuing to click for being on your left side in heel position. You would of course continue to toss the treat behind you to get the dog out of position, and then back in again. When your dog is good at that, you might continue the progression by stepping further away from the wall. Eventually your dog would be choosing to come to your left side even if you were in the middle of the room. You could make a nice little game of this and begin walking forward one step and clicking when your dog keeps up with you, and then walking forward 2 steps and clicking for keeping up with that. Progressively, you add more steps and then changes of directions and halts and eventually, you have very nice off leash heeling.
Shaping can be applied to all sorts of tasks, from heeling to coming when called, all the way through complex things like backing away from you and spinning! Shaping is a skill though and both you and your dog need to learn the process. With one of my own dogs, D’fer, I had to stop using him as a demo for shaping after he learned to “shovel snow” by lining up to the handle of a snow shovel, pick up the handle and walk the shovel forward in under ten clicks! He was really good at shaping but he gave my audiences an unrealistic expectation for how long shaping would take.
All this is very good information for building behaviours, but how about dealing with behaviours you don’t like? Positive reinforcement can be really good for that too! There is a procedure called Differential Reinforcement of Alternate behaviour, shortened to DRA. It is also sometimes called DRO, with the O standing for Other behaviour. What this means is simply that you mark or click for anything other than the behaviour you want to get rid of. Let’s say you have a puppy who jumps up. When he is doing ANYTHING other than jumping up, you can mark that and treat. The fact is that most of the time the puppy is NOT jumping up, so you have lots of behaviours you can reinforce. You can reinforce for four on the floor, running around, barking at the window, lying down, sitting, grabbing a toy, looking at the other dog, following the cat or getting on furniture. Some of those behaviours might also be on your list of things you don’t want your pup to do, but none of them are compatible with jumping up. This is a very flexible procedure, so you can either choose one of those things, or all of them, and either way you will decrease the jumping up behaviour. In practice it is best to choose one behaviour and click that and only that, but at the beginning it may be best to just click for everything that isn’t jumping up to make the point that there are lots of things other than jumping up that you can click.
Another way to decrease behaviours is to reinforce the least iteration of the behaviour. This works well for behaviours that go on and on, such as barking. Barking can be looked at as one behaviour, or it can be looked at as a long series of behaviours. When we look at it as a long series, it is easy to get control over it by clicking for the first few barks. If you were to count how many barks happen before your dog just naturally stops barking, you might find for instance that he barks an average of 78 times. I would want to get my click in before the fifth bark if possible, and initially, I would be following the click up with super duper good treats and lots of them. I would want to make a big impression because barking is such an exciting behaviour. I would keep doing this until I had an INCREASE in short bursts of barking. It is much easier to get rid of barking altogether when it occurs in short bursts instead of over long duration! Once I was in control of the barking to the extent that it only happens in short bursts, I could start clicking for single barks.
All of these procedures have pitfalls, which is why I have a job! I have had many students say to me “that worked for a bit, but it isn’t working any more”. When you are using a Swiss Army Knife as your main tool, you may have to sometimes stop and change tactics in order to be successful. Knowing that there are other blades to use when your first choice fails you is a valuable skill to develop as a trainer. Positive reinforcement is my favourite training tool by far, and it is really useful to play with various ways to implementing it.