Placebo and the power of belief

Professor Irving Kirsch works at the Department of Psychology at the University of Hull.  He has done extensive research on placebo effects, anti-depressant medication, hypnosis, suggestion and the power of belief. 

His work indicating that the effect of most modern anti-depressants may be largely down to placebo has recently caused a huge stir in the press. Recent NICE guidelines also draw heavily on his meta-analysis of placebo. 

In September 2007 he spoke at the Royal College of Physicians about placebo and the power of belief.  We reproduce his talk here.

Placebo effects depend in part on the condition that’s being treated. You may find strong placebo effects in some conditions, weak placebo effects in others, no placebo effect at all in some other conditions.

Professor Irving Kirsch

Half a century ago, Harry Beecher wrote his famous article entitled ‘The Powerful Placebo’ in which he argued that placebos have a powerful effect.  This was a ground breaking article because in those times double blind trials were not the norm.  There’s a piece of conventional wisdom - which turns out it’s a misinterpretation but is nevertheless a conventional wisdom - that about one of third of patients respond to placebo. 

So that was the view 1955.  If we go to 2001, there was a meta-analysis done by Hróbjartsson and Gøtzsche  (followed up in 2004), questioning Beecher’s initial conclusion.  What they did was  a meta-analysis comparing placebo effects to the effects of no treatment at all.  They came to the conclusion that there was little evidence in general that placebo had powerful, clinical effects. 

Now, in evaluating this more recent meta-analysis there are couple of things that one has to keep in mind.  First, many of the studies that were included in this meta-analysis had design flaws which would make it a little difficult to find placebo effects.  For example, some of them were not double blind.  Some of the placebos, instead of being pills that were indistinguishable from the treatment included things like asking people to engage in leisure reading or talking to them about football, food, television, pets and holidays.   These are not what one would typically think of as a placebo, but they happened to have been called the placebo in studies and that was the criterion used for the meta-analysis.  There was even a case of a placebo being administered without the patient’s knowledge - it was an anti-bacterial cream that was administered during childbirth.  Given the nature of the placebo effect, one would not expect to find one if the patient doesn’t know she is being treated. 

The effect size was not compared to the effects of conventional treatments which was interesting, particularly because in some of these studies the actual treatment did not have a significant effect.  And there was quite large degree of variation in the conditions being evaluated, they included infertility, anaemia, bacterial infections, herpes, colds, pain and many others. 

Which conditions respond to the placebo effect?

graph of placebo effect in diabetes

Placebo effects depend in part on the condition that’s being treated.  You may find strong placebo effects in some conditions, weak placebo effects in others, no placebo effect at all in some other conditions. For example, placebo doesn’t affect diabetes.  There’s no placebo effect on blood sugar at all, looking at the  FDA’s data.  By contrast, response to pain reducing placebo seems to be about half of the response to active pain reducing medications.  Interestingly enough, that’s true regardless of whether the active medication is aspirin or morphine, meaning that placebo morphine is significantly more powerful than placebo aspirin.  In depression you get this whopping placebo response that is more than 80 percent of the response to the medication.  Now I will come back to that because that’s quite a broad claim to make. During this talk I will give you the details of the data.

Reanalysing placebo effect

Bruce Wampold  did a re-analysis of the meta-analysis of Hróbjartsson and Gøtzsche in 2005.  He came to the conclusion that Beecher had come to: the placebo is powerful.  He compared the effect of placebos versus no treatment at all across the large body of placebo literature.  But he limited his study to situations where there would be some suspicion that there should be a placebo effect.  He came up with the statistic that seven is the number needed to treat. 

The number needed to treat is the number of patients you would have to treat by placebo to get one benefit or prevent one negative effect compared to no treatment at all.  It’s a sort of a strange statistic because it’s counter intuitive, the lower the number the more effective the treatment.  If we have a number needed to treat of seven, for placebo treatment as compared to no treatment, is that a powerful effect or is it a powerless effect?  One way to look at that is to compare it to some other conventional medical treatments.  So here’s that number of seven for placebo.  The number for radiotherapy in breast cancer is about the same, with a number needed to treat of eight.  Using calcium plus vitamin D for osteoporosis in elderly patients, to prevent fractures, the number needed to treat is 15.  Beta-blockers for heart disease, the number needed to treat is 40 and then aspirin to prevent heart disease, the number needed to treat is 208.

number needed to treat comparison graph 

So is placebo effect powerful?  Maybe it is, maybe it isn’t, but if it isn’t, then there’s certainly not a powerful treatment effect for many well established conventional medical interventions.

Placebo and depression

The effect of placebos on depression is particularly strong and in fact accounts for about 80 percent of  the response to anti-depressant medication.  I’d like to spend some time documenting that and telling you where that figure came from.  It started with a meta-analysis that I worked on in 1998 with Guy Sapirstein who was at that time a graduate student.  The purpose was to evaluate the placebo effect in trials of anti-depressant medication. We weren’t really interested in anti-depressant medication at the time, I sort of backed into that interest.  What I’ve been interested in it for my entire professional career is the effects of beliefs and expectancies and hence the placebo effect. 

We looked at how powerful is the placebo in the area of depression because it’s a disorder that ought to be susceptible to placebo effects. Central to depression is the sense of hopelessness.  If you give someone a treatment that instils hope you ought to be countering major symptom of depression, so there ought to be a placebo effect.  We looked for clinical trials in which groups of people had been assigned to placebo and also to no treatment at all. They were all clinical trials of some kind of treatment, so we also had a group assigned to a drug and an effect from that to look at as well. 

We looked at the level of depression before treatment, the level of depression after treatment, for people given an active medication, for people given a placebo pill and for people put on a waiting listing.  The last group was followed up to see what happened spontaneously to their depression, with no treatment control.  This last group is particularly important, which was a point that was also made by Hróbjartsson and Gøtzsche.  If you want to know what the effect of a drug is you have to subtract out the response to the placebo, that’s the conventional wisdom of double blind clinical trials.  By the same token if you want to know what the effect of the placebo is you have to subtract out what would happen if you hadn’t given the placebo.  People might get better and may have nothing to do with the placebo, it might be the passage of time, it might be spontaneous remission, it might be a statistical phenomenon called regression to the mean.  If you select a group of people to follow who are extreme in one measure, the next time you measure them, just for statistical reasons they’re likely to be less extreme than they were before. 

kirsch talk slide 3 

So this was a fairly conventional meta-analysis in some ways.  What we see is a whopping response to anti-depressant medication, a standardised mean difference of more than one point five, more than one and a half standard deviations.  These effects are much larger than we usually get in most medical treatment.  There is also, however, a whopping response to placebo as well.  In contrast, in the wait list groups and those not given any treatment at all, there is very little response.  So if we compare the no treatment response to the placebo response, that difference is the placebo effect. 

Pie chart showing the effect of antidepressants with 50% of effect attributed to placebo 

Now we can look at the response to drug treatment and we can partition it.  This is the exact same data presented in a different format. If we look at anti-depressant medication the data of our meta-analysis would indicate that about 25 percent of that response would occur anyway, even if there was no treatment at all, including no placebo treatment.  50 percent of that response is a placebo effect, that’s pretty large.  Here you can also see  the thing that surprised me then, but no longer surprises me.  The effect of the drug when you subtracted the placebo response from the drug response was relatively small - only 25 percent of the drug response. Given everything that I had read about anti-depressants, I would have expected a much larger effect for the actual medication.  So I started thinking ‘now why is the drug effect so small?’ 

Does the type of anti-depressant make a difference?

Anti-depressant drug effects by type graph 

We’d been looking at clinical trials dealing with all sorts of medications, from the older tricyclic medications, to SSRIs, MAO inhibitors and so on.  We thought, perhaps some of these are effective and some not and we’re underestimating the effect of the really effective ones.  So we went back to the clinical trials that we had analysed and categorised them in terms of what kind of active drug was used.  As you can see, there was tremendous consistency – if you look at the percentage of the drug response reproduced by placebo there is virtually no difference.

anti-depressant by type percent effect graph 

You’ll notice something else.  Alongside the tricyclics, SSRIs and other anti-depressants, there’s a fourth category, ‘other drugs’.  Here’s surprise number two.  In some of the clinical trials medications were tested that are not typically considered anti-depressants, but which were being evaluated for anti-depressant efficacy.  These included lithium for uni-polar depressed patients, barbiturates, synthetic thyroid hormone for patients with no thyroid disorder and benzyl diazepam.  The effects are the same. 

What do all of these active drugs have in common with each other that they don’t share with  placebo? what is creating this difference that’s so consistent, not only between all the different anti-depressants but also what is it that lithium has in common with barbiturates or thyroid medication or tricyclics or SSRIs in relation to the treatment of depression?  Well, one thing that they all have in common is they all produce side effects.

Many have long been unimpressed by the magnitude of the differences observed between treatments and controls, what some of our colleagues refer to as the 'dirty little secret' in the pharmaceutical literature

Hollon et al 2002

‘Breaking blind’

Imagine that you are recruited as a patient to take part in the clinical trial of depression, you have to give informed consent.  As part of the informed consent procedure you will be told, ‘You may be given placebo, it’s double blind, we’re not gonna let you know and the physician won’t until after the trial is over.  The therapeutic effect may take some weeks before they occur but you have to be aware there’s also a possibility of side effects.’

As part of the consent procedure you listen to all of the side effects that have been reported for this medication.  So you might get dry mouth, you might get drowsy, you might feel some nausea.  Most patients break blind – that is they guess whether they are getting the real medication or not.  Patients receiving placebo guess only half the time, but 80 percent of patients receiving the active drug will correctly guess what they’ve received.  This makes sense.  If I was a patient in a clinical trial, the first thing I would be wondering is ‘which group have I been randomised to? Am I getting an active treatment or am I being given a placebo?  I have no way of knowing, but wait a second - my mouth is getting dry.  That’s one of the things they warned me about, yippee, I’m in the active drug group’. 

Producing a placebo with side effects

It’s not a confirmed hypothesis but it’s a plausible one.  What all of these drugs have in common, that’s not shared by placebo, is the component of breaking blind so the person now has greater expectations for improvement.  We presumed that a central component of the placebo effect is patients’ expectations of improvement and in fact, we have some data that substantiates that as a factor.  If you do something that’s going to produce greater expectations of improvement, you are to get greater improvement.  That might explain at least part of this difference that all active medications seem to have in comparison to a placebo.  Is there any way to control for that?  Well there is, it’s very difficult and has been done in a relatively small number of trials.  It  has not been done with SSRIs - pharmaceutical companies which fund most of the research and the clinical trials on anti-depressants, have not been interested in following it up on this research with the SSRIs.  The control is the use of what’s called an ‘active placebo’, that is a medication that produces side effects, hopefully side effects similar to those of the treatment under investigation, but which is not thought to have any therapeutic benefit itself for the treatment.  So you can use an active placebo as a control in order to prevent the breaking of blind.  There were eight or nine studies that were done decades ago, using active placebos rather than inert placebos in clinical trials of tricyclic medication for depression.

If you look at published trials of tricyclic medications compared to inert placebos, about 75 percent of them report significant benefit of medication over placebo in the published literature.  If you look at the eight or nine clinical trials, in which atropine was used as a placebo none showed a significant difference between drugs and placebo.  Atropine doesn’t treat depression, but does produce a dry mouth.  I’d love to see that followed up with additional research and especially additional research using new anti-depressants.  I’ve tried to get some funding to do that, I haven’t been successful so someone else will have to do it.

'Listening to Prozac, but hearing placebo'

We published the meta-analysis I’ve just described to you and we titled it ‘Listening to Prozac’ - which was the name of a popular book at the time -  ‘but Hearing Placebo’.  The reaction that some people had was, ‘Well this can’t be true. There have been so many clinical trials of  anti-depression medication, and it’s constituted a revolution in treatment.  We know that anti-depressants work, there must be something wrong with a particular set of clinical trials that you uncovered in your literature search and that led to this inappropriate conclusion.’

Unpublished trials

We decided to find another set of clinical trials and we did this by using the Freedom of Information Act, and requesting from the Food and Drug Administration in the US the data sent to them by the pharmaceutical companies in the process of obtaining approval for what were then the six most widely prescribed anti-depressant medications.  Specifically we had them send all the data that the pharmaceutical companies had given them for the approval of Fluoxetine (Prozac), Paroxetine (Seroxat/Paxil), Sertraline (Lustral/Zoloft), Venlafaxine (Effexor), Nefazodone (Dutonin/Serzone) and Citalopram (Cipramil/Celexa). Brand names vary from country to country in some cases. 

There are a couple of advantages of this FDA data set.  One is that it includes unpublished as well as published studies. One of the requirements of the FDA in approving medications is that they be sent information on all the clinical trials that were sponsored by the industry.  The second is they all use the same outcome measure which was the Hamilton rating scale for depression, and that’s useful because we didn’t have to calculate statistical effect size and then wonder, ‘well how does that correspond to what would be clinically meaningful?’

We now have a common standard that we can look at and know clinically what these scores mean. 

Vanishing clinical significance

Clinical significance of anti-depressant drug effects graph 

So here’s what we got, looking at these medications.  We see a large, clinically meaningful change in response to active medication, in this case about ten points on the Hamilton depression inventory.  There’s a large change in the placebo group as well, more than eight points on the Hamilton, duplication by placebo of 82 percent.  You may have noticed, first I said more than 80 percent and then I showed you the slides from the earlier meta-analysis showing a 75 percent duplication right by placebo.  That figure was when you have only the published literature. Here you have the unpublished studies as well.  The mean difference on Hamilton depression inventory is one point eight points, less than two points  difference.  To put that into some kind of perspective, the criterion used by NICE for establishing clinical significance of drug placebo difference in depression was three points on the Hamilton depression inventory.  So, at least by NICE standards I think I can safely say that this difference is not clinically meaningful or statistically significant. 

Are dose levels significant?

We had a concern about this data as well, and the concern was as follows.  There are two ways of doing clinical trials for anti-depressants, one of the most common way is to start with a dose and allow the physician to adjust the dose of the anti-depressant or placebo, the physician presumably doesn’t know which it is, as needed during the course of the trial. There’s nothing wrong with that, because it mimics what would presumably happen in clinical practice. 

But there’s some important information that that doesn’t give you i.e. what is a clinically effective dose?  In order to answer that question, there’s another kind of clinical trial that’s done, it’s called a fixed dose trial.  In fixed dose trials patients are randomised to receive particular doses, low doses, high doses, medium sized doses of the agent being investigated. This allows you to establish what a clinically effective dose is.  So our concern was, there were ten of these 40 trials that were fixed dose trials, that means they included some groups of patients who were given anti-depressants but in low doses.  We wondered if we were underestimating the drug effect because we were including some patients who were given too low a dose.  So we went to those ten dose response trials and what we did was to compare the effect of the lowest dose of the drug to the effect of the highest dose.  As you can see there’s no difference at all, there’s no dose response relationship. 

‘The Emperor’s New Drugs’

graph for kirsch talk

In fact, across these ten trials there were 40 comparisons of one dose to another dose.  There was one that turned out to be significant, it was in a trial of Prozac in which the low dose was significantly more effective that the high dose.  So we published our findings under the title ‘The Emperor’s New Drugs’. The piece appeared along with nine commentaries by 14 authors, which included authors of some of the clinical trials that we had analysed that were part of the FDA database.  And the reaction this time was actually quite different to what you might expect.  Everyone agreed that the numbers were right, and they said ‘it’s true and we knew it all along’. 

A group of clinical trialists who had worked on the original drug trials wrote in reponse to our article.  They say ‘Many have long been unimpressed by the magnitude of the differences observed between treatments and controls, what some of our colleagues refer to as the “dirty little secret” in the pharmaceutical literature’. (Hollon et al. 2002)

Publication bias

So we hadn’t discovered anything new at all, we just had made public a secret that had already been known by the FDA, by the pharmaceutical companies, by the clinical trialists, but not by the general public, not by physicians, not by third party players.  The question was, how was this secret kept?  And here was my next surprise.  Significant differences between drug and placebo were found in only half of the trials.  My studies have never been sponsored by the pharmaceutical industry, so if I do a study, no matter what the results are, I can publish it.  If I do a study and it’s a pharmaceutical industry trial, I can publish it only if I have permission from the company.  Most negative trials are not published. 

So how do these drugs get approved?   The FDA’s not basing their decision on published trials, they’ve got the full data set.  The criterion used by the FDA for proving anti-depressant medication is two clinical trials showing significant difference between drug and placebo.  That in itself doesn’t sound bad except there are a couple of catches.  One is that there’s no limit on the number of trials you can conduct in order to find the two that give you the significant drug placebo difference. That’s voodoo science, significance values are out of the window at that point.  Negative trials just don’t count and the degree of difference is not considered. 

So what do we do now?  I’m going to start by talking about what do we do now in terms of the treatment of depression specifically, and then I’m going to broaden that a little bit to talking about the implications that some of this might have for treatment, both by conventional and complementary and alternative medicine. 

Prescribing placebos?

Well, one thing we might consider doing is prescribing placebos.  They are cheap and have very few side effects, but of course there’s a problem.  The ethical problem is that to prescribe placebos effectively you probably have to deceive people.   And I’ve not studied this, but I will bet you that trust is part of the healer-patient relationship.  If you want to maintain trust you have to earn it by behaving in a trustworthy manner.  If we start deceiving patients, aside from what’s ethical or not ethical, in the long run we’re going to lose the patients’ trust and with it an important part our capacity to produce benefits for patients.  So I don’t advocate prescribing placebos. 

Alternative treatment options

What else might we do then?  Well, one option is to prescribe St John’s Wort. If we look at the data we have about the same improvement for hypericum that we do for conventional anti-depressants.  It is also slightly but significantly better than a placebo.  There’s no deception involved so we have that advantage, and we have far fewer side effects.  There are significantly more side effects with conventional anti-depressants than there are with St John’s Wort. 

The third alternative would be to prescribe physical exercise. We have data indicating that physical exercise could be effective in the treatment of mild and moderate depression.  That’s not to say it’s not also effective for severe depression - we don’t know. 

Then, of course, there’s psychotherapy which has also been shown to be as effective as anti-depressant medication, which means it is slightly more effective than pill placebo.  In this case it’s shown to work for severe depression as well as for mild and moderate depression. Psychotherapy might also have some side effects but I don’t think they are too severe and I don’t think we need to worry about them. 

The bigger picture

Now let’s broaden this out a little.  I’ve  narrowed it down to one disorder because it’s one that I’ve studied and one where I thought the results were important.  There does seem to be a placebo effect which can be meaningful and perhaps powerful for at least some conditions, depression being one, pain being another.  What implications does this data have for medical treatment and what can we do with it, especially since we can’t ethically prescribe placebos.  Well, one possibility is to concentrate on the placebo aspects of treatment so that we can generate the placebo effect without placebos.  This is based on the notion which is sometimes true, that you can have a real drug effect and it can be boosted by placebo factors. 

Placebo effect from genuine treatment

Some lovely evidence of this comes from a series of studies by Benedetti and his colleagues in Italy, on hidden versus open treatment.  I’m going to tell you about two of his trials though he has others as well.  One was with morphine for post-operative pain and the other was with diazepam for states of anxiety associated with surgery.  In these studies he asks the patient to give consent to the following procedure, ‘we may or may not be giving you an active drug. If we give you an active medication we may or not let you know when we start it,’  And the patients are hooked up to an IV and the IV is delivering saline.  That means you can switch from saline to the active drug without the patient’s awareness. Benedetti asked ‘what’s the effect of knowing that you’re getting the morphine, what’s the effect of knowing that you’re getting the diazepam?’  There’s no placebo in these studies but it is assessing the placebo effect. 

open and hidden interventions and how they alter the effects of pain medication

Here’s what happens with morphine and pain reduction.  Pain reduction is cut in half when the person doesn’t know that they’re getting morphine, when there’s been no signal that the morphine is being administered.  So about half of the effect of morphine is due to the knowledge that one is getting the drug.  With diazepam for anxiety, there’s no effect at all unless the patient knows they’re getting it, in fact there’s actually a slight increase in the state of anxiety.  What’s happened here is just the opposite of what we want to do, we’ve reduced the placebo component of treatment and that’s interesting scientifically, it’s not what we want to do clinically, so the question is can we augment it? 

The perfect medicine?

I want to finish up with this question about prescribing placebos.  I know we can’t do it, but I can’t help but fantasise about what it would be like if we could just prescribe placebos.  Then I think, ‘well gee, if we prescribe placebos then we could have commercials for them’.  So what would  a placebo advertisement look like?  And I suppose it might be something like the following:

‘Prevaricain - a genuine placebo medication, tested in more clinical trials than any other treatment, so powerful it’s the standard by which all other medications are tested, so effective it’s used in the treatment of thousands of aliments, and so safe that it can be given to infants, the elderly and pregnant women.  Remember, if it’s placebo you can believe in it’.