In review after review of clinical trials on complementary and alternative medicine (CAM), researchers have informed therapists that many of the studies out there are not of high enough quality to prove benefits of the modalities being examined. Furthermore, there’s an inadequate number of trials to move CAM speedily along on the road to universal acceptance. What’s the problem? And why do we need these trials anyway?
This issue emerged in our Aug./Sept. 2002 Somatic Research column on rheumatology (“Massage and other Natural Wonders”) in which Dr. Edzard Ernst, professor and director of the department of complementary medicine at University of Exeter’s post-graduate medical school in England, was quoted as saying, “Generally speaking, complementary and alternative medicine is grossly under-researched. Because of the popularity of complementary and alternative medicine, adequately defining risk-benefit relationships is an urgent matter.”1 His conclusions are typical of those found in current study reviews.
Randomized controlled trials (RCTs) are considered the “gold standard” of clinical research. As to the element of quality, said Ernst in a recent e-mail interview from his office, a high quality trial is one that “minimizes selection bias which otherwise can distort results completely.” While some CAM studies are carefully planned out and do meet RCT criteria, there seems to be a plethora of studies in which methodological weaknesses are the norm.
I recently spoke with Dr. Maria Hernandez-Reif of the Touch Research Institute (TRI) in Miami, Fla., about the current status of massage therapy research. TRI researchers have been at the forefront in providing sound clinical research in the field and have added significantly to our current body of knowledge on the effects of massage.
“I think it is very important for massage therapy research to follow the guidelines of other quality scientific research,” said Hernandez-Reif, “which includes having at least a comparison group, randomly assigning participants to the groups and having outcome measures. Think carefully about why you’re doing the particular study, have solid questions you would like to answer at the end of the study and have the controls.”
Does it Fit?
One might wonder if some methodological weaknesses in CAM research could be related to the fact this is a holistic field in which many treatments are individualized. For instance, we have previously reported on research of flower essences for depression. This approach necessarily demands a treatment protocol designed to fit each individual subject. How can this type of research fulfill the criteria of group protocol and controls? Are we trying to force a square peg in the round hole of standards originally established for allopathic medicine?
Dr. Jonathan Berman, director of the Office of Clinical and Regulatory Affairs at the National Center for Complementary and Alternative Medicine (NCCAM), National Institutes of Health (NIH) had this to say: “Standard, medically-oriented clinical trials have two fundamental requirements: therapeutic materials and procedures that can be reproduced, and trials that are conducted in a reproducible manner. The first requirement means doctors and patients know what the patients are getting in the trial and can give similar treatment to future patients. The second requirement means that if the results of the trial are favorable, future patients are likely to have favorable outcomes,” he said.
“Some CAM therapeutic materials, such as vitamins and acupuncture needles, can easily be made in a reproducible manner. Others, such as plant extracts, are more complex and more difficult to make reproducibly. Nevertheless, even the latter can probably be made so that variation is acceptable. People are people, whether they are receiving CAM or conventional therapies, and clinical trials involving CAM therapies can be performed just as well as clinical trials involving other products.
“One of the arts of clinical trials,” he continued, “is designing them to address issues that are meaningful to patients in everyday use. As already mentioned, complex botanicals can be reproducibly made and studied. But if everyday use of the therapy requires highly individualized treatments, it may or may not be possible to design formal studies of those treatments.”
Ernst takes a slightly different stance. “The problems we encounter in CAM are seen also in other fields, e.g., surgery, psychotherapy and occupational therapy. Individualization is no real obstacle.” As an example, he emphasized “the many RCTs of homeopathy that have incorporated this principle.”
Researchers at TRI have consistently adhered to the RCT criteria for randomization and controls. “Regardless of what treatment you are studying,” said Hernandez-Reif, “the study has to follow the same rigors of clinical trials.” Even in an individualized flower essence study, she said, it is necessary to include a control group receiving a dummy flower essence. “I think that’s the primary problem — studies are lacking control groups. You also need to have other types of measures. In massage therapy, subjects feel obligated to tell you their pain went from (a rating of) 8 to 4, and they know they’re expected to feel better. There might be some bias on the part of the participant.”
To counter the possible bias of self-reporting, TRI researchers try to incorporate objective measures in their studies, such as physiological (heart rate, blood pressure), biochemical (stress hormones) or immunological change in immune systems (natural killer cells). By collecting saliva or urine samples and measuring biochemistry, she noted, researchers “might be able to document better or complement self-report measures.”
“More Research Needed”
This recurring phrase, found in the conclusions section of so many meta-analyses, points to the complicated and sometimes frustrating task of amassing sufficient data to prove benefits of a CAM therapy. Hernandez-Reif explained the process: “In meta-analysis, researchers go through the literature and find articles published on a particular subject. They will be trying to collapse across a number of studies to see whether, if you combine them all, you have an effect.” A specific criteria is established which generally includes randomization, controls, having sufficient sample size or having enough participants in a study, she said.
“Typically the studies that have been published were not rigorously done to the point where they would meet the criteria,” she said. Thus, the reviewers may end up with only five or six sound, scientific studies meeting their criteria. At this point, they “take the means and standard deviations (SD) from those studies; if those are not present, then the studies are out. The researchers collapse across the good studies and perform a statistic that comes up with an effect size. The power of that effect is based on mean and standard deviations.”
Mean represents the average score, while the SD represents variability within scores. For example, the massage therapy and control groups are both asked for a self-reported rating (such as degree of discomfort or pain) on the first day of the study. Individual scores are added in each group and divided by the group number; a mean score is the result. The SD is then computed in this way, said Hernandez-Reif. “How much does each individual number differ from the mean? If the SD is high, that means groups are not very similar within. In order for you to understand the group dynamics, you have to study the SD. When the SD is as high or higher than the mean, scores are spread out all over the place. You are looking for consistency,” she said.
“When doing a meta-analysis,” Hernandez-Reif explained, “you look at the mean and SD of solid studies, then do a bigger analysis across all the studies.” Researchers also look at other types of analysis, such as effect size. “Meta-analysis typically uses a lot of statistics. From the effect size, they can tell whether effects are small, medium or large.” When effects are small, more research is needed.
Hernandez-Reif emphasized the importance of conducting a good clinical study with groups that are comparable. If, after treatment, “the mean goes from 8 to 4, but the SD is 5 or 6, your effect might not be strong enough.” In this case, there is too much variability in the scores; the SD is too high. “You also have to compare findings of the massage group to the control group. Sometimes it’s just a placebo effect from participating in a study. They report less pain and are less anxious. You have to have a strong effect.”
The Road to Success
The source of this lack of an appropriate number of high-quality CAM trials is not only limited to inadequate efforts on the part of some researchers, but also can be found within the existing research environment. While Ernst stated there is no easy answer, as failings are so diverse, he noted, “In principle, the problem, in my view, is lack of funds, lack of expertise and lack of rigor.” Berman’s comments further expand this exploration of the root cause. “Demonstration that any therapy (CAM or non-CAM) is effective and safe involves two steps. In Step 1, a therapy is selected for scientific investigation. Since there are limited resources, choices must be made among many therapies.2 For some CAM therapies, there is already widespread use, but little if any systematically collected data is available,” said Berman.
“In Step 2, a therapy that has appeared to be effective and safe in the first stages of investigation is formally tested in large numbers of patients using scientifically rigorous procedures. It is only through these large, very expensive (often $20 million or more), and very time-consuming (easily five years or longer) studies that the clinical value of a therapy can be scientifically proven. In the CAM field, I see us at Step 1. One might say the present use of many therapies has suggested they appear to be effective and safe. However, more remains to be learned about them through formal studies.”
So we return once again to the problem of “more research needed.” What we have learned is the importance of adhering to RCT protocol – providing randomization and controls, and utilizing appropriate data analysis. Funding is, of course, a whole other matter. But as results come in and funding sources envision the potential of generating income from proven benefits, perhaps this hurdle may also be overcome.