Wool and Nuts: U curve

Showing posts with label U curve. Show all posts

Saturday, July 24, 2010

The China Study one more time: Are raw plant foods giving people cancer?

In this previous post I analyzed some data from the China Study that included counties where there were cases of schistosomiasis infection. Following one of Denise Minger’s suggestions, I removed all those counties from the data. I was left with 29 counties, a much smaller sample size. I then ran a multivariate analysis using WarpPLS (warppls.com), like in the previous post, but this time I used an algorithm that identifies nonlinear relationships between variables.

Below is the model with the results. (Click on it to enlarge. Use the "CRTL" and "+" keys to zoom in, and CRTL" and "-" to zoom out.) As in the previous post, the arrows explore associations between variables. The variables are shown within ovals. The meaning of each variable is the following: aprotein = animal protein consumption; pprotein = plant protein consumption; cholest = total cholesterol; crcancer = colorectal cancer.

What is total cholesterol doing at the right part of the graph? It is there because I am analyzing the associations between animal protein and plant protein consumption with colorectal cancer, controlling for the possible confounding effect of total cholesterol.

I am not hypothesizing anything regarding total cholesterol, even though this variable is shown as pointing at colorectal cancer. I am just controlling for it. This is the type of thing one can do in multivariate analyzes. This is how you “control for the effect of a variable” in an analysis like this.

Since the sample is fairly small, we end up with insignificant beta coefficients that would normally be statistically significant with a larger sample. But it helps that we are using nonparametric statistics, because they are still robust in the presence of small samples, and deviations from normality. Also the nonlinear algorithm is more sensitive to relationships that do not fit a classic linear pattern. We can summarize the findings as follows:

- As animal protein consumption increases, plant protein consumption decreases significantly (beta=-0.36; P<0.01). This is to be expected and helpful in the analysis, as it differentiates somewhat animal from plant protein consumers. Those folks who got more of their protein from animal foods tended to get significantly less protein from plant foods.

- As animal protein consumption increases, colorectal cancer decreases, but not in a statistically significant way (beta=-0.31; P=0.10). The beta here is certainly high, and the likelihood that the relationship is real is 90 percent, even with such a small sample.

- As plant protein consumption increases, colorectal cancer increases significantly (beta=0.47; P<0.01). The small sample size was not enough to make this association insignificant. The reason is that the distribution pattern of the data here is very indicative of a real association, which is reflected in the low P value.

Remember, these results are not confounded by schistosomiasis infection, because we are only looking at counties where there were no cases of schistosomiasis infection. These results are not confounded by total cholesterol either, because we controlled for that possible confounding effect. Now, control variable or not, you would be correct to point out that the association between total cholesterol and colorectal cancer is high (beta=0.58; P=0.01). So let us take a look at the shape of that association:

Does this graph remind you of the one on this post; the one with several U curves? Yes. And why is that? Maybe it reflects a tendency among the folks who had low cholesterol to have more cancer because the body needs cholesterol to fight disease, and cancer is a disease. And maybe it reflects a tendency among the folks who have high total cholesterol to do so because total cholesterol (and particularly its main component, LDL cholesterol) is in part a marker of disease, and cancer is often a culmination of various metabolic disorders (e.g., the metabolic syndrome) that are nothing but one disease after another.

To believe that total cholesterol causes colorectal cancer is nonsensical because total cholesterol is generally increased by consumption of animal products, of which animal protein consumption is a proxy. (In this reduced dataset, the linear univariate correlation between animal protein consumption and total cholesterol is a significant and positive 0.36.) And animal protein consumption seems to be protective again colorectal cancer in this dataset (negative association on the model graph).

Now comes the part that I find the most ironic about this whole discussion in the blogosphere that has been going on recently about the China Study; and the answer to the question posed in the title of this post: Are raw plant foods giving people cancer? If you think that the answer is “yes”, think again. The variable that is strongly associated with colorectal cancer is plant protein consumption.

Do fruits, veggies, and other plant foods that can be consumed raw have a lot of protein?

With a few exceptions, like nuts, they do not. Most raw plant foods have trace amounts of protein, especially when compared with foods made from refined grains and seeds (e.g., wheat grains, soybean seeds). So the contribution of raw fruits and veggies in general could not have influenced much the variable plant protein consumption. To put this in perspective, the average plant protein consumption per day in this dataset was 63 g; even if they were eating 30 bananas a day, the study participants would not get half that much protein from bananas.

Refined foods made from grains and seeds are made from those plant parts that the plants absolutely do not “want” animals to eat. They are the plants’ “children” or “children’s nutritional reserves”, so to speak. This is why they are packed with nutrients, including protein and carbohydrates, but also often toxic and/or unpalatable to animals (including humans) when eaten raw.

But humans are so smart; they learned how to industrially refine grains and seeds for consumption. The resulting human-engineered products (usually engineered to sell as many units as possible, not to make you healthy) normally taste delicious, so you tend to eat a lot of them. They also tend to raise blood sugar to abnormally high levels, because industrial refining makes their high carbohydrate content easily digestible. Refined foods made from grains and seeds also tend to cause leaky gut problems, and autoimmune disorders like celiac disease. Yep, we humans are really smart.

Thanks again to Dr. Campbell and his colleagues for collecting and compiling the China Study data, and to Ms. Minger for making the data available in easily downloadable format and for doing some superb analyses herself.

Sunday, May 9, 2010

Long distance running causes heart disease, unless it doesn’t

Regardless of type of exercise, disease markers are generally associated with intensity of exertion over time. This association follows a J-curve pattern. Do too little of it, and you have more disease; do too much, and incidence of disease goes up. There is always an optimal point, for each type of exercise and marker. A J curve is actually a U curve, with a shortened left end. The reason for the shortened left end is that, when measurements are taken, usually more measures fall on the right side of the curve than on the left.

The figure below (click to enlarge) shows a schematic representation that illustrates this type of relationship. (I am not very good at drawing.) Different individuals have different curves. If the vertical axis was a measure of health, as opposed to disease, then the curve would have the shape of an inverted J.

The idea that long distance running causes heart disease has been around for a while. Is it correct?

If it is, then one would expect to see certain things. For example, let’s say you take a group of long distance runners who have been doing that for a while, ideally runners above age 50. That is when heart disease becomes more frequent. This would also capture more experienced runners, with enough running experience to cause some serious damage. Let us say you measured markers of heart disease before and after a grueling long distance race. What would you see?

If long distance running causes heart disease, you would see a significant proportion with elevated makers of heart disease among the runners at baseline (i.e., before the race). After all, running is causing a cumulative problem. The levels of those markers would be correlated with practice, or participation in previous races, since the races are causing the damage. Also, you would see a uniformly bad increase in the markers after the race, as the running is messing up everybody more or less equally.

Sahlén and colleagues (2009), a group of Swedish researchers, studied males and females aged 55 or older who participated in a 30-km (about 19-mile) cross-country race. The full reference to the article is at the end of this post. The researchers included only runners who had no diagnosed medical disorders in their study. They collected data on the patterns of exercise prior to the race, and participation in previous races. Blood was taken before and after the race, and several measurements were obtained, including measurements of two possible heart disease markers: N-terminal pro-brain natriuretic peptide (NT-proBNP), and troponin T (TnT). The table below (click to enlarge) shows several of those measurements before and after the race.

We can see that NT-proBNP and TnT increased significantly after the race. So did creatinine, a byproduct of breakdown in muscle tissue of creatine phosphate; something that you would expect after such a grueling race. Yep, long distance running increases NT-proBNP and TnT, so it leads to heart disease, right?

Wait, not so fast!

NT-proBNP and TnT levels usually increase after endurance exercise, something that is noted by the authors in their literature review. But those levels do not stay elevated for too long after the race. Being permanently elevated, that is a sign of a problem. Also, excessive elevation during the race is also a sign of a potential problem.

Now, here is something interesting. Look at the table below, showing the variations grouped by past participation in races.

The increases in NT-proBNP and TnT are generally lower in those individuals that participated in 3 to 13 races in the past. They are higher for the inexperienced runners, and, in the case of NT-proBNP, particularly for those with 14 or more races under their belt (the last group on the right). The baseline NT-proBNP is also significantly higher for that group. They were older too, but not by much.

Can you see a possible J-curve pattern?

Now look at this table below, which shows the results of a multiple regression analysis on its right side. Look at the last column on the right, the beta coefficients. They are all significant, but the first is .81, which is quite high for a standardized partial regression coefficient. It refers to an almost perfect relationship between the log of NT-proBNP increase and the log of baseline NT-proBNP. (The log transformations reflect the nonlinear relationships between NT-proBNP, a fairly sensitive health marker, and the other variables.)

In a multiple regression analysis, the effect of each independent variable (i.e., each predictor) on the dependent variable (the log of NT-proBNP increase) is calculated controlling for the effects of all the other independent variables on the dependent variable. Thus, what the table above is telling us is that baseline NT-proBNP predicts NT-proBNP increase almost perfectly, even when we control for age, creatinine increase, and race duration (i.e., amount of time a person takes to complete the race).

Again, even when we control for: AGE, creatinine increase, and RACE DURATION.

In order words, baseline NT-proBNP is what really matters; not even age makes that much of a difference. But baseline NT-proBNP is NEGATIVELY correlated with number of previous races. The only exception is the group that participated in 14 or more previous races. Maybe that was too much for them.

Okay, one more table. This one, included below, shows regression analyses between a few predictors and the main dependent variable, which in this case is TnT elevation. No surprises here based on the discussion so far. Look at the left part, the column labeled as “B”. Those are correlation coefficients, varying from -1 to 1. Which is the predictor with the highest absolute correlation with TnT elevation? It is number of previous races, but the correlation is, again, NEGATIVE.

In follow-up tests after the race, 9 out of the 185 participants (4.9 percent) showed more decisive evidence of heart disease. One of those died while training a few months after the race. An autopsy was conducted showing abnormal left ventricular hypertrophy with myocardial fibrosis, coronary artery narrowing, and an old myocardial scar.

Who were the 9 lucky ones? You guessed it. Those were the ones who had the largest increases in NT-proBNP during the race. And large increases in NT-proBNP were more common among the runners who were too inexperienced or too experienced. The ones at the extremes.

So, here is a summary of what this study is telling us:

- The 30-km cross-country race studied is no doubt a strenuous activity. So if you have not exercised in years, perhaps you should not start with this kind of race.

- By and large, individuals who had elevated markers of heart disease prior to the race also had the highest elevations of those markers after the race.

- Participation in past races was generally protective, likely due to compensatory body adaptations, with the exception of those who did too much of that.

- Prevalence of heart disease among the runners was measured at 4.9 percent. This does not beat even the mildly westernized Inuit, but certainly does not look so bad considering that the general prevalence of ischemic heart disease in the US and Sweden is about 6.8 percent.

It seems reasonable to conclude that long distance running may be healthy, unless one does too much of it. The ubiquitous J-curve pattern again.

How much is too much? It certainly depends on each person’s particular health condition, but the bar seems to be somewhat high on average: participation in 14 or more previous 30-km races.

As for the 4.9 percent prevalence of heart disease among runners, maybe it is caused by something else, and endurance running may actually be protective, as long as it is not taken to extremes. Maybe that something else is a diet rich in refined carbohydrates and sugars, or psychological stress caused by modern life, or a combination of both.

Just for the record, I don’t do endurance running. I like walking, sprinting, moderate resistance training, and also a variety of light aerobic activities that involve some play. This is just a personal choice; nothing against endurance running.

Mark Sisson was an accomplished endurance runner; now he does not like it very much. (Click here to check his excellent book The Primal Blueprint). Arthur De Vany is not a big fan of endurance running either.

Still, maybe the Tarahumara and hunter-gatherer groups who practice persistence hunting are not such huge exceptions among humans after all.

Reference:

Sahlén, A., Gustafsson, T.P., Svensson, J.E., Marklund, T., Winter, R., Linde, C., & Braunschweig, F. (2009). Predisposing Factors and Consequences of Elevated Biomarker Levels in Long-Distance Runners Aged >55 Years. The American Journal of Cardiology, 104(10), 1434–1440.

Saturday, December 19, 2009

Total cholesterol and cardiovascular disease: A U-curve relationship

The hypothesis that blood cholesterol levels are positively correlated with heart disease (the lipid hypothesis) dates back to Rudolph Virchow in the mid-1800s.

One famous study that supported this hypothesis was Ancel Keys's Seven Countries Study, conducted between the 1950s and 1970s. This study eventually served as the foundation on which much of the advice that we receive today from doctors is based, even though several other studies have been published since that provide little support for the lipid hypothesis.

The graph below (source: canibaisereis.com, with many thanks to O Primitivo) shows the results of one study, involving many more countries than Key's Seven Countries Study, that actually suggests a NEGATIVE linear correlation between total cholesterol and cardiovascular disease.

Now, most relationships in nature are nonlinear, with quite a few following a pattern that looks like a U-curve (plain or inverted); sometimes called a J-curve pattern. The graph below (source also: canibaisereis.com) shows the U-curve relationship between total cholesterol and mortality, with cardiovascular disease mortality indicated through a dotted red line at the bottom.

This graph has been obtained through a nonlinear analysis, and I think it provides a better picture of the relationship between total cholesterol (TC) and mortality. Based on this graph, the best range of TC that one can be at is somewhere between 210, where cardiovascular disease mortality is minimized; and 220, where total mortality is minimized.

The total mortality curve is the one indicated through the full blue line at the top. In fact, it suggests that mortality increases sharply as TC decreases below 200.

Now, these graphs relate TC with disease and mortality, and say nothing about LDL cholesterol (LDL). In my own experience, and that of many people I know, a TC of about 200 will typically be associated with a slightly elevated LDL (e.g., 110 to 150), even if one has a high HDL cholesterol (i.e., greater than 60).

Yet, most people who have a LDL greater than 100 will be told by their doctors, usually with the best of the intentions, to take statins, so that they can "keep their LDL under control". (LDL levels are usually calculated, not measured directly, which itself creates a whole new set of problems.)

Alas, reducing LDL to 100 or less will typically reduce TC below 200. If we go by the graphs above, especially the one showing the U-curves, these folks' risk for cardiovascular disease and mortality will go up - exactly the opposite effect that they and their doctors expected. And that will cost them financially as well, as statin drugs are expensive, in part to pay for all those TV ads.