Wool and Nuts: cardiovascular disease

Showing posts with label cardiovascular disease. Show all posts

Monday, December 24, 2012

The 2012 Atherosclerosis egg study: More smoking is associated with more plaque, unless you eat more eggs

I blogged before about the study by David Spence and colleagues, published online in July 2012 in the journal Atherosclerosis (). This study attracted a lot of media attention (e.g., ). The article is titled: “Egg yolk consumption and carotid plaque”. The study argues that “regular consumption of egg yolk should be avoided by persons at risk of cardiovascular disease”. It hints at egg yolks being unhealthy in general, possibly even more so than cigarettes.

I used the numbers in Table 2 of the article (only 5 rows of data, one per quintile; i.e., N=5) to conduct a type of analysis that is rarely if ever conducted in health studies – a moderating effects analysis. A previous blog post summarizes the results of one such analysis using WarpPLS (). It looked into the effect of the number of eggs consumed per week on the association between blood LDL cholesterol and plaque (carotid plaque). The conclusion, which is admittedly tentative due to the small sample (N=5), was that plaque decreased as LDL cholesterol increased with consumption of 2.3 eggs per week or more ().

Recently I ran an analysis on the moderating effect of number of eggs consumed per week on the association between cumulative smoking (measured in “pack years”) and plaque. As it turns out, if you fit a 3D surface to the five data points that you get for these three variables from Table 2 of the article, you end up with a relatively smooth surface. Below is a 3D plot of the 5 data points, followed by a best-fitting 3D surface (developed using an experimental algorithm).

Based on this best-fitting surface you could then generate a contour graph, shown below. The “lines” are called “isolines”. Each isoline refers to plaque values that are constant for a set of eggs per week and cumulative smoking combinations. Next to the isolines are the corresponding plaque values. The first impression is indeed that both egg consumption and smoking are causing plaque buildup, as plaque clearly increases as one moves toward the top-right corner of the graph.

But focus your attention on each individual isoline, one at a time. It is clear that plaque remains constant for increases in cumulative smoking, as long as egg consumption increases. Take for example the isoline that refers to 120 mm2 of plaque area. An increase in cumulative smoking from about 14.5 to 16 pack years leads to no increase in plaque if egg consumption goes up from about 2 to 2.3 eggs per week.

These within-isoline trends, which are fairly stable across isolines (they are all slanted to the right), clearly contradict the idea that eggs cause plaque buildup. So, why does plaque buildup seem to clearly increase with egg consumption? Here is a good reason: egg consumption is very strongly correlated with age, and plaque increases with age. The correlation is a whopping 0.916. And I am not talking about cumulative egg consumption, which the authors also measure, through a variable called “egg-yolk years”. No, I am talking about eggs per week. In this dataset, older folks were eating more eggs, period.

The correlation between plaque and age is even higher: 0.977. Given this, it makes sense to look at individual isolines. This would be analogous to what biostatisticians often call “adjusting for age”, or analyzing the effect of egg consumption on plaque buildup “keeping age constant”. A different technique is to “control for age”; this technique would be preferable had the correlations been lower (say, lower than 0.7), as collinearity levels might have been below acceptable thresholds.

The underlying logic of the “keeping age constant” technique is fairly sound in the face of such a high correlation, which would make “controlling for age” very difficult due to collinearity. When we “keep age constant”, the results point at egg consumption being protective among smokers.

But diehard fans of the idea that eggs are unhealthy could explain the results differently. Maybe egg consumption causes plaque to go up, but smoking has a protective effect. Again taking the isoline that refers to 120 mm2 of plaque area, these diehard fans could say that an increase in egg consumption from 2 to 2.3 eggs per week leads to no increase in plaque if cumulative smoking goes up from about 14.5 to 16 pack years.

Not too long ago I also blogged about a medical case study of a man who ate approximately 25 eggs (20 to 30) per day for over 15 years (probably well over), was almost 90 years old (88) when the case was published in the prestigious The New England Journal of Medicine, and was in surprisingly good health (). This man was not a smoker.

Perhaps if this man smoked 25 cigarettes per day, and ate no eggs, he would be in even better health eh!?

Monday, October 1, 2012

The anatomy of a VAP test report

The vertical auto profile (VAP) test is an enhanced lipid profile test. It has been proposed, chiefly by the company Atherotech (), as a more complete test that relies on direct measurement of previously calculated lipid measures. The VAP test is particularly known for providing direct measurements of LDL cholesterol, instead of calculating them through equations ().

At the time of this writing, a typical VAP test report would provide direct measures of the cholesterol content of LDL, Lp(a), IDL, HDL, and VLDL particles. It would also provide additional measures referred to as secondary risk factors, notably particle density patterns and apolipoprotein concentrations. Finally, it would provide a customized risk summary and some basic recommendations for treatment. Below is the top part of a typical VAP test report (from Atherotech), showing measures of the cholesterol content of various particles. LDL cholesterol is combined for four particle subtypes, the small-dense subtypes 4 and 3, and the large-buoyant subtypes 2 and 1. A breakdown by LDL particle subtype is provided later in the VAP report.

In the table above, HDL cholesterol is categorized in two subtypes, the small-dense subtype 2, and the large-buoyant subtype 3. Interestingly, most of the HDL cholesterol in the table is supposedly of the least protective subtype, which seems to be a common finding in the general population. VLDL cholesterol is categorized in a similar way. IDL stands for intermediate-density lipoprotein; this is essentially a VLDL particle that has given off some of its content, particularly its triglyceride (or fat) cargo, but still remains in circulation.

Lp(a) is a special subtype of the LDL particle that is purported to be associated with markedly atherogenic factors. Mainstream medicine generally considers Lp(a) particles themselves to be atherogenic, which is highly debatable. Among other things, cardiovascular disease (CVD) risk and Lp(a) concentration follow a J-curve pattern, and Lp(a)’s range of variation in humans is very large. A blog post by Peter (Hyperlipid) has a figure right at the top that illustrates the former J-curve assertion (). The latter fact, related to range of variation, generally leads to a rather wide normal distribution of Lp(a) concentrations in most populations; meaning that a large number of individuals tend to fall outside Lp(a)’s optimal range and still have a low risk of developing CVD.

Below is the middle part of a typical VAP report, showing secondary risk factors, such as particle density patterns and apolipoprotein concentrations. LDL particle pattern A is considered to be the most protective, supposedly because large-buoyant LDL particles are less likely to penetrate the endothelial gaps, which are about 25 nm in diameter. Apolipoproteins are proteins that bind to fats for their transport in lipoproteins, to be used by various tissues for energy; free fatty acids also need to bind to proteins, notably albumin, to be transported to tissues for use as energy. Redundant particles and processes are everywhere in the human body!

Below is the bottom part of a typical VAP report, providing a risk summary and some basic recommendations. One of the recommendations is “to lower” the LDL target from 130mg/dL to 100mg/dL due to the presence of the checked emerging risk factors on the right, under “Considerations”. What that usually means in practice is a recommendation to take drugs, especially statins, to reduce LDL cholesterol levels. A recent post here and the discussion under it suggest that this would be a highly questionable recommendation in the vast majority of cases ().

What do I think about VAP tests? I think that they are useful in that they provide a lot more information about one’s lipids than standard lipid profiles, and more information is better than less. On the other hand, I think that people should be very careful about what they do with that information. There are even more direct tests that I would recommend before a decision to take drugs is made (, ), if that decision is ever made at all.

Monday, November 28, 2011

Triglycerides, VLDL, and industrial carbohydrate-rich foods

Below are the coefficients of association calculated by HealthCorrelator for Excel (HCE) for user John Doe. The coefficients of association are calculated as linear correlations in HCE (). The focus here is on the associations between fasting triglycerides and various other variables. Take a look at the coefficient of association at the top, with VLDL cholesterol, indicated with a red arrow. It is a very high 0.999.

Whoa! What is this – 0.999! Is John Doe a unique case? No, this strong association between fasting triglycerides and VLDL cholesterol is a very common pattern among HCE users. The reason is simple. VLDL cholesterol is not normally measured directly, but typically calculated based on fasting triglycerides, by dividing the fasting triglycerides measurement by 5. And there is an underlying reason for that - fasting triglycerides and VLDL cholesterol are actually very highly correlated, based on direct measurements of these two variables.

But if VLDL cholesterol is calculated based on fasting triglycerides (VLDL cholesterol = fasting triglycerides / 5), how come the correlation is 0.999, and not a perfect 1? The reason is the rounding error in the measurements. Whenever you see a correlation this high (i.e., 0.999), it is reasonable to suspect that the source is an underlying linear relationship disturbed by rounding error.

Fasting triglycerides are probably the most useful measures on standard lipid panels. For example, fasting triglycerides below 70 mg/dl suggest a pattern of LDL particles that is predominantly of large and buoyant particles. This pattern is associated with a low incidence of cardiovascular disease (). Also, chronically high fasting triglycerides are a well known marker of the metabolic syndrome, and a harbinger of type 2 diabetes.

Where do large and buoyant LDL particles come from? They frequently start as "big" (relatively speaking) blobs of fat, which are actually VLDL particles. The photo is from the excellent book by Elliott & Elliott (); it shows, on the same scale: (a) VLDL particles, (b) chylomicrons, (c) LDL particles, and (d) HDL particles. The dark bar at the bottom of each shot is 1000 A in length, or 100 nm (A = angstrom; nm = nanometer; 1 nm = 10 A).

If you consume an excessive amount of carbohydrates, my theory is that your liver will produce an abnormally large number of small VLDL particles (also shown on the photo above), a proportion of which will end up as small and dense LDL particles. The liver will do that relatively quickly, probably as a short-term compensatory mechanism to avoid glucose toxicity. It will essentially turn excess glucose, from excess carbohydrates, into fat. The VLDL particles carrying that fat in the form of triglycerides will be small because the liver will be in a hurry to clear the excess glucose in circulation, and will have no time to produce large particles, which take longer to produce individually.

This will end up leading to excess triglycerides hanging around in circulation, long after they should have been used as sources of energy. High fasting triglycerides will be a reflection of that. The graphs below, also generated by HCE for John Doe, show how fasting triglycerides and VLDL cholesterol vary in relation to refined carbohydrate consumption. Again, the graphs are not identical in shape because of rounding error; the shapes are almost identical.

Small and dense LDL particles, in the presence of other factors such as systemic inflammation, will contribute to the formation of atherosclerotic plaques. Again, the main source of these particles would be an excessive amount of carbohydrates. What is an excessive amount of carbohydrates? Generally speaking, it is an amount beyond your liver’s capacity to convert the resulting digestion byproducts, fructose and glucose, into liver glycogen. This may come from spaced consumption throughout the day, or acute consumption in an unnatural form (a can of regular coke), or both.

Liver glycogen is sugar stored in the liver. This is the main source of sugar for your brain. If your blood sugar levels become too low, your brain will get angry. Eventually it will go from angry to dead, and you will finally find out what awaits you in the afterlife.

Should you be a healthy athlete who severely depletes liver glycogen stores on a regular basis, you will probably have an above average liver glycogen storage and production capacity. That will be a result of long-term compensatory adaptation to glycogen depleting exercise (). As such, you may be able to consume large amounts of carbohydrates, and you will still not have high fasting triglycerides. You will not carry a lot of body fat either, because the carbohydrates will not be converted to fat and sent into circulation in VLDL particles. They will be used to make liver glycogen.

In fact, if you are a healthy athlete who severely depletes liver glycogen stores on a regular basis, excess calories will be just about the only thing that will contribute to body fat gain. Your threshold for “excess” carbohydrates will be so high that you will feel like the whole low carbohydrate community is not only misguided but also part of a conspiracy against people like you. If you are also an aggressive blog writer, you may feel compelled to tell the world something like this: “Here, I can eat 300 g of carbohydrates per day and maintain single-digit body fat levels! Take that you low carbohydrate idiots!”

Let us say you do not consume an excessive amount of carbohydrates; again, what is excessive or not varies, probably dramatically, from individual to individual. In this case your liver will produce a relatively small number of fat VLDL particles, which will end up as large and buoyant LDL particles. The fat in these large VLDL particles will likely not come primarily from conversion of glucose and/or fructose into fat (i.e., de novo lipogenesis), but from dietary sources of fat.

How do you avoid consuming excess carbohydrates? A good way of achieving that is to avoid man-made carbohydrate-rich foods. Another is adopting a low carbohydrate diet. Yet another is to become a healthy athlete who severely depletes liver glycogen stores on a regular basis; then you can eat a lot of bread, pasta, doughnuts and so on, and keep your fingers crossed for the future.

Either way, fasting triglycerides will be strongly correlated with VLDL cholesterol, because VLDL particles contain both triglycerides (“encapsulated” fat, not to be confused with “free” fatty acids) and cholesterol. If a large number of VLDL particles are produced by one’s liver, the person’s fasting triglycerides reading will be high. If a small number of VLDL particles are produced, even if they are fat particles, the fasting triglycerides reading will be relatively low. Neither VLDL cholesterol nor fasting triglycerides will be zero though.

Now, you may be wondering, how come a small number of fat VLDL particles will eventually lead to low fasting triglycerides? After all, they are fat particles, even though they occur in fewer numbers. My hypothesis is that having a large number of small-dense VLDL particles in circulation is an abnormal, unnatural state, and that our body is not well designed to deal with that state. Use of lipoprotein-bound fat as a source of energy in this state becomes somewhat less efficient, leading to high triglycerides in circulation; and also to hunger, as our mitochondria like fat.

This hypothesis, and the theory outlined above, fit well with the numbers I have been seeing for quite some time from HCE users. Note that it is a bit different from the more popular theory, particularly among low carbohydrate writers, that fat is force-stored in adipocytes (fat cells) by insulin and not released for use as energy, also leading to hunger. What I am saying here, which is compatible with this more popular theory, is that lipoproteins, like adipocytes, also end up holding more fat than they should if you consume excess carbohydrates, and for longer.

Want to improve your health? Consider replacing things like bread and cereal with butter and eggs in your diet (). And also go see you doctor (); if he disagrees with this recommendation, ask him to read this post and explain why he disagrees.

Monday, September 12, 2011

Fasting blood glucose of 83 mg/dl and heart disease: Fact and fiction

If you are interested in the connection between blood glucose control and heart disease, you have probably done your homework. This is a scary connection, and sometimes the information on the Internetz make people even more scared. You have probably seen something to this effect mentioned:

Heart disease risk increases in a linear fashion as fasting blood glucose rises beyond 83 mg/dl.

In fact, I have seen this many times, including on some very respectable blogs. I suspect it started with one blogger, and then got repeated over and over again by others; sometimes things become “true” through repetition. Frequently the reference cited is a study by Brunner and colleagues, published in Diabetes Care in 2006. I doubt very much the bloggers in question actually read this article. Sometimes a study by Coutinho and colleagues is also cited, but this latter study is actually a meta-analysis.

So I decided to take a look at the Brunner and colleagues study. It covers, among other things, the relationship between cardiovascular disease (they use the acronym CHD for this), and 2-hour blood glucose levels after a 50-g oral glucose tolerance test (OGTT). They tested thousands of men at one point in time, and then followed them for over 30 years, which is really impressive. The graph below shows the relationship between CHD and blood glucose in mmol/l. Here is a calculator to convert the values to mg/dl.

The authors note in the limitations section that: “Fasting glucose was not measured.” So these results have nothing to do with fasting glucose, as we are led to believe when we see this study cited on the web. Also, on the abstract, the authors say that there is “no evidence of nonlinearity”, but in the results section they say that the data provides “evidence of a nonlinear relationship”. The relationship sure looks nonlinear to me. I tried to approximate it manually below.

Note that CHD mortality really goes up more clearly after a glucose level of 5.5 mmol/l (100 mg/dl). But it also varies significantly more widely after that level; the magnitudes of the error bars reflect that. Also, you can see that at around 6.7 mmol/l (121 mg/dl), CHD mortality is on average about the same as at 5.5 mmol/l (100 mg/dl) and 3.5 mmol/l (63 mg/dl). This last level suggests an abnormally high insulin response, bringing blood glucose levels down too much at the 2-hour mark – i.e., reactive hypoglycemia, which the study completely ignores.

These findings are consistent with the somewhat chaotic nature of blood glucose variations in normoglycemic individuals, and also with evidence suggesting that average blood glucose levels go up with age in a J-curve fashion even in long-lived individuals.

We also know that traits vary along a bell curve for any population of individuals. Research results are often reported as averages, but the average individual does not exist. The average individual is an abstraction, and you are not it. Glucose metabolism is a complex trait, which is influenced by many factors. This is why there is so much variation in mortality for different glucose levels, as indicated by the magnitudes of the error bars.

In any event, these findings are clearly inconsistent with the statement that "heart disease risk increases in a linear fashion as fasting blood glucose rises beyond 83 mg/dl". The authors even state early in the article that another study based on the same dataset, to which theirs was a follow-up, suggested that:

…. [CHD was associated with levels above] a postload glucose of 5.3 mmol/l [95 mg/dl], but below this level the degree of glycemia was not associated with coronary risk.

Now, exaggerating the facts, to the point of creating fictitious results, may have a positive effect. It may scare people enough that they will actually check their blood glucose levels. Perhaps people will remove certain foods like doughnuts and jelly beans from their diets, or at least reduce their consumption dramatically. However, many people may find themselves with higher fasting blood glucose levels, even after removing those foods from their diets, as their bodies try to adapt to lower circulating insulin levels. Some may see higher levels for doing other things that are likely to improve their health in the long term. Others may see higher levels as they get older.

Many of the complications from diabetes, including heart disease, stem from poor glucose control. But it seems increasingly clear that blood glucose control does not have to be perfect to keep those complications at bay. For most people, blood glucose levels can be maintained within a certain range with the proper diet and lifestyle. You may be looking at a long life if you catch the problem early, even if your blood glucose is not always at 83 mg/dl (4.6 mmol/l). More on this on my next post.

Tuesday, September 28, 2010

Income, obesity, and heart disease in US states

The figure below combines data on median income by state (bottom-left and top-right), as well as a plot of heart disease death rates against percentage of population with body mass index (BMI) greater than 30 percent. The data are recent, and have been provided by CNN.com and creativeclass.com, respectively.

Heart disease deaths and obesity are strongly associated with each other, and both are inversely associated with median income. US states with lower median income tend to have generally higher rates of obesity and heart disease deaths.

The reasons are probably many, complex, and closely interconnected. Low income is usually associated with high rates of stress, depression, smoking, alcoholism, and poor nutrition. Compounding the problem, these are normally associated with consumption of cheap, addictive, highly refined foods.

Interestingly, this is primarily an urban phenomenon. If you were to use hunter-gatherers as your data sources, you would probably see the opposite relationship. For example, non-westernized hunter-gatherers have no income (at least not in the “normal” sense), but typically have a lower incidence of obesity and heart disease than mildly westernized ones. The latter have some income.

Tragically, the first few generations of fully westernized hunter-gatherers usually find themselves in the worst possible spot.

Sunday, September 12, 2010

The China Study II: Wheat flour, rice, and cardiovascular disease

In my last post on the China Study II, I analyzed the effect of total and HDL cholesterol on mortality from all cardiovascular diseases. The main conclusion was that total and HDL cholesterol were protective. Total and HDL cholesterol usually increase with intake of animal foods, and particularly of animal fat. The lowest mortality from all cardiovascular diseases was in the highest total cholesterol range, 172.5 to 180; and the highest mortality in the lowest total cholesterol range, 120 to 127.5. The difference was quite large; the mortality in the lowest range was approximately 3.3 times higher than in the highest.

This post focuses on the intake of two main plant foods, namely wheat flour and rice intake, and their relationships with mortality from all cardiovascular diseases. After many exploratory multivariate analyses, wheat flour and rice emerged as the plant foods with the strongest associations with mortality from all cardiovascular diseases. Moreover, wheat flour and rice have a strong and inverse relationship with each other, which suggests a “consumption divide”. Since the data is from China in the late 1980s, it is likely that consumption of wheat flour is even higher now. As you’ll see, this picture is alarming.

The main model and results

All of the results reported here are from analyses conducted using WarpPLS. Below is the model with the main results of the analyses. (Click on it to enlarge. Use the "CRTL" and "+" keys to zoom in, and CRTL" and "-" to zoom out.) The arrows explore associations between variables, which are shown within ovals. The meaning of each variable is the following: SexM1F2 = sex, with 1 assigned to males and 2 to females; MVASC = mortality from all cardiovascular diseases (ages 35-69); TKCAL = total calorie intake per day; WHTFLOUR = wheat flour intake (g/day); and RICE = and rice intake (g/day).

The variables to the left of MVASC are the main predictors of interest in the model. The one to the right is a control variable – SexM1F2. The path coefficients (indicated as beta coefficients) reflect the strength of the relationships. A negative beta means that the relationship is negative; i.e., an increase in a variable is associated with a decrease in the variable that it points to. The P values indicate the statistical significance of the relationship; a P lower than 0.05 generally means a significant relationship (95 percent or higher likelihood that the relationship is “real”).

In summary, the model above seems to be telling us that:

- As rice intake increases, wheat flour intake decreases significantly (beta=-0.84; P<0.01). This relationship would be the same if the arrow pointed in the opposite direction. It suggests that there is a sharp divide between rice-consuming and wheat flour-consuming regions.

- As wheat flour intake increases, mortality from all cardiovascular diseases increases significantly (beta=0.32; P<0.01). This is after controlling for the effects of rice and total calorie intake. That is, wheat flour seems to have some inherent properties that make it bad for one’s health, even if one doesn’t consume that many calories.

- As rice intake increases, mortality from all cardiovascular diseases decreases significantly (beta=-0.24; P<0.01). This is after controlling for the effects of wheat flour and total calorie intake. That is, this effect is not entirely due to rice being consumed in place of wheat flour. Still, as you’ll see later in this post, this relationship is nonlinear. Excessive rice intake does not seem to be very good for one’s health either.

- Increases in wheat flour and rice intake are significantly associated with increases in total calorie intake (betas=0.25, 0.33; P<0.01). This may be due to wheat flour and rice intake: (a) being themselves, in terms of their own caloric content, main contributors to the total calorie intake; or (b) causing an increase in calorie intake from other sources. The former is more likely, given the effect below.

- The effect of total calorie intake on mortality from all cardiovascular diseases is insignificant when we control for the effects of rice and wheat flour intakes (beta=0.08; P=0.35). This suggests that neither wheat flour nor rice exerts an effect on mortality from all cardiovascular diseases by increasing total calorie intake from other food sources.

- Being female is significantly associated with a reduction in mortality from all cardiovascular diseases (beta=-0.24; P=0.01). This is to be expected. In other words, men are women with a few design flaws, so to speak. (This situation reverses itself a bit after menopause.)

Wheat flour displaces rice

The graph below shows the shape of the association between wheat flour intake (WHTFLOUR) and rice intake (RICE). The values are provided in standardized format; e.g., 0 is the mean (a.k.a. average), 1 is one standard deviation above the mean, and so on. The curve is the best-fitting U curve obtained by the software. It actually has the shape of an exponential decay curve, which can be seen as a section of a U curve. This suggests that wheat flour consumption has strongly displaced rice consumption in several regions in China, and also that wherever rice consumption is high wheat flour consumption tends to be low.

As wheat flour intake goes up, so does cardiovascular disease mortality

The graphs below show the shapes of the association between wheat flour intake (WHTFLOUR) and mortality from all cardiovascular diseases (MVASC). In the first graph, the values are provided in standardized format; e.g., 0 is the mean (or average), 1 is one standard deviation above the mean, and so on. In the second graph, the values are provided in unstandardized format and organized in terciles (each of three equal intervals).

The curve in the first graph is the best-fitting U curve obtained by the software. It is a quasi-linear relationship. The higher the consumption of wheat flour in a county, the higher seems to be the mortality from all cardiovascular diseases. The second graph suggests that mortality in the third tercile, which represents a consumption of wheat flour of 501 to 751 g/day (a lot!), is 69 percent higher than mortality in the first tercile (0 to 251 g/day).

Rice seems to be protective, as long as intake is not too high

The graphs below show the shapes of the association between rice intake (RICE) and mortality from all cardiovascular diseases (MVASC). In the first graph, the values are provided in standardized format. In the second graph, the values are provided in unstandardized format and organized in terciles.

Here the relationship is more complex. The lowest mortality is clearly in the second tercile (206 to 412 g/day). There is a lot of variation in the first tercile, as suggested by the first graph with the U curve. (Remember, as rice intake goes down, wheat flour intake tends to go up.) The U curve here looks similar to the exponential decay curve shown earlier in the post, for the relationship between rice and wheat flour intake.

In fact, the shape of the association between rice intake and mortality from all cardiovascular diseases looks a bit like an “echo” of the shape of the relationship between rice and wheat flour intake. Here is what is creepy. This echo looks somewhat like the first curve (between rice and wheat flour intake), but with wheat flour intake replaced by “death” (i.e., mortality from all cardiovascular diseases).

What does this all mean?

- Wheat flour displacing rice does not look like a good thing. Wheat flour intake seems to have strongly displaced rice intake in the counties where it is heavily consumed. Generally speaking, that does not seem to have been a good thing. It looks like this is generally associated with increased mortality from all cardiovascular diseases.

- High glycemic index food consumption does not seem to be the problem here. Wheat flour and rice have very similar glycemic indices (but generally not glycemic loads; see below). Both lead to blood glucose and insulin spikes. Yet, rice consumption seems protective when it is not excessive. This is true in part (but not entirely) because it largely displaces wheat flour. Moreover, neither rice nor wheat flour consumption seems to be significantly associated with cardiovascular disease via an increase in total calorie consumption. This is a bit of a blow to the theory that high glycemic carbohydrates necessarily cause obesity, diabetes, and eventually cardiovascular disease.

- The problem with wheat flour is … hard to pinpoint, based on the results summarized here. Maybe it is the fact that it is an ultra-refined carbohydrate-rich food; less refined forms of wheat could be healthier. In fact, the glycemic loads of less refined carbohydrate-rich foods tend to be much lower than those of more refined ones. (Also, boiled brown rice has a glycemic load that is about three times lower than that of whole wheat bread; whereas the glycemic indices are about the same.) Maybe the problem is wheat flour's gluten content. Maybe it is a combination of various factors, including these.

Reference

Kock, N. (2010). WarpPLS 1.0 User Manual. Laredo, Texas: ScriptWarp Systems.

Acknowledgment and notes

- Many thanks are due to Dr. Campbell and his collaborators for collecting and compiling the data used in this analysis. The data is from this site, created by those researchers to disseminate their work in connection with a study often referred to as the “China Study II”. It has already been analyzed by other bloggers. Notable analyses have been conducted by Ricardo at Canibais e Reis, Stan at Heretic, and Denise at Raw Food SOS.

- The path coefficients (indicated as beta coefficients) reflect the strength of the relationships; they are a bit like standard univariate (or Pearson) correlation coefficients, except that they take into consideration multivariate relationships (they control for competing effects on each variable). Whenever nonlinear relationships were modeled, the path coefficients were automatically corrected by the software to account for nonlinearity.

- The software used here identifies non-cyclical and mono-cyclical relationships such as logarithmic, exponential, and hyperbolic decay relationships. Once a relationship is identified, data values are corrected and coefficients calculated. This is not the same as log-transforming data prior to analysis, which is widely used but only works if the underlying relationship is logarithmic. Otherwise, log-transforming data may distort the relationship even more than assuming that it is linear, which is what is done by most statistical software tools.

- The R-squared values reflect the percentage of explained variance for certain variables; the higher they are, the better the model fit with the data. In complex and multi-factorial phenomena such as health-related phenomena, many would consider an R-squared of 0.20 as acceptable. Still, such an R-squared would mean that 80 percent of the variance for a particularly variable is unexplained by the data.

- The P values have been calculated using a nonparametric technique, a form of resampling called jackknifing, which does not require the assumption that the data is normally distributed to be met. This and other related techniques also tend to yield more reliable results for small samples, and samples with outliers (as long as the outliers are “good” data, and are not the result of measurement error).

- Only two data points per county were used (for males and females). This increased the sample size of the dataset without artificially reducing variance, which is desirable since the dataset is relatively small. This also allowed for the test of commonsense assumptions (e.g., the protective effects of being female), which is always a good idea in a complex analysis because violation of commonsense assumptions may suggest data collection or analysis error. On the other hand, it required the inclusion of a sex variable as a control variable in the analysis, which is no big deal.

- Since all the data was collected around the same time (late 1980s), this analysis assumes a somewhat static pattern of consumption of rice and wheat flour. In other words, let us assume that variations in consumption of a particular food do lead to variations in mortality. Still, that effect will typically take years to manifest itself. This is a major limitation of this dataset and any related analyses.

- Mortality from schistosomiasis infection (MSCHIST) does not confound the results presented here. Only counties where no deaths from schistosomiasis infection were reported have been included in this analysis. Mortality from all cardiovascular diseases (MVASC) was measured using the variable M059 ALLVASCc (ages 35-69). See this post for other notes that apply here as well.

Wednesday, September 8, 2010

The China Study II: Cholesterol seems to protect against cardiovascular disease

First of all, many thanks are due to Dr. Campbell and his collaborators for collecting and compiling the data used in this analysis. This data is from this site, created by those researchers to disseminate the data from a study often referred to as the “China Study II”. It has already been analyzed by other bloggers. Notable analyses have been conducted by Ricardo at Canibais e Reis, Stan at Heretic, and Denise at Raw Food SOS.

The analyses in this post differ from those other analyses in various aspects. One of them is that data for males and females were used separately for each county, instead of the totals per county. Only two data points per county were used (for males and females). This increased the sample size of the dataset without artificially reducing variance (for more details, see “Notes” at the end of the post), which is desirable since the dataset is relatively small. This also allowed for the test of commonsense assumptions (e.g., the protective effects of being female), which is always a good idea in a complex analysis because violation of commonsense assumption may suggest data collection or analysis error. On the other hand, it required the inclusion of a sex variable as a control variable in the analysis, which is no big deal.

The analysis was conducted using WarpPLS. Below is the model with the main results of the analysis. (Click on it to enlarge. Use the "CRTL" and "+" keys to zoom in, and CRTL" and "-" to zoom out.) The arrows explore associations between variables, which are shown within ovals. The meaning of each variable is the following: SexM1F2 = sex, with 1 assigned to males and 2 to females; HDLCHOL = HDL cholesterol; TOTCHOL = total cholesterol; MSCHIST = mortality from schistosomiasis infection; and MVASC = mortality from all cardiovascular diseases.

The variables to the left of MVASC are the main predictors of interest in the model – HDLCHOL and TOTCHOL. The ones to the right are control variables – SexM1F2 and MSCHIST. The path coefficients (indicated as beta coefficients) reflect the strength of the relationships. A negative beta means that the relationship is negative; i.e., an increase in a variable is associated with a decrease in the variable that it points to. The P values indicate the statistical significance of the relationship; a P lower than 0.05 generally means a significant relationship (95 percent or higher likelihood that the relationship is “real”).

In summary, this is what the model above is telling us:

- As HDL cholesterol increases, total cholesterol increases significantly (beta=0.48; P<0.01). This is to be expected, as HDL is a main component of total cholesterol, together with VLDL and LDL cholesterol.

- As total cholesterol increases, mortality from all cardiovascular diseases decreases significantly (beta=-0.25; P<0.01). This is to be expected if we assume that total cholesterol is in part an intervening variable between HDL cholesterol and mortality from all cardiovascular diseases. This assumption can be tested through a separate model (more below). Also, there is more to this story, as noted below.

- The effect of HDL cholesterol on mortality from all cardiovascular diseases is insignificant when we control for the effect of total cholesterol (beta=-0.08; P=0.26). This suggests that HDL’s protective role is subsumed by the variable total cholesterol, and also that it is possible that there is something else associated with total cholesterol that makes it protective. Otherwise the effect of total cholesterol might have been insignificant, and the effect of HDL cholesterol significant (the reverse of what we see here).

- Being female is significantly associated with a reduction in mortality from all cardiovascular diseases (beta=-0.16; P=0.01). This is to be expected. In other words, men are women with a few design flaws. (This situation reverses itself a bit after menopause.)

- Mortality from schistosomiasis infection is significantly and inversely associated with mortality from all cardiovascular diseases (beta=-0.28; P<0.01). This is probably due to those dying from schistosomiasis infection not being entered in the dataset as dying from cardiovascular diseases, and vice-versa.

Two other main components of total cholesterol, in addition to HDL cholesterol, are VLDL and LDL cholesterol. These are carried in particles, known as lipoproteins. VLDL cholesterol is usually represented as a fraction of triglycerides in cholesterol equations (e.g., the Friedewald and Iranian equations). It usually correlates inversely with HDL; that is, as HDL cholesterol increases, usually VLDL cholesterol decreases. Given this and the associations discussed above, it seems that LDL cholesterol is a good candidate for the possible “something else associated with total cholesterol that makes it protective”. But waidaminet! Is it possible that the demon particle, the LDL, serves any purpose other than giving us heart attacks?

The graph below shows the shape of the association between total cholesterol (TOTCHOL) and mortality from all cardiovascular diseases (MVASC). The values are provided in standardized format; e.g., 0 is the average, 1 is one standard deviation above the mean, and so on. The curve is the best-fitting S curve obtained by the software (an S curve is a slightly more complex curve than a U curve).

The graph below shows some of the data in unstandardized format, and organized differently. The data is grouped here in ranges of total cholesterol, which are shown on the horizontal axis. The lowest and highest ranges in the dataset are shown, to highlight the magnitude of the apparently protective effect. Here the two variables used to calculate mortality from all cardiovascular diseases (MVASC; see “Notes” at the end of this post) were added. Clearly the lowest mortality from all cardiovascular diseases is in the highest total cholesterol range, 172.5 to 180; and the highest mortality in the lowest total cholesterol range, 120 to 127.5. The difference is quite large; the mortality in the lowest range is approximately 3.3 times higher than in the highest.

The shape of the S-curve graph above suggests that there are other variables that are confounding the results a bit. Mortality from all cardiovascular diseases does seem to generally go down with increases in total cholesterol, but the smooth inflection point at the middle of the S-curve graph suggests a more complex variation pattern that may be influenced by other variables (e.g., smoking, dietary patterns, or even schistosomiasis infection; see “Notes” at the end of this post).

As mentioned before, total cholesterol is strongly influenced by HDL cholesterol, so below is the model with only HDL cholesterol (HDLCHOL) pointing at mortality from all cardiovascular diseases (MVASC), and the control variable sex (SexM1F2).

The graph above confirms the assumption that HDL’s protective role is subsumed by the variable total cholesterol. When the variable total cholesterol is removed from the model, as it was done above, the protective effect of HDL cholesterol becomes significant (beta=-0.27; P<0.01). The control variable sex (SexM1F2) was retained even in this targeted HDL effect model because of the expected confounding effect of sex; females generally tend to have higher HDL cholesterol and less cardiovascular disease than males.

Below, in the “Notes” section (after the “Reference”) are several notes, some of which are quite technical. Providing them separately hopefully has made the discussion above a bit easier to follow. The notes also point at some limitations of the analysis. This data needs to be analyzed from different angles, using multiple models, so that firmer conclusions can be reached. Still, the overall picture that seems to be emerging is at odds with previous beliefs based on the same dataset.

What could be increasing the apparently protective HDL and total cholesterol in this dataset? High consumption of animal foods, particularly foods rich in saturated fat and cholesterol, are strong candidates. Low consumption of vegetable oils rich in linoleic acid, and of foods rich in refined carbohydrates, are also good candidates. Maybe it is a combination of these.

We need more analyses!

Reference:

Kock, N. (2010). WarpPLS 1.0 User Manual. Laredo, Texas: ScriptWarp Systems.

Notes:

- The path coefficients (indicated as beta coefficients) reflect the strength of the relationships; they are a bit like standard univariate (or Pearson) correlation coefficients, except that they take into consideration multivariate relationships (they control for competing effects on each variable).

- The R-squared values reflect the percentage of explained variance for certain variables; the higher they are, the better the model fit with the data. In complex and multi-factorial phenomena such as health-related phenomena, many would consider an R-squared of 0.20 as acceptable. Still, such an R-squared would mean that 80 percent of the variance for a particularly variable is unexplained by the data.

- The P values have been calculated using a nonparametric technique, a form of resampling called jackknifing, which does not require the assumption that the data is normally distributed to be met. This and other related techniques also tend to yield more reliable results for small samples, and samples with outliers (as long as the outliers are “good” data, and are not the result of measurement error).

- Colinearity is an important consideration in models that analyze the effect of multiple predictors on one single variable. This is particularly true for multiple regression models, where there is a temptation of adding many predictors to the model to see which ones come out as the “winners”. This often backfires, as colinearity can severely distort the results. Some multiple regression techniques, such as automated stepwise regression with backward elimination, are particularly vulnerable to this problem. Colinearity is not the same as correlation, and thus is defined and measured differently. Two predictor variables may be significantly correlated and still have low colinearity. A reasonably reliable measure of colinearity is the variance inflation factor. Colinearity was tested in this model, and was found to be low.

- An effort was made here to avoid multiple data points per county (even though this was available for some variables), because this could artificially reduce the variance for each variable, and potentially bias the results. The reason for this is that multiple answers from a single county would normally be somewhat correlated; a higher degree of intra-county correlation than inter-county correlation. The resulting bias would be difficult to control for, via one or more control variables. With only two data points per county, one for males and the other for females, one can control for intra-country correlation by adding a “dummy” sex variable to the analysis, as a control variable. This was done here.

- Mortality from schistosomiasis infection (MSCHIST) is a variable that tends to affect the results in a way that makes it more difficult to make sense of them. Generally this is true for any infectious diseases that significantly affect a population under study. The problem with infection is that people with otherwise good health or habits may get the infection, and people with bad health and habits may not. Since cholesterol is used by the human body to fight disease, it may go up, giving the impression that it is going up for some other reason. Perhaps instead of controlling for its effect, as done here, it would have been better to remove from the analysis those counties with deaths from schistosomiasis infection. (See also this post, and this one.)

- Different parts of the data were collected at different times. It seems that the mortality data is for the period 1986-88, and the rest of the data is for 1989. This may have biased the results somewhat, even though the time lag is not that long, especially if there were changes in certain health trends from one period to the other. For example, major migrations from one county to another could have significantly affected the results.

- The following measures were used, from this online dataset like the other measures. P002 HDLCHOL, for HDLCHOL; P001 TOTCHOL, for TOTCHOL; and M021 SCHISTOc, for MSCHIST.

- SexM1F2 is a “dummy” variable that was coded with 1 assigned to males and 2 to females. As such, it essentially measures the “degree of femaleness” of the respondents. Being female is generally protective against cardiovascular disease, a situation that reverts itself a bit after menopause.

- MVASC is a composite measure of the two following variables, provided as component measures of mortality from all cardiovascular diseases: M058 ALLVASCb (ages 0-34), and M059 ALLVASCc (ages 35-69). A couple of obvious problems: (a) they does not include data on people older than 69; and (b) they seem to capture a lot of diseases, including some that do not seem like typical cardiovascular diseases. A factor analysis was conducted, and the loadings and cross-loadings suggested good validity. Composite reliability was also good. So essentially MVASC is measured here as a “latent variable” with two “indicators”. Why do this? The reason is that it reduces the biasing effects of incomplete data and measurement error (e.g., exclusion of folks older than 69). By the way, there is always some measurement error in any dataset.

- This note is related to measurement error in connection with the indicators for MVASC. There is something odd about the variables M058 ALLVASCb (ages 0-34), and M059 ALLVASCc (ages 35-69). According to the dataset, mortality from cardiovascular diseases for ages 0-34 is typically higher than for 35-69, for many counties. Given the good validity and reliability for MVASC as a latent variable, it is possible that the values for these two indicator variables were simply swapped by mistake.

Thursday, May 27, 2010

Postprandial glucose levels, HbA1c, and arterial stiffness: Compared to glucose, lipids are not even on the radar screen

Postprandial glucose levels are the levels of blood glucose after meals. In Western urban environments, the main contributors to elevated postprandial glucose are foods rich in refined carbohydrates and sugars. While postprandial glucose levels may vary somewhat erratically, they are particularly elevated in the morning after breakfast. The main reason for this is that breakfast, in Western urban environments, is typically very high in refined carbohydrates and sugars.

HbA1c, or glycated hemoglobin, is a measure of average blood glucose over a period of a few months. Blood glucose glycates (i.e., sticks to) hemoglobin, a protein found in red blood cells. Red blood cells are relatively long-lived, lasting approximately 3 months. Thus HbA1c (given in percentages) is a good indicator of average blood glucose levels, if you don’t suffer from anemia or a few other blood abnormalities.

Based on HbA1c, one can then estimate his or her average blood glucose level for the previous 3 months or so before the test, using one of the following equations, depending on whether the measurement is in mg/dl or mmol/l.

Average blood glucose (mg/dl) = 28.7 × HbA1c − 46.7
Average blood glucose (mmol/l) = 1.59 × HbA1c − 2.59

Elevated blood glucose levels cause damage in the body primarily through glycation, which leads to the formation of advanced glycation endproducts (AGEs). Given this, HbA1c can be seen as a proxy for the level of damage done by elevated blood glucose levels to various body tissues. This damage occurs over time; often after many years of high blood glucose levels. It includes kidney damage, neurological damage, cardiovascular damage, and damage to the retina.

Most regular blood exams focus on fasting blood glucose as a measure of glucose metabolism status. Many medical practitioners have as a target a fasting blood glucose level of 125 mg/dl (7 mmol/l) or less, and largely disregard postprandial glucose levels or HbA1c in their management of glucose metabolism. Leiter and colleagues (2005; full reference at the end of this post) showed that this focus on fasting blood glucose is a mistake. They are not alone; many others made this point, including some very knowledgeable bloggers who focus on diabetes (see “Interesting links” section of this blog). Leiter and colleagues (2005) also provided some interesting graphs and figures, including eye-opening correlations between various variables and arterial stiffness. The figure below (click to enlarge) shows the contribution of postprandial glucose to HbA1c.

Note that the lower the HbA1c is in the figure (horizontal axis), the higher is the postprandial glucose contribution to HbA1c. And, the lower the HbA1c, the closer the individuals are to what one could consider having a perfectly normal HbA1c level (around 5 percent). That is, only for individuals whose HbA1c levels are very high, fasting blood glucose levels are relatively reliable measures of the tissue damage done be elevated blood glucose levels.

The table below (click to enlarge) shows P values associated with the impact of various variables (listed on the leftmost column) on arterial stiffness. This measure, arterial stiffness, is strongly associated with an increased risk of cardiovascular events. Look at the middle column showing P values adjusted for age and height. The lower the P value, the more a variable affects arterial stiffness. The variable with the lowest P value by far is 2-hour postprandial blood glucose; the blood glucose levels measured 2 hours after meals.

Fasting glucose levels were reported to be statistically insignificant because of the P = 0.049, in terms of their effect on arterial stiffness, but this P value is actually significant, although barely, at the 0.05 level (95 percent confidence). Interestingly, the following measures are not even on the radar screen, as far as arterial stiffness is concerned: systolic blood pressure, LDL cholesterol, HDL cholesterol, triglycerides, and fasting insulin levels.

What about the lipid hypothesis, and the “bad” LDL cholesterol!? This study is telling us that these are not very relevant for arterial stiffness when we control for the effect of blood glucose measures. Not even fasting insulin levels matters much! Wait, not even HDL!!! A high HDL has been definitely shown to be protective, but when we look at the relative magnitude of various effects, the story is a bit different. A high HDL’s protective effect exists, but it is dwarfed by the negative effect of high blood glucose levels, especially after meals, in the context of cardiovascular disease.

What all this points at is what we could call a postprandial glucose hypothesis: Lower your postprandial glucose levels, and live a longer, healthier life! And, by the way, if your postprandial glucose levels are under control, lipids do not matter much! Or maybe your lipids will fall into place, without any need for statin drugs, after your postprandial glucose levels are under control. One way or another, the outcome will be a positive one. That is what the data from this study is telling us.

How do you lower your postprandial glucose levels?

A good way to start is to remove foods rich in refined carbohydrates and sugars from your diet. Almost all of these are foods engineered by humans with the goal of being addictive; they usually come in boxes and brightly colored plastic wraps. They are not hard to miss. They are typically in the central aisles of supermarkets. The sooner you remove them from your diet, the better. The more completely you do this, the better.

Note that the evidence discussed in this post is in connection with blood glucose levels, not glucose metabolism per se. If you have impaired glucose metabolism (e.g., diabetes type 2), you can still avoid a lot of problems if you effectively control your blood glucose levels. You may have to be a bit more aggressive, adding low carbohydrate dieting (as in the Atkins or Optimal diets) to the removal of refined carbohydrates and sugars from your diet; the latter is in many ways similar to adopting a Paleolithic diet. You may have to take some drugs, such as Metformin (a.k.a. Glucophage). But you are certainly not doomed if you are diabetic.

Reference:

Leiter, L.A., Ceriello, A., Davidson, J.A., Hanefeld, M., Monnier, L., Owens, D.R., Tajima, N., & Tuomilehto, J. (2005). Postprandial glucose regulation: New data and new implications. Clinical Therapeutics, 27(2), S42-S56.

Wednesday, May 12, 2010

Is heavy physical activity a major trigger of death by sudden cardiac arrest? Not in Oregon

The idea that heavy physical activity is a main trigger of heart attacks is widespread. Often endurance running and cardio-type activities are singled out. Some people refer to this as “death by running”.

Good cardiology textbooks, such as the Mayo Clinic Cardiology, tend to give us a more complex and complete picture. So do medical research articles that report on studies of heart attacks based on comprehensive surveys.

Reddy and colleagues (2009) studied sudden cardiac arrest events followed by death from 2002 to 2005 in Multnomah County in Oregon. This study was part of the ongoing Oregon Sudden Unexpected Death Study. Multnomah County has an area of 435 square miles, and had a population of over 677 thousand at the time of the study. The full reference to the article and a link to a full-text version are at the end of this post.

The researchers grouped deaths by sudden cardiac arrests (SCAs) according to the main type of activity being performed before the event. Below is how the authors defined the activities, quoted verbatim from the article. MET is a measure of the amount of energy spent in the activity; one MET is the amount of energy spent by a person sitting quietly.

- Sleep (MET 0.9): subjects who were sleeping when they sustained SCA.
- Light activity (MET 1.0–3.4): included bathing, dressing, cooking, cleaning, feeding, household walking and driving.
- Moderate activity (MET 3.5–5.9): included walking for exercise, mowing lawn, gardening, working in the yard, dancing.
- Heavy activity (MET score ≥6): included sports such as tennis, running, jogging, treadmill, skiing, biking.
- Sexual activity (MET score 1.3): included acts of sexual intercourse.

What did they find? Not what many people would expect.

The vast majority of the people dying of sudden cardiac arrest were doing things that fit the “light activity” group above prior to their death. This applies to both genders. The figure below (click to enlarge) shows the percentages of men and women who died from sudden cardiac arrest, grouped by activity type.

Sudden cardiac arrests were also categorized as witnessed or un-witnessed. For witnessed, someone saw them happening. For un-witnessed, the person was seen alive, and within 24 hours had died. So the data for witnessed sudden cardiac arrests is a bit more reliable. The table below displays the distribution of mean age, gender and known coronary artery disease (CAD) in those with witnessed sudden cardiac arrest.

Look at the bottom row, showing those with known coronary artery disease. Again, light activity is the main trigger. Sleep comes second. The numbers within parentheses refer to percentages within each activity group. Those percentages are not very helpful in the identification of the most important triggers, although they do suggest that coronary artery disease is a major risk factor. For example, among those who died from sudden cardiac arrest while having sex, 57 percent had known coronary artery disease. For light activity, 36 percent had known coronary artery disease.

As a caveat, it is worth noting that heavy activity appears to be more of a trigger in younger individuals than in older ones. This may simply reflect the patterns of activities at different ages. However, this does not seem to properly account for the large differences observed in triggers; the standard deviation for age in the heavy activity group was large enough to include plenty of seniors. Still, it would have been nice to see a multivariate analysis controlling for various effects, including age.

So what is going on here?

The authors give us a hint. The real culprit may be bottled up emotional stress and sleep disorders; the latter may be caused by stress, as well as by obesity and other related problems. They have some data that points in those directions. That makes some sense.

We humans have evolved “fight-or-flight” mechanisms that involve large hormonal discharges in response to stressors. Our ancestors needed those. For example, they needed those to either fight or run for their lives in response to animal attacks.

Modern humans experience too many stressors while sitting down, as in stressful car commutes and nasty online interactions. The stresses cause “fight-or-flight” hormonal discharges, but are followed by neither “fight” nor “flight” in most cases. This cannot be very good for us.

Death by running!? More like death by not running!

Reference:

Reddy, P.R., Reinier, K., Singh, T., Mariani, R., Gunson, K., Jui, J., & Chugh, S.S. (2009). Physical activity as a trigger of sudden cardiac arrest: The Oregon Sudden Unexpected Death Study. International Journal of Cardiology, 131(3), 345–349.

Sunday, May 9, 2010

Long distance running causes heart disease, unless it doesn’t

Regardless of type of exercise, disease markers are generally associated with intensity of exertion over time. This association follows a J-curve pattern. Do too little of it, and you have more disease; do too much, and incidence of disease goes up. There is always an optimal point, for each type of exercise and marker. A J curve is actually a U curve, with a shortened left end. The reason for the shortened left end is that, when measurements are taken, usually more measures fall on the right side of the curve than on the left.

The figure below (click to enlarge) shows a schematic representation that illustrates this type of relationship. (I am not very good at drawing.) Different individuals have different curves. If the vertical axis was a measure of health, as opposed to disease, then the curve would have the shape of an inverted J.

The idea that long distance running causes heart disease has been around for a while. Is it correct?

If it is, then one would expect to see certain things. For example, let’s say you take a group of long distance runners who have been doing that for a while, ideally runners above age 50. That is when heart disease becomes more frequent. This would also capture more experienced runners, with enough running experience to cause some serious damage. Let us say you measured markers of heart disease before and after a grueling long distance race. What would you see?

If long distance running causes heart disease, you would see a significant proportion with elevated makers of heart disease among the runners at baseline (i.e., before the race). After all, running is causing a cumulative problem. The levels of those markers would be correlated with practice, or participation in previous races, since the races are causing the damage. Also, you would see a uniformly bad increase in the markers after the race, as the running is messing up everybody more or less equally.

Sahlén and colleagues (2009), a group of Swedish researchers, studied males and females aged 55 or older who participated in a 30-km (about 19-mile) cross-country race. The full reference to the article is at the end of this post. The researchers included only runners who had no diagnosed medical disorders in their study. They collected data on the patterns of exercise prior to the race, and participation in previous races. Blood was taken before and after the race, and several measurements were obtained, including measurements of two possible heart disease markers: N-terminal pro-brain natriuretic peptide (NT-proBNP), and troponin T (TnT). The table below (click to enlarge) shows several of those measurements before and after the race.

We can see that NT-proBNP and TnT increased significantly after the race. So did creatinine, a byproduct of breakdown in muscle tissue of creatine phosphate; something that you would expect after such a grueling race. Yep, long distance running increases NT-proBNP and TnT, so it leads to heart disease, right?

Wait, not so fast!

NT-proBNP and TnT levels usually increase after endurance exercise, something that is noted by the authors in their literature review. But those levels do not stay elevated for too long after the race. Being permanently elevated, that is a sign of a problem. Also, excessive elevation during the race is also a sign of a potential problem.

Now, here is something interesting. Look at the table below, showing the variations grouped by past participation in races.

The increases in NT-proBNP and TnT are generally lower in those individuals that participated in 3 to 13 races in the past. They are higher for the inexperienced runners, and, in the case of NT-proBNP, particularly for those with 14 or more races under their belt (the last group on the right). The baseline NT-proBNP is also significantly higher for that group. They were older too, but not by much.

Can you see a possible J-curve pattern?

Now look at this table below, which shows the results of a multiple regression analysis on its right side. Look at the last column on the right, the beta coefficients. They are all significant, but the first is .81, which is quite high for a standardized partial regression coefficient. It refers to an almost perfect relationship between the log of NT-proBNP increase and the log of baseline NT-proBNP. (The log transformations reflect the nonlinear relationships between NT-proBNP, a fairly sensitive health marker, and the other variables.)

In a multiple regression analysis, the effect of each independent variable (i.e., each predictor) on the dependent variable (the log of NT-proBNP increase) is calculated controlling for the effects of all the other independent variables on the dependent variable. Thus, what the table above is telling us is that baseline NT-proBNP predicts NT-proBNP increase almost perfectly, even when we control for age, creatinine increase, and race duration (i.e., amount of time a person takes to complete the race).

Again, even when we control for: AGE, creatinine increase, and RACE DURATION.

In order words, baseline NT-proBNP is what really matters; not even age makes that much of a difference. But baseline NT-proBNP is NEGATIVELY correlated with number of previous races. The only exception is the group that participated in 14 or more previous races. Maybe that was too much for them.

Okay, one more table. This one, included below, shows regression analyses between a few predictors and the main dependent variable, which in this case is TnT elevation. No surprises here based on the discussion so far. Look at the left part, the column labeled as “B”. Those are correlation coefficients, varying from -1 to 1. Which is the predictor with the highest absolute correlation with TnT elevation? It is number of previous races, but the correlation is, again, NEGATIVE.

In follow-up tests after the race, 9 out of the 185 participants (4.9 percent) showed more decisive evidence of heart disease. One of those died while training a few months after the race. An autopsy was conducted showing abnormal left ventricular hypertrophy with myocardial fibrosis, coronary artery narrowing, and an old myocardial scar.

Who were the 9 lucky ones? You guessed it. Those were the ones who had the largest increases in NT-proBNP during the race. And large increases in NT-proBNP were more common among the runners who were too inexperienced or too experienced. The ones at the extremes.

So, here is a summary of what this study is telling us:

- The 30-km cross-country race studied is no doubt a strenuous activity. So if you have not exercised in years, perhaps you should not start with this kind of race.

- By and large, individuals who had elevated markers of heart disease prior to the race also had the highest elevations of those markers after the race.

- Participation in past races was generally protective, likely due to compensatory body adaptations, with the exception of those who did too much of that.

- Prevalence of heart disease among the runners was measured at 4.9 percent. This does not beat even the mildly westernized Inuit, but certainly does not look so bad considering that the general prevalence of ischemic heart disease in the US and Sweden is about 6.8 percent.

It seems reasonable to conclude that long distance running may be healthy, unless one does too much of it. The ubiquitous J-curve pattern again.

How much is too much? It certainly depends on each person’s particular health condition, but the bar seems to be somewhat high on average: participation in 14 or more previous 30-km races.

As for the 4.9 percent prevalence of heart disease among runners, maybe it is caused by something else, and endurance running may actually be protective, as long as it is not taken to extremes. Maybe that something else is a diet rich in refined carbohydrates and sugars, or psychological stress caused by modern life, or a combination of both.

Just for the record, I don’t do endurance running. I like walking, sprinting, moderate resistance training, and also a variety of light aerobic activities that involve some play. This is just a personal choice; nothing against endurance running.

Mark Sisson was an accomplished endurance runner; now he does not like it very much. (Click here to check his excellent book The Primal Blueprint). Arthur De Vany is not a big fan of endurance running either.

Still, maybe the Tarahumara and hunter-gatherer groups who practice persistence hunting are not such huge exceptions among humans after all.

Reference:

Sahlén, A., Gustafsson, T.P., Svensson, J.E., Marklund, T., Winter, R., Linde, C., & Braunschweig, F. (2009). Predisposing Factors and Consequences of Elevated Biomarker Levels in Long-Distance Runners Aged >55 Years. The American Journal of Cardiology, 104(10), 1434–1440.