Objectives
A variety of birth size measures are used in studies investigating the Developmental Origins of Disease and Health (DOHaD) hypothesis. Maternal-fetal medicine research has demonstrated that different birth size measures do not signal the same information about intrauterine growth and development. On this basis, we predict that differences between birth size measures across studies and even within studies will lead to different outcomes, but this question has been inadequately studied. The objective of this study was to investigate whether using different birth size measures can affect relationships between birth size and young adult health outcomes.
Methods
Using a U.S. population sample of 2921 white and African-American females, we tested two hypotheses: 1) different birth size measures will not have the same associations with age at menarche, young adult height, and young adult weight and 2) different categorical birth size measures will not classify the same infants.
Results
Different birth size measures had different strengths of associations with the three health outcomes and the categorical birth size measures identified different infants.
Conclusion
We demonstrate that the choice of birth size measures does affect the predictive relationship between birth size and young adult health outcomes. Moreover, if data from multiple studies with different birth size measures are synthesized (as in a review), it is likely that the data are describing different infants and this may compromise the conclusions that can be drawn. Implications of our findings for DOHaD research are discussed.
Numerous studies demonstrate a relationship between small birth size and chronic illness in adulthood, e.g. atherosclerotic disease, stroke, hypertension [1], psychopathology [2], and cancer [3]. This relationship is known as the Developmental Origins of Health and Disease hypothesis (DOHaD). In addition to such long-term outcomes, intermediate outcomes on the trajectory to adult disorders exist. Girls who are small at birth reach menarche at a younger age [4-7], are shorter as young adults [8-12], and have more central adiposity in early adulthood [13,14]. These young adult health outcomes are, in turn, associated with an increased risk of adult CVD and breast cancer [15-17].
Although there is support for the DOHaD hypothesis, gaps in our knowledge remain. Many studies have replicated the original findings [18-22], but some have not [23-28]. Using different birth size measures may explain some of the divergence in results. In empirical studies different measures can lead to inconsistent associations between birth size and health outcomes. In reviews, treating all birth size definitions as if they are equivalent markers for IUGR, the hypothesized mechanism for DOHaD, could lead to misclassification errors and misinterpretation of results [29-31]. Unfortunately, the choice of a birth size measure in many epidemiologic studies is not justified with regards to the hypothesized physiological mechanisms and the problem is exacerbated when researchers develop idiosyncratic birth size measures that may have no pathophysiologic relevance to small birth size. Surprisingly, this methodological issue is not well-discussed in the field. For example, an excellent review about methodological limitations in testing the DOHaD hypothesis does not address birth size definitions [32].
The purpose of this study was to use a cohort of girls followed from birth to young adulthood to investigate whether using different birth size measures would affect the relationship between birth size and young adult health outcomes. We tested two hypotheses: 1) different birth size measures will not have the same associations with age at menarche, young adult height and young adult weight and 2) different categorical birth size measures will not classify the same infants.
Materials
We used data from the U.S. based National Longitudinal Survey of Youth, 1979 Children and Young Adult Surveys (NLSY79). The NLSY79 enrolled a nationally representative sample of young people living in the U.S. in December 1978, born between 1957 and 1964. Extensive data on these respondents have been collected annually through 1994 and biennially thereafter. Data on biological children born to the women have been collected biennially beginning 1986.
The primary sample inclusion criterion for this study was a daughter born to an African-American or non-Hispanic white woman (hereafter referred to as white) in the NLSY79 before 1993. We focused our study on females because there are sex differences in DOHaD hypothesis outcomes [33-34], the age of menarche is easier to detect than age at spermarche, the age of menarche has been associated with adult chronic disease [35], and age of menarche is associated with small birth size [4-7].
The eligible sample consisted of 2921 girls (white n=1626; African-American n=1295) born to 2035 mothers (white n=1177; African-American n=858). Birth weight, birth length, and gestational age were required for all analyses; 473 girls (16.2% of the sample) were excluded for missing at least one of these measures. Two girls were removed from the sample because they had biologically implausible high birth weights for very short birth lengths. Additionally, some subjects were lost because of missing data in the young adult outcomes. The final sample for analysis of menarche was 2207 (75.6% of eligible sample), for young adult height 2155 (73.8% of eligible sample), and young adult weight 2114 (72.4% of eligible sample). The study sample contained more whites and was slightly better educated than the eligible sample.
Birth size indexes
After extensively reviewing the literature, we selected for comparison the eight most commonly used birth size measures in DOHaD research (Table 1). A birth size measure was included if it had been used in at least two studies. Birth weight, length and gestational age were reported by mothers in the first weeks after delivery. Length and gestational age alone are not used in DOHaD studies, so in this study, they are only included as part of some of the calculated measures. Note that the birth weight percentile, adjusted for GA in female births is based on the work of Oken, et al., [36].
Continuous Variables |
|
Variable |
Definition |
Birth weight |
Mother reported in pounds and ounces during first interview following the birth; converted to grams |
Birth weight percentile, adjusted for gestational age |
Weight adjusted for gestational age |
Weight/length ratio |
Ratio of weight (in grams) to length (meters) as reported above. |
Ponderal Index |
Relationship between birth weight and birth length: birth weight (kg)/length (m)3 |
Binary Variables | |
Low birth weight |
< 2500 grams |
Small for Gestational Age (SGA) |
Cut-off is <10th percentile based on population curves |
Low weight/length ratio |
Cut-off is <10th percentile based upon the NLSY study sample distribution |
Low Ponderal Index |
<10th percentile; percentile determined based on the NLSY study sample distribution |
Table 1: Definition of birth size measures.
Outcome variables
Three outcome variables were studied: age at menarche, height and weight between the ages of 17-24 years. Age of menarche for participants was first collected from mothers when girls were 8-13 years old. When girls turned 14 years, they were directly asked as a validation process and if they had not experienced menarche, they continued to be asked until they had reached this developmental milestone. Each girl’s year and month of birth were used to determine their age of menarche in months, which was then converted to years. Self-reported height was recorded in inches and self-reported weight recorded in pounds when participants were between 17 and 24 years old and post menarche.
Other key variables
Many DOHaD studies have used samples homogenous with regards to race, but race can affect the relationship between birth size and adult health. This is particularly salient in light of well-described differences in African-American and white birth weights, average GA, and birth lengths [37]. African-American and white developmental and somatic differences are significant throughout the hypothesized trajectories for the development of adult disorders: at birth, in adolescence and young adulthood, and in mid-late adulthood [38].
Therefore, we used race as a covariate. Daughters were assigned the race designated by their mothers. These data have been validated in a sub-study that asked the daughters to self-describe their racial/ethnic status and parent and child self-reported race were found to agree.
To test our first hypothesis, we standardized young adult outcomes and birth size measures. Measures were standardized to a z metric, where z = (y-M)/SD where M and SD are the sample mean and standard deviation. Z scores allowed us to compare the sizes of the associations between birth size indexes and outcomes in the same metrics. We calculated the regressions of each intermediate outcome (age of menarche, young adult height and weight) on each birth size measure listed in table 1, using race as a covariate. We also ran regressions for each outcome/birth size measure combination that included a race by birth size measure interaction. Only one interaction was statistically significant across the twelve regressions, so interactions were not pursued further.
Our hypothesis was tested by examining the overlap between the four groupings of infants defined by accepted cut points on the four measures of birth size. We analyzed the overlap in two ways. First, for each birth size measure A, we calculated the conditional probability that a baby who was small as defined by A would also be small as defined by an alternative measure B. Second, for each pair of definitions, we calculated the two-by-two table for small vs. not-small according to both definitions. Phi coefficients were calculated. The phi coefficient is the equivalent of a Pearson or coefficient for binary variables. It takes the value 1.0 for perfect agreement between the two variables and the value 0 for chance agreement.
Analyses were performed using R Version 2.15.3 (http://www.r-project.org/). Regressions were calculated using the LME function in the package NLME. All regressions and mean comparisons included random intercepts for mothers to account for siblings in the sample.
Sample descriptive statistics
Table 2 displays the means and standard deviations for the three young adult health outcomes, the continuous birth measures, and the proportions for the binary variables for birth size. The first and second columns of data show results for whites and African-Americans. The third column presents the data from the whole sample and the fourth displays the p values for the comparison between whites and African-Americans.
As young adults, African-American girls reached menarche earlier, were shorter, and were heavier. African-American infants, however, were lighter and leaner (lower weight/length) than white infants. But they also had larger ponderal indexes (higher weight/length3). These race differences were in every case highly statistically significant. Therefore, all subsequent analyses include race as a covariate.
Variable |
White Girls (N=1475) |
African-American Girls (N=971) |
All Girls (N=2446) |
p |
Young adult outcomes: Mean (SD) |
||||
Age at menarche (mo) |
150.2 (14.7) |
143.6 (16.3) |
147.5 (15.7) |
<.0001 |
Height (cm) |
165.2 (7.1) |
163.9 (7.4) |
164.7 (7.3) |
<.0001 |
Weight (kg) |
63.0 (13.5) |
68.5 (17.0) |
65.3 (15.2) |
<.0001 |
Birth size measures: Mean (SD) |
||||
Birth weight (g) |
3339.6 (537.4) |
3101.7 (593.3) |
3245.2 (572.1) |
<.0001 |
Birth weight for gestational age (%) |
53.5 (28.8) |
40.4 (29.0) |
48.3 (29.6) |
<.0001 |
Birth weight (g)/Birth length (m) |
6550.1 (927.3) |
6305.9 (1227.3) |
6453.1 (1063.1) |
<.0001 |
Ponderal index |
25.6 (6.0) |
27.5 (14.6) |
26.3 (10.3) |
<.0001 |
Binary birth size measures: % Small birth size |
||||
Low birth weight |
6.3 |
13.2 |
9 |
<.0001 |
Low birth weight corrected for gestational age |
7.9 |
18.1 |
11.9 |
<.0001 |
Low birth weight/Birth length |
7.8 |
13.5 |
10.1 |
<.0001 |
Low ponderal index |
7.9 |
13.6 |
10.1 |
<.0001 |
Table 2: Young adult outcomes and birth size measure by participant race: means and standard deviations.
Prediction of young adult outcomes by different birth size measures
The results from the series of regressions used to answer this question are summarized in Table 3. This table reports the standardized regression coefficients for the relationships between the birth size measures (continuous and binary) and the young adult outcomes. Because the regression coefficients are standardized, for each young adult outcome we can compare the magnitudes of the effects of different birth size measures.
Predictors |
Young Adult Outcomes |
|||||
Age at Menarche, N=2207 |
Height, N=2155 |
Weight, N=2114 |
||||
Binary |
Continuous |
Binary |
Continuous |
Binary |
Continuous |
|
Birth weight regressions |
||||||
Birth weight |
0.016 |
0.052*** |
0.092*** |
0.204*** |
0.022 |
0.103** |
Birth weight for gestational age (%) regressions |
||||||
Birth weight for gestational age (%) |
0.039** |
0.043*** |
0.086*** |
0.237*** |
0.074* |
0.133*** |
Birth weight/Birth length regressions |
||||||
Birth weight/Birth length |
0.028 |
0.055*** |
0.050** |
0.117*** |
0.034 |
0.078** |
Ponderal index regressions |
||||||
Ponderal index |
0.037* |
-0.016 |
0.029 |
-0.032 |
0.017 |
0.012 |
Note: Analyses are clustered by mothers to account for multiple siblings. ‘*’=p<0.05, ‘**’=p<0.01, and ‘***’=p<0.001. |
Table 3: Standardized coefficients for the regression of young adult outcomes on birth size using z-Transformed scores with race as a covariate.
PI has no consistent relationship with the young adult outcomes to. It is statistically significant in only one of 6 regressions and the sign of the coefficient varies. This is substantially different from the pattern of the other birth size variables and we will discuss the latter separately.
The associations between the other three birth size variables, birth weight, birth weight adjusted for gestational age, and weight/length, and the young adult outcomes are consistently positive, that is, smaller babies were shorter and lighter as adults and reached menarche at a younger age. However, the size of the associations among the continuous versions of the birth size variables were on average about 2.5 times larger than the binary versions. Moreover, whereas the continuous versions were always statistically significant, for the binary versions only five of the eight birth size/adult outcome associations were significant.
The strength of association between birth size measures and young adult outcome also varied depending on the outcome. Adult height was most strongly predicted by the birth size variables, followed by adult weight, while age at menarche was the least strongly predicted. For example, being a standard deviation heavier at birth was associated with being a slight 0.05 of a standard deviation older at menarche, compared to being about 0.20 of a standard deviation taller.
Finally, Table 3 shows that for any given adult outcome, there was variation in the strength of association as a function of which birth size variable is used. However, there is no consistent pattern indicating whether one birth size measures best predicts all the young adult outcomes.
Different birth size measures to identify groups of infants
We next examined whether the categorical birth size measures would identify the same groups of infants. If so, cross-classification in 2×2 tables should reveal high agreement. We calculated the probability of an infant being in one group, given that the baby was in another group. We also analyzed the agreement between how the binary variables categorized infants.
The results are displayed in Table 4. Part A of the table shows that the probability of being in the “small” category of each of the indexes, given that that an infant was classified as “small” on another one. The conditional probabilities are all relatively low. Part B shows that agreement in categorizing infants is also poor, with the exception of a phi = 0.73 between infants in the low birth weight group and those in the low birth weight/length ratio group.
Low Birth Weight |
Low Birth Weight Corrected For Gestational Age |
Low Birth Weight/Birth Length |
Low Ponderal Index |
|
Low birth weight |
100 |
61.1 |
79.6 |
29 |
Low birth weight corrected for gestational age |
46.2 |
100 |
48.6 |
27.4 |
Low birth weight/Birth length |
71.5 |
57.7 |
100 |
46.3 |
Low ponderal index |
25.8 |
32.3 |
46 |
100 |
Part A: Probability (%) of low birth size as defined by column, conditional on small birth size as defined by row.
Low Birth Weight |
Low Birth Weight Corrected For Gestational Age |
Low Birth Weight/Birth Length |
Low Ponderal Index |
|
Low birth weight |
1 |
0.48 |
0.73 |
0.2 |
Low birth weight corrected for gestational age |
0.48 |
1 |
0.47 |
0.21 |
Low birth weight/Birth length |
0.73 |
0.47 |
1 |
0.4 |
Low ponderal index |
0.2 |
0.21 |
0.4 |
1 |
Part B: Phi coefficients for agreement between row and column definitions of small birth size.
Table 4: Identifying the same neonates: Agreement among birth size measures.
Variability in birth size measures can be an important challenge in DOHaD research, but is not well-studied. The goals of our study were to determine whether the eight most commonly used research birth size measures equivalently predicted three young adult female health outcomes and whether commonly used categorical birth size measures identified the same girls in their infancies.
We demonstrated three findings of importance to DOHaD researchers. First, the birth size measure used does have an effect on the strength of the association between birth size and young adult health outcomes, which is what we would expect from maternal-fetal medicine research indicating that they are markers of different intrauterine problems. This means that a researcher needs to carefully consider which definition of birth size he or she is going to use for which health outcome. For example, PI may show no relationship with an adult health outcome, but birth size used as a continuous measure may reveal a strong association.
Second, the outcome variable used also matters. Different birth size measures predicted different health outcomes and there was no single birth size definition that predicted all health outcomes. Third, the most commonly used categorical definitions of small birth size classified babies differently. This is particularly important in cross-study syntheses of data (reviews) because the studies reviewed may not be investigating the same infants. This problem can seriously undermine the validity of the conclusion drawn from such reviews this is not a problem that can be detected using the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) methodology for reviews.
What are possible explanations for our findings? The simplest is that although these definitions are assumed to measure the latent variable of IUGR, they are not all doing so. Simple birth weight or length does not necessarily indicate IUGR [39], particularly for late pregnancy growth restriction, as it will result in asymmetrical smallness. SGA and IUGR are often used synonymously, but they are not interchangeable. The category of SGA includes infants who have suffered IUGR, but it also includes neonates who are constitutionally small and there is no way to differentiate between the two. SGA also cannot differentiate between infants suffering IUGR early in pregnancy from those with late onset IUGR.
Because low birth weight, even adjusted for gestational age, may not be a good proxy for IUGR, clinical researchers have looked instead for an index of body composition [40]. Measures that quantify the relationship between soft tissue mass and skeletal growth have been demonstrated to be better proxies for IUGR, e.g. weight/length ratio [41] and the PI [42]. Resnik and Creasy state that PI is the most valid measure of IUGR in term infants [43]. This is of particular importance to DOHaD researchers who often restrict their samples to neonates delivered at term. The distribution of PI has also been found to be similar across races and genders [44].
However, others have disputed this claim, showing that PI does not correlate as highly with triceps skin folds (a standard indicator of fat distribution and thus, intrauterine nutrition) as does weight/length ratio [45].
PI may also help determine the timing of exposure to adverse intrauterine conditions causing IUGR [46]. This was confirmed in a study of serial ultrasound examinations compared with neonatal anthropometric measurements. Decreased birth weight and decreased PI signaled third trimester IUGR. First trimester IUGR produced neonates who had the highest birth weights and PIs of the entire IUGR group [39].
These clinical data make it clear that recognizing IUGR is difficult and not only depends on size and timing of birth, but body composition. This highlights the importance of how to use and interpret results from the use of various birth size measures. DOHaD research could be improved by incorporating clinical knowledge into how birth size is defined. Sub-grouping infants in a population study using different measures sequentially could shed more light on epidemiologic studies investigating possible mechanisms. For example, first identifying infants who are SGA and then measuring the PI within that group may give investigators a more valid sub-group of IUGR infants.
The second explanation for our findings is that the association between birth size and health outcomes may be differentially mediated or moderated by factors such as postnatal growth rate, childhood stress, or genotype. Epidemiologic studies often don’t control for these variables. Bollen and colleagues have proposed an innovative statistical solution using structural equation modeling to partially compensate for the myriad errors associated with using birth size measures, as well the influence of postnatal factors [47].
Although our primary study goal was to compare the effects of different birth size measures on health outcomes, there were some interesting substantive results. Simple (unadjusted) weight/length was the strongest predictor of age of menarche, although every continuous index had a significant positive correlation with this outcome. The binary variable of SGA, previously reported in Reagan, et al., [6], predicted younger age at menarche, a finding similar to that of Morris, et al., [48]. Low PI also predicted young age at menarche.
Continuous birth weight for gestational age was the strongest predictor of young adult height. However, every other variable except PI was also significantly correlated with height. Other researchers in DOHaD studies of young adult stature in females have reported that birth weight and length were positively correlated with later stature [8,11].
Birth size was positively associated with weight in young adulthood and the continuous variable of weight for gestational age was the strongest predictor. The direction of this association perhaps appears to contradict the DOHaD hypothesis, but our findings are similar to the results reported by Labayen, et al., when they focused the relationship between birth weight and late adolescent weight [13]. When they further investigated birth weight and later body composition, e.g., fat distribution, they discovered that birth weight was inversely associated with anthropometric markers of central adiposity. Similarly, Breukhoven and colleagues did not find a negative correlation between small sizes at birth or prematurity and young adult weight, but with Dual Energy X-ray Absorptiometry (DEXA) scanning, were able to detect higher central fat mass in the participants who had been preterm and smaller [14].
There were limitations in our goal to investigate the effects of different birth size measures in DOHaD research. The most significant is that we were not studying the typical adult outcomes of cardiovascular disease, hypertension, stroke, or diabetes. It is possible that different birth size measures would have had uniform associations with these adult outcomes, although we think this is doubtful. Second, birth size data and outcome data were self-reported. We cannot determine in what directions this may have influenced our results. Finally, we did not study males and therefore cannot generalize our results to them.
The gold standard for defining IUGR is direct serial measurements of fetal development in utero. However, these methods are difficult and expensive to do on a population basis, so we expect that many researchers will continue to use anthropomorphic data standardly available at delivery to investigate the DOHaD hypothesis. Our findings suggest that researchers should choose birth size measures depending on their hypotheses. Sequential categorization of infants may best define a valid group of IUGR infants or the use of structural equation modeling as described by Bollen may help mitigate experimental error in using birth size measures. Those researchers doing a systematic or meta-analytic review of the literature should not synthesize data across different birth size measures as such studies are likely not investigating the same infants. Instead, data should be analyzed in sub-groups by the birth size measure used.
Ethics: The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national guidelines on human experimentation (NIH) and with the Helsinki Declaration of 1975, as revised in 2008, and has been approved by the Institutional Review Board at The Ohio State University.
Financial Support: This work was supported by a grant from the U.S. National Institutes of Health (NIH): NR009384, Salsberry and Reagan, Co-PIs.
All authors have no conflicts of interest to disclose.
Citation: Pajer KA, Salsberry PJ, Gardner W, Reagan PB, Fang MZ, et al. (2019) Different Birth Size Measures are Not Equivalent in Predicting Health Outcomes in Young Women. J Neonatol Clin Pediatr 6: 035.
Copyright: © 2019 Kathleen A Pajer, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.