The world’s leading publication for data science, AI, and ML professionals.

Are first babies more likely to be late?

Yes, and also more likely to be early. But just a little.

If you are pregnant with your first child, you might have heard that first babies are more likely to be late. Also, you might have heard that they are more likely to be early. As it turns out, both are true.

  • If "early" means preterm – before 37 weeks of pregnancy – first babies are more likely to be early. Based on live births recorded in the National Survey of Family Growth, about 12% of first babies are born preterm, compared to 10% of other babies.
  • And if "late" means after 40 weeks, first babies are more likely to be late: about 15%, compared to 10% of other babies.

The following figure shows the distribution of pregnancy length for live births (excluding multiple births and deliveries by C-section):

Distribution of pregnancy lengths for full-term single births. The shaded areas show 90% confidence intervals.
Distribution of pregnancy lengths for full-term single births. The shaded areas show 90% confidence intervals.

First babies are less likely to be "on time" at 39 weeks, and more likely to be a little late, between 41 and 43 weeks.

Among full-term pregnancies, first babies are born about 1.3 days later on average. But the average doesn’t tell the whole story.

How much longer?

Suppose you are at the beginning of Week 37. The average time until delivery at this point is 2.8 weeks.

Two weeks later, at the beginning of Week 39, the average remaining time is 1.2 weeks. As you expect, with each week that goes by, the average remaining time goes down.

But then it stops.

The following figure shows the cruelest statistic in obstetrics: the average remaining time computed at the beginning of each week of pregnancy:

Average remaining time at the beginning of each week of pregnancy, for live births, excluding multiple births and deliveries by C-section.
Average remaining time at the beginning of each week of pregnancy, for live births, excluding multiple births and deliveries by C-section.

Between Weeks 39 and 43, the remaining time until delivery barely changes. Time goes by, but the finish line keeps moving into the future.

At Week 39, if you ask a doctor when the baby will arrive, they say something like "Any day now." If you ask again at Week 40, they give the same answer. And again at Week 41. That might be frustrating to hear, but they are right; for almost five weeks, you are always one week away.

The situation is a little worse for first babies. The following figure shows average remaining time for first babies and others:

Average remaining time at the beginning of each week of pregnancy for first babies and others.
Average remaining time at the beginning of each week of pregnancy for first babies and others.

At the beginning of Week 39, the average remaining time is 1.3 weeks for first babies and 1.1 weeks for others. That difference is about 36 hours.

The gap persists for a week or so, but after Week 41, first babies and others are indistinguishable.

Maybe this week?

As you plan for the final weeks of pregnancy, the average time until delivery is not very helpful. You might prefer to know, at the beginning of each week, the probability of delivering in the next seven days.

The following figure answers that question for first babies and others:

Probability of delivering in the next week, computed at the beginning of each week.
Probability of delivering in the next week, computed at the beginning of each week.

At the beginning of Week 37, you can pack a bag if you want to, but there is only a 6% chance you will need it, first baby or not.

At the beginning of Week 38, the chance of delivering in the next week is about 11%, not much higher.

But at the beginning of Week 39, it is substantially higher: 54% for first babies and 61% for others.

This gap persists for a week or so; then after Week 41, the two curves are effectively the same.

Are these differences real?

The results in this article might reflect real biological and medical differences between first babies and others. In that case, they are likely to be predictive: if you are expecting your first baby, you will have to wait a little longer, on average, than for subsequent births.

But these results might be due to measurement error.

  • By convention, the duration of pregnancy is measured from the first day of the mother’s last menstrual period. The reported lengths might not be precise and might be less precise for first-time mothers.
  • Also, NSFG data is based on interviews, not medical records, so it relies on the memories of respondents. Reported lengths might be less accurate for first babies.

But even if measurement errors are different for first babies, it’s not clear why they would be biased toward longer durations.

The apparent differences between first babies and others might also be caused by a confounding factor related to pregnancy length.

  • If a woman’s first baby is delivered by C-section, subsequent deliveries are more likely to be scheduled and less likely to be late. I excluded deliveries by C-section for this reason.
  • If first babies are less likely to be induced, more of them would be allowed to be late. I don’t know a reason they would be, but the dataset doesn’t have information on induced labor, so I can’t confirm or rule out this possibility.

The results I’ve presented are statistically significant, which means that if there were no difference between first babies and others, we would be unlikely to see these gaps. The results are also consistent over the course of the survey, from 2002 to 2017. So it is unlikely that the apparent differences are due to random sampling.

More reading

This article is based on a case study in my book, Think Stats: Exploratory Data Analysis in Python, which you can download at no cost from Green Tea Press. It is also available in paper and electronic formats from O’Reilly Media (Amazon affiliate link).

I published a similar analysis (based on older data) in my blog, Probably Overthinking It, where you can read more articles on Data Science and Bayesian statistics.

If you enjoyed this article, you might also like "The Inspection Paradox is Everywhere", which is about a surprisingly ubiquitous statistical illusion.

Methodology

I used data from the National Survey of Family Growth (NSFG), which "gathers information on family life, marriage and divorce, Pregnancy, infertility, use of contraception, and men’s and women’s health."

The dataset includes records of 43 292 live births, of which I excluded 737 multiple births and 11 003 deliveries by C-section. I also excluded 3 cases where the duration of pregnancy was reported to be 50 weeks or more. This analysis is based on the remaining 31 906 cases.

The NSFG is representative of United States residents, but it uses stratified sampling, so some groups are oversampled. I used weighted resampling to correct for oversampling and to generate the confidence intervals shown in the figures.

The details of data cleaning, validation, and resampling are in this Jupyter notebook. The details of the analysis are in this notebook.

About the author

Allen Downey is a Professor of Computer Science at Olin College in Massachusetts. He and his wife have two daughters: the first was born a week early; the second was two weeks late, after a little encouragement.


Related Articles