How much would mothers earn if they didn't have children?
A review of the child penalty literature
The child penalty, sometimes also called the motherhood penalty, describes the adverse effects of having children on the labour market outcomes of women, for example employment or earnings. These child penalties are large in many countries and are assumed to be one of the main drivers of gender inequality in high and middle income countries today. Below you can see child penalties in employment in different countries:
The child penalty in employment across the world. The scale is in % and shows the average drop in employment for women relative to men in the 10 years after childbirth. Source: Child Penalty Atlas
For the rest of the post, I will use annual earnings as the labour market outcome of interest and present all definitions and results in terms of earnings. Before we dive into how to measure child penalties, let’s look at the two most common definitions of the child penalty:
The impact of having children on the woman’s earnings, compared to what she would have earned counterfactually if she hadn’t had children.
The impact of having children on the earnings gap between men and women. This means taking the difference between the impact of children on women’s earnings and the impact on men’s earnings.
These two definitions would end up coinciding if the impact of having children on men’s earnings is negligible.
Constructing the counterfactuals that we need for both definitions is not easy! We obviously can’t run a randomised controlled trial where we tell lots of men and women to either have children or not have them and then compare the outcomes between the two groups. Unfortunately, it’s also not as easy as just comparing mothers to childless women and fathers to childless men. These groups differ in lots of ways that influence both, whether they have children and what their earnings are. Instead we have to think of clever ways to estimate what the men and women in question would have earned if they hadn’t had children.
This post will discuss the two main approaches for estimating penalties: Event studies and using IVF patients. For both approaches, I will explain the main drawbacks and then present some newer papers trying to deal with these problems.
The event study approach
Let’s start with what is probably the most popular approach, event studies. This is the most common way of estimating child penalties and the employment child penalties that I showed above are estimated using this method. The main paper that popularised the method is Kleven et al. 2019. They use Danish data to estimate the effects of having children on gender inequality.
How does an event study work? The idea is to line all of our observations up around “the event” which is the birth of the first child. In the data that the authors are using for this paper, there are lots of different individuals who we track over time and who have their first child at different ages. We define t=0 as the year of childbirth and look at everyone’s earnings in the 5 years before childbirth and the 10 years after childbirth. Here is a visualisation of how we line up observations from different people to get event times:
Source: Own visualisation with Canva.
Once you have done this for lots and lots of respondents and lined everyone up in this way, average earnings around childbirth will look something like this:
Source: NLSY79 data, data and cleaning code from replication package by Henrik Kleven, own visualisation
We can already see that women and men evolve roughly in parallel before childbirth and that there is a big drop in earnings for women around t=0 when the first childbirth happens.
How do we go from this to estimating the child penalty? We have the actual average earnings around childbirth that you can see above. What we need to do now, is construct the counterfactual: What would the earnings have been for all those women if they didn’t have children? The way the regression does this is by estimating what earnings would have been as a function of the year and the age of the woman. In a slightly simplified way, you can imagine that for someone who has a child at age 20 in 2000, we would then construct the counterfactual earnings at childbirth as the predicted average earnings of a 20 year-old in 2000. When we do this for lots and lots of women, we then get what the average counterfactual earnings would have been at each point in time before and after childbirth.
Once we have done this for each event time, we can look at how much earnings drop counterfactually after child birth. We then repeat the same procedure for the men given that Kleven et al. use Definition 2 for the child penalty and therefore compare the drop in earnings between men and women after having a child. This is what their results look like:
Source: Kleven et al. 2019
The long-run child penalty, measured at t=10, is 19.4%. We see that there is only a very small effect for fathers and their earnings remain very stable around the arrival of the first child. If we applied Definition 1, then the child penalty would only be a little bit larger, slightly over 20%.
The authors also look at the very long run and find that the child penalty in earnings stays pretty constant even 20 years after childbirth!
Problems with the event study approach
The problem with event studies is that they essentially assume that the timing of childbirth is unrelated to the earnings outcome. However, there is evidence that women time their first births around the time their earnings profile flattens out. If the earnings profile is flattening around childbirth, that means that the counterfactual earnings we estimate after childbirth are likely to be too high which then leads to event studies overestimating child penalties.
On the other hand, there is also reason to think that event studies might underestimate child penalties. This boils down to using bad controls. As I explained above, we use age and year variables in the regression in order to predict how earnings evolve over the life-cycle for different cohorts. The problem is that by including the age variables, we basically use all women of that age as the control group. However, more and more women of each age will have already had a child, meaning that their earnings are lower than what they would have been at that age in the absence of a child. By including these controls who are actually treated, we therefore bias our estimate of the child penalty downward.
The IVF approach
The other main approach to estimating child penalties comes from using IVF patients. One of the main papers here is Lundborg et al. 2017 who also happen to use data from Denmark so that we can compare their results nicely to those of Kleven et al. 2019.
The idea behind the IVF approach is to assume that whether the first IVF treatment is successful or not is essentially random. In the introduction, I explained that we unfortunately can’t just compare mothers to childless women because these groups are too different in many other ways. However, when we look at women going for IVF, these should all be similar since they are already selected for wanting to have children. The IVF procedure then acts as a random assignment of children to some of these women and if we compare their outcomes, we should get a causal estimate of the effect of having children. This is obviously the causal estimate according to Definition 1, since we’re directly comparing women with and without children and aren’t looking at gender inequality.
Importantly, Lundborg et al. 2017 focus on childless women going through their first IVF treatment. The reason they focus on childless women is that they want to estimate the effect of having children compared to not having children, rather than the effect of having additional children. The reason they focus on the first treatment only is to ensure that the outcomes are actually random. While going for IVF treatment in the first place is not random, the outcome of that first treatment should be. Since everyone does the first round by definition, there is no possibility for those who have a first successful treatment to be different from those who have a first unsuccessful treatment. However, subsequent rounds of IVF are then not random and not everyone does them which means we would get selection bias if we included outcomes from subsequent rounds. For example, those who have a stronger desire to have a child might go for more additional rounds.
In order to properly estimate the effects of having children, the authors employ an instrumental variables strategy. This means that in a first step, they estimate the impact of having a successful first IVF treatment on fertility, i.e. estimating the difference in fertility between those whose first treatment was successful vs. those whose first treatment was unsuccessful. This effect should initially be large and then become smaller over time since more and more women who have an unsuccessful first treatment go on to eventually have a child later on.
The second step is then to use these differences in fertility to estimate the impact of having children on earnings. The idea behind the instrumental variable approach is basically to take the “portion” of fertility that is entirely explained by the random outcome of the first treatment and to then see what the impact of this portion is on earnings. We’re trying to isolate the effect of the exogenous increase in fertility that is randomly assigned and not dependent on other underlying characteristics1.
Using this strategy, the authors find that there is a penalty in earnings of about 31% in years 0-1, 12% in years 2-5 and 11% in years 6-10 after the first IVF treatment. Here is what the results look like:
Source: Lundborg et al. 2017
Their estimate for the short-run impact is thus very similar to Kleven et al. 2019 while their estimate for the medium and long-run impact is quite a bit smaller2. We can see in the graph that the earnings seem to recover a lot more and quite quickly after having the first child. In contrast, the event study paper found that in the long-run, the child penalty according to Definition 1 is still slightly over 20% so roughly twice as big as what the IVF approach finds.
The same authors have a newer paper as well, Lundborg et al. 2024, where they track outcomes in the very long run, up to 25 years after the first IVF attempt. They find that the child penalty completely disappears 10 years after the first treatment and even turns into a child premium, i.e. those with children earning more than those without, after 15 years!
Source: Lundborg et al. 2024
Problems with the IVF approach
The main problem with using the IVF approach is what actually happens to the control group. We’re using those who are unsuccessful with their first treatment as the counterfactual for what happens to earnings when you don’t have a child. Some of these women will, however, be successful with a later treatment. That means that you’re comparing women who have a child now to those who have a child a little bit later and will then also see their earnings fall. The instrumental variable methodology partly controls for this but it can’t fully solve this problem.
Additionally, some women will go on to have further unsuccessful treatment. Given that IVF treatment is time-intensive and has side effects, this makes it likely that those women are able to focus on their career less. It seems very plausible that there are negative mental health effects of being involuntarily childless and going through round after round of unsuccessful IVF treatment which then also have negative effects on labour market outcomes.
There is actually good evidence from two newer papers that unsuccessful IVF treatment does have these negative effects. Bögl et al. 2025 use Swedish data on patients using different reproductive technologies, including IVF. They find that those who initiate treatment but consistently remain infertile experience negative mental health effects. They also find an effect on couple stability, with divorce being 50% more likely for those who remain childless in the long-run compared to those whose first IVF treatment was successful3.
Martinenghi and Naghsh-Nejad 20254 use IVF patients from Australia to study the impacts of involuntary childlessness. They also find negative effects of remaining infertile on mental health and document direct income decreases during the period of IVF treatment. They explicitly note that these findings have important implications for the IVF approach to estimating child penalties. The results imply that the control group of women going through unsuccessful rounds of IVF might have lower earnings due to the IVF treatment and therefore would not be a good counterfactual. This means that it is likely that the IVF methodology underestimates the child penalty and accounting for this would therefore move us closer to the higher estimates from the event study methodology.
Some new papers trying to correct for these problems
Recently, there have been several papers trying to deal with the problems related to both approaches. One of them is Bensnes et al. 2023 who combine the two methodologies and find an estimate that, unsurprisingly, is in the middle! What they especially emphasise in their paper is that the standard event study finds basically zero effect on the father and a strong negative effect on the mother. However, their method finds a similar gap between partners due to a smaller negative effect on the mother and a positive effect on the father.
The second paper is Ilciukas 2025 who extends the IVF approach. He uses Dutch data and focuses on patients starting with intrauterine insemination (IUI) as a fertility treatment, with some of them later transitioning to IVF. His idea is to properly model all subsequent cycles instead of just success at the first cycle so that the control group only contains those who were never successful. This is a more complicated technique and allows him to bound estimates of the child penalty within a certain region. He finds that there is a persistent income reduction for mothers that is between 6% and 32%, which means that the event study result and the result from the IVF methodology are both within his bounds.
The third paper is Melentyeva and Riedel 2025. They try to solve the problem of bad controls and heterogeneous effects by age at birth in event studies by constructing a “rolling window” control group. For a woman who gives birth at age 28, the control group will be women who give birth between ages 29 and 33, i.e. the 5 years after she has given birth. The counterfactual earnings are always calculated only with mothers who haven’t given birth yet to avoid the bad control problem. So the counterfactual earnings at 28 are calculated with all mothers who give birth between 29 and 34, but the counterfactual at 29 is calculated only with mothers who give birth between 30 and 34, and so on. The idea is that within a narrow window of 5 years, the women who give birth in that window are very similar to each other. The trade-off is that the smaller you make the window, the more similar the mothers are but the fewer years post-birth you can estimate. With the window of 5 years, you can estimate outcomes in the 4 years following the birth.
Applying this method to German data, they can estimate child penalties for women who give birth between 23 and 33 years old. They find heterogeneous effects with younger mothers losing more earnings in relative terms. Importantly, when they compare this method to running a conventional event study on the same data, the estimates from the rolling window approach are 30% bigger on average. This result implies that event studies might actually be underestimating child penalties and would move even further away from the IVF results.
Where does this leave us?
In summary, event studies find larger child penalties than the IVF methodology, especially in the longer run: While the two main papers that we looked at here agree that the immediate effect on women’s earnings is roughly a 30% loss, after 10 years the IVF study only finds an effect of around 10% while the event study estimates around 20% earnings loss. After 20 years, the event study finds that the child penalty stays relatively constant while a follow up to the IVF study finds that there is actually a child premium in the very long-run, i.e. a positive effect on earnings.
There are some arguments that IVF estimates are downward biased and event study estimates upward biased, meaning that the truth should lie somewhere in between the two. However, there is also an argument to be made that event studies are downward biased and a recent paper using a different method for constructing the control group finds an even larger effect than the classic event study.
All the methods agree that there are significant penalties for some amount of time but there is disagreement over how fast and how fully earnings rebound after childbirth. This obviously matters for policy: If child penalties are a relatively short-lived phenomenon that only relates to the fact that most mothers take some time off after childbirth, we should probably be a lot less worried about them than if they are very persistent and it is hard for mothers to get back onto their initial career trajectory at all.
After reading all these papers, my personal opinion is that I’m inclined to believe that there are pretty persistent and long-run effects. It makes sense to me that temporary absences or temporary stretches of part-time work don’t just mean that you earn less during those times but that they influence your overall career progression.
Thanks to Ben Snodin and Matthew Farrugia-Roberts for giving me feedback on an earlier draft of this post. Thanks to Gabriel Leite Mariante for discussing the child penalty literature with me and helping me figure out some data issues.
If you have never heard of instrumental variables before, this is probably still a bit hard to understand, sorry! I recommend reading Mostly Harmless Econometrics if you want to understand this better.
The two papers aren’t perfectly comparable since Kleven et al. 2019 use births between 1985 and 2003, while Lundborg et al. 2017 use births between 1995 and 2005. Additionally, the sample of IVF women differs from the general population in that they are older and better educated. The authors do provide evidence in their paper that their estimated child penalties should generalise to the whole population.
Specifically, this is looking at couples who are still childless 10 years after the first failed IVF attempt. 30% of those couples will have divorced by then compared to 20% of couples who were successful at their first attempt.
Thanks to Stephanie Murray from Family Stuff for bringing this paper to my attention!








> “The event study assumes that the penalty is the same at each age and averages all these penalties together. Given that this assumption probably doesn’t hold, the estimated penalty might be biased either way, depending on who has children at what ages in the specific sample.”
Isn’t the estimate here a weighted average of age-specific treatment effects? Thus, we have some ATT that best describes the effect on the average treated unit (average-aged mother in the sample). So it’s not really a violated assumption here, but an important caveat regarding external validity. Not trying to be pedantic; I really enjoyed the article.
> In a slightly simplified way, you can imagine that for someone who has a child at age 20 in 2000, we would then construct the counterfactual earnings at childbirth as the predicted average earnings of a 20 year-old in 2000.
> The problem is that by including the age variables, we basically use all women of that age as the control group. However, more and more women of each age will have already had a child, meaning that their earnings are lower than what they would have been at that age in the absence of a child.
Thanks for the post. I would like to know if I understand this correctly, but it seems like this bias could be fixed if the counterfactual only uses 20 year-olds in 2000 who have not yet undergone childbirth? (I understand this will yield fewer and fewer samples as age increases, leading to higher variance estimates.)