What Fraction of His Money Do You Expect Landsburg to Lose?
I am ashamed to admit how many hours in the last few days I have devoted to this silly problem I first read on Steve Landsburg’s blog:
There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. What fraction of the population is female?
Well, of course, you can’t know for sure, because, by some extraordinary coincidence, the last 100,000 families in a row might have gotten boys on the first try. But in expectation, what fraction of the population is female? In other words, if there were many such countries, what fraction would you expect to observe on average?
I have many things to say about it–my brother (who was in a PhD math program) and I argued for a good hour last night alone. But I am way behind on “real work” and so can’t do it justice.
This is not for everyone, but for those who are geeky enough, it’s actually pretty interesting to delve into this controversy. So here’s my plan for you, to get up to speed such that you will fully appreciate my attempt to resolve the dispute when I post it at some unspecified time in the future:
(1) Read Landsburg’s original statement of the problem. Try to figure it out yourself, then skim the comments to see all the knots his readers tied themselves into.
(2) Read Landsburg’s purported solution.
(3) Read the physicist who thinks Landsburg is wrong.
(4) Go back to the comments of Landsburg’s solution post, and skim them. A few people keep hitting on the key elements of the controversy. (To give you a hint: It’s crucial that not every “potential population” has the same size.)
(5) Finally go read Landsburg’s $15,000 challenge to the physicist.
(To give you a hint: It’s crucial that not every “potential population” has the same size.)
The result will differ depending on the number of couples you start with — and indeed if you assume infinite couples then the answer will be the same — but the crux of the disagreement really is on the rationale to get there.
Most people (as did the physicist) jumped to the conclusion that, given a 50-50 gender probability, the expected fraction of boys and girls is the same. But the point prof Landsburg was that that answer ignores the fact that if you were to implement such a policy, it will be more likely your country will end up having a higher fraction of boys (and fewer kids in total) than a higher fraction of girls (and more kids in total) — because of the long tail of the distribution.
The physicist in his post tried to get at the same number as did Prof Landsburg, but his calculations only show he failed to understand the point Prof Landsburg was trying to get at. (I myself got confused over Prof Landsburg family example.)
typo, please add: But the point prof Landsburg was [trying to get at was]
No, even if we include a realistic assumption ….. say….that people stop after having 4 kids even if all of them are girls, we still end up with an expected value of 0.5
Landsburg’s solution depends on excluding all the children of families which do not yet have boys……
Bingo!
The assumption is only that people stop having kids after they have a boy. If you add the assumption that people will stop after having 1 boy or 4 kids, then it seems the probability distribution will be further skewed in the direction of males.
Dude, just write a simulation, and see what values you get more often than not. 😉 (try simulations on paper, if you can’t code)
Anyhow, here is an example of output you may get: (I’m using a random number generator for each kid.)
4 couples
1st generation: 0 boys, 4 girls. (4 reproducing)
2nd generation: 2 boy, 6 girls. (2 reproducing)
3rd generation: 4 boys, 6 girls.
fraction: 6 / 10 = .6
4 couples
1st generation: 2 boys, 2 girls. (4 reproducing)
2nd generation: 4 boy, 2 girls. (2 reproducing)
fraction: 2 / 6 = .(3)
I got more boys than girls in the first run, and the opposite in the second. Try to run this a few times yourself. Use only one family first: to see that boys can only beat girls on the first generation. They can then even out on the 2nd generation, and then they are screwed. This means that, in the universe of possible outcomes, the average of the probability distribution is slightly tilted to the boys favour. As you increase the number of families, the law of large numbers kicks in and you should see few of such effects.
Or just list the outcomes with their probabilities yourself. For 1 couple, the possible outcomes are:
P(B) = 1/2 Girls fraction = 0
P(GB) = 1/4 Girls fraction = .5
P(GGB) = 1/8 Girls fraction = .67
P(GGGB) = 1/16 Girls fraction = .75
…
See boys, can only beat girls in first generation, but they win big time when they do (1 to 0). Let’s see the expected value for our values:
E=(1/2)*0 + (1/4)*.5 + (1/8)*.67 + (1/16)*.75
E=.26
Obviously E will favour girls as we add more cases, but it will still not reach 1/2. Apparently, it approaches (1/2) – (1/4k), where k is the number of couples.
What’s the flaw in this model? (It assumes that we begin with a specified number of couples, and its focus is the number of children in the population once the last of those couples has had a boy or reached the specified threshold value.)
10 rem Sets up arrays for total number of boys and total number of girls
20 dim g(30000),b(30000)
30 rem X is the total number of “generations”
40 x = 100
50 rem n is the total size of the population
60 n = 30000
70 rem sg is the total number of girls
80 sg = 0
90 rem sb is the total number of boys
100 sb = 0
110 for i = 1 to n
120 rem g(i) is the total number of girls for family number i
130 g(i) = 0
140 rem b(i) is the total number of boys for family number i
150 b(i) = 0
160 if (g(i)+b(i)) = x then 260
170 rem s is the sex of each child (1 for girls, 0 for boys)
180 s = int(rnd(1)+0.5)
190 if s = 1 then 240
200 rem increments number of boys in family
210 b(i) = b(i)+1
220 goto 260
230 rem increments number of girls in family
240 g(i) = g(i)+1
250 goto 160
260 next i
270 for i = 1 to n
280 print i,g(i);”/”;(g(i)+b(i))
290 rem sums number of girls in all families
300 sg = sg+g(i)
310 rem sums number of boys in all families
320 sb = sb+b(i)
330 next i
340 print “ratio = “;sg/(sg+sb)
Dude, you posted that huge code listing that nobody here will run, but forgot to tell us what output you got. 😉
Anyhow, you probably want to use smaller population seeds in order to see the point that Landsburg was trying to drive home. Otherwise, your ratio number will just give you the .5 limit approximation.
My results fluctuated around 0.5.
But–genuine, non-snarky question–if I’m talking about the population of a country, why wouldn’t I *want* a large population seed? Why wouldn’t results obtained with smaller seeds be worthy of skepticism?
FWIW, I just ran 3,000 simulations for populations of 10 couples each and came up with only 47.9% girls. But if the question really is focused on a *country*, then ten couples seems way too small. Even if you work with 400 couples–half the population of the world’s smallest country, Vatican City–the result after 3,000 trials is 49.9988% girls.
Landsburg is not disputing the result, but how people got there. His point is not that the initial reply of 50-50 was wrong, but that the thinking was insufficient — and indeed that can be seen by starting with smaller populations to avoid rounding approximations.
As he puts it in some comment (paraphrasing): you could say that the reason the answer is 50-50 is because everyone has two legs: just because you got the response right (in the limiting case) doesn’t mean that your thinking is correct.
Okay, let me dig up the actual comment of his:
(I don’t necessarily endorse the way he puts things: but if you’ve read any of his stuff, you know he says it as it is. eheh)
Here is another answer of his. In italic is the reply he is answering to:
Anyhow, if I may change the topic to something “lighter”, what is the language you’re using there?
Let me guess: some dialect of Basic. 😉 But seriously, what dialect is it exactly? I didn’t know people were still using Basic these days eheh — well, other than Visual Basic that is, which only slightly resembles Basic anyhow.
You know, David Friedman has written a few QBasic programs to go with his age-old Price Theory textbook. These programs no longer run in modern systems, and there are no modern compilers for them. I have already did an initial port (to Java) of those a little while back: but probably writing them from scratch is in order. I’m guessing you’re a fellow econ enthusiast from the computer science field — if you’d be interested in collaborating on something, drop me a mail: rpmcruz AT alunos.dcc.fc.up.pt
Error on my part: 30 should have specified that X is the number of times a couple gets to try to have a male child.
And of course N is the number of couples, not the size of the population.
And “ratio” should have been “percentage of girls”; this is what I get for posting too quickly. Wish there were a way to delete one’s own posts on this site.
these are the kinds of problems the census bureau is faced with
nerds
Landsburg is an idiot.
The answer is obviously 0.5.
Landsburg got carried away and posted a wrong solution……. and when people called him out, he came up with an unreasonable interpretation of the question which allows him to stick to the wrong answer.
He is wrong but he won’t lose his bet because his bet is a simple simulation that allows him to “prove” the wrong solution based on his wrong assumptions about what the question is.
Nah, Landsburg is not an idiot. He is arguably a jerk (for the harsh dismissal of people who disagree with him), but he’s not an idiot on this.
I’m not ready to give my full-blown analysis of this controversy, but I think Landsburg is basically right. If the question is indeed, “What is the mathematical expectation of the fraction of females?” then it’s not a good answer to say, “50%, since every child has a 50/50 chance of being female.”
However, I think this misses the spirit of the problem. I think most people when they hear this puzzle, are trying to use intuition to figure out, “Is this rule going to give a tendency for more boys or girls?” I think most people don’t make a distinction between “Are there more boys or girls?” versus “Is the ratio of girls to boys higher or lower than 1?”
Landsburg is right that a person with good math skills should understand this distinction, but I don’t think it’s wrong when people rely on the intuition of “nature doesn’t care about the parents’ motivations” to say, “There is no reason to suppose more boys will be born than girls.”
In case some people aren’t getting it, try this example which someone posted in Landsburg’s comments:
* Suppose X and Y are both random variables with a uniform distributino on [1, 2].
* Cleary E(X) = E(Y) = 1,5,
* However, E(X/Y) = E(Y/X) > 1.
That’s a little bit weird if you are attributing too much meaning to the term “expectation.” It looks like you’re saying, “We expect X to be bigger than Y, and Y to be bigger than X, at the same time.”
BTW I can’t remember how to calculate E(X/Y) in this case; I’m just reproducing the answer from Landsburg’s comments. Can someone remind me?
This is basically the same reason that betting on black in Roulette doesn’t work. Each birth is 50/50 regardless of past history. You aren’t bound to get a boy at some point.
…and yes Bob, I did the math. Ricardo’s numbers are the same as mine.
Brian, the question is NOT, “If you grab a random child from all possible universes, what is the probability that it’s a girl?”
I thought Landsburg was wrong for about 12 hours myself.
No, you and Landsburg are correct given a finite population. It feels like one of those tee-hee-hee-look-at-how-clever-I-was math puzzles. If we throw in the reality that no one can have a thousand daughters before a son, you get the same <50% girls answer.
Brian, did you read the follow-up post? There is a very legitimate sense in which Landsburg is right.
And of course it was a “tee-hee-hee-look-at-how-clever-I-was” math puzzle. It’s not like the Chinese government was consulting with us.