New banner and icon for the blog

I've decided to try my hand at graphics design, to draw the banner and the icon for this blog. The following are some earlier drafts. Hopefully, they illuminate the meaning behind the blog's name.

Here are the results, the final designs for the banner and the icon. They're what appears on the blog now.

You may next want to read:
15 puzzle: a tile sliding game
Time spent on video games: worthwhile or wasteful?
Another post, from the table of contents

Merry Christmas! And happy one year anniversary for this blog!

Jesus asked his disciples, "what about you? Who do you say I am?"

The following is a selection of the things I've said about Jesus in various posts, in the year since I started this blog. This is who I say he is. Merry Christmas!

What does it mean that Christ created the world? It means that he was incarnated into the world. Otherwise, what has God (who is a spirit) to do with the world (which is physical)? To physically create the world, God - the One Father of All - breathed into the world his Secret Fire, the Imperishable Flame, the One that belongs only with God. He did so to "let these things be" - so that his plans and intentions would become physical reality through his Word.

The form of that Flame is none other than Christ come into the world. Merry Christmas to you all - for on that day the universe was (ontologically, not temporally) created.

Jesus is like a baby elephant. A large elephant can be groped at by blind men and never be comprehended because of its large size. But if that elephant had a baby - something begotten to be of the same elephant nature yet small enough to be felt by the blind - then they could get a good idea of the large elephant.

There are no miracles at the highest level: If we could somehow understand God completely through one last final miracle, we'd find that there are no exceptions or surprises or inconsistencies in God, for God is perfectly logical and consistent in himself. All the lower level miracles would be contained and explained in the last miracle to give a complete picture of God.
The last, deepest, greatest miracle is the Incarnation. It is a miracle at the level of the nature of God himself. It reveals to us everything about God. "Anyone who has seen me has seen the Father". It is the miracle that explains all other miracles.

What is love? As it is written, "Greater love has no on than this: to lay down one's life for one's friends". So at the cross, Jesus displays the unconditional love that continued to love his sinful enemies even while we crucified him. He took on all our sins and their consequences, sacrificing himself and saving us.

Jesus Christ is God himself incarnated as a man. In God's act of true love for us, Christ came - God came as a man - to fulfill the plan for our salvation. For what power does anyone else have to stop the course of sin? To save us? To reach us, he humbled himself down to our level, and took on the human form that he first granted us. Like us, he was conceived, born, and raised, and became a man familiar with our sorrow, who experienced our pain. Despite being fully human, he remained morally perfect, so that he could serve as the perfect example for us. Moreover, this was necessary for the next key part of the plan: his crucifixion and resurrection. 
I do not understand Jesus' death on the cross. There are theories of how it worked, but I doubt we have anything close to the full picture. This is only expected: the cross is nothing less than the intersection of all of existence - things on heaven and earth, visible and invisible, life and death, good and evil, sin and righteousness, God and his creation, story and Author - they all collide here. I think that a complete understanding of Christ's death and resurrection would require nothing short of the entirety of the mind of God. My telling of the story is utterly insufficient for it - nevertheless I will proceed.

The universe is actually designed by and for one person, and one person only: Jesus Christ, who is God himself. It's his game. He designed it to play it himself. Every parameter, every feature of the universe is designed solely for Christ's sake. Because Jesus made the universe to play it himself, it's an excellent, perfectly crafted game: it's made with incredible elegance, efficiency, and simplicity in its fundamental rules which are completely free of bugs or exceptions, yet the final result is rich and complex and intricate, and allows for a great deal of player expression.
How, then, shall we play? [...] Play like Jesus played: to express God's love back towards God, and to your fellow players. This is what the universe was designed for. After all, it was designed by and for Jesus.

The Incarnation:
This, at last, is the event for which all of creation was made and waiting for. God sent his son, in whom dwells the fullness of deity, to be incarnated as a human being. Thus began the final act in the grand story of the universe - the one that the whole universe had been building up to for 13.8 billion years. 
The death and resurrection of Jesus Christ:
And here is the climax of the story: the singularity at the heart of existence, the purpose for which Christ came into the universe. Everything had been for this event. In the beginning, when God set the laws of the universe, he dictated that a hammer pounding a nail would be sufficient to drive it through Christ's body. When God created the Earth, he placed the iron atoms that would make up the spear that pierced Christ's side. When God created life, he designed it so that sufficient structural disruption would cause it, and therefore Christ, to die. When he created humans, he gave us sufficient brains for processing life, love, death, and resurrection. And when the serpent deceived Adam and Eve, God declared that the seed of the woman would crush the serpent. Everything - all the other events in the universe's history - had been building towards this moment. Through his death and resurrection, Christ makes us fully God's children. He completely reverses Adam's, Eve's, and all of our sins. He sets the course for all of creation - the whole universe - to be redeemed.

You may next want to read:
The Gospel according to Disney's "Tangled"
The Gospel: the central message of Christianity
For Christmas: the Incarnation
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 4)

Have you read the last several posts? In those posts we began the tale of Alice and Bob, a pair of murder suspects who recently started dating one another. Through their sordid tale, we'll examine Bayesian reasoning, the scientific method, and the so-called fallacy of "affirming the consequent".

Alice and Bob are going through a rough patch in their relationship. One day, Alice accuses Bob of infidelity, and they have this conversation:
You spent the night at Carol's house last weekend! You're cheating on me with her! 
What?! How do you figure that? I'm innocent! 
If you're cheating on me with her, it makes perfect sense that you'd spend the night at her house! 
Ha! You're "affirming the consequent". You've started from "if [cheating], then [night at Carol's house]", then concluded that "if [night at Carol's house], then [cheating]". This is a logical fallacy, and your argument is invalid. Cheating on you is not the only possible explanation for me spending the night at Carol's. There are other, perfectly innocent explanations - like the fact that Carol threw a party that ran late, and a bunch of us just crashed at her place for the night rather than risk driving home tired and drunk.
Now, let's pause the conversation here for the moment and assess the situation. So far, Bob's logic follows the example in the Wikipedia page on "affirming the consequent". And he certainly seems right - "affirming the consequent" is a fallacy in propositional logic, and Alice can't necessarily conclude that Bob cheated on her just because he spent a night at Carol's house. So, is Alice committing a logical fallacy? And therefore Bob is innocent? Let's continue and see:
That party happened last weekend, when Carol knew I would be out of town! That makes perfect sense if Carol plotted to have you come without me! 
That's ridiculous. You're still just "affirming the consequent". You've started from "if [Carol plotting], then [party on that weekend]", then concluded that "if [party on that weekend], then [Carol plotting]". I told you before, this is a logical fallacy. There are many other explanations for why the party might have happened on that weekend. 
Whenever I ask Carol whether she's seeing anyone, she avoids the question! 
That's more flawed reasoning. Do I have to explain it to you again? There are other possible explanations why Carol avoids that topic with you. That doesn't mean I'm cheating on you with her.
But she was always very forthcoming about her dating life before. What would make her so reluctant to talk about it now? 
I don't know. I suggest you ask her. Many innocent explanations are possible. You're still "affirming the consequent". You've started with a hypothesis - that I'm cheating on you with Carol - and then produced observations that fit with that hypothesis, then used those observations to justify the original hypothesis. That's like saying "The Bible is true because God wrote it, and it says that God exists. Therefore God exists". That kind of silly, circular reasoning is what happens when you "affirm the consequent", and you're using this logical fallacy over and over again to try to say that I've cheated on you.
Well, Bob certainly seems to be right. Alice can't, and shouldn't, conclude that Bob is cheating on her just because Carol is not talking about her dating life, or because the party happened on a certain weekend. After all, "affirming the consequent" is a logically fallacy, isn't it?
Someone at the party saw you go into Carol's bedroom with her. 
You're still "affirming the consequent", and it invalidates your conclusion. Your logic is flawed. There are perfectly innocent reasons to go into someone's bedroom. 
With two bottles of wine. And you closed the door afterwards. 
Still just "affirming the consequent". What, we're not allowed to drink wine at a party? We're not allowed to close the door when the music is loud out in the living room?
This was at 10 pm, far before your normal bedtime, far before you're normally tired. 
It was a crazy party. It wore me out fast. Why do you continue to "affirm the consequent"? Don't you see that you're still just starting with the idea that I'm cheating on you, then using that idea to interpret the events to justify itself? That's circular reasoning. You're saying that just because these are how things would play out IF I were cheating on you, therefore then I MUST be cheating on you. It's a logical fallacy, like I said many times, and you're just using it repeatedly.
And you didn't come out from the bedroom until the next day. 
Like I said, we got tired and decided to just sleep off the party rather than risk driving home drunk and exhausted.
Okay... hmm... I mean, "affirming the consequent" is still a logical fallacy, right? It's got a Wikipedia page and everything. How could Alice be right when she's committing this fallacy over and over? I mean... if you heard that your significant other went into a bedroom with someone else, along with two bottles of wine, and stayed behind closed doors until the next day, you wouldn't jump to any conclusions, right? Because you're a logical thinker and you don't want to commit a fallacy?
My source from the party also tells me that it looked like you and Carol were making out before you went into her room.
You know that eyewitnesses are unreliable. The living room was dark and your "source" was probably drunk as well. Or maybe your "source" is lying to break us up for his or her own ends. There's lots of possibilities, you can't conclude that I cheated on you from this, and you're only still "affirming the consequent" by bringing this up to say that I did.
I also have this shopping receipt, dated the day before the party, for things that Carol bought. She purchased scented candles, those wine bottles we mentioned, and "sexy" lingerie. 
What Carol does with her money and what lingerie she wears is none of my business. You're still using flawed logic, by starting from the idea that I'm cheating on you, to explain what Carol bought, then using that explanation to justify your initial assumption. 
She also bought condoms. 
I didn't know, I don't care, and it's not relevant. There are many reasons that Carol would buy condoms that have nothing to do with me cheating on you. 
I found condoms of the same brand in Carol's trash dumpster after the party. They were used. 
Are you crazy?! That's disgusting! Completely apart from your gross dumpster-diving, that doesn't prove anything. Those could have come from anywhere, thrown away by anyone. You're still trying to make everything fit into your preconceived notion that I've cheated on you. That's "affirming the consequent"! It's a logical fallacy! You're just repeating this fallacy over and over again!
I had them tested at the lab. The DNA on the outside is a decently good match with samples from Carol's hair. 
DNA matching is imperfect. There's thousands of people in this city that would also be a "good match" with that DNA sample. Even if were an "excellent" match there's still lots of people who would fit that criteria. Any one of them could have used the condom. And even if it WAS Carol, you can't conclude that I've cheated on you with her from just that. That would be "affirming the consequent"! 
And the DNA on the inside is an excellent match with you. 
LOGICAL FALLACY! Over and over again! Your reasoning is invalid! You're trying to go from "if A then B", to "if B then A"! That's circular logic! It's "affirming the consequent"! I did not cheat on you with Carol!
If you still believe Bob, then I have a bridge to sell you. The weight of evidence is overwhelming at this point. Bob did almost certainly cheat on Alice with Carol.

But what about "affirming the consequent"? Isn't Bob right that it's a logical fallacy? Isn't Alice's argument based entirely on using it over and over? What does Bayesian reasoning say about all this?

Now, Bayesian reasoning mirrors human common sense. It will never lead to a result that "normal" reasoning says is impossible. As I mentioned earlier, you don't actually need formal training to use it in your daily life, because its rules are just the rules of good thinking that's been refined to a mathematical precision. However, because of its precision and power over propositional logic, Bayesian reasoning can sometimes lead to surprising results for someone who's only versed in propositional logic. "Affirming the consequent" is one such result.

In Bayesian logic, "Affirming the consequent" is allowed in a mathematically precise way. You CAN relate "if A then B" to "if B then A". In Bayesian terms, where we assign probability values - P(A), P(B), P(A|B), et cetera - to all statements, "if A then B" can be expressed as P(B|A), and "if B then A" becomes P(A|B). And these two probabilities are directly related to one another, as it is plainly written out in Bayes' theorem:

P(A|B) = P(B|A)  * P(A)/P(B)

Essentially, the two factors grow together. As P(B|A) gets bigger, so does P(A|B). As B becomes better explained by A, A becomes more likely given B. The more strongly the consequences of a hypothesis are affirmed, the more likely the hypothesis is to be true. As more events around Carol's party are explained by Bob cheating on Alice, it becomes more certain that Bob cheated based on these events. So each event - each instance of "affirming the consequent" - actually strengthens the hypothesis that Bob cheated on Alice with Carol. Far from dooming Alice's hypothesis because of its status as a "logical fallacy", it actually serves as evidence for Alice's accusation.

That's right: "affirming the consequent" does not invalidate its conclusion, instead it actually serves as evidence FOR that conclusion.

It is the very fact that Alice used "affirming the consequent" OVER AND OVER that made her case so strong. It's crucial to note that if she had made only one such argument, even if that argument was the one from DNA on the condom, her case would have been weak and she would have been wrong to come to her conclusion. But with each instance of "affirming the consequent" - each time Alice successfully showed that the events around Carol's party fit with Bob cheating on her - her case grew stronger. Therefore, "affirming the consequent" is a "logical fallacy" only insofar as it's not being used enough.

So if you see someone say that Bill Gates must own Fort Knox because he's rich, you can legitimately say that this is flawed reasoning, and call him out on "affirming the consequent". In this case, you'd be using that term as a proper logical fallacy and saying that this person conclusion is invalid. But if this person repeated similar arguments over and over - if he showed that Bill Gates was part of a secret cabal that controlled the U.S. government, and that Gates had regularly been inside Fort Knox, and that there were mysterious changes to his net wealth that matched perfectly with mysterious changes in the amount of gold in Fort Knox, and that a highly ranked government official anonymously said "Bill Gates owns Fort Knox.", then we might be getting somewhere. Each of these things would by itself could be dismissed by citing "affirming the consequent", but together, each instance of "affirming the consequent" counts as evidence, and adds up to a strong case.

So "affirming the consequent" can both serve as evidence and be a mistake. But, in Bayesian terms, how can you tell when it's a mistake? What is the genuine blunder in logic when that happens? As a mistake, "affirming the consequent" is the act of coming to the conclusion without enough evidence. It's coming to the conclusion without affirming enough consequents. Or more properly, it's concluding that P(hypothesis|evidence) is high, when P(evidence|hypothesis) is not yet large enough to compensate for P(hypothesis)/P(evidence). The solution to this issue, in part, is not to stop "affirming the consequent", but to do it more - to look for more evidence.

The reason that propositional logic doesn't, and can't, follow this reasoning is because it cannot distinguish between probability values of 1% or 99%. In propositional logic, a statement can only be true, false, or undecided. But "affirming the consequent" works in Bayesian reasoning by moving the probability value: it perhaps starts at 1% (very unlikely to be true), but then slides to 20% (unlikely to be true), then to 70% (likely to be true), and to 99% (very likely to be true) as you affirm more consequents, over and over. Propositional logic sees this and says "all I recognize in all these cases are undecided statements", and since 99% is not 100%, it will not let you say that the conclusion is true. This is why "affirming the consequent" is always a logical fallacy in propositional logic. But this really says more about the limits of propositional logic rather than reflecting true rationality.

How do you know when you've affirmed enough consequents? How many times to you have to "affirm the consequent" to be sure of your conclusion? Due to the difficulties associated with using Bayes' theorem in a real-world context, it may be hard or impossible to get actual numbers. But you have to at least walk through the equations to vaguely answer the question.

In particular, when you work through the equation it turns out that the most effective kind of evidence is that which could be affirmed by your hypothesis, but not by a rival hypothesis. "Affirming the consequent" is better than not affirming. Circular reasoning is better than contradictory reasoning. This is the essence of the odds form of Bayes' theorem, which shows the importance of comparing the hypotheses against one another. It has many important applications:

One such application is the scientific method. Bayesian reasoning is the logical framework that underlies the scientific method. Science, in part, relies on "affirming the consequent". Experimental verification of theoretical predictions serves as evidence for that theory. On the flip side, theories are falsified based on experiments as well. Both sides of that statement are together expressed in the odds form of Bayes' theorem. Between two competing theories or hypothesis, "affirming the consequent" is better than not affirming, and circular reasoning is better than contradictory reasoning.

Bayesian reasoning is also at the heart of presuppositional apologetics, which starts with the idea that God of the Bible - who is the basis for all rational thought - exists. It then "affirms the consequent" by verifying that the world does indeed bear the image of its Creator. Rival non-Christian worldviews cannot make the same affirmation, and therefore must borrow from the Christian worldview even in attacking it, thereby contradicting themselves. Of course, its critics have said that this approach is invalid because it "affirms the consequent", but I hope you now know better.

This reasoning is also the logical foundation for my blog here. I start with this fundamental postulate: God as revealed in Jesus Christ. I then "prove" that God exists by demonstrating that this postulate generates the universe - that is, by affirming the consequent.

This Bayesian reasoning is also the logical framework for my series of posts on how science itself - its axioms and long-term traits and properties - serves as strong evidence for Christianity. Because hypothesis should be measured against its rivals, I said that science is evidence for Christianity and against atheism. Of course, its critics accused me of "affirming the consequent" over and over again. By now, you should recognize this is as the mark of a strong argument, one with a great deal of evidence behind it. After all, "affirming the consequent" is a hallmark of science itself.

In all these areas, beware those who only cry "fallacy!", who will not state or test their hypothesis against yours, who only want to tear down arguments instead of building them. They pretend that their ignorance is strength, because they think that knowing nothing means they never have to affirm any consequents. They do not realize that this is actually the mark of profound weakness, and such know-nothing hypothesis can only survive by parasitically attaching itself to more established theories. But you should actively seek to find, build, critique, and refine your hypothesis. Rejecting a hypothesis is never an end in itself, but a step towards a better hypothesis. Remember that the devil comes to steal and to kill and to destroy. But it is God who creates.

We can now conclude by answering the questions I raised at the end of my last post. Yes, Bayesian reasoning allows for "affirming the consequent", and this actually serves as evidence FOR your conclusion. There is still a sense where "affirming the consequent" is a fallacy, which happens when you give a hypothesis too much credit based on a single instance of "affirming the consequent". But this only means that you haven't affirmed enough consequents. To escape this fallacy, you need to affirm more consequents with your hypothesis, while comparing it with its rival hypothesis. "Affirming the consequent" is a fallacy in propositional logic, but that's more indicative of propositional logic's inflexible limits rather than a reflection of actual rationality. In fact, "affirming the consequent" forms half of Bayes' theorem in odds form, which is the logical basis for the scientific method, presuppositional apologetics, and this very blog and the theories I put forth in it.

You may next want to read:
What is "evidence"? What counts as evidence for a certain position?
Science as evidence for Christianity (Summary and Conclusion)
"Proving" God's existence
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 3)

In my last post, I introduced Bayes' theorem:

P(hypothesis|observation) = P(observation|hypothesis)/P(observation) * P(hypothesis)

Now, this is a powerful equation that tells us how to use observed evidence to update our beliefs about a hypothesis. But as I mentioned, it has two difficulties with its use: first, the probability prior to the observation - P(hypothesis) - is famously difficult to compute in a clear, objective manner, and it changes based on the background information that each person has. For these reasons it's often said to be a personal, subjective probability, reflecting a particular person's degree of belief based on his or her unique set of background information.

And second, things get even worse for P(observation): this is the probability of making the observation, averaged over the complete set of competing hypotheses. Because this is an average over the complete set, we have to know all P(hypothesis) values for every competing hypothesis. But as we said just in the previous paragraph, computing even one of these values is difficult. If that wasn't hard enough, in real-life situations we may not even be able to enumerate the complete set of competing hypotheses. And then, even if we somehow got through all these difficulties, we still have to calculate P(observation|hypothesis) values for each of these hypotheses, which itself is no trivial task, then calculate their average across all the hypotheses. This step often requires more computation than the rest of Bayes' theorem put together, even for well-defined problems with fixed values for all other probabilities.

For these reasons I often like to use Bayes' theorem in odds form: simply write down the equations for two different hypotheses and divide one by the other, and you get:

P(hypothesis A|observation)/P(hypothesis B|observation) =
P(hypothesis A)/P(hypothesis B) * P(observation|hypothesis A)/P(observation|hypothesis B)

This can be summarized as "posterior odds = prior odds * likelihood ratio (of the observation being made from each hypothesis)", where:

P(hypothesis A|observation)/P(hypothesis B|observation) = posterior odds,
P(hypothesis A)/P(hypothesis B) = prior odds,
P(observation|hypothesis A)/P(observation|hypothesis B) = likelihood ratio.

 Let's go through an example: say you're investigating a murder. You think that Alice is twice as likely to be guilty compared to Bob - this is your prior odds. You then observe fingerprints on the murder weapon that are 3000 times more likely to have come from Alice than from Bob - this is the likelihood ratio. You multiply these ratios to calculate your new opinion, the posterior odds: Alice is now 6000 times more likely to be guilty than Bob. Posterior odds is prior odds times likelihood ratio.

This is still Bayes' theorem, just in a different algebraic form. The intuition captured by this equation is the same: an observations counts as evidence towards the hypothesis that better predicts, anticipates, explains, or agrees with that observation. But notice that in this form, P(observation) - which was difficult or impossible to calculate - has been cancelled out. Also, P(hypothesis) - another troublesome number - only appears in a ratio of two competing hypotheses, which I think is a more reasonable way to think of it: it's easier to say how much more likely one hypothesis is than another, instead of assigning absolute probabilities to both of them. In short, this form makes the math easier, and allows you to think of just two hypotheses at a time, rather than having to account for the complete set of competing hypotheses all at once. You don't have to worry about Carol and her fingerprints for the time being in the above murder investigation example.

Let's go through a couple more examples:

Say that your friend claims that he has a trick coin: he says it lands "heads" all the time, rather than the 50% of the time that you'd normally expect. You're somewhat skeptical, and based on his general trustworthiness and the previous similar claims he's made, you only think that there's a 1:4 odds that this is a 100% "heads" coin, versus it being a normal coin. This is your P(always heads)/P(normal), the prior odds.

When you express your skepticism, your friend says, "well then, let me just show you!" and flips the coin. It lands "heads". "See!" says your friend. "I told you it'll always lands heads!" Now, obviously a single flip doesn't prove anything. But it certainly is evidence - not very strong evidence, but some evidence. Since the coin will land "heads" 100% of the time if your friend is right, but only 50% of the time if it's a normal coin, their ratio - the likelihood ratio - is 100%:50%, or 2:1.

Now, according to the odds form of Bayes' theorem, posterior odds is prior odds times likelihood ratio. 1:4 * 2:1  = 1:2, so you should now believe that there's a 1:2 odds that this is a trick coin like your friend claimed, versus it being a normal coin. You're still skeptical of the claim, but you're now less skeptical.

Noting your remaining skepticism, your friend then flips the coin again. "Ha, another heads!" he says as he calls out the result. Now, to calculate your new opinion, simply repeat the calculation above, with the previous answer - the old posterior odds of 1:2 - serving as the new prior odds. The likelihood ratio remains 2:1. Posterior odds is prior odds times likelihood ratio, so our new posterior odds is 1:2*2:1 = 1:1. You should now be completely uncertain as to whether this coin in fact is a trick coin. You say to your friend, "well, you may have something there".

"Okay, fine then." says your friend. "Let's flip this thing ten more times." And behold, it comes up "heads" all ten times. Your posterior odds get multiplied by 2:1 for each of the ten flips, and it's now 1:1 * (2:1)^10 = 1024:1. You should now believe that the chance of this being an "always heads" coin is 1024 times greater than it being a normal coin. If you're willing to consider "normal" and "always heads" as the complete set of competing hypotheses, this would give you over 99.9% certainty that your friend is right that this coin will always land heads.

"Wow, amazing." you tell your friend, as you're now pretty much convinced. "I've never actually seen one of these before", you say, as you idly grab the coin and flip it again, fully expecting it to land "heads" once more. But this time, it lands "tails".

What now? The likelihood ratio for the coin to land "tails" - P(tails|always heads)/P(tails|normal) - is 0%:50%, or 0:1. Our new posterior odds is 1024:1 * 0:1 = 0:1. There is now absolutely no chance that this coin is one that will land heads 100% of the time. But at the same time, it also seems unlikely that it's just a normal coin. given that it landed "heads" 12 times in a row just before this. A new possibility suggests itself: that this coin has something like a 90% chance of landing heads.

This illustrates one of the major advantages of the odds form of Bayes' theorem. Before this, you hadn't even considered that the chance for this coin to land "heads" was anything other than 50% or 100%. All of the other hypotheses - such as the coin landing "heads" 90% or 80% or 20% of the time - you had ignored. And yet, even without considering the complete set of competing hypotheses, you were still able to carry out valid calculations and make statistical inferences, reaching sound conclusions.

You both stare at the coin that landed "tails". You ask your friend, "What just happened?" He replies, "well, the magician I bought it from said that it would always land heads. And it seemed to be working fine up 'til now. Maybe he just meant that it'll land heads most of the time?" Being naturally suspicious, you respond, "Looks like he lied to you then. He probably just sold you a normal coin".  But your friend comes back with, "C'mon, you know that's not fair. Human language doesn't work like that. It's imprecise by its very nature. When someone says 'always' in casual conversation, they don't necessarily mean '100.000000...% of the time' with an infinite number of significant figures. Even 'normal' coins don't land heads exactly 50.000000...% of the time". Struck by your friend's rare moment of lucid articulation, you become temporarily speechless. "Besides", your friend continues, "the magician might have said that the coin 'nearly always lands heads'. I don't remember exactly".

With this new insight, you realize that your had set your priors to the wrong hypotheses at the beginning of the problem. Instead of the hypotheses that the coin to land "heads" exactly 100% of the time, or exactly 50% of the time, you should have set them to 'close to 100% of the time' and 'close to 50% of the time'. Giving the odds of P(close to 100%)/P(close to 50%) = 1:4 as before, and interpreting "close to" as a flat distribution within 2% of the given value, We get that the likelihood ratio for the coin landing "heads" is P(heads|close to 100)/P(heads|normal) = 99%:50% = 1.98:1, and for the coin landing "tails" is P(tails|close to 100)/P(tails|normal) = 1%:50% = 0.02:1. Then the value for the posterior odds after 12 heads and 1 tails is given by prior odds times likelihood ratio, and it is roughly:

1:4 * (1.98:1)^12 * 0.02:1 = 18.15:1

(This is an approximation, made by assuming that the probability distribution can be thought of as being entirely focused at the center of their interval. The actual value, 16.97:1, can be obtained by a straightforward integration over the probability distributions, but that calculation lies beyond the scope of this introductory post.)

So you don't have to abandon the "close to 100%" hypothesis along with the "exactly 100% hypothesis. The odds are still 18:1 in favor of the coin landing "heads" more than 98% of the time, against it being a "normal" coin - enough for you to be reasonably confident in believing as your friend does.

This illustrates again the advantages of using the odds form. Firstly, we again didn't have to consider other probability values for the coin landing "heads", such as 75%. We were still able to come to a reasonable conclusion without having to specify the complete set of competing hypotheses, and their probability distribution. Secondly, we were able to completely switch the class of hypotheses under consideration, without losing consistency. If we had stuck to the original form of Bayes' theorem, then we would have had to specify our prior probabilities for P(heads exactly 100% of the time) and P(heads exactly 50% of the time). To maintain our 1:4 ratio, we would assign them as 20% and 80%, taking up all 100% of our probability, because we were not thinking about other possibilities. But then, upon realizing our mistake, we would have no choice but to contradict our previous priors, and assign P(heads close to 100% of the time) and P(close to 50% of the time) some values, while going back and admitting that the chances of the coin giving exactly 50% or 100% "heads" are nearly zero. This is a problem created entirely by being unaware of the complete set of competing hypotheses.

But with the odds form, we don't have to have complete awareness. All the conclusions that we came to are still perfectly consistent with the data: there is zero chance for the coin to land "heads" exactly 100% of the time, yet it is much more likely that the "heads" probability is close to 100% than it being a normal coin. Our two sets of priors do not contradict each other either: it's quite reasonable for our prior odds to be 1:4 in both cases, because we have not specified how much of the total probability they take up. In general, I feel that it's easier to say how likely two hypotheses are relative to one another, rather than specifying the absolute probability value for a hypothesis.

I hope this convinces you of the virtues of the odds form of Bayes' theorem. This is how I use Bayes' theorem in everyday situations to sharpen my thinking: I didn't know if this one movie was going to be any good (prior odds), but upon its recommendation from a friend (likelihood ratio), I revise my opinion and are now more likely to see it (posterior odds). I didn't know whether Argentina or Germany is more likely to win the World Cup (prior odds), but upon watching Germany slaughter Brazil (likelihood ratio), I now consider Germany more likely than Argentina to win the World Cup (posterior odds). So on and so forth. Posterior odds is prior odds times likelihood ratio.

Let's consider a couple of last examples:

I don't know if Bill Gates owns Fort Knox (prior odds). But I know that he's rich, and he's more likely to be rich if the owns Fort Knox than if he does not (likelihood ratio). Therefore, given that Bill Gates is rich, he's more likely to own Fort Knox (posterior odds).

Does that reasoning sound suspicious? It should. I took it straight from the Wikipedia page on "affirming the consequent", which is a logical fallacy. But the structure of the above argument is correct according to Bayes' theorem. It follows the same structure as all of my other examples. So, has Bayesian reasoning lead to a logical fallacy? Oh no! What shall we do?

Hold that thought, while we consider our last example:

 I don't know whether Einstein's theory of general relativity, or Newton's theory of gravity is correct. (prior odds). But upon considering the experimental evidence of bending of starlight observed during the 1919 solar eclipse (likelihood ratio), I now consider general relativity much more likely to be correct than Newtonian gravity (posterior odds).

You should recognize that as the event that actually "proved" general relativity to the public, and the epitome of the scientific method at work: hypotheses are judged according to their agreement with experimental observations. But this is nothing more than just straightforward Bayesian reasoning, following the same structure as all of my other examples. So, it turns out that Bayesian reasoning underlies the scientific method, by providing the logical framework for it.

What are we to make of these two last examples? Does Bayesian reasoning allow for affirming the consequent? But isn't that a logical fallacy? But doesn't Bayesian reasoning also underlie the scientific method? Does that mean that science follows a logically flawed system? What are we to make of this?

I will address these issues in my next post.

You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 4) (Next post of this series)
Isn't the universe too big to have humans as its purpose?
What is "evidence"? What counts as evidence for a certain position?
Another post, from the table of contents

Basic Bayesian reasoning: a better way to think (Part 2)

Image: portrait of Thomas Bayes, public domain
In my previous post, I explained that instead of thinking of logical statements as only being "true" or "false", we should assign probability values for their chance of being true. This is the fundamental tenet of Bayesian reasoning. This allows us to employ the entire mathematical field of probability theory in our thinking and expands the rules of logic far beyond their limited forms in propositional logic.

Essentially, we can now use any valid equation in probability theory as a rule of logic. We saw an useful example in the last post: P(C|A) = P(C|BA)P(B|A) + P(C|~BA)P(~B|A). This captures the intuitive idea that if A is likely to lead to B, and B is likely to lead to C, then A is likely to lead to C. But it also does more - it tells us precisely how to calculate the probability for our conclusion, while simultaneously sharpening, guiding, and correcting our thinking. In particular, it tells us that in the second step, it's not enough that B is likely to lead to C, instead requiring that BA is likely to lead to C. (Incidentally, this is why a blind person is not likely to get traffic tickets, even though a blind person is likely to be a bad driver, and bad drivers are likely to get tickets.)

In this post, I will introduce another such equation in probability theory:

P(A|B) = P(B|A)/P(B) * P(A)

This is Bayes' Theorem. Named after Reverend Thomas Bayes, this unassuming little equation, which can be derived immediately from the definition that P(A|B) = P(AB)/P(B), is so important that its use is nearly synonymous with Bayesian logic, and its interpretation is the logical basis for the scientific method. At its heart, this equation tells you how to update your beliefs based on the evidence. To see how that works, set A = "hypothesis", and B = "observation" in the formula. The equation then becomes:

P(hypothesis|observation) = P(observation|hypothesis)/P(observation) * P(hypothesis)

Each factor can then be translated into words as:

P(hypothesis): probability that the hypothesis is true, before considering the observation.
P(hypothesis|observation): probability that the hypothesis is true, after considering the observation.
P(observation|hypothesis): probability for the observation, as predicted by the hypothesis.
P(observation): probability for the observation, averaged over the predictions from every hypothesis.

This equation tells us how we should update our opinion on a hypothesis after we make a relevant observation. That is, it tells us how to go from P(hypothesis) to P(hypothesis|observation). It says that a hypothesis becomes more likely to be true if it's able to predict an observation better than the "average" hypothesis: the bigger the ratio of P(observation|hypothesis)/P(observation), the more likely the hypothesis becomes. Conversely it becomes less likely to be true if it could not beat the "average" hypothesis in its predictions. In short, it says that an observation counts as evidence for the hypothesis that better predicted it. We already intuitively knew this to be true - but Bayes' theorem states it in a mathematically rigorous fashion, and allows us to put firm numbers to some of these factors.

Let's consider an example: Alice and Bob go out on a date. Bob liked Alice and wants to ask her to a second date, but he's not sure how she'll respond. So he hypothesizes two possible outcomes: Alice will say "yes" to a second date, or she will say "no". Based on all the information he has - how Alice acted before and during the date, how they communicated afterwards, etc. - he thinks that there's a 50-50 chance between Alice saying "yes" or "no". That is to say:

P(Alice will say "yes") = P(Alice will say "no") = 0.5

For the sake of simplicity, we will not consider other possibilities, such as Alice saying some form of "maybe". These two "yes" and "no" will serve as our complete set of possible hypotheses.

While Bob is agonizing over this second date, he runs into Carol, who is a mutual friend to both Alice and Bob. She tells Bob, "Alice absolutely loved it last night! She can't wait to go out with you again!". Carol's affirmation serves as evidence that Alice will say "yes" to a second date. We already knew this intuitively: Carol's affirmation is obviously good news for Bob. But Bayes' theorem allows us to calculate the probability explicitly from some starting probabilities. To see this, we need to evaluate two probability values: P(Carol's affirmation|Alice will say "yes"), and P(Carol's affirmation|Alice will say "no").

What value should we assign to P(Carol's affirmation|Alice will say "yes")? That is, if Alice would say "yes" to a second date, what is the probability that Carol would have given Bob her affirmation? Not particularly high - After all, Carol could have simply forgotten to mention Alice's reaction, or Alice and Carol might not have had a chance to discuss the first date, or Alice could have had a terrible time, but she might still give Bob a second chance. All these are ways that the "yes" hypothesis might not lead to Carol's affirmation. Taking these things into account, let's say that P(Carol's affirmation|Alice will say "yes") = 0.2.

What about P(Carol's affirmation|Alice will say "no")? This is the probability that Carol would still communicate her affirmation to Bob, even though Alice would say "no" to a second date. Now, it could be that Alice hated her first date with Bob, but Carol deliberately lied to him. Or maybe Carol simply wanted to encourage Bob even though she didn't really know how Alice felt. Or Alice did really enjoy her time with Bob, but she'll be suddenly struck by amnesia before Bob asks her out again. But assuming that Alice and Carol are honest people, and that nothing particularly strange happens, it's very unlikely that Carol gives Bob her affirmation if Alice is going to say "no". So let's say that P(Carol's affirmation|Alice will say "no") = 0.02

Now, what about P(Carol's affirmation)? This is the last factor we need to apply Bayes' theorem. This is the probability that Carol gives Bob her affirmation, averaged over both the "yes" and "no" hypotheses. Since there's a 50-50 chance that Alice will say "yes" or "no", this is simply the average of the two probabilities mentioned above: 0.5*0.2 + 0.5*0.02 = 0.11. This step can get complicated, but because of the 50-50 chance for our two hypotheses, it is mercifully short in this simple example. So P(Carol's affirmation)=0.11.

This now gives Bob enough information to compute P(Alice will say "yes"|Carol's affirmation). That is, given that Carol told Bob that Alice wants to go out again, what is the probability that Alice will answer "yes" to a second date? According to Bayes' theorem:

P(Alice will say "yes"|Carol's affirmation) =
P(Carol's affirmation|Alice will say "yes")/P(Carol's affirmation) * P(Alice will say "yes") =
0.2/0.11 * 0.5 = 0.909090... = 10/11

Carol's affirmation, upon considering it as evidence, has pulled the probability from 50% to 91%. That is, if Bob thought before that there was only a 50% chance that Alice will agree to a second date, he should now think that there is a 91% chance. That is what evidence does: it pulls the probability for a hypothesis in one direction or another. A strong piece of evidence might pull it all the way from 0.1% to 99.9%, whereas a weak piece of evidence might only pull it from 50% to 60%. An opposing piece of evidence will pull the probability in the other direction, as in a tug-of-war. This is why we commonly speak of "weighing" the evidence. This exemplifies how Bayesian reasoning corresponds to the common sense we use in everyday life, except that it's mathematically precise.

So there is a 10 out of 11 probability, or about a 91% chance, that Alice will say "yes" to Bob's request for a second date. Things are looking good for Bob! Of course, there is the remaining 1/11 probability that Alice will say "no". Bob will have to live with that chance of rejection. That's the nature of Bayesian reasoning - you can't ever be 100% certain, but you can be certain enough to act. Bob should definitely ask Alice out again.

Note that Bayes' theorem, as with all Bayesian reasoning, compels you to accept its conclusions: you cannot simply say "I don't buy this argument" or "I don't find this convincing". If you accept its premises, you must accept it conclusion: otherwise you're violating the rules of mathematical logic.

But where do the premises come from? How did I assign, for example, that P(Alice will say "yes")=0.5, or P(Carol's affirmation|Alice will say "no")=0.02? Well, for the sake of this problem, I just made up some reasonable values. In real life, computing these values would be far more difficult than the example problem itself. For instance, to calculate P(Alice will say "yes"), Bob would have to consider all the relevant background information he has. This would include how Alice interacted with him during the first date, his knowledge of human mating behaviors (it's a good sign if she laughs at your jokes, it's bad if she calls the cops on you, etc), and any other relevant information. Based on this total information, he would calculate how often a woman like Alice would agree to a second date, and that would be his P(Alice will say "yes"). That's why this probability can be thought of as a personal, subjective degree of belief: because nobody else has the exact set of background information that Bob has.

What about calculating P(Carol's affirmation|Alice will say "no")? This is the probability that Carol will convey Alice's approval to Bob, even though Alice will say "no" to the second date. This number might be obtained through some sociological studies, by asking questions like "How often do women tell their friends that they enjoyed a date even if they didn't?" The nature of the relationship between Alice, Bob, and Carol also needs to be taken into account, along with their personalities. Are Alice and Carol very close friends? Is Carol generally reliable, or is she prone to hyperbole? The value of P(Carol's affirmation|Alice will say "no") is all this information condensed into a single number.

You may be disappointed that these probabilities are not simple to calculate. This is often the case in real-life scenarios. It turns out that humans and human relationships cannot be reduced down to a simple calculation, even with Bayes' theorem. Real life is complicated: this should not surprise anyone. Often, the relevant starting probabilities can only be guessed at from intuition. Being able to do that well is a large part of what it means to be a reasonable, logical person in the real world.

So those are the strengths and weaknesses of Bayes' theorem. On the one hand, it provides a firm, computationally exact way of updating your beliefs based on the evidence. On the other hand, the probabilities needed to perform the calculations can be difficult or impossible to assign. In particular, the assignment of prior probabilities - The degree of belief in the hypothesis before considering the observation - is a famously contentious issue within Bayesian reasoning, and there is no way to assign these numbers that's been established as being correct. This was the value of P(Alice will say "yes") in our example above, and I have described it as a personal, subjective probability based on the unique set of background information that a person has. This gives us a ballpark number that we can immediately use, but its imprecise and subjective nature, combined with the human capacity for self-deception, is a cause for concern.

Is Bayesian reasoning still useful in light of these weaknesses? Definitely. It is still far more applicable than propositional logic, and it still tells us, in a mathematically precise way, how to logically incorporate evidence into your beliefs. And often, when there is enough evidence, the specific values of these questionable probabilities turn out to be irrelevant. This is why we often look for overwhelming evidence, beyond any reasonable doubt, before we decide to take action based on a hypothesis. So while Bayes' theorem cannot tell us everything (nothing in this world can), it is a very useful tool for sharpening our thinking and processing evidence to update our beliefs.

In my next post, I will re-cast Bayes' theorem into a different mathematical equation - the odds form - which eases some of the difficulties of Bayesian reasoning. I will use this new form to discuss more examples in Bayesian reasoning.

You may next want to read:
Basic Bayesian reasoning: a better way to think (Part 3) (Next post of this series)
Why are there so few Christians among scientists? (part 1)
How to make a fractal: version 2.0
Another post, from the table of contents