October 2009 Archives

Bayes in the Courtroom

| No Comments | No TrackBacks

New Scientist has just put out an article on Bayesian probability and the justice system.

The article itself is pretty good, and I'll come on to it in a moment, but first I wish to take issue with the quiz at the start. If you haven't already, take time out to do the quiz and read the article, and come back here when you're ready.

Done? Good.

Now, I've previously written a Very Long Blog on Bayes. I like to think I'm pretty on the ball with the subject, and in my opinion the only really hard thing you have to do is being completely sure of what question you are asking, and what information you need to answer it. After that, it's mere number crunching, often of the variety that doesn't even need explicit use of Bayes' Theorem.

The New Scientist article is fuel to my fire. It is written by a journalist, presumably not a statistician, mathematician or anyone else who may have sat through lectures on the subject of probability, but someone who has spent time researching the subject and thinking in depth about it. And yet they cannot correctly author a quiz on the subject.

The article now has a couple of comments, one from me, on issues with the quiz. I must say I don't agree with the earlier commenter, broadly speaking. However, the quiz certainly is flawed.

On a practical matter, take question 3.
A man has been murdered, and various pieces of evidence mean that we can be certain that the murderer had a particular disease.The disease is rare; only 1 in 10,000 people have it.The suspect has been tested for the disease, using a test that is 99 per cent accurate, and the test was positive.What is the probability that the suspect really has the disease?
There are three options really in play - 1 in 100, 1 in 101 and less than 1 in 101. There may be issues of slight rounding errors that the earlier commenter mentioned, but my issue here is all the answers are basically the same - you don't convict someone on evidence as flimsy as 1 in 100 or 1 in 101, and that tiny difference is never a practical one. I answer that question by making approximations in my head that get me to 1 in 100 - marking me down for that in the same way as someone who ends up with the 99 in 100 or even 100 in 100 answer is absurd. But I can live with a computer marking me down for approximating.

On a more significant matter, take question 5.
Two children have died in the same family. Their parents are on trial, accused of murdering them.The defence claims that the children both died of Sudden Infant Death Syndrome, or "cot death".An expert witness testifies that the odds of one child dying of cot death, in a family like the one on trial, is 1 in 8500.Hence, he argues, the probability that both died of cot death is that probability multiplied by itself: 1 in 73 million.What is the probability that the two children did both die of cot death, and thus were not murdered?
What is the question being asked? It is, I would say "Given two dead children that either died through murder or cot death, which of the two actually killed them?"
I don't know if that's actually what the author meant to ask, as their solution answers "What is the chance of two children dying of cot death?"
This does not give you an odds ratio for cot death over murder. It can't - not without you knowing how common murder is? To take extreme and absurd examples, if murder never happens, it was cot death. If every child is murdered before their first night's sleep at home, it was murder. It's only with the knowledge of murder probabilities and cot death probabilities that you can answer the question.

If the journalist writing the article can make these errors after researching it, what hope does the average person have when sitting as a juror in a serious case?

The linked Fenton and Neil paper is excellent though.

It quickly focuses on the fundamental point that two similar sounding questions - that of the probabilities of the evidence given the hypothesis and the hypothesis given the evidence can have very different answers. It goes on to discuss the range of errors that can result from this and related difficulties (and this range of errors is an impressive and scary range). It even covers the good old Birthday Paradox, and shows how that can mislead the court. The technical details of these fallacies the paper covers are clearly presented, and well worth going through.

It's important, even though many readers will recall the issues surrounding cot death cases and other high profile cases, as the authors point out this is still happening, despite well publicised open letters from the President of the Royal Statistical Society. It's a persistent problem that clearly hasn't been put paid to.

The meat of the paper is perhaps the solution offered - that jurors be presented essentially with a black box, or a calculator, that does the complex stuff for them. You just feed in the offered probabilities for the evidence, and see what comes out. I'd like to be optimistic that this would catch on but it seems from the discussion in the paper that this stuff is old technology and it just isn't making inroads. What can be done about that?

I finish with these terrifying quotations of the judge from the Adams trial reported in the paper:
"The introduction of Bayes' theorem into a criminal trial plunges the jury into inappropriate and unnecessary realms of theory and complexity deflecting them from their proper task"
and
"The task of the jury is ... to evaluate evidence and reach a conclusion not by means of a formula, mathematical or otherwise, but by the joint application of their individual common sense and knowledge of the world to the evidence before them"
What kind of judge insists that jurors should use common sense in a situation where common sense is commonly known to be nonsense? And Bayes theorem is not inappropriate and unnecessary for the jury to do their proper task, it is precisely the opposite (however poorly it may have been presented at the trial). It's absurd, and I would think jurors should have a right to expert assistance in dealing with evidence. I hope that becomes commonplace in the future, and miscarriages of justice resulting from incorrect thinking that we are all prone to are avoided as much as possible.


edit: New Scientist have now reworded question 5 to correct it to better reflect the intended answer.

Superstition in the World of Warcraft

| No Comments | No TrackBacks
I've been considering writing this for a while, but a recent jibe at a fellow skeptical blogger made me think about bringing this forward.

I have to confess, I'm a player of World of Warcraft (WoW). For those not in the know, it's a fantasy game played with many other people sharing the game world with you. It has a total population of players rivalling that of a not inconsiderable country - approaching 12 million paying subscribers.

There's an incredible variety of people playing. Naturally, many geeks play, but I've played alongside an elven hunter who has a secret identity of a church minister, and a gnome who when not firing large fireballs into the face of trolls is a civil servant of some seniority. There are doctors, and nurses, journalists, atheist scientists who for some reason have ended up playing priests of all things, and a very large number of obnoxious teenagers.

As I said, it does have a large number of geeks playing, and the game exposes a lot of the mathematics to the player and a number of them engage in significant amounts of number-crunching to improve their play, and scientific experiments to determine how best to melt the faces of their enemies.

However, a lot of the underlying programming is hidden, and not everyone is quite so nerdy. This leads to what I think is an interesting situation.

What we have is an alternative world whose laws of physics are essentially precisely determined - they're embedded in the programming. They're not determinable by the average user but they absolutely exist and are known to the creators. There is also heavy use of random numbers, which, as it is running on computers, are pseudo-random number generators, probably with seeds drawn from suitable entropy pools. Those pools, as a result of the servers running the world having multiple users engaging in many actions per second, are no doubt quite well-fed.

To explain that in more detail, a PRNG takes some number to start with, and uses it to generate a sequence that jumps around in a way that appears random, but is precisely determined by that first value - the seed. To prevent 'attacks' (where you don't want someone to know what your PRNG generates) you need to find a random source for these seeds, which is the entropy pool. A good modern OS will use keyboard pushes and their timings, mouse movements and clicks and so on to produce that first number - it relies on the human interaction to make that first number unpredictable, after which it becomes very hard indeed to predict the others.

Essentially, in WoW, all the luck that you experience is embedded within that PRNG. It's an extremely reasonable proposition that your luck in WoW is entirely determined by these well-established laws of physics combined with the PRNGs and their seeds in some distant server deep in a server room in a far off city. It's quite obviously implausible that anyone could influence their luck in this situation.

But superstitions about this world arise just as they do with the real world. Here's a post from the Daedalus Project, that studies the psychology of these so-called MMORPGs. It discusses the idea that (essentially) PRNGs are seeded by easily controlled ideas - such as the class of the player that first enters a particular dungeon. This is a very odd idea from a computer science standpoint as basically you have to write your own code to seed your PRNG rather than rely on the ones already built into the operating system. Doing this is not only stupid, but it's a kind of stupid that requires extra work. And this belief is held sometimes quite fervently, even when basically God comes along and tells you that's not how he built the universe.

You'll see from the Daedalus Project article that this is by no means restricted to WoW, and some truly elaborate ideas can spring up.

You'll also see a lot of humour about the inevitable coincidences that come up when people are constantly rolling a random number between 1 and 100 to determine who wins some prize or another, but usually this is taken as the actual coincidence it really is, rather than as some supernatural force at work (note, many other pages on the above linked site contain NSFW text).

Here's another excellent blog on the matter. The matter of the dragon Onyxia casting her devastating Deep Breath is a well-known source of peculiar ideas.

However, the idea that there are people who hold superstitions about a world in which the laws of physics are genuinely and absolutely known - at least by someone if not the general population, where there fundamentally is no mystery  (unless some somehow leaks in from our world in such a way as to give them individually some particularly good or bad luck) is to me pretty surprising. I guess it goes to show that peculiar ideas persist even when there is no room for the mystery to support them.

Recycled content: Comets and the LHC

| 1 Comment | No TrackBacks
I work on the Galaxy Zoo project, part of which involves a forum with daily 'Object of the Day' posts. I don't find so much time to post these days, and the users do an amazing job of posting their own content there, but I thought I would reuse a couple of my older posts as blogs. This one seems like suitable content for the themes I tend to touch on here. This one is from 16th September 2008, 6 days after the first proton beam circulation in the LHC.

Today I've gone for a comet:
587731514218577929 is a comet posted by Arralen
 


I've picked this comet as, while it's fairly small, it shows very nicely how the tail of the comet is pointing in a different direction from the comet's motion - you can see the 'trail' of coloured points at the head moves in a different direction to the tail.

As to why I've posted a comet? Well, it's all to do with the LHC. "What have comets got to do with particle accelerators?" I hear you cry.

Well, I was reading a book last night called "The Varieties of Scientific Experience", by the well known astronomer Carl Sagan (who sadly passed on in 1996). It's an excellent read which I'd thoroughly recommend, and has more than a few gorgeous astronomical photos in it to boot.

In it, he tells the tale of Sir William Huggins, an astronomer in 1910. Huggins was amongst the earliest astronomers taking advantage of spectroscopy, and he'd taken spectra of several comets and had found the signatures of a number of organic molecules in them. Amongst those signatures was that of cyanide.

In 1910, it was looking like Earth might just pass through the tail of Halley's Comet. Despite reassurances from astronomers that the amount of cyanide in the tail would make no difference to people on Earth (if the Earth passed through the tail at all!), there was some degree of panic. The Pope issued a statement criticising people who hoarded oxygen cylinders in Rome. Quacks sold anti-cyanide comet pills. A number of people committed suicide.

Needless to say, noone died from being poisoned by Halley's Comet.

People today thought that switching on the LHC would spell the end of the world as well, despite reassurances that this would not happen. While I've heard of no instances of quacks selling anti-LHC pills (I doubt anyone would be gullible enough to think a pill could help there...) there's more than a few parallels.

And needless to say, noone died when the LHC got switched on.

Distances across the universe

| 2 Comments | No TrackBacks
I promised at some point I'd blog on space, and now I am.

I'm going to talk a bit about how big the universe is - it's what's in my blog title, and if Simon Singh hadn't got more important things on his plate we might still be enjoying his tale about Katie Melua, which I'll come on to later.

So, how big is the universe? First off, to answer this question we need to know what we mean by the universe. The universe - all the things that exist that are physically located in the same connected piece of space as us (so I'm ignoring parallel universes or anything along those lines) - might be enormous. It might even be infinite, we don't have a way of telling.

A more sensible question than simply 'how big is the universe?' (which is basically unanswerable) is 'how big is the observable universe?' - broadly speaking this is how far away is the most distant thing we can, or could see.

Lets go back to Simon Singh's little tale of Katie Melua. As Simon explains here, in a somewhat less famous Guardian article, Katie Melua sang
We are 12 billion light-years from
the edge,
That's a guess,
No one can ever say it's true,
But I know that I will always be
with you.
Simon rightly pointed out that it's not a guess, and it's not 12 billion light years. The distance to the edge (which is in no way a physical edge, just purely a matter of perspective in the way the horizon you see out of your window is) is determined by the age of the universe - it's literally as far as light has had the chance to travel since the universe began. Nothing can go faster than light, and light can't have been going anywhere before the Big Bang. The age of the universe is not 12 billion years, but 13.7 billion years.

There's actually an extra caveat that for the first 370,000 years the universe was something like a thick fog - light couldn't get anywhere without hitting something. So the distance to the edge is a little further than the most distant thing we can see (which is the CMB, currently being observed in obscene detail by the Planck satellite).

The problem is, that when Simon rewrote the lyrics he said:
We are 13.7 billion light-years from

the edge of the observable universe,

That's a good estimate with

well-defined error bars,

Scientists say it's true, but

acknowledge that it may be refined,

And with the available information, 
I predict that I will always be
 with you

Now the tricky thing is, and this may confuse you, the fact that light has travelled for 13.7 billion years does not mean it has covered 13.7 billion light years. The thing is, distance in cosmology is confusing. The sole reason it's confusing is that the universe is an ever-changing place which has been expanding since... well.. the year dot.

Since the universe first became transparent to light, those 370,000 years after the beginning, the universe has become about a thousand times larger. That means that the very first light year the light crossed on its way to us is now a thousand times larger too. While the light has only travelled across 13.7 billion light years, the distance between us and where it left from is now about 46 billion light years. This is what cosmologists call the line-of-sight comoving distance. There's also a transverse comoving distance which is kind of the same but for separations across the sky from our point of view, rather than along our line-of-sight. It sounds crazy that they can be different, but trust me, this is only going to get worse, and on the upside it turns out our universe seems to be flat - a technical term meaning that the universe is flat. This means that the two comoving distances are luckily the same and we can ignore the distinction for the most part.

Now that's all fine and dandy, but we can't really measure comoving distances. We can only measure things about our universe, formulate a model of it, and work out given that model what the comoving distances are.

So lets think about the ways in which we might figure out distances. The obvious way, for an astronomer, is to use a luminosity distance. Luminosity distances basically involve looking at a source emitting a known amount of light, and from the amount of light you actually see figuring out how far away it is. Obviously, further away things are fainter. The problem is, as the universe expands, it stretches light and makes it lose energy (this may cause you to jump up and shout "Conservation of Energy!" but if you do this, Einstein magically appears and subjects you to a six month PhD level lecture course at the end of which you find it was all ok after all). Also, it means that as a source emits photons of light, the time between each photon gets stretched out, and the source looks that much fainter. This means luminosity distances are much bigger than 'real' comoving distances by quite a chunk. The universe in luminosity distance is not 46 billion light years from here to the edge but a boggling 50 trillion light years.

But this isn't the only way to measure distance. One we all naturally learn when we're extremely tiny is that distant things look smaller. If we do things this way we get the angular diameter distance.

Except they don't. Not cosmologically. The problem is that the distant universe is the universe in the distant past, when the universe was small. In some sense, the universe is smaller on the outside than it is on the inside. But it still has to go round us all the way. This screws completely with how things get smaller as they get more distant, and above a certain distance, which is not actually tremendously far on the cosmological scale, things start getting bigger as they get further away. This happens for things where light has been travelling for about 10 billion years to get here.

We can of course use light travel time as a distance measure too, as Simon did, but it's not one we'd generally use and it misleadingly represents the true scale of the universe today. It's really really big - 90 billion light years from one side to another.

Now I'll finish up with a bit of media misreporting. Fortunately, this bit of media misreporting was an honest mistake, not politically motivated, and noone stopped taking their antivirals as a result. You may, if you are unlucky, see claims that the universe is 156 billion light years across. This is wrong, and this is how I think the story came about:

There's a paper by Key et al that looked at the patterns of fluctuations in the cosmic microwave background - that most distant of things we can observe, and tried to see if patterns on one side of the universe matched up with patterns on the other. This is the sort of thing that might happen if the universe was joined up on opposite sides somehow - that if you went in one direction long enough you'd end up back where you started, or at least a hell of a lot closer than you should have done. They found there were no such patterns larger than 20 degrees across (and couldn't make strong statements about smaller ones, essentially). You can use this maximum size to work out a scale for the universe repeating, which will generally be a bit smaller than the 46 billion light year scale I mentioned above. This slightly smaller scale I think got doubled, to go from a radius to a diameter, and then a journalist accidentally thought this diameter was a radius, and doubled it again.

The universe, basically, isn't as big as 160 billion light years across, but it's not as small as 13.7 billion light years. It certainly is big though, and it's without a doubt astonishing.

And lastly, don't worry too much about the different ways to measure distance. The chemist is sufficiently close that you won't notice the difference in day to day life!

Westminster Skeptics in the Pub

| No Comments | No TrackBacks
I will talk about space or something soon and let this blog live up to its title a bit, but right now I'm going to talk about last night. And briefly the night before, and this morning while I'm at it.

Skeptics in the Pub (SITP) is an event where a bunch of likeminded people get together in the pub to listen to a speaker on, obviously, skepticism. It's grown and spread to multiple cities in the UK (newest to come along is Cambridge, so new it hasn't yet happened) and is now international too. Keep an eye out for one happening near you (or do what my friend Andy did and hold one yourself).

Last night was the inaugural Westminster SITP, begun in order to focus more specifically on matters of the media, policy and legal matters, and it was a cracking event. Such matters are not those that most immediately interest me (except media perhaps) but there's no doubt that they're amongst the most important.

There are current events like the ongoing efforts to reform libel law (and if you haven't signed that appeal yet, do that now), Simon Singh's libel case (congratulations to him for this morning's success - he's been granted leave to appeal), past efforts to silence speech on important issues of health (for example the Rath/Goldacre libel case) and a couple of nights ago we saw (on an issue beyond what I would normally consider the domain of the skeptic) the remarkable story of Trafigura's injunction against the Guardian, all of which highlight the timeliness of Westminster SITP (Newsnight clearly agreed, reporting on it as part of their report on Trafigura and the associated legal issues - catch it on the iPlayer if you can - for about a week only, 37 minutes or so in). As has been said by others many times too, it's hard to judge the impact of bad laws on the stifling of important debate - not only do people get sued but people have to edit articles or even choose not to risk saying things in the first place out of fear of legal action.

Nick Cohen told us we have to keep banging on about this stuff, even if we end up boring people silly. Only by grinding away, campaigning against laws that stifle debate, can we actually end up changing the law.

Walking back to the station last night past the Houses of Parliament were an inspiring sight, but the crowded upstairs little bar I'd spent the evening in was full of people far more inspiring than the most impressive and historical piece of architecture.

Keep an eye on events like these - I think they're going to prove very important indeed - for everyone.

Ethics and Science

| No Comments | No TrackBacks
Few things limit science, but those things that do limit are significant.
First off, science can only ask questions about the observable. It can answer any question that is formulated sufficiently clearly in terms of the observable, but it can't answer things outside that. It can tell you if there's a unicorn at the bottom of your garden, but only if it's a tangible unicorn with a physical presence.
Secondly, science can only answer those questions we're technologically able to. If something is in principle able to be answered but we're not technologically able, we're out of luck. We can't answer some questions about string theory perhaps, because even the LHC doesn't reach high enough energies.
Thirdly, someone has to pay for it. If we don't have the money or are unwilling to invest the money, research doesn't get done.
Lastly, science is constrained by ethics. We might get interesting research out of conducting horrifying tests on other people, but we don't do that research because the ethics and morality of it are just plainly wrong.

This is something that Ricky Gervais, as reviewed here recently, touched upon. Science can tell you if you can build an atom bomb, and if you have the money and the technology you can build an atom bomb, but only ethics can tell you if you should drop an atom bomb (usual answer: no).

I'd like to talk about a controversial crossover between science and ethics. I'm not, however, going to discuss the ethics. The area in question is animal rights.

A lot of medical research especially is conducted on animals. There are two questions that arise from this: is medical research using animals effective, and is medical research on animals ethical? These questions are often conflated in ways both confusing and downright incorrect.

The latter is a question I will not touch upon, except to say that I think if you feel a certain kind of medical research is unethical then that's the end of the story - you should not support it and you should campaign against it (ethically - you should be writing to your MP, holding peaceful protests but not sending letter bombs - for most people this is common sense).

So, what about whether medical research on animals is effective?

First off, I want to make it clear that this is not a question for government. I would be bemused, to say the least, if a government came into power and instead of saying "Here's £x million for cosmology research", "Here's £x million for cosmology research but don't spend it on looking at galaxy clusters, just look at supernovae and weak lensing and the CMB. Galaxy cluster methods are not a good research technique".

I don't think galaxy cluster methods are a bad research technique, I'm just picking them out at random. My point is that it's crazy to have a government legislate on the method of research. Legislating on funding in general areas is one thing, but there's a reason we have research councils that operate by peer review to allocate funding - it's just daft to have laws passed on how to do science at that level. The fact that your peers say "Professor Blogs can't get good value for money using this research technique" is a perfectly adequate control. It's quite appropriate that government places ethical and other controls on science but the appropriate way to pick between good and bad research approaches is at a more finely grained level than the legislature.

I really think it's important that when someone comes along saying we should have a law against animal testing in medicine because animal testing doesn't work that we ignore them. The peer review process works just fine for throwing out bad research techniques. Generally speaking, the inexpert public need not get involved unless there's clear reason to think this method is failing.

If, however, we think animal testing is unethical, we shouldn't do it. End of story.

Why am I launching into this? The Safer Medicines campaign is why. This group was formerly Europeans for Medical Progress. They say they are an "independent patient safety organisation of doctors and scientists whose concern is whether animal testing, today, is more harmful than helpful to public health and safety". If they're doctors and scientists and they're supporting what this group does, they need to take a look at how they present their arguments.

Look at this from their front page - this is the first three of five bullet points:
  • Six young men at Northwick Park hospital were nearly killed by a drug which they were given because it had been 'proved safe' in monkeys
  • Arthritis drug Vioxx - the greatest drug catastrophe in history - killed up to 140,000 people after being 'proved safe' in animals, including monkeys
  • 92% of new drugs successful in animal studies go on to fail in clinical trials, as at Northwick Park - sometimes injuring or killing volunteers and patients
Do you see a problem? This ties in somewhat to my earlier epically long blog post on Bayes. Six young men at Northwick Park hospital were nearly killed by a drug. Are we told how many took the drug? No. So how do we know how safe the drug is? If six took it it's bloodly scary. If ten million took it (unrealistic at one hospital, but for the sake of argument), it really isn't so scary after all.
140,000 people killed by Vioxx after being proven safe in animals. Do we know how many other people took Vioxx? No. Do we know anything about the animal studies? No.
92% of new drugs successful in animal studies go on to fail in clinical trials. So what? We don't know anything about how many drugs fail animal trials that would fail clinical trials, or how many drugs pass animal trials that pass clinical trials, or how many fail animal trials that might pass clinical trials. We have one number whose sole apparent purpose is to make you think animal studies are not good scientific evidence while providing absolutely none of the requisite other information to tell you if this is the case.


Here's another example from the same group back in 2005, writing in New Scientist. Note the same kind of errors and the statement: 
I am the director of Europeans for Medical Progress (www.curedisease.net), an independent organisation of scientists whose concern is patient safety, not animal welfare."
I suppose I have to take them at their word. They say they're not interested in animal welfare, and they're an organisation of scientists. One that, I believe, keeps failing to present a convincing scientific case.

Let me be clear about how I feel about animal testing - there are two independent questions:
  • does it work?
  • is it ethical?
They're independent. They're also not questions with yes or no answers - it depends on the precise nature of testing.

But believe me on this - if you ever see someone claiming to be only talking science but who overgeneralises on the divide between animal testing and alternative methods - take a second look at what they're saying. I'd put money on it being bad science.

It's important to have the arguments over animal testing clearly laid out. It's a deeply ethical issue, and if you can make the ethical case you have it totally won, and I would far sooner see the ethics debated up front as the important issue it is, and the efficacy debated clearly and not misrepresented.
I'd like to talk about the Million Dollar Challenge, the prize offered by the James Randi Educational Foundation for demonstrating supernatural abilities under controlled conditions, and use this as an excuse to discuss statistical means of measuring belief (by which I mean how you should translate evidence to belief, not the unjustified belief which would be called faith). I'll show a mathematical foundation for certain skeptical maxims as we go forward.

The Million Dollar Challenge (MDC) requires, as I understand it, only a few things from an applicant. A statement of what you can do, a statement of under what conditions you can do it, and a statement of how well you can do it. From there, a decision can be made as to whether you are actually claiming something suitably outside the realms of the accepted to be eligible for the prize, and between the applicant and the JREF a scheme for a controlled test can be devised (I may be slightly out in minor details there, but I think that covers the important points for this blog post).

What interests me here are the statement of how well you can do something, and the decision on how well you actually have to perform in the test to win the prize, and what impact the actual winning of a prize should have on our skeptical worldviews.

Lets take a step back, and consider some probability theory. This is going to be a brief derivation and discussion of Bayes' Theorem. We start from a result of conditional probabilities -
and (just by switching B and A)
.
This says simply that the probability of one thing (A) happening given that another thing (B) has, or will, happen (which is the P(A|B) term) is the probability that both happen (the P(A and B) term) divided by the probability that the other thing B will happen regardless (P(B)). If we rearrange the second and insert it into the first to get rid of the P(A and B), we get Bayes' Theorem:
.
This is a tremendously useful result, and you can easily find tutorials such as this one that explains how it works in settings of conditional probability such as medical tests for a disease (I recommend taking the time out to read that, at this point). In that case, you might have a medical test which returns some result, and you want to know the probability that you have the disease given that your test came back positive. You can use Bayes' Theorem to work that out from overall probabilities that you have the disease, that the test comes back positive regardless of if you have the disease, and the probability that the test comes back positive given that you do actually have a disease. That's a bit of a hefty bunch of probabilities and 'given's flying around, so, as I said above, I recommend again the previously linked tutorial.

This sort of thing is often linked to when someone comes along asking what all the fuss is about Bayes' Theorem. That's fine, it explains simple cases when Bayes' Theorem can be applied, but it doesn't at all explain why something so simple should cause a fuss.

To explain that, let's move on to a discussion about hypothesis testing. You may have studied hypothesis testing at school, if you happened to do some relatively advanced courses in mathematics. The standard procedure runs broadly along these lines - you set up two hypotheses. Let's say they are like this:
The Null Hypothesis: A dowser (who we will call Fred) cannot determine which of two boxes a bowl of water is in better than chance.
The Alternative Hypothesis: The dowser can determine which of two boxes a bowl of water is in.
You then go and collect some evidence by performing suitably controlled trials on Fred, you find he succeeded in x trials out of a total of N and you calculate this quantity:
The probability that the dowser could succeed x or more times out of N purely by chance.
If this quantity is too small (commonly 5% or 1% or some similarly small number) you reject the Null Hypothesis and accept the Alternative Hypothesis.

The problem with this is slightly subtle, but crucial. It's asking the wrong question. It's asking "If the dowser has no ability, what is the probability that he would do at least as well as he did?"

If you think about that for a moment, you'll see that's really actually a bit of a dull question and has rather an uninteresting answer. What we actually want to know is "Does he have a supernatural ability to find bowls of water that are obscured from his conventional senses?"

In other words, we want to know
but we've calculated from our hypothesis testing
(note that I'm phrasing it as 'no dowsing ability' in both cases to make things simpler, but P(no dowsing ability | success) = 1 - P(dowsing ability | success), as he can either dowse or he can't, so translating between the two isn't too difficult).

If we look at Bayes' Theorem, we can see how we change from one to the other:
.

More generally when applying Bayes' Theorem to this sort of problem, we talk about this:
- the probability of a hypothesis H being true given some evidence E (P(H|E)) is equal to the probability of getting the evidence given that the hypothesis is true (P(H|E)) times the probability the hypothesis is true (P(H)) divided by the probability of getting the evidence regardless (P(E)).

This is great - we have a way from going from the answer to the wrong question ("could he succeed if he weren't a dowser?") to the answer to the right question ("is it likely he has a paranormal ability?").

However, there's a couple of catches when you do this, and it's these catches that are the source of the fuss surrounding Bayes' Theorem, but also the source of some interesting points about tests like the MDC.

Firstly, whereas our original example from our tutorial of having tests that sometimes work or sometimes don't, and of having people that may or may not have diseases, it's quite clear what the probability means. It fits naturally in with everyone's quite uncontroversial ideas of the actual meaning of the word 'probability' - you randomly select something from a population and the probability tells you the proportion of times you do this that you get a certain result.

However, in our case, we've now gone to a different thing - we've got a probability for "Fred has a paranormal ability". This wasn't drawn from a population about which we can say those kinds of things - Fred either definitely is or definitely isn't paranormal. It's not immediately clear that we can use probabilities when we talk about this discussion. This comes up in my own field of cosmology where you might actually be asking questions about the entire universe, and then it's really not clear that there are multiple universes with population distributions from which we can draw results (now that's an understatement).

It's actually possible, however, to demonstrate that we can use probability as an expression of our degree of belief in something, in such a way that the mathematics is completely consistent with the mathematics of ordinary probabilities that deal with the frequency of events, and that operate in exactly the same way when we do deal with those situations. This, fundamentally is the difference between two schools of thought - the frequentist and the Bayesian. Frequentists think probabilities only work when you deal with frequencies of events drawn from a population. Bayesians hold that probabilities can be used considerably more generally. Note that frequentists do not claim that Bayes' Theorem is wrong - it clearly works just fine in our medical testing example - but that Bayesians (such as myself) misapply it. This is a philosophical, very interesting, and often heated discussion, but for the purposes of this I'm just going to assume that being a Bayesian is right, and we can use probabilities to express our degree of belief in something.

Once we accept that, we can go on to our second big problem - what about those two other terms in Bayes' Theorem - P(H) and P(E)?

Lets start with P(E) first, as while at first sight it is hard to calculate a probability of getting some evidence regardless of the actual fact of the situation, it turns out to be easy to do away with. To do this, we note that either H is true or it isn't.
.
If we expand those two out using Bayes' Theorem we get to

This is a very vague statement. It might mean that Fred's dowsing ability is so weak that he spots bowls of water in boxes with a 50.1% rate of success, or a 70% rate of success or even 100%. It's a complete and continuous range of hypotheses which cover the full range of Fred's ability to exceed chance at his task. Fortunately, the framework we've been building up allows us to deal with this. We can basically use calculus to deal with this complete range of possibilities and compare them to a hypothesis like "Fred has a 50% chance of success", we can calculate overall probabilities for the two and look at the ratio, in a process called Bayesian model comparison - broadly speaking take the ratio of the two probabilities for H (in all its possibilities) and our other idea that Fred is deluded and he can't do better than chance.

Two very interesting things emerge from this, two principles which are very well-known to the skeptic. Firstly, because Fred might be succeeding with almost any rate of chance we have to work out the odds for all of the possible rates of chance and kind of average over them. This weakens the relative strength of this hypothesis - essentially because it has a free parameter - just how good Fred is. This is Occam's Razor - the theory is penalised because it is more complex. In other situations where the hypothesis might have other complicating factors it would be penalised even more. In contrast, the idea that Fred succeeds at a rate of 50% rate is a strong hypothesis - it's easily falsified as with sufficient evidence pointing at, say, 51% success we'd have to throw it out, but our original idea for Fred being paranormally endowed would cover this possibility. It's a simple and completely natural consequence of Bayesian ideas. It's a bit more rigorous and mathematically framed than many examples of Occam's Razor (how many free parameters does an invisible unicorn introduce?) but in this circumstance it's a powerful version of it that is fundamentally set out in a way that allows us to quantify how simple we should keep a theory in the face of evidence for complexity.

Secondly, lets look at P(H) - our prior probability that a hypothesis is true. This expresses our prior belief in an idea. If an idea is an extraordinary claim then P(H) is naturally a very small number, and to succeed against our more conventional idea it needs to have a P(E|H) that is really really big - it needs extraordinary evidence. Hence a simple and completely natural route to the idea that extraordinary claims require extraordinary evidence.

Now lets go back to the MDC and what all this means for it.

Suppose Fred comes along and applies for the MDC. He fills in the application form and he reaches the question "What is your success rate?". He now has a number of options available to him. He might think from his previous experience that he succeeds 90% of the time, or maybe he doesn't really have a good idea of what he thinks. What we should be doing is encouraging him to make as strong a statement on this front as he can. Why? Because it reduces the amount of evidence he needs to produce to demonstrate his claim is true. It makes for a stronger statement. However, we should be clear that if he isn't quite sure he should suggest a range of success rates and we can marginalise over these - he'll need to produce more evidence to demonstrate his claim as a result, but he's more likely (in his opinion) to have covered the actual level of his ability. Note that this means that if he says he succeeds 90% of the time, but he succeeds at 70% in the test, he doesn't win. He made a claim and it wasn't true, even if it turns out that he apparently defied the laws of nature at the time.

One might compare this in a more extreme example to someone claiming to be able to dowse, and promptly proceeding to undergo the test while flying over the boxes Superman-style. He's demonstrated a paranormal ability but outside the remit of the test. Of course if he did this, he'd be in a much better situation to reapply having adjusted his claim to his actual dowsing success rate of 70%, or having submitted a claim not to be able to dowse but to be able to fly through the air faster than a speeding bullet. Similarly but less extremely, suppose he consistently performed worse than chance? In that case it's certainly more likely that he was simply unlucky than he has some negative dowsing ability (which would be just as against our expectations as an effective dowsing ability) so we shouldn't be prepared to hand over a million dollars for that, even if its contrary to our expectations from chance.

On top of this question, we need to decide what level of evidence should be expected from him, and should this level of evidence be the same to win a million dollars as it is to actually convince the unbeliever that he really can do something amazing.

I would argue that these should differ. For one thing, it would give a terribly negative appearance to the skeptical community if we announced that someone had to provide absolutely astonishing levels of evidence to win the million dollars. They could, probably rightly, claim that we've set the bar unrealistically high. It becomes especially problematic if a frequentist comes along and explains the nature of more conventional hypothesis testing, and then they might claim that by being Bayesian we're making their life harder. And we would be.

So we should set our P(H) for winning the million dollars to be pretty low, and we might even take the approach of throwing this Bayesian approach out altogether - even though we may lose the benefits of encouraging the claimant to make a strong claim from the start.

However, we (perhaps individually) should consider what P(H) we should set in advance - what evidence we demand to actually change our mind. I would argue that this should be much much beyond that needed to win a million dollars. We might consider it much more likely that something else happened - Fred got exceptionally lucky, or Fred managed to outwit the Amazing Randi and his colleagues (practically impossible, but arguably far more likely than really being able to dowse). From this point of view winning the MDC is not something that should convince you that paranormal abilities exist. It's a strong indication that scientists need to jump on the case and find out exactly what's going on, but it is very much justifiable to be far harder to persuade than the non-skeptical community might like.

The MDC clearly has great value beyond simply assessing claims - it's about highlighting the lack of evidence, and highlighting the importance of scientific testing, and highlighting the unwillingness of many people claiming unusual abilities to subject themselves to it, and I think it makes a stronger point when we are less demanding.

But for more significant decisions we should be more willing to throw out weak claims or weak evidence. We shouldn't need to argue about a tiny but statistically significant effect above chance for a homeopathy study because it's tiny, too small to account for any effect a homeopath might claim to see in their clinic and so small as to be medically worthless. And we should be pushing people to make strong claims from the outset. If a strong claim is true, its strength makes it easier to find the evidence, and if a strong claim is false, its strength makes it more easily falsified. It's ultimately of benefit to both sides.

Science - Ricky Gervais

| No Comments | No TrackBacks
"I said this would be about Science. I lied."

Science, the latest stand-up show from Ricky Gervais really doesn't have much science in it. Since this blog is of a science-y bent, this means my review will be relatively short.

Ricky Gervais is of course a world-famous comedian. His shows are award winning and successful in an international context. It's very unlikely that you don't already know if Ricky Gervais is your kind of comedy. This means my review will be relatively short.

Also, Ricky Gervais is a world-famous comedian and his shows are award winning and successful in an international context. It's very unlikely that if you want tickets and don't already have them that you'll be able to get them. This means my review will be relatively short.

It's certainly true to say that there's little science in Science. There is perhaps a brief run on morality and science and that's pretty much it until the encore, which began with the above listed quotation. But from the point of view of most readers of this blog, the interesting thing follows immediately on from that - it's not so much about science as rationality and non-rationality, and there's plenty that he launches into from there, which is certainly welcome and is as funny as you might expect it to be from Ricky.

Those readers who made it to Robin Ince's Nine Lessons and Carols for Godless People last Christmas and caught Ricky's material then will certainly recognise much of the material in this latest tour, so you need not feel guilty about missing out too much.

Ricky's comedy is quite focussed on mildly unpleasant characters with whom you might sympathise, and this continues in this tour. It's hilarious, but frequently uncomfortable, and ultimately thoroughly brilliant. It's not all based around such routines however - the destruction of the story of Noah's Ark being a notable example. Noah's Ark is a ridiculously easy target for comedy, being plausible in almost exactly zero respects. However, the take based upon an aged Sunday School text is highly enjoyable, even when we already know the implausibility of the story and the sheer malevolence of God around that part of the Bible.

It's first rate comedy, and has a rationalist tinge that will appeal to many of us, but it's not one to see hoping to find pure science or rationalist comedy - for that you'll have to go back to Ince's events I'm afraid.

Highly recommended, but if you haven't got your ticket you'll probably need to wait for the DVD.
Reblog this post [with Zemanta]
There are, unfortunately, times in life when one is faced with evidence that is of a pretty shoddy nature. Evidence that superficially appears to support one claim, but is of such a low standard that it is of extremely limited value in assessing whether that claim is true.

I'd like to talk about perhaps the most extreme examples of that, and show that it's remarkably easy to come up with evidence for really nutty claims as a demonstration of just how bad evidence can get and still count technically as evidence.

In the case of the Raven Paradox, or Hempel's Ravens (don't click there just yet - it'll defeat the point of the explanation I'm about to launch into), one considers the claim:
'All crows are black'
This is a fairly strong claim, but not a terribly remarkable one. In order to determine if this is the case, consider this Venn diagram of all objects in the universe:
crowvenn.png
If we wish to determine if all crows are black, we can do one of two things. Firstly, we can check all crows to see if they are black - checking everything in the pale blue circle to make sure that the pale blue non-overlapping region of the diagram is empty. Secondly, we can check all not-black things and make sure they are not crows - this examines every object not in the dark left hand circle and makes sure that the pale blue non-overlapping region is empty that way.

The first method generally has advantages over the second in this case, as there are many fewer crows than things that are not-black, and we might expect to more rapidly find the falsifying case of the not-black crow that way than by looking through all the not-black things in existence. However, the two methods do check the same logically equivalent statements:
'All crows are black' and 'All not-black things are not-crows'

In many cases, when we set out to test a claim, we set out with the hope to perform a similar procedure. Usually we don't deal with things as 100% one way or the other as 'All crows are black', we might deal with something more like '95% of crows are black'. In that case we can either check all crows to see if 95% of them are black, or check all not-black things and see if the total population of things within that set that are crows is larger than 5% of the population of all crows. Or more realistically, we go and sample the populations in such a way that we can be reasonably confident that we're doing a good test of things, and use statistics to put a measure of how confident we are on that.

There happens to be a peculiar problem when you consider what happens as you're working through the process of checking and classifying objects to determine the truth of a statement. Consider if I'm checking our original statement 'All crows are black'. I start by picking up a crow and inspecting it, and saying "It's black, therefore I can be more confident in the statement that 'All crows are black'". In other words, a black crow is a confirming instance of the hypothesis 'All crows are black'.

However, what if I'd been looking at it from the other way of doing things? Suppose I find a purple cow. Well, a purple cow is a not-black thing, and it's a not-crow. That lends credence to the idea that 'All crows are black' because if I keep doing that and checking not-black things I can find out if the hypothesis is true. A purple cow is also a confirming instance of 'All crows are black'.

But wait - a purple cow by exactly the same logic is also a confirming instance of 'All crows are white'. The same evidence is lending support to a completely contradictory conclusion. How can this be?

Well, essentially, I take the perspective that the purple cow is lending a tiny but positive amount of credence to both statements, mainly because the population of not-black not-crows is so overwhelmingly large and shares a substantial overlap with not-white not-crows. The purple cow really is actual evidence for a huge array of statements about the universe. But noone would ever say in day to day speech that it's evidence. It's evidence in homeopathic quantities. You'd be nuts to use it.

So, what's my point? My point is that it's possible to obtain evidence for something that while technically evidence, it can be of such overwhelmingly poor quality that it does not constitute evidence in any meaningful way at all. And it doesn't have to be quite as shockingly obviously bad as Hempel's construction in the field of ornithology, and it's perfectly reasonable to assess a real positive amount of evidence in favour of something and declare it effectively zero.

By way of an introduction

| 3 Comments | No TrackBacks
Welcome to my brand-spanking new blog! In fact, this blog is so new that only one other person knows the address at the time of writing, and I haven't finished the layout or given it a name yet. However, after battling with various blogging systems and technical issues, I need some time out from the hacking and felt that writing the first post would make for a pleasant change.
So, for a first post, I thought I should tell people that have stumbled here blindly who I am, what interests me and what this blog will be about. I'll also tell you a bit about why I started a blog, why I started it this month, and what you might expect to find here in coming weeks.
I'm Edd, and I tend not to bother being terribly anonymous on the internet these days. A Google of my name will no doubt tell you that I work in cosmology, although having trained as an astronomer I am increasingly working on the computing side of things. Science is a passion, and so is telling everyone why it should be their passion too.
I'm also a skeptic. And that is why I have started blogging now. I'm a supporter of the James Randi Educational Foundation (JREF) amongst other related groups, and amongst the things the JREF does is the organisation of the Amazing Meetings (TAMs), about which I shall blog in a future planned post.
JREF's new president, Phil Plait (of Bad Astronomy) has said how TAMs reinvigorate - this was true at the first TAM I was at for me (TAM 6), but it is even more so after the most recent one I attended - TAM London, just this weekend. Not only did I have certain very kind people tell me I would be an excellent blogger, but right now, efforts to support science are needed more than ever in the UK and not just in the area I linked there, and I'd hope maybe one or two people might find out about these important issues as a result of this. I've managed to build up a short list of things I do wish to write about in much more detail than the 140 characters that Twitter provides - some related to bad science, many related to good science and maybe the odd more light-hearted post as well. Also, if I ever have another more practical idea then I may choose to blog that too, with an already established place to put it.
I will write more on TAMs and the marvellous people I have met through this area, alongside other topics of science, astronomy and frankly anything that takes my fancy, but for now, thanks for reading and I hope you'll be back for my next blog post.

Flattery


"neat blog"
"Don't ever stop"
  - Rita

"you are totally ill informed"
"u r probably ignorant on most things"
"yr blog cant be worth reading"
  - @angelneptustar

"The really nice thing about your blog is it always stretches my brain and makes me feel I am still learning & on the up :)"
  - Alice

"More frightfully interesting stuff... Any self-respecting geek should be reading this."
  - Lenny

"I'm not always sure I totally get what you're on about, but I like reading your blogs anyway - mainly because I know it's stuff that needs to be said."
  - Hanny

Pages

Categories

About this Archive

This page is an archive of entries from October 2009 listed from newest to oldest.

November 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

About the author

Edd works somewhere between astronomy and computing and has a general interest in science, skepticism and other related topics.

Opinions expressed in this blog are my own and not those of my employer.