Skip to main content Skip to navigation


<?xml version="1.0"?>

<!DOCTYPE TEI.2 SYSTEM "base.dtd">




<title>Sources of Variation I</title></titleStmt>

<publicationStmt><distributor>BASE and Oxford Text Archive</distributor>


<availability><p>The British Academic Spoken English (BASE) corpus was developed at the

Universities of Warwick and Reading, under the directorship of Hilary Nesi

(Centre for English Language Teacher Education, Warwick) and Paul Thompson

(Department of Applied Linguistics, Reading), with funding from BALEAP,

EURALEX, the British Academy and the Arts and Humanities Research Board. The

original recordings are held at the Universities of Warwick and Reading, and

at the Oxford Text Archive and may be consulted by bona fide researchers

upon written application to any of the holding bodies.

The BASE corpus is freely available to researchers who agree to the

following conditions:</p>

<p>1. The recordings and transcriptions should not be modified in any


<p>2. The recordings and transcriptions should be used for research purposes

only; they should not be reproduced in teaching materials</p>

<p>3. The recordings and transcriptions should not be reproduced in full for

a wider audience/readership, although researchers are free to quote short

passages of text (up to 200 running words from any given speech event)</p>

<p>4. The corpus developers should be informed of all presentations or

publications arising from analysis of the corpus</p><p>

Researchers should acknowledge their use of the corpus using the following

form of words:

The recordings and transcriptions used in this study come from the British

Academic Spoken English (BASE) corpus, which was developed at the

Universities of Warwick and Reading under the directorship of Hilary Nesi

(Warwick) and Paul Thompson (Reading). Corpus development was assisted by

funding from the Universities of Warwick and Reading, BALEAP, EURALEX, the

British Academy and the Arts and Humanities Research Board. </p></availability>




<recording dur="00:36:34" n="5384">


<respStmt><name>BASE team</name>



<langUsage><language id="en">English</language>



<person id="nf0273" role="main speaker" n="n" sex="f"><p>nf0273, main speaker, non-student, female</p></person>

<personGrp id="ss" role="audience" size="l"><p>ss, audience, large group </p></personGrp>

<personGrp id="sl" role="all" size="l"><p>sl, all, large group</p></personGrp>

<personGrp role="speakers" size="3"><p>number of speakers: 3</p></personGrp>





<item n="speechevent">Lecture</item>

<item n="acaddept">Statistics</item>

<item n="acaddiv">ls</item>

<item n="partlevel">UG/PG</item>

<item n="module">Health and Disease in Populations</item>




<u who="nf0273"><kinesic desc="projector is on showing slide" iterated="n"/> okay you might notice a slight difference between <pause dur="0.2"/> # this week and last week that's because i am not <gap reason="name" extent="2 words"/> <pause dur="0.8"/> i'm <gap reason="name" extent="2 words"/> i'm a lecturer in medical statistics and i'm doing today's lecture and the next Sources of Variation lecture <pause dur="1.7"/> # first of all a couple of things i realize you've got a lecture at quarter past one <pause dur="0.5"/> so i will be trying to keep to time <pause dur="0.2"/> # <pause dur="0.7"/> bear with me <pause dur="0.7"/> the other thing to say <pause dur="0.2"/> is that <pause dur="0.2"/> the # guest lecture from <gap reason="name" extent="2 words"/> <pause dur="0.2"/> is being swapped with Sources of Variation Three <pause dur="0.8"/> so <gap reason="name" extent="2 words"/> was on the twenty-seventh and <trunc>sorc</trunc> Sources of Variation was on the twentieth <pause dur="0.4"/> they're now going to be swapped so <gap reason="name" extent="2 words"/> will be on the twentieth and Sources of Variation Three will be on the twenty-seventh <pause dur="2.3"/> okay so <pause dur="0.5"/> sources of variation <pause dur="1.9"/><kinesic desc="changes slide" iterated="n"/> hurray <pause dur="0.4"/> the slide changed over <pause dur="1.0"/> the informal <pause dur="0.7"/> objectives of this lecture are to enable you to distinguish between <pause dur="0.2"/> observed data and underlying tendencies which give rise to observed data <pause dur="0.9"/> and to understand the concept of variation

and randomness <pause dur="1.5"/> # you have some examples in your lecture notes on page a hundred <pause dur="0.6"/> # for example we might observe the proportion of people with diabetes in a sample <pause dur="0.6"/> and that would give us an idea of the underlying <trunc>prev</trunc> prevalence of diabetes in a particular population <pause dur="1.7"/> another example would be breast cancer survival we might observe the proportion surviving who were treated with tamoxifen <pause dur="0.6"/> whereas what we're actually interested in is the effect of survival on treating everybody <pause dur="0.2"/> with tamoxifen <pause dur="0.4"/> if they have breast cancer <pause dur="2.7"/> # <pause dur="0.5"/> so that gives you an idea <pause dur="0.6"/> quickly of the difference between observed data and underlying tendencies <pause dur="0.3"/> which give rise to data <pause dur="0.8"/> objective two of understanding concepts of sources of variation and randomness <pause dur="0.6"/> i would hope that <pause dur="0.2"/> most of us have a <pause dur="0.7"/> fairly good appreciation that we're all different without really thinking about it <pause dur="1.1"/> the reason this kind of thing is important to take into account <pause dur="0.2"/> # <pause dur="0.5"/> is basically when we're planning for # <pause dur="0.9"/> predicting for

the future say for example for providing flu jabs <pause dur="1.2"/> we can observe the number of cases of flu per year in the last five years <pause dur="0.7"/> and we wouldn't be surprised <pause dur="0.2"/> to see that those numbers in the last five years were different year on year <pause dur="0.7"/> wouldn't surprise us at all <pause dur="0.9"/> # we we should all be <pause dur="0.6"/> fairly fairly competent at realizing that <pause dur="0.6"/> the number of cases of flu would depend on various factors in a very complex manner <pause dur="0.5"/> and simply because of the <pause dur="0.2"/> the fact that we're all different anyway <pause dur="0.4"/> there'd be a natural variation <pause dur="0.2"/> component in that <pause dur="5.5"/> so <pause dur="0.9"/><kinesic desc="changes slide" iterated="n"/> the formal objectives of this lecture <pause dur="0.4"/> <vocal desc="sniff" iterated="n"/><pause dur="0.3"/> is first that you should be able to distinguish between observed epidemiological quantities such as incidence prevalence incident rate ratio things like that <pause dur="0.7"/> and their true or underlying values <pause dur="1.5"/> and you ought to be able to discuss how observed epidemiological quantities depart from true values <pause dur="0.2"/> because of random

variation <pause dur="1.5"/> unless we have large resources and can measure absolutely everybody <pause dur="0.2"/> in a particular population we're interested in <pause dur="0.7"/> we'll only ever see <pause dur="0.2"/> an observed proportion of people with diabetes say <pause dur="1.6"/> and that may or may not be equal to the true prevalence of diabetes in our <trunc>sa</trunc> in our population <pause dur="0.8"/> but if we selected our sample properly <pause dur="0.3"/> then that ought to give us a fairly good idea <pause dur="0.2"/> of the basic prevalence of diabetes in the population <pause dur="1.7"/> but that basic idea will vary <pause dur="0.2"/> because of natural variation <pause dur="0.7"/> so consequently we want to be able to say something about <pause dur="0.4"/> how our basic idea of prevalence <pause dur="0.4"/> will vary in reality <pause dur="0.7"/> an idea of the scale of the variation will help us with that <pause dur="4.8"/><kinesic desc="changes slide" iterated="n"/> and statistical theory will help us to do that <pause dur="0.7"/> objective three we want to be able to describe how observed values help us towards a knowledge of the true values <pause dur="0.9"/> and there are two basic statistical ways of doing that certainly in this module at least <pause dur="0.8"/> the first is to test a hypothesis about a

true value <pause dur="0.6"/> and that's what we'll be dealing with in this lecture <pause dur="0.9"/> and the second is to calculate a range <pause dur="0.2"/> in which that true value <pause dur="0.3"/> probably lies <pause dur="3.2"/> so <pause dur="0.7"/> today we'll just be talking about hypothesis tests <pause dur="1.7"/><kinesic desc="changes slide" iterated="n"/> just have a quick drink <pause dur="4.5"/> <event desc="drinks" iterated="n"/> so say we're interested for some reason <pause dur="0.7"/> in the probability of getting a head when we flip a coin <pause dur="1.3"/> so the obvious thing to do do a quick experiment flip the coin ten times <pause dur="0.7"/> see what happens <pause dur="0.8"/> suppose we observe seven heads and three tails <pause dur="1.0"/> then informally we could <trunc>m</trunc> draw several conclusions from that observation given our prior belief <pause dur="0.2"/> about <pause dur="0.3"/> the probability of the coin <pause dur="0.4"/> falling on heads <pause dur="1.8"/> first of all <pause dur="0.4"/> we might suspect that our data was wrong <pause dur="0.3"/> it happens <pause dur="0.7"/> # censuses get miscounted for <pause dur="0.2"/> various reasons <pause dur="1.5"/> # <pause dur="0.2"/> another thing could be that we could have artefact which <pause dur="0.7"/> isn't very easy to illustrate in the example of tossing a coin so i'll give you another one <pause dur="1.0"/> if we look at how # deaths from diabetes change with time i believe

we were discussing that in a <pause dur="0.3"/> couple of lectures ago <pause dur="1.6"/> one of the things that altered <pause dur="0.2"/> the number of deaths <pause dur="0.2"/> <trunc>o</trunc> from diabetes with time <pause dur="0.5"/> was altering the definition of diabetes <pause dur="1.6"/> that had an # an effect on the conclusions that we made about the change in deaths <pause dur="0.6"/> and that is generally known as artefact <pause dur="2.1"/> # another conclusion we might draw is it's just chance <pause dur="0.2"/> the coin's fair we're expecting five heads we've seen seven <pause dur="0.3"/> it's not all that surprising <pause dur="0.6"/> we just put it down to chance and <pause dur="0.4"/> conclude that our coin is fair <pause dur="1.5"/> on the other hand if we're feeling particularly cynical <pause dur="0.5"/> we might conclude that the coin is biased <pause dur="0.5"/> it's difficult to tell seven is that different from five or not <pause dur="1.2"/> we don't know <pause dur="7.1"/><kinesic desc="changes slide" iterated="n"/> so that provides a simple example <pause dur="0.3"/> of what we observe not being exactly what we expected <pause dur="0.2"/> if we toss a coin ten times we expect five heads given that it's fair <pause dur="0.7"/> but we observe seven <pause dur="1.5"/> the coin will tend to produce an equal number of heads and an equal number of tails <pause dur="0.8"/> but <pause dur="0.2"/> we're not surprised when

random variation means that we observe something slightly different <pause dur="1.3"/> and <pause dur="0.4"/> similarly it's no surprising that the health of people varies <pause dur="0.7"/> on average four cases of meningitis per month in Leicester some some months we observe ten other months we observe none <pause dur="0.7"/> nobody's terribly surprised about that <pause dur="1.4"/> again smokers tend to be less healthy than non-smokers but <pause dur="0.4"/> if we pick a small sample <pause dur="0.5"/> then we might for some reason have ended up picking healthy smokers <pause dur="0.4"/> just down to chance <pause dur="6.4"/><kinesic desc="changes slide" iterated="n"/> so tendency versus observations <pause dur="1.5"/> what we practically want to know <pause dur="0.6"/> <vocal desc="sniff" iterated="n"/><pause dur="0.8"/> is <pause dur="0.3"/> what is going to happen in the future <pause dur="0.4"/> what are the underlying tendencies of health <pause dur="0.3"/> in our population <pause dur="1.5"/> for example # providing for our flu jabs we want to plan <pause dur="0.7"/> to buy enough flu jabs <pause dur="0.2"/> to vaccinate at least most people at risk in our population <pause dur="1.1"/> we need the underlying tendency of that population <pause dur="0.2"/> to <pause dur="0.2"/> being at risk at flu <pause dur="0.8"/> of flu <pause dur="0.3"/> from flu <pause dur="0.2"/> even <pause dur="1.1"/> so we might take the number of <pause dur="0.2"/> # cases of

flu <pause dur="0.2"/> in the previous years <pause dur="1.0"/> and logically we might also use any other information that we know to have a bearing on the number of cases of flu <pause dur="0.2"/> that we observe so <pause dur="0.4"/> temperature would be an obvious one <pause dur="0.9"/> # <pause dur="0.3"/> the underlying health of the general population <pause dur="0.7"/> but that's slightly more difficult to to quantify <pause dur="0.9"/> so we would take what we've observed in the past <pause dur="0.5"/> and what we know to have a bearing <pause dur="0.2"/> on our probability of someone having flu <pause dur="0.6"/> and try and use it to predict the future <pause dur="4.3"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="1.0"/> some further examples of <pause dur="0.6"/> attempts <pause dur="0.3"/> # of of the differences between <pause dur="0.4"/> the underlying tendency being related to the observed data <pause dur="0.9"/> if we're <sic corr="interested">interesteded</sic> for some bizarre artificial reason of the proportion of red marbles in a bag with a thousand red and black ones <pause dur="0.8"/> then we could count all thousand marbles <pause dur="0.2"/> and <pause dur="0.3"/> we would know exactly <pause dur="0.2"/> what the underlying proportion was <pause dur="1.1"/> our underlying tendency <pause dur="0.9"/> # <pause dur="0.5"/> but obviously we don't have all day and

we're not particularly interested in counting marbles <pause dur="0.4"/> so we could just take a sample <pause dur="0.3"/> and measure the proportion of reds <pause dur="0.4"/> in <pause dur="0.2"/> that sample that we pick at random <pause dur="0.9"/> if we pick the sample sufficiently well <pause dur="0.5"/> and sufficiently large we'll have a fairly good idea <pause dur="0.2"/> of what the proportion of reds is in the bag <pause dur="1.2"/> similarly we can't ask everybody how they voted in the general election <pause dur="0.8"/> but <pause dur="0.4"/> we ought to be intuitively <pause dur="0.2"/> # confident <pause dur="0.4"/> that asking a thousand people how they voted assuming they didn't lie to us <pause dur="0.7"/> # that we have a fairly good idea <pause dur="0.2"/> of the result of the election <pause dur="1.3"/> and again <pause dur="0.5"/> we're interested in the total number of Leicester diabetic patients who have foot problems <pause dur="0.7"/> so instead of asking all Leicester diabetic <trunc>pr</trunc> patients how their feet are <pause dur="0.4"/> we would just take # a random sample we don't have all day <pause dur="0.4"/> we don't have infinite time we don't have infinite money <pause dur="3.5"/> so if we have an idea of the underlying tendency of diabetes in a population <pause dur="0.6"/> then

we can predict <pause dur="0.2"/> what we may reasonably observe <pause dur="0.2"/> using probability theory <pause dur="3.0"/><kinesic desc="changes slide" iterated="n"/> a further example <pause dur="0.4"/> # working out the provision of neonatal intensive care cots <pause dur="1.5"/> we know from the past <pause dur="0.3"/> data <pause dur="0.6"/> that the true requirement in nineteen-ninety-two <pause dur="0.7"/> was about one cot per thousand live births per year <pause dur="1.7"/> and we also know from the past <pause dur="0.5"/> that we observe about twelve-thousand live births per year <pause dur="0.8"/> so on average we'll need about twelve-thousand neonatal intensive cots per year <pause dur="0.6"/> that's the true tendency <pause dur="0.7"/> we've taken a lot of data <pause dur="0.8"/> and <pause dur="0.4"/> measured what we're interested in <pause dur="0.9"/> and that's what we've ended up <pause dur="0.6"/> with <pause dur="1.2"/><kinesic desc="changes slide" iterated="n"/> however just knowing the average <pause dur="0.5"/> isn't enough we need to know an idea of how it all varies <pause dur="1.0"/> the # slide shows the <pause dur="0.5"/> # <pause dur="0.8"/> requirement of neonatal intensive care <pause dur="0.3"/> <sic corr="cots">costs</sic> in the past <pause dur="0.7"/> you can see that it varies quite a lot <pause dur="1.0"/> it gives us an idea of the variation in the need for a <trunc>neo</trunc> neonatal intensive care cots in the past <pause dur="0.8"/> this is what we've observed in the past not where it what we're <pause dur="0.4"/>

expecting in the future yet <pause dur="2.0"/> and it has quite a large range <pause dur="0.2"/> # <pause dur="0.2"/> we've in the past we've required between two and twenty-four neonatal intensive care cots <pause dur="1.2"/> and most of the time <pause dur="0.2"/> we needed between about eight and sixteen cots <pause dur="1.1"/> so if we provided twelve if we'd just gone with the average and ignored the variation <pause dur="0.5"/> then quite a lot of the time <pause dur="0.4"/> we'd be up to about four cots short <pause dur="1.6"/> so we need an appreciation of the variation <pause dur="2.8"/><vocal desc="sniff" iterated="n"/><pause dur="1.0"/><kinesic desc="changes slide" iterated="n"/> slide eleven <pause dur="0.2"/> a slight repeat <pause dur="1.4"/> neonatal intensive care cots we often observe eight to six <trunc>c</trunc> # <pause dur="0.2"/> eight to sixteen cots <pause dur="0.2"/> being used <pause dur="0.8"/> and on one day per month more having done some <pause dur="0.2"/> mildly complex calculations using the data in that histogram <pause dur="1.4"/> we needed nineteen or more cots <pause dur="0.8"/> and on one per cent of the days we needed twenty-one <pause dur="0.5"/> # <pause dur="0.5"/> cots <pause dur="0.7"/> hardly ever did we need more than twenty-four <pause dur="1.7"/> so logically we provided <pause dur="0.2"/> we <trunc>w</trunc> we looked at <pause dur="0.2"/> that data and thought right let's provide nineteen cots <pause dur="0.7"/> and on average about twelve were occupied so we had sixty-three per

cent <pause dur="0.4"/> of those nineteen cots occupied <pause dur="0.2"/> usually <pause dur="1.7"/> now that was taking <pause dur="0.4"/> data from our true distribution <pause dur="0.4"/> which we'd observed over a certain period of time in the past <pause dur="0.8"/> and used it to # work out what we would expect to see <pause dur="1.4"/> but in practice what we want to do is entirely the other way round we want to observe something <pause dur="0.3"/> and make an inference about what we expect to see in the future <pause dur="1.3"/> we want to reverse the direction of inference from the observed distribution <pause dur="0.4"/> to the true tendency <pause dur="1.8"/> given what we observe what we might <pause dur="0.2"/> what might we expect to happen <pause dur="0.2"/> in the future <pause dur="5.4"/><kinesic desc="changes slide" iterated="n"/> and hypothesis tests <pause dur="0.2"/> allows us to do this in a formal way <pause dur="1.5"/> we can take the observed data <pause dur="0.4"/> and make an objective statement about <pause dur="0.4"/> the # true situation <pause dur="0.5"/> we can use we can describe how the observed values will help us towards a knowledge of the true values by testing our hypothesis <pause dur="7.1"/><kinesic desc="changes slide" iterated="n"/> so formally <pause dur="0.4"/> a hypothesis is a statement <pause dur="0.4"/> that an underlying tendency of scientific interest <pause dur="0.3"/> takes a particular quantitative value <pause dur="3.0"/> we have

to state our beliefs in a quantitative way <pause dur="0.3"/> in order to use <pause dur="0.2"/> quantitative methods <pause dur="1.5"/> and on the slide are some examples of hypotheses that we might test <pause dur="0.9"/> so first of all we might say that the coin is fair but to put that in a quantitative way <pause dur="0.4"/> we have to put a value on the probability of a head <pause dur="1.3"/> so if the coin is fair <pause dur="0.3"/> we'd expect to see # heads about half the time <pause dur="0.7"/> and that is equivalent to saying that the probability of a head is a half <pause dur="2.0"/> if we want to say that a new drug is no better than a standard treatment then we would compare the survival rates by calculating the ratio <pause dur="1.1"/> if the new drug is no better then the survival rates we would expect to be equal and consequently the ratio would be equal to one <pause dur="2.1"/> and again <pause dur="0.2"/> # <pause dur="0.2"/> if we want to make a statement about the true prevalence of tuberculosis in a given population <pause dur="1.5"/> then we have to put a value on that we may observe from the past that it would be two in ten-thousand <pause dur="0.5"/> and use that as our <trunc>help</trunc> <pause dur="0.2"/> <trunc>h</trunc> our hypothesis to test <pause dur="0.2"/> so

we're stating our beliefs <pause dur="0.4"/> which may be <pause dur="0.3"/> possibly informal <pause dur="0.5"/> in a formal quantitative way <pause dur="0.4"/> in order to use quantitative methods <pause dur="5.6"/><kinesic desc="changes slide" iterated="n"/> so now we have our hypothesis what can we do with it <pause dur="1.3"/> say # we have the hypothesis that our success rate for aneurysm repair <pause dur="0.2"/> is eighty per cent <pause dur="0.9"/> and we observe what happens to say six patients who have <pause dur="0.2"/> an aneurysm repaired <pause dur="1.5"/> we need to use what we observe about those six patients to test that hypothesis <pause dur="0.4"/> that the success rate is about eighty per cent <pause dur="0.5"/> is eighty per cent <pause dur="1.6"/> now informally if we had observed <pause dur="0.2"/> # one death in a in those six patients <pause dur="0.9"/> then # <pause dur="0.3"/> we could be reasonably confident <pause dur="0.3"/> of a difference from eighty per cent <pause dur="0.4"/> because that's quite an <trunc>obstre</trunc> <pause dur="0.2"/> extreme observation <pause dur="1.1"/> if we observe four or five in six <pause dur="0.7"/> then we would be unsure what to conclude because the proportion of four or five out of six <pause dur="0.4"/> is <pause dur="0.2"/> quite close to eighty per cent <pause dur="0.2"/> we're not totally sure <pause dur="1.8"/> so that would give us an informal idea <pause dur="0.7"/> but we want <pause dur="0.2"/> a way of objectively

distinguishing <pause dur="0.8"/> the instances where our # our observed data is <pause dur="0.2"/> slightly different from our expected data <pause dur="0.2"/> our our null hypothesis <pause dur="0.8"/> from the situations where we have <pause dur="0.3"/> # data which is different from our null hypothesis and constant <pause dur="0.2"/> consequently quite extreme <pause dur="0.6"/> and hypothesis testing <pause dur="0.4"/> allows us to do that objectively <pause dur="2.6"/> so it allows us to compare consistently what we observe <pause dur="0.4"/> with what is actually happening what we think is happening <pause dur="9.6"/><kinesic desc="changes slide" iterated="n"/> so formally <pause dur="1.2"/> in a hypothesis test <pause dur="0.3"/> we <reading>calculate the probability of getting an <trunc>ar</trunc> an observation as as extreme as <pause dur="0.5"/> or more extreme <pause dur="0.6"/> than the one observed <pause dur="0.2"/> if the stated hypothesis was true</reading> <pause dur="1.4"/> we have our stated hypothesis <pause dur="0.5"/> in a quantitative <pause dur="0.5"/> fashion <pause dur="0.8"/> and we can make some probability statement about that <pause dur="0.7"/> which we can then use to calculate the probability <pause dur="0.2"/> of our observed data <pause dur="1.3"/> the idea is that <pause dur="0.2"/> if what we observe is very unlikely <pause dur="0.8"/> then <pause dur="0.2"/> the probability will be very small <pause dur="1.3"/> so if the probability is very small <pause dur="1.1"/> then either <pause dur="0.5"/> under the null hypothesis something very

unlikely has occurred <pause dur="0.9"/> or <pause dur="0.2"/> the hypothesis is wrong <pause dur="1.8"/> so then we conclude that the data are <trunc>inca</trunc> <pause dur="0.2"/> incompatible <pause dur="0.2"/> with our null hypothesis <pause dur="1.0"/> and that probability is called a P-value <pause dur="1.2"/> # another example <pause dur="0.2"/> of <pause dur="0.2"/> how <pause dur="0.5"/> # you might remember a P-value which is # a slightly more <pause dur="0.5"/> medical interpretation <pause dur="0.4"/> would be to consider how likely a patient having a blood pressure of <pause dur="0.5"/> one-forty over ninety and being healthy <pause dur="0.2"/> would be <pause dur="0.9"/> healthy patients don't <trunc>nen</trunc> generally have blood pressures that extreme <pause dur="0.7"/> so either <pause dur="0.2"/> it's highly unlikely the patient has <pause dur="0.5"/> # is healthy and has an extreme blood pressure reading <pause dur="0.8"/> or the patient is not healthy <pause dur="1.8"/> # so <trunc>tha</trunc> that that probability is a P-value <pause dur="5.5"/><kinesic desc="changes slide" iterated="n"/> so take our extreme value <pause dur="0.7"/> we have a hypothesis that a coin is fair and we've tossed it ten times <pause dur="0.8"/> we've observed ten heads and zero tails <pause dur="1.3"/> now under the hypothesis that the coin is fair <pause dur="0.3"/> the probability of a head <pause dur="0.5"/> is <pause dur="0.2"/> point-five a half <pause dur="0.4"/> one in two <pause dur="2.1"/> then <pause dur="0.5"/> assuming that the probability of a head is one if <trunc>fi</trunc> # one in

two even <pause dur="0.2"/> a half <pause dur="1.0"/> we can calculate the probability of getting ten heads each with a probability of a half <pause dur="1.1"/> and that translates to about point-zero-zero-two one in five-hundred <pause dur="0.4"/> exactly <pause dur="0.7"/> two in one-over-one-thousand-and-<pause dur="0.2"/>twenty-four <pause dur="0.6"/> two times one-thousand one-over-one-thousand-and-twenty-four <pause dur="1.1"/> that's our P-value <pause dur="0.7"/> our probability <pause dur="0.2"/> of observing ten heads <pause dur="0.5"/> given the probability of a head is a half <pause dur="1.1"/> the probability of observing the data <pause dur="0.3"/> given that the null hypothesis is true <pause dur="1.9"/> now that's really unlikely <pause dur="0.3"/> one in five-hundred <pause dur="0.9"/> so <pause dur="0.8"/> either we've got an outstanding <pause dur="0.2"/> chance result <pause dur="0.8"/> or the data <pause dur="0.2"/> <trunc>o</trunc> or the <trunc>hy</trunc> the hypothesis <pause dur="0.3"/> <trunc>i</trunc> # can be rejected <pause dur="0.9"/> the data we've observed is inconsistent with the hypothesis that we're testing <pause dur="0.3"/> that the coin is true <pause dur="0.9"/> and therefore <pause dur="0.2"/> we've got strong evidence against that hypothesis <pause dur="3.2"/> we've <trunc>un</trunc> we've observed something very unlikely <pause dur="0.3"/> so we've concluded that the hypothesis we were testing <pause dur="0.3"/> is false <pause dur="1.6"/> # <pause dur="0.4"/> yeah <pause dur="0.6"/> <vocal desc="sniff" iterated="n"/> <pause dur="0.3"/> we've rejected

that that null hypothesis <pause dur="2.2"/> prior beliefs are relevant here <pause dur="0.5"/> # they help us to set up the null hypothesis <pause dur="1.2"/> # <trunc>i</trunc> <trunc>i</trunc> in in this example our prior belief was that the the coin was fair so <pause dur="0.4"/> we assume that the probability of a head <pause dur="0.2"/> was a half <pause dur="0.8"/> and <trunc>coc</trunc> calculated the probability of our <trunc>o</trunc> <pause dur="0.3"/> observed data <pause dur="0.4"/> in those circumstances <pause dur="1.6"/> the last example there where we have <pause dur="0.5"/> # ten patients treated on # <trunc>u</trunc> using new treatment X <pause dur="0.4"/> and ten of them surviving <pause dur="1.0"/> # <pause dur="0.3"/> is exactly the same as tossing a coin ten times where instead of tossing a coin we wait and see whether the patient lives or dies <pause dur="0.3"/> same as head or tail <pause dur="0.8"/> and historically if we've seen that fifty per cent die that's the same as expecting <pause dur="0.2"/> a head with probability point-five <pause dur="1.5"/> so that might help put it in context for you <pause dur="12.2"/><event desc="drinks" iterated="n"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="1.1"/> we've set up our null hypothesis <pause dur="1.2"/> we've made some probability statements about the <pause dur="0.2"/> observed data <pause dur="0.5"/> given that our <trunc>n</trunc> our null hypothesis is true we've got our P-value <pause dur="1.2"/> if that P-value <pause dur="0.3"/> is less than or equal to

point-zero-five <pause dur="0.9"/> then <pause dur="0.5"/> we reject our hypothesis we say <pause dur="0.3"/> one of <pause dur="0.2"/> one of several things <pause dur="0.6"/> we could say that the data is inconsistent with the hypothesis <pause dur="1.1"/> we've assumed something is true we've observed something <pause dur="0.3"/> which is very <trunc>like</trunc> <pause dur="0.2"/> very unlikely if it's true <pause dur="0.5"/> therefore what we're seeing is inconsistent with what we think <pause dur="1.8"/> we could also put that as saying that we have substantive evidence against the hypothesis <pause dur="0.8"/> # that it's reasonable to reject the hypothesis and that it's a statistically significant result <pause dur="1.3"/> at five per cent in this particular example <pause dur="1.5"/> if the P-value is greater than point-nought-five then we can't say any of the above <pause dur="1.9"/> # <pause dur="0.5"/> what we can't say is that the null hypothesis is false <pause dur="0.6"/> absence of evidence against the null hypothesis <pause dur="0.3"/> isn't evidence of absence <pause dur="0.9"/> we can't say that that gives us evidence to conclude that the hypothesis is false for example <pause dur="0.7"/> if # <pause dur="1.0"/> the probability under our null hypothesis <pause dur="0.2"/> that the mean surface temperature <pause dur="0.6"/> # of the earth has

increased by only one centigrade over the last fifty years <pause dur="0.3"/> <vocal desc="sniff" iterated="n"/><pause dur="0.5"/> # our observed data has a probability of point-one <pause dur="1.6"/> then that's greater than point-nought-five so we reject that null hypothesis <pause dur="0.6"/> it doesn't prove that there is no global warming <pause dur="0.2"/> it simply proves that what we've observed is inconsistent <pause dur="0.3"/> with what we believe that the temperature of the earth <pause dur="0.2"/> has increased by <pause dur="0.4"/> one per cent over the last # one degree-C by <pause dur="0.4"/> the last <pause dur="0.3"/> in the last fifty years <pause dur="2.3"/> another example which might be particularly illuminating on this absence of evidence not <trunc>evi</trunc> <pause dur="0.2"/> is not evidence of absence would be the U-S's stance on Iraqi weapons <pause dur="0.6"/> at the moment <pause dur="1.0"/> they're trying to say that <pause dur="0.4"/> the absence of evidence <pause dur="0.8"/> of # weapons doesn't mean that that is evidence that there are no weapons <pause dur="1.1"/> that may help <pause dur="0.2"/> to <pause dur="0.2"/> illuminate for you <pause dur="5.8"/><kinesic desc="changes slide" iterated="n"/> # further examples <pause dur="0.2"/> of <trunc>i</trunc> <trunc>h</trunc> hypothesis tests and P-values <pause dur="1.3"/> the first example <pause dur="0.7"/> the incidence of disease X in Warwickshire significantly lower than the

rest of the U-K <pause dur="0.5"/> P equals nought-point-nought-one <pause dur="1.6"/> this means that we've tested the hypothesis <pause dur="0.4"/> that the incidence of disease in Warwickshire <pause dur="0.4"/> is equal to the incidence of disease <pause dur="0.4"/> in the rest of the U-K <pause dur="1.2"/> and what we've observed <pause dur="0.3"/> about the incidence of disease in Warwickshire <pause dur="0.3"/> is very unlikely <pause dur="0.2"/> under that null hypothesis <pause dur="0.8"/> if the two incidences were the same then what we'd observe would have a probability <pause dur="0.3"/> of point-nought-one <pause dur="0.8"/> that's very unlikely it's less than point-nought-five so we've rejected that null hypothesis <pause dur="0.2"/> and we can say <pause dur="0.5"/> that the incidence of disease X in Warwickshire is significantly lower than in the rest of the U-K <pause dur="1.9"/> second example death rate from disease Y <pause dur="0.5"/> is significantly higher in Barnsley than in Leicester with P equals point-five <pause dur="0.8"/> we've tested the null hypothesis that the two death rates are equal <pause dur="0.7"/> we've observed something about the death rates of both of them <pause dur="0.6"/> and we've concluded that what we've observed is very unlikely <pause dur="0.3"/> under that null

hypothesis that they are the same <pause dur="2.1"/> that's # <pause dur="1.3"/> that particular example what we've observed under our null hypothesis <pause dur="0.4"/> has about a five per cent chance of occurring <pause dur="1.2"/> i'll talk a little bit more about <pause dur="0.4"/> how we choose a cut-off point P-values a bit later on <pause dur="1.8"/> third example patients on the new drug did not live significantly longer than those on the standard drug <pause dur="0.8"/> we've taken patients on the new drug and patients on the standard drug <pause dur="0.8"/> tested the hypothesis that they both lived the same amount of time <pause dur="1.5"/> and calculated under that hypothesis <pause dur="0.6"/> the probability of the data we've observed being about point-four <pause dur="0.8"/> in other words about forty per cent of the time we would observe data that extreme <pause dur="0.7"/> that's not that unlikely <pause dur="0.2"/> so we've rejected the null <trunc>hypothesi</trunc> # we've accepted the null hypothesis in that case <pause dur="12.1"/><kinesic desc="changes slide" iterated="n"/> so the null hypothesis <pause dur="0.2"/> the hypothesis to be tested <pause dur="0.2"/> is often called the null hypothesis oh i'm glad we've got H-nought on the slides i occasionally call it that without

really thinking <pause dur="1.0"/> # <pause dur="2.0"/> this is <pause dur="1.5"/> the <trunc>pr</trunc> the quantitative statement about our <trunc>tr</trunc> our prior beliefs <pause dur="1.1"/> so for example <pause dur="0.3"/> if we're supposing that death rates from # <pause dur="0.4"/> a disease on treatment A and treatment B <pause dur="0.6"/> were the same <pause dur="0.6"/> then we would calculate the ratio of the death rates <pause dur="0.5"/> to be <pause dur="0.3"/> # one <pause dur="1.3"/> that would be our null hypothesis we would then observe data and <pause dur="0.3"/> calculate the <pause dur="0.2"/> probability of what we observed occurring <pause dur="1.7"/> for example again the prevalence # in in Warwickshire of a particular disease is the same in Leicestershire another example of a null hypothesis <pause dur="2.3"/> and <pause dur="1.2"/> P being less than or equal to point-nought-five <pause dur="0.4"/> <trunc>s</trunc> is substantial evidence against the hypothesis being tested <pause dur="0.8"/> not that it's definitely false <pause dur="0.4"/> it means what we've observed is unlikely <pause dur="0.2"/> given what we think <pause dur="0.4"/> not that the hypothesis is untrue <pause dur="2.1"/> again by the same token <pause dur="0.4"/> P being greater than point-nought-five <pause dur="0.7"/> is that the data is not inconsistent <pause dur="0.6"/> with the # <pause dur="0.5"/> hypothesis <pause dur="0.4"/> that means that there's not much evidence against

the <trunc>hy</trunc> the hypothesis being tested <pause dur="0.5"/> but not that it's definitely true <pause dur="0.6"/> meaning that what we've observed is reasonably likely <pause dur="0.2"/> given what we believe <pause dur="6.5"/><kinesic desc="changes slide" iterated="n"/> as a further experiment again flipping a coin ten times <pause dur="0.5"/> and having our observed results being seven heads three tails <pause dur="1.2"/> we suppose that our null hypothesis is that the coin is is # <pause dur="0.2"/> fair <pause dur="0.9"/> so we make the probability statement that the probability of a head is point-five as before <pause dur="1.0"/> what we're interested in is whether or not the coin is biased <pause dur="1.0"/> what you're seeing there on the slide <pause dur="0.9"/> is the probabilities of observing various different events the first column <pause dur="1.3"/> is the number of heads just let me get the pointer up <pause dur="0.5"/><event desc="finds pointer" iterated="y" dur="2"/> oh <pause dur="0.2"/> where's it gone <pause dur="0.2"/> there we are <pause dur="1.5"/> so <pause dur="0.4"/> the first column is the number of heads we may observe from zero to ten <pause dur="0.2"/> obviously <pause dur="1.5"/> and the second column <pause dur="0.6"/> is the probability of that number of heads occurring <pause dur="0.5"/> under our null hypothesis <pause dur="0.7"/> so if our coin is unbiased if our coin is fair and our probability of a head <pause dur="0.5"/>

is <pause dur="0.2"/> point-five <pause dur="0.8"/> then <pause dur="0.6"/> the probability of observing no heads <pause dur="0.5"/> is point-zero-zero-one <pause dur="1.2"/> the probability of observing one head <pause dur="0.5"/> is point-zero-one-zero <pause dur="0.8"/> and <pause dur="0.2"/> right the way up to <pause dur="0.5"/> ten <pause dur="2.2"/> now what we want to know is how likely <pause dur="0.2"/> is it that we observe <pause dur="0.2"/> seven heads and three tails <pause dur="1.7"/> and we do that by adding up the relevant probabilities and multiplying by two because this is a two-sided test <pause dur="0.7"/> we don't know whether the coin is biased in favour of heads or in favour of tails <pause dur="1.1"/> and that gives us a P-value of point-three-four-four <pause dur="1.2"/> so <pause dur="0.3"/> if the coin were biased <pause dur="0.4"/> about thirty-four per cent of the time <pause dur="0.4"/> we would expect to see <pause dur="0.4"/> seven heads three tails <pause dur="0.7"/> that's not particularly unlikely <pause dur="0.6"/> it's certainly not <pause dur="0.2"/> five per cent unlikely <pause dur="0.7"/> and so <pause dur="0.6"/> we don't reject our null hypothesis that the coin <pause dur="0.2"/> is biased <pause dur="4.7"/><kinesic desc="changes slide" iterated="n"/> ah <pause dur="0.9"/> and there we are <pause dur="0.8"/> we flipped a coin ten times <pause dur="0.3"/> observed the results seven heads three tails <pause dur="0.5"/> calculated the probability <pause dur="0.3"/> of what we've observed <pause dur="0.7"/> it's reasonably consistent with what we believe that the coin

is unbiased <pause dur="0.8"/> and that's fairly weak evidence against because it's consistent with the <trunc>nun</trunc> null hypothesis <pause dur="0.9"/> so we don't have <trunc>an</trunc> evidence <pause dur="0.2"/> that the coin is <trunc>unb</trunc> is biased <pause dur="0.6"/> but it doesn't prove <pause dur="0.3"/> that the coin is not unbiased <pause dur="0.3"/> # # <trunc>th</trunc> <trunc>th</trunc> <pause dur="0.3"/> that the coin is unbiased <pause dur="0.6"/> all it does is provide evidence <pause dur="0.5"/> in favour of that hypothesis <pause dur="7.0"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="3.1"/> just let me <pause dur="0.2"/> collect my thoughts <pause dur="1.4"/> <vocal desc="sniff" iterated="n"/><pause dur="1.7"/> <vocal desc="clears throat" iterated="n"/><pause dur="4.9"/> now rejecting H-nought <pause dur="0.2"/> is not <pause dur="0.2"/> always much use this is <pause dur="0.5"/> this is what i was <pause dur="0.4"/> said i'd get back to you about the P-equals-point-five business <pause dur="0.9"/> we simply choose that as an arbitrary cut-off point <pause dur="0.5"/> there there is nothing amazing happens <pause dur="0.3"/> between <pause dur="0.5"/> # point-zero-four-nine and point-zero-five-one <pause dur="6.9"/> and <pause dur="3.2"/> <vocal desc="sigh" iterated="n"/> <pause dur="1.1"/> hang on a second <pause dur="6.5"/> <event desc="drinks" iterated="n"/><vocal desc="clears throat" iterated="n"/> <pause dur="0.4"/> <vocal desc="sniff" iterated="n"/> <pause dur="0.2"/> so <pause dur="0.6"/> yeah <pause dur="0.2"/> arbitrary P-values <pause dur="1.4"/> we're not all that interested <pause dur="0.9"/> in <pause dur="0.2"/> exact # differences

between <pause dur="0.5"/> point-nought-five-<pause dur="0.4"/>one and <pause dur="0.6"/> point-nought-four-nine <pause dur="0.8"/> it largely depends on the context of what we're thinking of <pause dur="1.0"/> it it's an arbitrary cut-off rule <pause dur="0.4"/> which we'll use but it depends on our situation <pause dur="1.1"/> if we're testing a hypothesis that a treatment for the common cold <pause dur="0.7"/> # is effective <pause dur="0.9"/> and we observed # <pause dur="0.6"/> a P-value of point-nought-five-one <pause dur="0.7"/> in that particular hypothesis test <pause dur="0.6"/> then <pause dur="0.3"/> because this isn't a particularly you know <pause dur="0.4"/> groundbreaking thing to be testing <pause dur="1.5"/> the fact that we've observed something fairly unlikely probably means that our <pause dur="0.8"/> cure for the common cold isn't <pause dur="0.2"/> isn't all that effective <pause dur="0.6"/> and so we're not all that excited <pause dur="0.7"/> however if we're looking at a cure for AIDS <pause dur="1.2"/> and we observe a P-value of point-nought-five-one <pause dur="0.8"/> then because this is quite an <trunc>im</trunc> important and expensive problem <pause dur="0.7"/> we've observed a fairly unlikely result <pause dur="0.2"/> and <pause dur="0.3"/> we're really very interested in finding a cure for AIDS <pause dur="0.5"/> so even though it's not a significant result <pause dur="0.7"/> it's still an interesting thing <pause dur="0.2"/> and we would want to

investigate further <pause dur="2.1"/> # <pause dur="3.4"/> false positive results <pause dur="0.2"/> # <pause dur="1.2"/> it's a very strange slide i i i <pause dur="0.4"/> can't quite see the connection between <pause dur="0.3"/> rejecting H-nought and and all the other points on the slide anyway <pause dur="0.7"/> # the P-value <pause dur="1.0"/> gives us an idea of <trunc>i</trunc> <trunc>i</trunc> a probability of interpretation of <pause dur="0.5"/> how unlikely what we observe is given what we believe <pause dur="0.9"/> # <trunc>i</trunc> <trunc>i</trunc> it's a it's a simple <pause dur="0.2"/> interpretation <pause dur="0.6"/> # that that we can talk about <pause dur="0.9"/> it also has the nice probability interpretation <pause dur="0.5"/> that it is the probability of getting a false positive result <pause dur="0.8"/> so in other words the P-value is also <pause dur="0.4"/> the probability of rejecting the null hypothesis when it's true <pause dur="1.1"/> which is quite a handy interpretation <pause dur="1.6"/> you should also note that significance depends on the sample size <pause dur="0.9"/> if we flipped a coin three times <pause dur="0.4"/> then the minimum P-value we could observe <pause dur="0.3"/> would be # a quarter <pause dur="0.4"/> point-two-five <pause dur="1.3"/> which <pause dur="0.2"/> means that <pause dur="0.4"/> we're never going to observe a significant result in that test of whether or not that coin is

unbiased <pause dur="1.8"/> # and so # what we'd <trunc>w</trunc> obviously need there is is a larger sample size for that test <pause dur="2.3"/> # last point to note is that a <trunc>stig</trunc> a statistically significant result is not necessarily a clinically important one <pause dur="1.2"/> # <pause dur="0.2"/> again this depends on the context of the problem that we're we're dealing with <pause dur="0.8"/> one that i've # consulted on recently was about A and E admissions <pause dur="1.0"/> # <trunc>alth</trunc> although the result <pause dur="0.4"/> the the reduction in # <pause dur="1.4"/> A and E admissions <pause dur="0.5"/> was really quite small <pause dur="0.8"/> this was actually very very interesting <pause dur="0.5"/> because even a tiny one per cent reduction in A and E admissions rate <pause dur="0.5"/> translated to quite a large money saving <pause dur="0.6"/> and so <pause dur="1.1"/> we were actually very interested in a very small difference <pause dur="1.2"/> however in <pause dur="0.2"/> other situations we might only be interested in a fairly large <pause dur="0.3"/> # <pause dur="0.6"/> change in say diabetes prevalence <pause dur="0.4"/> for for practical purposes <pause dur="0.7"/> it's # it's rather down to down to context <pause dur="0.5"/> another example would be looking at <trunc>a</trunc> aneurysm repair # abdominal aortic aneurysm <pause dur="0.8"/> if we have a

fairly rare <pause dur="0.6"/> problem <pause dur="0.3"/> # <trunc>s</trunc> # say abdominal <trunc>or</trunc> aortic aneurysm having quite a low success rate of repair <pause dur="1.7"/> then <pause dur="0.9"/> sorry # # <pause dur="0.2"/> quite a low death rate of repair and we want to reduce that then if it's low to start with <pause dur="0.6"/> we can only really reduce a low rate <pause dur="0.2"/> by a very small amount simply because of the <pause dur="0.2"/> amount we start with <pause dur="0.4"/> if we start with a five per cent death rate and we want to reduce that <pause dur="0.5"/> for whatever reason <pause dur="0.7"/> # economic or whatever <pause dur="1.4"/> then we can only # reduce a five per cent death rate <pause dur="0.3"/> by a maximum of five per cent <pause dur="0.6"/> which may in other contexts be quite a small reduction <pause dur="2.2"/> so statistically significant does not necessarily mean clinically important <pause dur="0.5"/> but it largely depends on the context of the problem <pause dur="0.3"/> at the time <pause dur="2.1"/> nevertheless P-values <pause dur="0.2"/> are used a lot <pause dur="0.8"/> # most people i i have consulting me at # the Walsgrave Hospital sorry the hospital formerly known as Walsgrave <pause dur="0.8"/> # <pause dur="1.1"/> get very excited when they see P-values in papers most most people <pause dur="0.4"/> are very interested in seeing significant

results <pause dur="0.4"/> but that does not necessarily mean that a significant result in a hypothesis test <pause dur="0.3"/> translates to something <pause dur="0.2"/> which is clinically useful or interesting <pause dur="6.9"/><kinesic desc="changes slide" iterated="n"/> so to sum up <pause dur="1.3"/> hypothesis tests allow us to describe how our observed values <pause dur="0.7"/> help us towards a knowledge of true values <pause dur="0.7"/> by testing <pause dur="0.2"/> # <pause dur="0.6"/> the probability of observing given what we believe <pause dur="2.0"/> and in the next lecture <pause dur="0.3"/> we'll look at how we calculate a range <pause dur="0.6"/> of # <pause dur="1.2"/> in in which the true value probably lies <pause dur="2.2"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="0.9"/> key points to note in this lecture <pause dur="1.0"/> are that variation exists that that people differ we should all have a fairly good appreciation <pause dur="0.4"/> that <pause dur="0.3"/> such is life <pause dur="0.2"/> that is the way it is <pause dur="1.4"/> # our observed data because of that natural variation <pause dur="0.8"/> is often different from our underlying tendency <pause dur="0.6"/> the observed proportion of people with diabetes in a in a general practice <pause dur="0.7"/> is often different from the prevalence of diabetes in the area that that general practice covers <pause dur="0.6"/> just because of natural variation <pause dur="2.0"/> # <pause dur="0.9"/> various sources of

variation <pause dur="0.4"/> natural is is the most obvious one to think about <pause dur="0.6"/> but our <pause dur="0.2"/> estimate of <pause dur="0.3"/> # the proportion of people in diabetes in our general practice <pause dur="0.6"/> will depend on how we choose our sample <pause dur="1.0"/> which is another source of variation <pause dur="1.1"/> and we may <pause dur="0.2"/> test hypothesis about <pause dur="0.3"/> hypotheses <pause dur="0.6"/> about our true value <pause dur="0.2"/> of prevalence of diabetes in our population <pause dur="0.2"/> from our general practice area <pause dur="0.7"/> by using what we observe <pause dur="0.3"/> given what we believe <pause dur="1.0"/> and calculating the probability of what we observe <pause dur="0.3"/> given what we believe <pause dur="1.8"/> and after next week's lecture you'll be able to see how confidence intervals <pause dur="0.3"/> can be calculated <pause dur="0.8"/> those are <pause dur="0.2"/> an <trunc>e</trunc> give us an idea of where our true value may lie <pause dur="0.5"/> with a specific probability <pause dur="1.4"/> and that's it for today so <pause dur="0.5"/> you'll be pleased you have a slightly longer break than usual