Skip to main content

pslct039

<?xml version="1.0"?>

<!DOCTYPE TEI.2 SYSTEM "base.dtd">

<TEI.2><teiHeader>

<fileDesc>

<titleStmt>

<title>"Log-rank test, Kaplan-Meier Estimator"</title></titleStmt>

<publicationStmt><distributor>BASE and Oxford Text Archive</distributor>

<idno>pslct039</idno>

<availability><p>The British Academic Spoken English (BASE) corpus was developed at the

Universities of Warwick and Reading, under the directorship of Hilary Nesi

(Centre for English Language Teacher Education, Warwick) and Paul Thompson

(Department of Applied Linguistics, Reading), with funding from BALEAP,

EURALEX, the British Academy and the Arts and Humanities Research Board. The

original recordings are held at the Universities of Warwick and Reading, and

at the Oxford Text Archive and may be consulted by bona fide researchers

upon written application to any of the holding bodies.

The BASE corpus is freely available to researchers who agree to the

following conditions:</p>

<p>1. The recordings and transcriptions should not be modified in any

way</p>

<p>2. The recordings and transcriptions should be used for research purposes

only; they should not be reproduced in teaching materials</p>

<p>3. The recordings and transcriptions should not be reproduced in full for

a wider audience/readership, although researchers are free to quote short

passages of text (up to 200 running words from any given speech event)</p>

<p>4. The corpus developers should be informed of all presentations or

publications arising from analysis of the corpus</p><p>

Researchers should acknowledge their use of the corpus using the following

form of words:

The recordings and transcriptions used in this study come from the British

Academic Spoken English (BASE) corpus, which was developed at the

Universities of Warwick and Reading under the directorship of Hilary Nesi

(Warwick) and Paul Thompson (Reading). Corpus development was assisted by

funding from the Universities of Warwick and Reading, BALEAP, EURALEX, the

British Academy and the Arts and Humanities Research Board. </p></availability>

</publicationStmt>

<sourceDesc>

<recordingStmt>

<recording dur="00:47:11" n="4825">

<date>21/02/2003</date><equipment><p>video</p></equipment>

<respStmt><name>BASE team</name>

</respStmt></recording></recordingStmt></sourceDesc></fileDesc>

<profileDesc>

<langUsage><language id="en">English</language>

</langUsage>

<particDesc>

<person id="nf0955" role="main speaker" n="n" sex="f"><p>nf0955, main speaker, non-student, female</p></person>

<person id="sf0956" role="participant" n="s" sex="f"><p>sf0956, participant, student, female</p></person>

<person id="sm0957" role="participant" n="s" sex="m"><p>sm0957, participant, student, male</p></person>

<person id="sm0958" role="participant" n="s" sex="m"><p>sm0958, participant, student, male</p></person>

<person id="sm0959" role="participant" n="s" sex="m"><p>sm0959, participant, student, male</p></person>

<person id="sm0960" role="participant" n="s" sex="m"><p>sm0960, participant, student, male</p></person>

<person id="sm0961" role="participant" n="s" sex="m"><p>sm0961, participant, student, male</p></person>

<person id="sf0962" role="participant" n="s" sex="f"><p>sf0962, participant, student, female</p></person>

<person id="sf0963" role="participant" n="s" sex="f"><p>sf0963, participant, student, female</p></person>

<person id="sm0964" role="participant" n="s" sex="m"><p>sm0964, participant, student, male</p></person>

<person id="sm0965" role="participant" n="s" sex="m"><p>sm0965, participant, student, male</p></person>

<personGrp id="ss" role="audience" size="s"><p>ss, audience, small group </p></personGrp>

<personGrp id="sl" role="all" size="s"><p>sl, all, small group</p></personGrp>

<personGrp role="speakers" size="13"><p>number of speakers: 13</p></personGrp>

</particDesc>

<textClass>

<keywords>

<list>

<item n="speechevent">Lecture</item>

<item n="acaddept">Statistics</item>

<item n="acaddiv">ps</item>

<item n="partlevel">UG3/PG</item>

<item n="module">Medical Statistics</item>

</list></keywords>

</textClass>

</profileDesc></teiHeader>

<text><body>

<u who="nf0955"> i wanted to start by saying that at the end <trunc>o</trunc> <trunc>o</trunc> well not quite at the end <pause dur="0.2"/> of the lecture i'd made a correction but somebody came and asked me at the end about the correction <pause dur="0.5"/> so can you just check in your notes where i was <pause dur="2.4"/> writing the <pause dur="1.3"/> <kinesic desc="writes on board" iterated="y" dur="14"/> variance of G <trunc>da</trunc> <pause dur="0.3"/> G-of-X <pause dur="1.4"/> approximately equal to <pause dur="0.5"/> G-dashed at mu <pause dur="3.9"/> right i forgot the squared initially can you check you've all i think you've probably all checked that <pause dur="0.3"/> i'd put the squared into all the other formulae but just not in here <pause dur="1.4"/><kinesic desc="writes on board" iterated="y" dur="2"/> so can you just check that you've got that correct in your notes or you might get <pause dur="0.8"/> slightly puzzled when you come back to them <pause dur="15.7"/> right i was then going on to talk about the Kaplan-Meier estimate <pause dur="3.6"/> <kinesic desc="writes on board" iterated="y" dur="14"/> so Kaplan-<pause dur="2.0"/>Meier <pause dur="2.2"/> estimator <pause dur="2.2"/> and <pause dur="2.0"/> it just incidentally was a paper published by two people

called Kaplan and Meier in nineteen-fifty-eight <pause dur="0.7"/> # <pause dur="1.1"/> just to show you that quite a lot of the stuff <pause dur="1.0"/> we do isn't <pause dur="0.4"/> late well second half of the twentieth century <pause dur="0.4"/> sometimes if you try to explain to people you're doing research in statistics they look at <trunc>b</trunc> you blankly and say what is there to do they seem to think all these things just <pause dur="0.6"/> appeared instantly <pause dur="0.9"/> # and nobody ever thought about them <pause dur="4.4"/> right and the notation i was <pause dur="0.2"/> using <pause dur="0.4"/> <kinesic desc="writes on board" iterated="y" dur="4"/> was <pause dur="4.9"/> you should just about be at this stage in your notes anyway this should already be written down <pause dur="0.9"/> so the notation was we're looking at <pause dur="0.8"/><kinesic desc="writes on board" iterated="y" dur="3"/> T-zero T-one <pause dur="0.9"/> up to <pause dur="0.2"/> did i say T-N or T-K <pause dur="1.5"/> N </u><u who="sf0956" trans="overlap"> N </u><pause dur="0.9"/> <u who="nf0955" trans="pause"> <vocal desc="laugh" iterated="n"/><kinesic desc="writes on board" iterated="y" dur="4"/> # <pause dur="0.8"/> and we <pause dur="0.2"/> as usual have <pause dur="0.8"/> T-nought equal to zero <pause dur="4.7"/> and then i gave you a description of the

intervals which i did <pause dur="0.2"/> correctly verbally and not correctly <pause dur="0.5"/> in the written form <pause dur="0.4"/> we think about the intervals <pause dur="0.6"/> <kinesic desc="writes on board" iterated="y" dur="8"/> I-I <pause dur="0.3"/> being <pause dur="1.3"/> zero <pause dur="0.3"/> T-one <pause dur="4.6"/> sorry i knew <unclear>i just</unclear> <pause dur="1.7"/> that's what got me confused you have to have that <pause dur="0.2"/> square bracket <kinesic desc="writes on board" iterated="y" dur="5"/> of the initial zero <pause dur="1.6"/> T-one <pause dur="0.2"/> to T-two <pause dur="0.8"/> and so on so the intervals are <pause dur="0.8"/><kinesic desc="writes on board" iterated="y" dur="13"/> <vocal desc="clears throat" iterated="n"/> with the exception of at zero <pause dur="2.3"/> T-I <pause dur="1.9"/> <vocal desc="clears throat" iterated="n"/><pause dur="1.8"/> T-I-plus <pause dur="1.7"/> T-I-plus-one <pause dur="0.8"/> with the square bracket <pause dur="3.2"/> so that you're actually including <pause dur="2.8"/> the interval <pause dur="8.7"/><kinesic desc="writes on board" iterated="y" dur="8"/> doesn't really make sense <trunc>t</trunc> <pause dur="1.2"/> to put that down as an equation <pause dur="2.4"/> okay then we need exactly the same thing as in an actuarial life table <pause dur="1.2"/> except that instead of looking at intervals <pause dur="0.5"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.4"/> we're looking at point estimates so that <pause dur="4.0"/> <kinesic desc="writes on board" iterated="y" dur="22"/> for <pause dur="1.9"/> all those intervals I equals <pause dur="1.8"/> one <pause dur="1.0"/> up to N <pause dur="0.6"/> we let <pause dur="0.8"/> D-of-T-<pause dur="0.3"/>I <pause dur="0.9"/> be equal to the number <pause dur="0.8"/> who died <pause dur="2.0"/> at <pause dur="0.5"/> T-I <pause dur="0.7"/> and now we are requiring

that we have the <trunc>exec</trunc> exact death time so it's not in an interval it's at a particular time <pause dur="6.3"/> and we're going to need again we're going to need to know how many people <kinesic desc="writes on board" iterated="y" dur="27"/> there are <pause dur="0.3"/> so <pause dur="1.7"/> for I-equals-one up to <pause dur="1.0"/> N <pause dur="2.6"/> <gap reason="inaudible" extent="1 sec"/> <pause dur="1.9"/> and zero <pause dur="0.5"/> # <pause dur="0.5"/> N-of-T-I is <pause dur="0.7"/> the number <pause dur="1.7"/> alive and at risk <pause dur="2.3"/> at <pause dur="1.8"/> time I <pause dur="11.1"/> so those two are <trunc>r</trunc> <pause dur="0.3"/> really no different from <pause dur="1.4"/> the actuarial life tables <pause dur="0.4"/> the difference comes in <pause dur="0.4"/> in the censoring <pause dur="2.1"/><kinesic desc="writes on board" iterated="y" dur="13"/> which <trunc>i</trunc> so again <pause dur="2.1"/> for any time we want to know <pause dur="1.6"/> C ah <trunc>s</trunc> <pause dur="0.4"/> C-of-T-I <pause dur="1.6"/> but this time <pause dur="2.1"/> it's in a different interval so <pause dur="0.3"/> <vocal desc="clears throat" iterated="n"/> it's the number censored <pause dur="2.6"/> <kinesic desc="writes on board" iterated="y" dur="18"/> sorry the number <pause dur="1.6"/> censored <pause dur="3.2"/> in <pause dur="4.8"/> T-I-minus-one <pause dur="0.4"/> T-<pause dur="0.2"/>I <pause dur="1.6"/> and the <pause dur="1.3"/> equality and inequality changed round <pause dur="4.2"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.4"/> we defined the time points to be the point at which deaths occur not at which points at which censoring occur <pause dur="0.8"/> so we can actually get things happening in an

interval for censorings unlike for deaths <pause dur="0.8"/> <vocal desc="clears throat" iterated="n"/><pause dur="0.2"/> and this is the point i was making that if we actually had a death <pause dur="3.3"/> <kinesic desc="writes on board" iterated="y" dur="20"/> at <pause dur="0.3"/> some value say <pause dur="1.1"/> eight doesn't matter what the eight is <pause dur="1.2"/> then <pause dur="1.4"/> the <pause dur="0.5"/> number of deaths at point eight would be equal to one <pause dur="1.0"/> but if there was a censored observation <pause dur="2.4"/> at <pause dur="0.9"/> time eight as well <pause dur="0.9"/> we regard that as happening <pause dur="0.2"/> after that death <pause dur="0.3"/> in other words <pause dur="0.4"/> it has to go <pause dur="0.4"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.9"/> if the because the interval will be <pause dur="1.0"/><kinesic desc="writes on board" iterated="y" dur="6"/> something <pause dur="2.1"/> eight for the censoring where that's strictly <pause dur="1.3"/> values less than eight <pause dur="0.2"/> the censoring goes into the interval above this <pause dur="3.7"/> it's actually much easier to do this often than to <pause dur="0.5"/> # <pause dur="0.6"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.3"/> think about the <pause dur="1.4"/> worry too much about the formulae <pause dur="0.7"/> what it does

mean of course is that you can also say <pause dur="2.2"/> <kinesic desc="writes on board" iterated="y" dur="11"/> therefore <pause dur="2.2"/> the number at time <pause dur="1.0"/> I-plus-one <pause dur="2.1"/> <vocal desc="clears throat" iterated="n"/> well what's going to be there at the number at time I-plus-one <pause dur="2.1"/> <kinesic desc="writes on board" iterated="y" dur="3"/> it'll start off with a number <pause dur="1.7"/> at the beginning <pause dur="1.0"/> at the previous interval <pause dur="2.0"/><kinesic desc="writes on board" iterated="y" dur="1"/> and we're going to have to subtract <pause dur="0.9"/> the deaths <pause dur="4.6"/> that happened at that time <pause dur="1.1"/> and then we're going to subtract the censoring <pause dur="0.5"/> from <pause dur="0.7"/> the next time <pause dur="9.2"/> <kinesic desc="writes on board" iterated="y" dur="3"/> # oh <pause dur="1.1"/> sorry <pause dur="0.8"/> just <kinesic desc="writes on board" iterated="y" dur="8"/> used an abbreviated notation sorry <pause dur="1.7"/> minus <pause dur="0.2"/> # <pause dur="4.8"/> sorry you quite often do land up writing D-I it's putting in the T-I just to make it explicit that it's on the <pause dur="0.3"/> original <pause dur="1.2"/> all the times <pause dur="0.7"/> the other thing obviously we <kinesic desc="writes on board" iterated="y" dur="20"/> need is <pause dur="2.8"/> that <pause dur="0.8"/> the number <pause dur="1.5"/> at time zero is <pause dur="0.5"/> the total number in the study <pause dur="9.3"/> okay i've said obviously that in fact yesterday we were

discussing how you modify this <pause dur="1.1"/> in the occasions when <vocal desc="clears throat" iterated="n"/><pause dur="1.1"/> when that doesn't actually happen there's some interesting situations <pause dur="3.1"/> so this is really by way of setting up definitions which actually give us <pause dur="3.2"/> <kinesic desc="writes on board" iterated="y" dur="3"/> unsurprisingly an estimator that looks the same <pause dur="0.6"/><kinesic desc="writes on board" iterated="y" dur="2"/> just make that T <pause dur="1.5"/> and <pause dur="0.2"/> once again we're going to have a product <pause dur="1.4"/><kinesic desc="writes on board" iterated="y" dur="5"/> this time i'm going to write the product as all T-I <pause dur="0.9"/> such that <pause dur="1.5"/> T-I is less than or equal to T <pause dur="4.8"/> and <pause dur="2.4"/> <kinesic desc="writes on board" iterated="y" dur="10"/> not surprisingly that's the product over <pause dur="0.7"/> all those <pause dur="0.4"/> D-T-Is <pause dur="1.4"/> N-T-Is <pause dur="11.8"/> what we can also notice from this is that <pause dur="1.6"/> you can <trunc>r</trunc> <pause dur="0.2"/><kinesic desc="writes on board" iterated="y" dur="1"/> write this more directly <pause dur="0.2"/> as a recursion <pause dur="1.0"/> by <pause dur="1.4"/> <kinesic desc="writes on board" iterated="y" dur="4"/> using the same <pause dur="3.7"/> product <pause dur="0.2"/> but let's write each of these as <pause dur="1.3"/> <kinesic desc="writes on board" iterated="y" dur="3"/> S-of-<pause dur="1.5"/>T-I <pause dur="14.2"/>

and writing it in this product form <pause dur="0.6"/> is the background to the other name that this is given <pause dur="0.7"/> <kinesic desc="writes on board" iterated="y" dur="26"/> which is this <pause dur="2.5"/> is also <pause dur="2.2"/> known <pause dur="1.1"/> as the <pause dur="1.4"/> product <pause dur="3.8"/> limit <pause dur="1.4"/> estimator <pause dur="15.6"/> one of the other things that can be established on this is that <pause dur="0.3"/> again it can be <kinesic desc="writes on board" iterated="y" dur="13"/> shown to be <pause dur="2.5"/> the maximum likelihood estimator <pause dur="9.2"/> okay <pause dur="0.3"/> well you may need to <pause dur="1.0"/> essentially specify <pause dur="0.2"/> conditions on <pause dur="0.2"/> nothing <pause dur="0.6"/> the <unclear>how is it</unclear> not changing when there's no information <pause dur="1.5"/> and <pause dur="0.7"/> the other thing we're going to want to know is the formula <pause dur="0.6"/> and <pause dur="0.7"/> sorry the the variance <kinesic desc="writes on board" iterated="y" dur="28"/> and <pause dur="3.0"/> we don't need to derive the variance of it again because the <pause dur="0.9"/> variance <pause dur="2.8"/> is <pause dur="0.8"/> <vocal desc="clears throat" iterated="n"/> <pause dur="1.2"/> similar <pause dur="2.1"/> in the obvious way <pause dur="0.7"/> to Greenwood's formula for the <trunc>l</trunc> # <pause dur="0.2"/> life table <pause dur="29.6"/> if you do do some reading around on the survival <pause dur="0.2"/> analysis books you might find a couple of other <pause dur="0.8"/>

<trunc>approxim</trunc> <pause dur="0.2"/> <vocal desc="clears throat" iterated="n"/> approximations to the variance <pause dur="0.5"/> that you can use particularly if there's no censoring <pause dur="1.6"/> if there is no censoring then at any point you're just using a simple binomial estimate <pause dur="0.8"/> so you can just use <pause dur="0.2"/> the binomial formula <pause dur="0.9"/> # <pause dur="0.9"/> in general <pause dur="0.2"/> you're going to be doing this kind of thing where you've got enough data to want to use a package which will calculate the variance for you <pause dur="0.4"/> so the simple approximations are not that critical <pause dur="1.1"/> okay i put some data up <pause dur="0.2"/> on <pause dur="1.6"/> Wednesday <pause dur="0.8"/> and <pause dur="0.6"/> # <pause dur="1.5"/> i have got an estimator Kaplan-Meier curve for that <pause dur="0.6"/> which i shall hand out in due course the main reason i'm not handing it out yet is i'm going to ask you to do something <pause dur="0.3"/> which has a solution on the back of the sheet <pause dur="1.2"/> # <pause dur="0.5"/> and that just goes through the calculation <pause dur="0.9"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.5"/> for you so <pause dur="0.9"/> you've got the list of times <pause dur="0.2"/> and for practice you can try doing it yourselves and then <pause dur="0.3"/> see whether you get the same answers <pause dur="1.1"/> one of

the other things that's useful to do <pause dur="0.4"/> which may or may not endear me entirely to <pause dur="1.6"/> # people fiddling around watching what i'm up to <pause dur="0.3"/> # <pause dur="26.6"/> <event desc="turns on overhead projector" iterated="n"/><kinesic desc="puts on transparency" iterated="n"/> right i should get these lights out <pause dur="3.6"/><event desc="turns off lights" iterated="n"/> okay what's quite often <pause dur="0.4"/> you'll do with an actuarial life test like a life table <pause dur="0.5"/> is draw a survival curve <pause dur="1.3"/> and <pause dur="6.3"/> you can see the fact that we have estimates only at times in which events occur <pause dur="1.3"/> by the jumps whenever there's an event <pause dur="0.4"/> and you can see that the jumps don't <pause dur="0.8"/> always happen at <pause dur="0.4"/> identical intervals <pause dur="1.6"/> you can also see in this group this is actually gastric cancer with a couple of comparisons <pause dur="1.2"/> with time and days <pause dur="1.5"/> you can see a <pause dur="0.4"/> a very dramatic <pause dur="0.6"/> drop in <pause dur="0.7"/> survival <pause dur="0.5"/> so that <pause dur="0.3"/> five-hundred days you've only got about a third of the people still alive <pause dur="0.9"/> and then <pause dur="0.6"/> the survival <pause dur="0.2"/> levels off a little bit <pause dur="3.0"/> but the other reason for showing this <pause dur="0.2"/> curve again there's a there's a printed one in the handout i'll give you so <pause dur="0.2"/> you can <pause dur="0.2"/> have a look at that being drawn up <pause dur="0.6"/>

the other reason for showing the curve is that of course generally what we're interested in <pause dur="0.4"/> is not just the survival <pause dur="0.2"/> i mean we may well want to know what the median survival is <pause dur="0.4"/> looks like about a <pause dur="0.6"/> a year in the radiation group with chemotherapy <pause dur="0.7"/> and maybe about <pause dur="0.5"/> a year and a half to two years in the other group <pause dur="0.8"/> but we actually would want to do a comparison of those <pause dur="0.7"/> in that group it looks as though the dotted line <pause dur="1.2"/> has got a better survival <pause dur="0.9"/> which is the chemotherapy-only group <pause dur="0.7"/> but how would we think about formally testing that <pause dur="2.4"/> well that's where we get <pause dur="0.5"/> back to a log-rank test <pause dur="1.3"/> which i'll talk about <pause dur="0.3"/> in formulae for in a minute i'm just trying to see if i can pick up <pause dur="0.6"/> let's <kinesic desc="writes on transparency" iterated="y" dur="4"/> concentrate on <pause dur="3.1"/> just that little bit of the <pause dur="0.3"/> graph at the moment <pause dur="0.9"/> what we could say at <kinesic desc="indicates point on transparency" iterated="n"/> that point is that <pause dur="0.6"/><kinesic desc="writes on transparency" iterated="y" dur="15"/> in the one group <pause dur="0.2"/> # <pause dur="0.7"/> there will be some number <pause dur="0.3"/> at whatever the

time is <pause dur="1.1"/> and that's going to be the number from the chemotherapy and radiation group <pause dur="1.3"/> and <pause dur="0.4"/> of those <pause dur="0.9"/> chemotherapy and radiation group <pause dur="0.5"/> there was a death at that <pause dur="0.4"/> we cannot count the number of deaths at that time <pause dur="0.9"/> in this case <pause dur="0.5"/><kinesic desc="writes on transparency" iterated="y" dur="1"/> just here <kinesic desc="indicates point on transparency" iterated="n"/> there are no deaths <pause dur="1.6"/> whereas <kinesic desc="indicates point on transparency" iterated="n"/> here <pause dur="0.2"/> there'll be <kinesic desc="writes on transparency" iterated="y" dur="12"/> some fixed number in the chemotherapy-only group <pause dur="1.3"/> and the death at that time <pause dur="1.0"/> we can see there is <pause dur="0.9"/> a jump so there must have been a death <pause dur="1.5"/> and we can think of that <pause dur="0.7"/> as then being a comparison of two binomials <pause dur="0.5"/> we could even set it up <pause dur="1.0"/> as a small <pause dur="0.2"/> two by two table <pause dur="0.6"/> and do a formal comparison of <pause dur="1.1"/> given the numbers in the two groups so if there were equal numbers in the two groups at this point at risk <pause dur="1.0"/> and we saw one death <pause dur="1.0"/> then we would divide that one death in terms of expectation <pause dur="0.5"/> into half in one group half in another <pause dur="1.6"/>

and provided we can assume <trunc>th</trunc> independence we can actually do that <pause dur="0.7"/> along the whole curve and that's the basis of the log-rank test <pause dur="2.8"/> so to come back to the <pause dur="0.8"/> blackboard <pause dur="2.0"/> and <pause dur="0.3"/> i'll leave that a minute 'cause i can see a couple of pens <pause dur="2.1"/> blackboard and chalk in fact i think we're coming back to <pause dur="1.0"/> a couple of minutes of you <pause dur="0.2"/> talking about what i've just said and whether it makes sense while i clean the blackboard <pause dur="6.6"/><event desc="wipes board" iterated="y" dur="unknown"/> or anything else you want to talk about </u><gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause"> okay have you got any questions on that <pause dur="1.4"/> no okay <pause dur="0.4"/> log-rank test <pause dur="11.1"/> <kinesic desc="writes on board" iterated="y" dur="8"/> i'm not really sure where the word log comes into <pause dur="0.7"/> this <pause dur="0.8"/> but the reason we talk about ranks is that we simply look at the order <trunc>inven</trunc> <pause dur="0.2"/> which the events occur <pause dur="0.6"/> and we don't look at how far apart they are in other words we've got the rank times <pause dur="0.6"/> but not <pause dur="0.2"/> the actual

values of the times <pause dur="0.7"/> and the point of a log-rank test is <pause dur="2.0"/> <kinesic desc="writes on board" iterated="y" dur="32"/> to compare two survival <trunc>gr</trunc> <pause dur="0.4"/> two or more survival curves so if we <pause dur="2.1"/> wish to <pause dur="2.1"/> test <pause dur="1.5"/> whether <pause dur="1.6"/> two survival curves are equal <pause dur="14.5"/> in a non-parametric <pause dur="0.6"/> context so non-parametrically <pause dur="8.7"/><kinesic desc="writes on board" iterated="y" dur="13"/> we <pause dur="2.6"/> can use <pause dur="2.1"/> the log-rank test <pause dur="10.4"/> okay as an aside that you don't need to write down but it's probably <pause dur="0.4"/> quite useful general knowledge <pause dur="1.1"/> non-parametric tests are essentially tests based on things like ranks that don't take <pause dur="0.2"/> the actual values into account <pause dur="1.7"/> we don't happen to teach very much about them at all on MORSE i think in fact this course is the only one that <pause dur="0.5"/> probably mentions them <pause dur="0.5"/> although i know at least some of you have discovered them in doing your reading of the literature <pause dur="1.1"/> # <pause dur="1.6"/> they tend to be very popular in the social sciences <pause dur="0.2"/> probably historically as much as anything else <pause dur="1.3"/> i've said we can use the log-rank test because

there is also <pause dur="0.3"/> a Wilcoxan <pause dur="4.6"/> i've been asked about the Wilcoxan test this morning and the name of the Wilcoxan test <pause dur="1.3"/> that actually relates to survival has just escaped me there is a Wilcoxan test that's related to survival <pause dur="0.4"/> so if you're reading any of the survival texts <pause dur="0.4"/> you might find reference to both of them <pause dur="0.8"/> # <pause dur="0.4"/> anything based on ranks tends to <trunc>in</trunc> require a lot more hard work which is why i'm not going to describe it but it does exist <pause dur="0.5"/> <sic corr="non-parametric">non-paracmetric</sic> tests are around <pause dur="0.5"/> # <pause dur="0.2"/> so if somebody at an interview or anything asks you about them <pause dur="0.4"/> you've heard of them <pause dur="0.3"/> you just haven't studied them <pause dur="1.0"/> # <pause dur="2.1"/> so <trunc>e</trunc> end of that aside we're being non-parametric we're comparing two groups <pause dur="0.4"/> and as i said what we're really wanting to do <pause dur="0.9"/> is <pause dur="0.8"/> what we're going to do is look at each point but let's think about if we're comparing and doing a significance test we need a null hypothesis <pause dur="0.8"/> and it's slightly complicated in this <pause dur="0.2"/> <kinesic desc="writes on board" iterated="y" dur="48"/> context <pause dur="1.6"/> to

write it down <pause dur="0.6"/> it's not quite as trivial as some other things so that the null hypothesis <pause dur="4.4"/> is <pause dur="2.0"/> that <pause dur="0.3"/> the <trunc>cur</trunc> <pause dur="0.2"/> <vocal desc="clears throat" iterated="n"/> the <pause dur="1.4"/> curves <pause dur="1.5"/> are <pause dur="3.0"/> identical <pause dur="4.4"/> I-E <pause dur="2.9"/> S for group one <pause dur="0.2"/> of T <pause dur="1.2"/> equals S for group two <pause dur="0.2"/> of T <pause dur="0.8"/> that's a form we <pause dur="0.2"/> usually can write hypotheses in but here <pause dur="0.6"/> we have to add in the comment for all T <pause dur="0.7"/> T greater than or equal to <pause dur="0.2"/> zero <pause dur="6.3"/> and <pause dur="0.5"/> you probably <vocal desc="clears throat" iterated="n"/> want to write what i just said <pause dur="0.7"/> <kinesic desc="writes on board" iterated="y" dur="18"/> where <pause dur="0.8"/> S-one <pause dur="1.6"/> S-two <pause dur="3.0"/> are the curves <pause dur="3.1"/> for <pause dur="2.3"/> groups one and two <pause dur="10.0"/> the for all T <pause dur="0.5"/> is <pause dur="0.8"/> the most powerful way of doing things in other words we want to compare what's happening on the whole survival <pause dur="0.7"/> it is actually very common in medicine to look at <pause dur="0.5"/> survival up to thirty days after an operation <pause dur="0.7"/> or survival up to one year <pause dur="0.4"/> that's certainly convenient as a summary <pause dur="0.3"/> it's just not as <pause dur="0.4"/> informative as it could be <pause dur="0.5"/> as a test <pause dur="0.6"/> so yesterday # # <pause dur="0.2"/> one of the medical professors was talking

about survival after <pause dur="0.2"/> a difficult operation <pause dur="0.5"/> and that was expressed in terms <pause dur="0.5"/> of survival to thirty days for <pause dur="0.2"/> general discussion <pause dur="0.7"/> but the formal tests were done in terms of complete survival curves <pause dur="1.1"/> what we actually do is we again think of at about each point <pause dur="1.3"/> <kinesic desc="writes on board" iterated="y" dur="24"/> so at each <pause dur="2.2"/> time <pause dur="1.4"/> T-I <pause dur="1.5"/> at which a death occurs <pause dur="0.8"/> at which at least i should say <pause dur="2.3"/> one death occurs <pause dur="7.2"/> okay and that's one death occurs in either group <pause dur="4.8"/> what we do is we <pause dur="0.2"/> <kinesic desc="writes on board" iterated="y" dur="7"/> form <pause dur="1.8"/> a two by two <pause dur="0.3"/> table <pause dur="3.4"/> okay i mean in reality we don't form the whole table but that's what we're doing conceptually <pause dur="1.7"/> and what does that table look like <pause dur="2.2"/> <kinesic desc="writes on board" iterated="y" dur="6"/> died <pause dur="2.6"/> not dead <pause dur="12.0"/> <kinesic desc="writes on board" iterated="y" dur="7"/> group one <pause dur="2.9"/> group two <pause dur="3.2"/> and i'm <pause dur="0.2"/> probably going to swap to a yeah i'm going to swap to a shorthand notion <pause dur="0.6"/> drop the T so let's just make that <pause dur="0.5"/><kinesic desc="writes on board" iterated="y" dur="1"/> D-one-I <pause dur="1.1"/> which

can of course now be zero because just 'cause somebody's died in group one they don't have to have died in group <pause dur="0.2"/> two <pause dur="0.2"/> as i showed <pause dur="1.2"/><kinesic desc="writes on board" iterated="y" dur="13"/> D-one-I D-two-I <pause dur="1.6"/> and this is total <pause dur="1.8"/> at risk at that point so <pause dur="1.6"/> N-<pause dur="0.3"/>one-I <pause dur="0.7"/> N-<pause dur="0.3"/>two-I <pause dur="12.5"/> two by two tables you've seen before <pause dur="0.8"/> and the standard way of dealing with those is look <pause dur="0.4"/> doing an observed minus expected comparison <pause dur="0.8"/> so <pause dur="2.5"/> the <pause dur="0.4"/> expected <pause dur="3.7"/> number <pause dur="2.2"/> of deaths <pause dur="2.6"/> in <pause dur="2.5"/> group <pause dur="1.5"/> one <pause dur="0.3"/> at time I <pause dur="6.0"/> <kinesic desc="writes on board" iterated="y" dur="50"/> is <pause dur="2.1"/> expected value of the random variable <pause dur="0.5"/> D-one-I <pause dur="1.4"/> we will put down as <pause dur="1.0"/> N-one-I <pause dur="1.6"/> N-one-I plus <pause dur="0.6"/> N-two-I <pause dur="1.5"/> times the total number of deaths that occur <pause dur="0.4"/> at that particular point <pause dur="10.1"/> okay that's nothing new to probably <pause dur="2.2"/> first year <pause dur="3.2"/> what i'm not sure is whether you're going to remember the variance expression for that <pause dur="2.9"/> anybody remember the variance <pause dur="9.5"/> <kinesic desc="writes on board" iterated="y" dur="4"/> might just

about regret starting writing it on that bit of the board </u><pause dur="2.7"/> <u who="sm0957" trans="pause"> is the not dead column the same as the <pause dur="0.5"/> total risk column or </u><pause dur="0.4"/> <u who="nf0955" trans="pause"> oh sorry i've <pause dur="3.9"/> <kinesic desc="writes on board" iterated="y" dur="5"/> i was just being lazy and not filling in the <trunc>to</trunc> <pause dur="0.2"/> the <pause dur="0.2"/> the whole thing <pause dur="1.2"/> <kinesic desc="indicates point on board" iterated="n"/> those were all the all those that are at risk <pause dur="0.2"/> of whom <pause dur="0.3"/> <kinesic desc="indicates point on board" iterated="n"/> those died and <kinesic desc="indicates point on board" iterated="n"/> these ones didn't die at that point <pause dur="0.6"/> you can <pause dur="9.7"/> <kinesic desc="writes on board" iterated="y" dur="15"/> could also write <pause dur="1.7"/> D-I <pause dur="0.3"/> N-I minus D-I <pause dur="0.8"/> where you're summing up over the <pause dur="0.3"/> subscripts <pause dur="15.6"/> right <pause dur="2.7"/> the variance term <pause dur="1.5"/> involves <pause dur="0.5"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.4"/> quite a lot of elements <pause dur="0.5"/> if you really want to <trunc>s</trunc> <pause dur="0.2"/> think about these tables in detail you actually land up with a

hypergeometric distribution <pause dur="0.8"/> # which would be <pause dur="0.3"/> a nice thing to set as a exam question for second year <pause dur="1.0"/> but for this year <pause dur="3.0"/> <kinesic desc="writes on board" iterated="y" dur="5"/> we're talking about N-one-I N-two-I <pause dur="1.3"/> so <pause dur="0.3"/> <kinesic desc="indicates point on board" iterated="n"/> those two <pause dur="1.6"/> multiplied by <pause dur="0.9"/> <kinesic desc="writes on board" iterated="y" dur="3"/> D-I <pause dur="0.7"/> N-I minus D-I <pause dur="1.1"/> so you're multiplying <pause dur="0.4"/> the four margins <pause dur="1.4"/> sorry <pause dur="0.2"/> yeah the four marginals together <pause dur="1.1"/> row margins <pause dur="0.5"/> and column margins <pause dur="2.0"/> <kinesic desc="writes on board" iterated="y" dur="1"/> and then what you divide by <pause dur="0.5"/> is a function of the total <pause dur="0.5"/> <kinesic desc="writes on board" iterated="y" dur="4"/> it's actually N-I-squared <pause dur="0.7"/> N-I-minus-one <pause dur="0.7"/> so if any decent sample size <pause dur="0.5"/> is just N-I-cubed <pause dur="2.3"/> and <pause dur="0.4"/> let's write that one again <pause dur="1.4"/><kinesic desc="writes on board" iterated="y" dur="1"/> think it should be fairly clear that # <pause dur="1.5"/> again it's the kind of thing you write a program for rather than <pause dur="0.6"/> enjoy doing on your calculator <pause dur="0.6"/> because this <kinesic desc="indicates point on board" iterated="n"/> is for only one time <pause dur="1.0"/> and we're going to need to think about it for a whole lot of times <pause dur="4.5"/> so in <pause dur="1.5"/> think i'll just got to

clean the board anyway i'll clean these two while you <pause dur="3.0"/> or at least one of them <pause dur="0.9"/> while you decide if you've got any questions on that <event desc="wipes board" iterated="y" dur="unknown"/></u> <gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause"> okay so what we do <pause dur="0.5"/> that's the expected value for a single <pause dur="0.5"/> time point <pause dur="1.6"/> what we want to do is to <kinesic desc="writes on board" iterated="y" dur="28"/> let <pause dur="1.5"/> E-<pause dur="0.8"/>one expected for group one <pause dur="0.9"/> well it's the fairly obvious thing you're going to do you're going to sum up <pause dur="0.9"/> over <pause dur="0.9"/> all your times <pause dur="1.1"/> and <pause dur="2.1"/> use E <pause dur="1.1"/> # E-one-I <pause dur="1.3"/> where <pause dur="2.2"/> E-one-I <pause dur="0.2"/> is precisely the expected value of E <pause dur="1.3"/> D-I the <trunc>n</trunc> the expected number of deaths at each interval <pause dur="8.7"/> and similarly for <pause dur="0.7"/> E-two-I <pause dur="5.1"/> obvious thing we're going to want to do as well is have the <pause dur="0.5"/> an observed so we're going to have <pause dur="0.4"/> <kinesic desc="writes on board" iterated="y" dur="9"/> the same notation <pause dur="0.8"/> observed simply <pause dur="0.4"/> equals the <pause dur="1.6"/> number of deaths <pause dur="0.9"/> actually occurring in group one at each of those times <pause dur="4.1"/> and the other thing we're going to want is the variance <pause dur="2.8"/> <kinesic desc="writes on board" iterated="y" dur="7"/> which shouldn't come as a surprise either <pause dur="1.5"/> the variance <pause dur="1.8"/> as i said there's an independence

assumption that we make <pause dur="0.5"/> the variance is <kinesic desc="writes on board" iterated="y" dur="4"/> the sum of the <pause dur="2.3"/> variances at each point <pause dur="2.9"/> and we could of course <pause dur="1.4"/> <kinesic desc="writes on board" iterated="y" dur="6"/> use a shorthand notation <pause dur="1.2"/> just calling it <trunc>v</trunc> <pause dur="0.6"/> V-I <pause dur="0.2"/> at each point <pause dur="2.9"/> why am i calling it V-I and not V-one-I <pause dur="9.4"/> yes </u><pause dur="4.7"/> <u who="sm0958" trans="pause"> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="0.6"/> <u who="nf0955" trans="pause"> 'cause if you look at <kinesic desc="indicates point on board" iterated="n"/> that <pause dur="0.2"/> if i swap those round all i'd be doing is swapping the <pause dur="0.3"/> position of the N-one and N-two it would make no difference <pause dur="1.4"/> okay so we've got <pause dur="0.7"/> an expected value an observed value <pause dur="0.2"/> a variance <pause dur="0.7"/> so the log-rank test comes back to something that <pause dur="1.3"/> <vocal desc="clears throat" iterated="n"/> <pause dur="0.7"/> would often be denoted by a Z statistic <pause dur="2.2"/> <kinesic desc="writes on board" iterated="y" dur="21"/> so the log-rank <pause dur="3.0"/> test <pause dur="0.6"/> uses <pause dur="1.2"/> Z equals <pause dur="2.0"/> observed <pause dur="0.2"/> minus expected <pause dur="1.6"/> divided by <pause dur="1.4"/> the square root <pause dur="0.2"/> of the variance <pause dur="4.6"/> and it uses that <pause dur="2.7"/><kinesic desc="writes on board" iterated="y" dur="16"/>

that's why we tend to use Z <pause dur="0.2"/> compared <pause dur="2.7"/> to the standard normal <pause dur="9.7"/> there's another way in which it's quite commonly done <pause dur="1.0"/><kinesic desc="writes on board" iterated="y" dur="19"/> alternatively <pause dur="6.4"/> Z-squared <pause dur="2.3"/> in other words something that's <pause dur="1.4"/> obviously looks like a chi-squared term <pause dur="0.9"/> O-minus-E squared <pause dur="1.8"/> over V <pause dur="1.8"/><kinesic desc="writes on board" iterated="y" dur="4"/> is compared to a <pause dur="0.8"/> chi-squared on one <pause dur="0.2"/> degree of freedom <pause dur="2.2"/> and there's a reason for mentioning <pause dur="0.3"/> that </u><gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause"> it's not immediately obvious how we're going to generalize a two by two table which is quite a nice thing to say a two by three table <pause dur="1.4"/> which is why one can think of a simpler version <pause dur="0.7"/> than the log-rank <pause dur="0.7"/> the log-rank is what you would use if you've only got two groups <kinesic desc="writes on board" iterated="y" dur="10"/> but if you want to think about a generalization <pause dur="1.3"/> alternatively <pause dur="2.3"/> we can <pause dur="0.7"/> use <pause dur="2.5"/> and as i don't use this very often i definitely don't remember it <pause dur="0.8"/> we use something that requires us <kinesic desc="writes on board" iterated="y" dur="15"/> to think about E-two <pause dur="1.3"/>

which is <pause dur="1.5"/> pretty simple that's just the <pause dur="0.5"/> sum of the <pause dur="3.2"/> expected value sorry of D-<pause dur="0.9"/>two-I <pause dur="0.2"/> in the obvious notation <pause dur="5.7"/> and <pause dur="3.4"/><kinesic desc="writes on board" iterated="y" dur="8"/> if you want to write it explicitly so that's <pause dur="0.7"/> N-two-I <pause dur="0.9"/> D-I <pause dur="1.4"/> over N-I <pause dur="7.0"/><kinesic desc="writes on board" iterated="y" dur="8"/> # similarly O-two is the <pause dur="1.2"/> should really have memorized this <pause dur="0.5"/> the # <pause dur="0.4"/> obvious definition is just the sum of all the deaths <pause dur="8.5"/> the one thing about <kinesic desc="indicates point on board" iterated="n"/> that Z-<pause dur="0.4"/>squared on one degree of freedom that doesn't look completely standard is to being divided by the variance <pause dur="1.5"/> so for this one <kinesic desc="writes on board" iterated="y" dur="23"/> we just use that completely standard form use <pause dur="1.3"/> X-squared equal to <pause dur="1.3"/> E-one-minus-<pause dur="0.5"/>O-one <pause dur="0.2"/> squared <pause dur="0.8"/> over <pause dur="0.3"/> E-one <pause dur="0.2"/> plus <pause dur="1.9"/> E-two-minus-<pause dur="3.0"/>O-two squared <pause dur="1.1"/> over <pause dur="1.5"/> E-two <pause dur="2.9"/> and anybody who feels really energetic can start playing with the formulae and seeing just how different they might get <pause dur="2.6"/> we're again <kinesic desc="writes on board" iterated="y" dur="4"/> referring to <pause dur="1.2"/> a chi-squared just on one degree of freedom <pause dur="4.0"/> but what can we say about this well <pause dur="1.7"/> the disadvantage <pause dur="0.6"/> why don't we use the

simpler one <kinesic desc="writes on board" iterated="y" dur="16"/> well the disadvantage <pause dur="2.1"/> is <pause dur="1.0"/> that <pause dur="1.7"/> X-squared is <pause dur="0.5"/> conservative <pause dur="4.9"/> and by conservative <pause dur="2.2"/> i mean that if you had something that was at the borderline say over five per cent significance level <pause dur="0.8"/> if you did the log-rank test it would show as significant <pause dur="0.8"/> if you did the <pause dur="0.4"/> simpler test it would tend to show not as significant so that's what we mean by conservative <pause dur="1.0"/> but <pause dur="0.2"/> looking at that formula <pause dur="2.9"/> the advantage is well if you think of the question i asked you how do you generalize this to three groups <pause dur="4.0"/> if instead of having <pause dur="0.3"/> chemotherapy and <trunc>radio</trunc> <pause dur="0.6"/> versus chemotherapy plus radiotherapy we'd had a third group <pause dur="0.3"/> radiotherapy only <pause dur="1.5"/> how would you generalize <pause dur="0.3"/> this what's the obvious generalization </u><pause dur="5.4"/> <u who="sm0959" trans="pause"> <gap reason="inaudible" extent="1 sec"/> the term for <pause dur="0.8"/> E-three </u><pause dur="0.5"/> <u who="nf0955" trans="pause"> you just put in an E-three term or however many you like because <pause dur="0.5"/><kinesic desc="indicates point on board" iterated="n"/> these terms are now <pause dur="0.4"/> pretty obvious so the <pause dur="0.7"/> <kinesic desc="writes on board" iterated="y" dur="21"/> advantage <pause dur="4.4"/> is <pause dur="1.0"/> ease <pause dur="0.6"/> of <pause dur="1.7"/>

generalizing <pause dur="3.9"/> to <pause dur="0.7"/> N groups <pause dur="10.8"/> okay <pause dur="1.1"/> what i thought i would do i thought it might have been slightly nearer the middle rather than nearer the end of the lecture <pause dur="0.8"/> # <pause dur="1.7"/> i take it you've all got the dataset from <pause dur="0.2"/> that i <trunc>m</trunc> <trunc>m</trunc> <pause dur="0.2"/> listed last week 'cause i haven't written it down <pause dur="2.2"/> for the control group <pause dur="5.4"/><kinesic desc="writes on board" iterated="y" dur="4"/> i'll give you the first few observations not necessarily all of them i mean i'll write them all down <pause dur="0.6"/> but <pause dur="0.2"/> you don't need to copy them all down what i want you to do is to try to write down <pause dur="0.4"/> the first couple of lines of a table <pause dur="0.4"/> to calculate what you need <pause dur="0.6"/> for <pause dur="0.9"/> a log-rank test <pause dur="0.6"/> # the table i've got has got <pause dur="3.9"/> ten columns so have a think about which columns and what you're going to put into those columns <pause dur="2.3"/> 'cause that way <pause dur="0.8"/> you're more likely to remember it if i decide to put this into an exam <pause dur="4.6"/> which is the other advantage of a simple procedure <pause dur="0.6"/> put it into an exam more easily <pause dur="0.7"/> okay so in the control group the <pause dur="0.8"/> times were <pause dur="0.3"/> two <pause dur="0.5"/> three <kinesic desc="writes on board" iterated="y" dur="22"/> </u><gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause">

<kinesic desc="writes on board" iterated="y" dur="16"/> and i'm <pause dur="0.3"/> planning to ask somebody to come and write up what they've thought on the board so <pause dur="0.8"/> as a just a gentle aid to actually addressing the problem <kinesic desc="writes on board" iterated="y" dur="8"/></u><gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause"> <unclear>probably</unclear> give you another <pause dur="0.4"/> at least three or four minutes if not more <pause dur="1.7"/> and i probably won't ask for a volunteer <pause dur="0.3"/> probably just ask someone <pause dur="4.6"/> if i ask for a volunteer i know who's hard-working and who will probably have an answer so <pause dur="1.6"/> those of you who are quiet i've no idea how good you are </u><gap reason="break in recording" extent="uncertain"/> <u who="nf0955" trans="pause"> okay i'm going to resist the temptation to use the <trunc>advant</trunc> the fact that i know some people's names and not others <pause dur="14.9"/> okay <pause dur="0.4"/> # but what i'm going to do is go for colour so <pause dur="0.8"/> gentleman in the nice red sweatshirt <pause dur="0.8"/> <vocal desc="laugh" iterated="n"/> <pause dur="0.3"/> i think you guessed that one was <shift feature="voice" new="laugh"/>coming <shift feature="voice" new="normal"/>when i said colour <pause dur="1.3"/> <vocal desc="laugh" iterated="n" n="sm0960"/>

come on come and write down what i don't <pause dur="0.3"/> doesn't matter whether you've got it right or wrong </u><u who="sm0960" trans="overlap"> <gap reason="inaudible" extent="1 sec"/> down </u><pause dur="0.5"/> <u who="nf0955" trans="pause"> yes but you've had a discussion 'cause i can see that </u><pause dur="1.2"/> <u who="sm0960" trans="pause"> <gap reason="inaudible" extent="1 sec"/><vocal desc="laughter" iterated="y" n="ss" dur="1"/> </u><pause dur="0.8"/> <u who="nf0955" trans="pause"> <vocal desc="laugh" iterated="n"/> <pause dur="0.6"/> did anyone come up with anything <pause dur="0.3"/> about how you're going to tackle it <pause dur="3.6"/> i can start working systematically through all of you <pause dur="0.7"/> <gap reason="name" extent="1 word"/> <pause dur="2.3"/> what column headings would you have had </u><pause dur="1.3"/> <u who="sm0961" trans="pause"> you've got to look at the <pause dur="0.9"/> it has something to do with two sets of <pause dur="0.2"/> trials you have the control group and the <pause dur="2.3"/> the </u><u who="nf0955" trans="overlap"> drug group </u><u who="sm0961" trans="latching"> the drug group from last time </u><u who="nf0955" trans="pause"> yeah <pause dur="5.0"/> <kinesic desc="writes on board" iterated="y" dur="4"/> okay so what do we need to have in those <pause dur="2.5"/> # let's go to the back row <pause dur="1.2"/> what what information are we going to need to <kinesic desc="indicates point on board" iterated="n"/> have on

those <pause dur="0.7"/> to fulfil the formulae </u><pause dur="1.3"/> <u who="sf0962" trans="pause"> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="2.5"/> <u who="nf0955" trans="pause"> so we're going to have the time so we're going to have the number <pause dur="0.6"/><kinesic desc="writes on board" iterated="y" dur="1"/> dead at a particular time <pause dur="1.3"/> and </u><pause dur="2.1"/> <u who="sf0962" trans="pause"> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="1.1"/> <u who="nf0955" trans="pause"> total number at risk <pause dur="1.2"/><kinesic desc="writes on board" iterated="y" dur="7"/> and i should <pause dur="0.2"/> probably put in <pause dur="2.2"/> that T is one for <pause dur="3.6"/> control <pause dur="0.2"/> drug <pause dur="0.2"/> yeah <pause dur="0.2"/> so most of you worked that much out yep <pause dur="2.8"/> how are you going to work out the <trunc>ex</trunc> what do you need to work out the expected numbers in either of those groups <pause dur="2.6"/> oh <pause dur="0.8"/> i rubbed the formulae off but you've got in your notes </u><pause dur="3.9"/> <u who="sf0963" trans="pause"> total number of deaths </u><pause dur="2.7"/> <u who="nf0955" trans="pause"> <kinesic desc="writes on board" iterated="y" dur="4"/> D-T <pause dur="0.5"/> think i'll <trunc>sw</trunc> <pause dur="0.2"/> swap between T and I <pause dur="1.0"/> and therefore <unclear>so the</unclear> total number <pause dur="2.1"/><kinesic desc="writes on board" iterated="y" dur="9"/> and that will allow you to go for <pause dur="0.8"/> E-one <pause dur="0.3"/> at time T <pause dur="0.4"/> E-two at time T <pause dur="1.3"/> and the variance at time T <pause dur="1.9"/> so you're all now going to be able to remember that <pause dur="0.2"/> without

having to be told it <pause dur="0.7"/> she says cheerfully <pause dur="1.4"/> # <pause dur="1.2"/> what's the very first time you've got <pause dur="0.3"/> in the datasets we're talking about <pause dur="0.5"/> and i should actually say <pause dur="0.2"/> # <pause dur="3.6"/> <kinesic desc="writes on board" iterated="y" dur="3"/> we're starting with twenty-two people in each group what's the first time <pause dur="0.3"/> between those two datasets <pause dur="1.3"/> middle of the threesome <pause dur="1.8"/> what's the first time </u><pause dur="0.7"/> <u who="sm0964" trans="pause"> # <pause dur="0.4"/> T </u><pause dur="0.5"/> <u who="nf0955" trans="pause"> right <pause dur="0.7"/> and what do we need to fill in for the rest of that column </u><pause dur="1.2"/> <u who="sm0964" trans="pause"> sorry </u><pause dur="0.2"/> <u who="nf0955" trans="pause"> what do we need to fill in for the rest of that row </u><pause dur="1.0"/> <u who="sm0964" trans="pause"> # <pause dur="3.0"/> the deaths </u><pause dur="0.4"/> <u who="nf0955" trans="pause"> yes </u><pause dur="0.6"/> <u who="sm0964" trans="pause"> i # <pause dur="0.4"/> # </u><u who="nf0955" trans="overlap"> <shift feature="voice" new="laugh"/>which are <shift feature="voice" new="normal"/> <vocal desc="laughter" iterated="y" n="ss" dur="1"/></u><pause dur="0.2"/> <u who="sm0964" trans="pause"> # <pause dur="0.5"/> two <pause dur="0.7"/> well one in each <pause dur="2.8"/> <kinesic desc="writes on board" iterated="y" dur="1" n="nf0955"/> so two for the <pause dur="0.5"/><kinesic desc="writes on board" iterated="y" dur="1" n="nf0955"/> next one <pause dur="1.6"/> # <pause dur="0.6"/> yeah <pause dur="2.0"/> <vocal desc="laughter" iterated="y" n="ss" dur="1"/><kinesic desc="writes on board" iterated="y" dur="2" n="nf0955"/> forty-<pause dur="0.6"/>two </u><pause dur="0.3"/> <u who="nf0955" trans="pause"> it's actually forty-four it's the <trunc>tot</trunc> it's the the two together </u><u who="sm0964" trans="overlap"> <gap reason="inaudible" extent="1 sec"/> </u><u who="nf0955" trans="overlap"> yeah <pause dur="0.3"/>

then goes down to forty-two for here <kinesic desc="indicates point on board" iterated="n"/> <pause dur="2.5"/> <kinesic desc="writes on board" iterated="y" dur="1"/> expected <pause dur="1.0"/> it's really easy <pause dur="0.3"/> let's go to the <pause dur="0.3"/> person sitting on his own <pause dur="0.8"/> expected number of deaths in each group <pause dur="1.4"/> well you've got you've got a <trunc>f</trunc> <pause dur="0.2"/> # formula up there <kinesic desc="indicates point on board" iterated="n"/> <pause dur="1.0"/> what's the expected number of deaths in group two <pause dur="3.7"/> how many deaths in group two <pause dur="1.1"/> at <kinesic desc="indicates point on board" iterated="n"/> this time </u><pause dur="2.7"/> <u who="sm0965" trans="pause"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="1.7"/> <u who="nf0955" trans="pause"> yeah <pause dur="0.5"/> one <pause dur="1.8"/><kinesic desc="writes on board" iterated="y" dur="1"/> how many deaths in total </u><pause dur="1.9"/> <u who="sm0965" trans="pause"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="2.6"/> <u who="nf0955" trans="pause"> <kinesic desc="writes on board" iterated="y" dur="1"/> and how many if it was total number <pause dur="4.8"/> <kinesic desc="writes on board" iterated="y" dur="1"/> oops sorry it was total <pause dur="3.7"/> sorry i'm putting the <pause dur="2.8"/><kinesic desc="writes on board" iterated="y" dur="2"/> forty-four at the bottom and the twenty <pause dur="0.2"/> sorry <pause dur="0.6"/> total <pause dur="0.9"/> total number # <pause dur="0.3"/> <vocal desc="clears throat" iterated="n"/> at risk in one group twenty-two <pause dur="0.2"/> total number forty-four <pause dur="0.6"/> and <pause dur="3.2"/> sorry <pause dur="0.2"/> i'm <pause dur="3.3"/> going to write the answer down much more easily which is what i did rather than writing the formula down <pause dur="1.1"/> let's just

do it <pause dur="1.0"/> the way i was thinking of it <pause dur="0.7"/> the total numbers <pause dur="0.2"/> which is the way you just think <kinesic desc="indicates point on board" iterated="n"/> number in this group is a <trunc>n</trunc> proportion of the number of <kinesic desc="indicates point on board" iterated="n"/> <kinesic desc="writes on board" iterated="y" dur="2"/> that group is a half which is why i was writing a half down <pause dur="0.4"/> and the total number of deaths was <kinesic desc="writes on board" iterated="y" dur="1"/> two <pause dur="0.5"/> so the expected number has to be <pause dur="0.9"/><kinesic desc="writes on board" iterated="y" dur="2"/> equal to one <pause dur="2.6"/> this one's a really easy one 'cause there's one death in each group <pause dur="0.5"/> and the group sizes are equal so one expected one expected <pause dur="0.6"/> and the variance term <pause dur="4.7"/> <kinesic desc="writes on board" iterated="y" dur="4"/> okay the next time is <pause dur="1.1"/><kinesic desc="writes on board" iterated="y" dur="1"/> six <pause dur="3.1"/> there's only one death <pause dur="3.8"/> <kinesic desc="writes on board" iterated="y" dur="3"/> but the group sizes are equal again </u><pause dur="0.6"/> <u who="sm0965" trans="pause"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="0.8"/> <u who="nf0955" trans="pause"> sorry </u><u who="sm0965" trans="overlap">

are they not three and four </u><pause dur="1.3"/> <u who="nf0955" trans="pause"> oops <pause dur="0.2"/> i'm sorry <pause dur="0.9"/> this is time three not time six <pause dur="0.5"/> <kinesic desc="writes on board" iterated="y" dur="1"/><vocal desc="laugh" iterated="n"/> thank you <pause dur="0.5"/><vocal desc="laugh" iterated="n"/> <pause dur="0.7"/> # at time three <pause dur="0.2"/> there is one death in the control group <pause dur="0.4"/> no deaths <kinesic desc="indicates point on board" iterated="n"/> there group sizes are equal <pause dur="0.8"/> <kinesic desc="writes on board" iterated="y" dur="2"/> so we can see the expecteds come in at a half each <pause dur="2.9"/> and <pause dur="0.2"/> for the rest of the table it's all <pause dur="0.2"/> written out in the handout <pause dur="2.9"/> and that <pause dur="0.4"/> wraps up the <pause dur="1.4"/> non-parametric side of survival analysis <pause dur="0.8"/> actuarial life tables <pause dur="0.5"/> Kaplan-Meier or product limit <pause dur="0.8"/> log-rank to compare survival curves so on Monday we'll go over to parametric <pause dur="0.5"/> methods for survival <pause dur="1.0"/> so <pause dur="0.3"/> any questions and the handout is at the front so you don't need to worry about <pause dur="0.6"/> copying down this 'cause it's all on the handout

</u></body>

</text></TEI.2>