Skip to main content


<?xml version="1.0"?>

<!DOCTYPE TEI.2 SYSTEM "base.dtd">




<title>Analysis of experimental data 5</title></titleStmt>

<publicationStmt><distributor>BASE and Oxford Text Archive</distributor>


<availability><p>The British Academic Spoken English (BASE) corpus was developed at the

Universities of Warwick and Reading, under the directorship of Hilary Nesi

(Centre for English Language Teacher Education, Warwick) and Paul Thompson

(Department of Applied Linguistics, Reading), with funding from BALEAP,

EURALEX, the British Academy and the Arts and Humanities Research Board. The

original recordings are held at the Universities of Warwick and Reading, and

at the Oxford Text Archive and may be consulted by bona fide researchers

upon written application to any of the holding bodies.

The BASE corpus is freely available to researchers who agree to the

following conditions:</p>

<p>1. The recordings and transcriptions should not be modified in any


<p>2. The recordings and transcriptions should be used for research purposes

only; they should not be reproduced in teaching materials</p>

<p>3. The recordings and transcriptions should not be reproduced in full for

a wider audience/readership, although researchers are free to quote short

passages of text (up to 200 running words from any given speech event)</p>

<p>4. The corpus developers should be informed of all presentations or

publications arising from analysis of the corpus</p><p>

Researchers should acknowledge their use of the corpus using the following

form of words:

The recordings and transcriptions used in this study come from the British

Academic Spoken English (BASE) corpus, which was developed at the

Universities of Warwick and Reading under the directorship of Hilary Nesi

(Warwick) and Paul Thompson (Reading). Corpus development was assisted by

funding from the Universities of Warwick and Reading, BALEAP, EURALEX, the

British Academy and the Arts and Humanities Research Board. </p></availability>




<recording dur="01:30:59" n="14014">


<respStmt><name>BASE team</name>



<langUsage><language id="en">English</language>



<person id="nm0658" role="main speaker" n="n" sex="m"><p>nm0658, main speaker, non-student, male</p></person>

<person id="sm0659" role="participant" n="s" sex="m"><p>sm0659, participant, student, male</p></person>

<person id="sf0660" role="participant" n="s" sex="f"><p>sf0660, participant, student, female</p></person>

<person id="sm0661" role="participant" n="s" sex="m"><p>sm0661, participant, student, male</p></person>

<person id="sm0662" role="participant" n="s" sex="m"><p>sm0662, participant, student, male</p></person>

<person id="sm0663" role="participant" n="s" sex="m"><p>sm0663, participant, student, male</p></person>

<person id="sm0664" role="participant" n="s" sex="m"><p>sm0664, participant, student, male</p></person>

<person id="sm0665" role="participant" n="s" sex="m"><p>sm0665, participant, student, male</p></person>

<person id="sm0666" role="participant" n="s" sex="m"><p>sm0666, participant, student, male</p></person>

<person id="sm0667" role="participant" n="s" sex="m"><p>sm0667, participant, student, male</p></person>

<person id="sf0668" role="participant" n="s" sex="f"><p>sf0668, participant, student, female</p></person>

<person id="sm0669" role="participant" n="s" sex="m"><p>sm0669, participant, student, male</p></person>

<person id="sm0670" role="participant" n="s" sex="m"><p>sm0670, participant, student, male</p></person>

<person id="sm0671" role="participant" n="s" sex="m"><p>sm0671, participant, student, male</p></person>

<person id="sm0672" role="participant" n="s" sex="m"><p>sm0672, participant, student, male</p></person>

<person id="sf0673" role="participant" n="s" sex="f"><p>sf0673, participant, student, female</p></person>

<person id="sf0674" role="participant" n="s" sex="f"><p>sf0674, participant, student, female</p></person>

<person id="sm0675" role="participant" n="s" sex="m"><p>sm0675, participant, student, male</p></person>

<person id="sm0676" role="participant" n="s" sex="m"><p>sm0676, participant, student, male</p></person>

<person id="sf0677" role="participant" n="s" sex="f"><p>sf0677, participant, student, female</p></person>

<person id="sm0678" role="participant" n="s" sex="m"><p>sm0678, participant, student, male</p></person>

<person id="sm0679" role="participant" n="s" sex="m"><p>sm0679, participant, student, male</p></person>

<person id="sm0680" role="participant" n="s" sex="m"><p>sm0680, participant, student, male</p></person>

<person id="sm0681" role="participant" n="s" sex="m"><p>sm0681, participant, student, male</p></person>

<person id="sm0682" role="participant" n="s" sex="m"><p>sm0682, participant, student, male</p></person>

<person id="sf0683" role="participant" n="s" sex="f"><p>sf0683, participant, student, female</p></person>

<person id="sm0684" role="participant" n="s" sex="m"><p>sm0684, participant, student, male</p></person>

<person id="sm0685" role="participant" n="s" sex="m"><p>sm0685, participant, student, male</p></person>

<person id="sm0686" role="participant" n="s" sex="m"><p>sm0686, participant, student, male</p></person>

<person id="sf0687" role="participant" n="s" sex="f"><p>sf0687, participant, student, female</p></person>

<person id="sm0688" role="participant" n="s" sex="m"><p>sm0688, participant, student, male</p></person>

<person id="sm0689" role="participant" n="s" sex="m"><p>sm0689, participant, student, male</p></person>

<personGrp id="ss" role="audience" size="m"><p>ss, audience, medium group </p></personGrp>

<personGrp id="sl" role="all" size="m"><p>sl, all, medium group</p></personGrp>

<personGrp role="speakers" size="34"><p>number of speakers: 34</p></personGrp>





<item n="speechevent">Lecture</item>

<item n="acaddept">Applied Statistics</item>

<item n="acaddiv">ps</item>

<item n="partlevel">PG</item>

<item n="module">Statistics for agricultural researchers</item>




<u who="nm0658"> this is the <pause dur="0.2"/> last of five sessions <pause dur="0.5"/> which are <pause dur="0.3"/> specially for your group <pause dur="0.4"/> to look at the design analysis of experiments <pause dur="1.0"/> # from next week <pause dur="0.6"/> we <pause dur="0.2"/> are going back to the general group <pause dur="0.5"/> and <pause dur="0.2"/> we're going to look at advanced methods <pause dur="0.4"/> which i think are equally applicable <pause dur="0.5"/> for everybody <pause dur="0.3"/> so we're then going to look at advanced modelling ideas <pause dur="0.5"/> which i think are the same sort of models you need to consider <pause dur="0.3"/> whether you're doing a survey or an experiment <pause dur="0.9"/> the two main problems we will tackle there <pause dur="0.3"/> will be <pause dur="0.2"/> what happens if your design is unbalanced <pause dur="1.1"/> and some experiments are unbalanced and most surveys <pause dur="0.5"/> have unbalanced data <pause dur="1.3"/> and the second thing we will tackle <pause dur="0.2"/> is what happens <pause dur="0.3"/> when your data may not come from a normal distribution <pause dur="0.7"/> the traditional statistics <pause dur="0.3"/> says that <pause dur="0.3"/> if your data come from a normal distribution then everything is fine <pause dur="0.4"/> and if they don't come from a normal distribution <pause dur="0.4"/> then <pause dur="0.2"/> you first panic <pause dur="0.4"/> and then you transform your data and then you hope your panic is over <pause dur="0.7"/> and there are

modern methods now <pause dur="0.4"/> where you can analyse data from a non-normal distribution <pause dur="0.5"/> much more flexibly <pause dur="0.3"/> than was possible before <pause dur="0.6"/> so that's what we're going to do together <pause dur="0.4"/> with <pause dur="0.2"/> the # <pause dur="0.2"/> students from wildlife management <pause dur="0.4"/> # and also from vegetation <pause dur="0.3"/> surveys <pause dur="1.3"/> so what i want to do today <pause dur="0.3"/> is to <pause dur="0.2"/> review <pause dur="0.2"/> the ideas <pause dur="0.3"/> # of experimental design <pause dur="0.3"/> and analysis <pause dur="0.2"/> and then go through one more advanced topic <pause dur="0.4"/> which is a very common topic and which is the subject of repeated measures <pause dur="0.4"/> so that's where <pause dur="0.3"/> you measure repeatedly on the same unit <pause dur="0.4"/> be it the same animal <pause dur="0.2"/> the same tree <pause dur="0.3"/> or the same plot <pause dur="0.3"/> you go back to it perhaps repeatedly through the season <pause dur="1.0"/> and what i want you to do <pause dur="0.3"/> with this <pause dur="0.3"/> single example of a more advanced method <pause dur="0.4"/> is to see whether the weapons you have learned through the last <pause dur="0.3"/> four weeks <pause dur="0.4"/> how much can they help <pause dur="1.0"/> because what i would like you to get to by the end of this course <pause dur="0.9"/> is <pause dur="1.1"/> # so that the subject of statistics is not always expanding <pause dur="0.6"/> you've got to give yourself a framework <pause dur="0.6"/> so that when

you have a new problem <pause dur="0.4"/> you say well what parts of the problem are new <pause dur="0.2"/> and what parts can fit into the the subject that i already know <pause dur="0.4"/> and i want to show you <pause dur="0.2"/> that although repeated measures appears to be a new subject <pause dur="0.5"/> once you understand some of the basic ideas we've tried to cover <pause dur="0.5"/> then <pause dur="0.2"/> there are only a few <pause dur="0.2"/> extra <pause dur="0.3"/> changes <pause dur="0.3"/> to be able to follow the ways you could analyse a repeated measures example <pause dur="1.6"/> so <pause dur="1.0"/> what i'm going to do this morning <pause dur="0.8"/> is <pause dur="0.5"/><kinesic desc="changes slide" iterated="n"/> to review some of the ideas of designing experiments <pause dur="1.0"/> bringing in <pause dur="0.2"/> what Genstat can do to help you <pause dur="0.8"/> look very quickly at data management <pause dur="0.7"/> # and then go into <pause dur="0.3"/> as part of the analysis <pause dur="0.2"/> how to analyse a repeated measures experiment <pause dur="3.5"/><vocal desc="cough" iterated="n"/><pause dur="1.6"/><kinesic desc="changes slide" iterated="n"/> you should have just two handouts <pause dur="0.3"/> although it says here you have three <pause dur="0.2"/> i couldn't think of anything extra to write for the worksheet so this is the one week where there isn't a worksheet <pause dur="0.3"/> so you have a lecture note <pause dur="0.2"/> and you have the practical exercise for <pause dur="0.4"/> the practical <pause dur="0.3"/> # <pause dur="0.3"/> which is as usual in the Met department at

eleven o'clock <pause dur="6.6"/><vocal desc="cough" iterated="n"/><pause dur="0.2"/> now we're going to look at <pause dur="0.4"/> the ideas of designing experiment <pause dur="0.7"/><kinesic desc="changes slide" iterated="n"/> and <pause dur="0.4"/> i've kept repeating <pause dur="0.3"/> that you start with your objectives <pause dur="0.2"/> they lead to the treatments <pause dur="1.0"/> the way you're doing your experiment leads to the layout <pause dur="0.2"/> you must decide on your measurements and we've now moved for the last four weeks to do the analysis <pause dur="0.4"/> and we're going to consider these components in turn yet again <pause dur="0.6"/> and see at the same time how Genstat can assist <pause dur="0.3"/> with the randomization of an experiment <pause dur="5.0"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="0.3"/> the menu that you're going to use <pause dur="0.3"/> here we see Genstat it's not very clear but it's probably clearer on your slide <pause dur="0.5"/> # <pause dur="0.2"/> that <pause dur="0.2"/> you're on the usual stats menu <pause dur="0.8"/> but instead of going <pause dur="0.2"/> to analysis of variance you go up one <pause dur="0.8"/> to an option called design <pause dur="1.0"/> and there are a variety of things you can do <pause dur="0.2"/> and all we're going to do <pause dur="0.4"/> is look at one or two standard <pause dur="0.3"/> designs <pause dur="1.3"/> and we're going to use them first to show you how to randomize an experiment <pause dur="0.6"/> but also to review some of the ideas of different experiments <pause dur="5.6"/><kinesic desc="changes slide" iterated="n"/> here's the

sort of menu you get <pause dur="1.6"/> and almost all the experiments that we've been discussing <pause dur="0.4"/> you can randomize <pause dur="0.3"/> using this menu <pause dur="1.0"/> for the very simplest experiment you might consider <pause dur="0.2"/> is a one-way design where you've just got treatments <pause dur="0.6"/> and the <pause dur="0.5"/> experiment is laid out in randomized blocks <pause dur="1.2"/> so then in this menu <pause dur="0.2"/> you have to say <pause dur="0.2"/> what do i call the blocks <pause dur="0.2"/> we've just left it with an A block <pause dur="0.3"/> and how many blocks are there <pause dur="1.3"/> what are we going to call <pause dur="0.5"/> the units well we'll call them plots <pause dur="0.7"/> and what are we going to call the treatment factor which might be variety here i've just called it treat <pause dur="0.5"/> and how many levels </u><pause dur="1.1"/><u who="sm0659" trans="pause"> one question please <pause dur="0.3"/> what is the criteria for one-way or two-way <pause dur="0.2"/> two-ways design </u><pause dur="1.7"/> <u who="nm0658" trans="pause"> you mean this </u><u who="sm0659" trans="overlap"> <gap reason="inaudible" extent="1 word"/> yeah what will be the criteria for this to choose one-way or <pause dur="0.2"/> <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="overlap"> two-way designs </u><u who="sm0659" trans="overlap"> yeah i believe there must be a two-way </u><u who="nm0658" trans="latching"> this is where <pause dur="0.3"/> on the second component <pause dur="1.0"/> i've said that <pause dur="0.2"/> you first think of the treatments <pause dur="0.8"/> if you're <pause dur="0.4"/> if you have one treatment factor <pause dur="0.7"/> like variety <pause dur="0.9"/> that is a one-way design <pause dur="1.3"/> if you have two treatment factors <pause dur="0.7"/> let us say <pause dur="0.2"/> three varieties <pause dur="0.4"/> and <pause dur="0.3"/> two levels of fertility <pause dur="1.0"/> that would be two treatment factors <pause dur="1.3"/> and then you would have <pause dur="0.3"/> a two-way

design <pause dur="0.3"/> in randomized blocks </u><pause dur="0.6"/> <u who="sm0659" trans="pause"> <gap reason="inaudible" extent="1 sec"/> different ways for the </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> well when you have </u><u who="sm0659" trans="overlap"> bias in the <pause dur="0.3"/> experiment </u><pause dur="0.8"/> <u who="nm0658" trans="pause"> when you have a one-way design it's simple <pause dur="0.5"/> whether you call it a factor or a treatment it's just the same thing <pause dur="0.7"/> when you have a two-way design take the example i've given you <pause dur="0.5"/> where you have <pause dur="0.2"/> three varieties and two levels of fertility <pause dur="0.5"/> you can either think of this <pause dur="0.2"/> as a two-way design <pause dur="0.8"/> for the treatments <pause dur="0.8"/> or you can think of it <pause dur="0.2"/> as a one-way design <pause dur="0.5"/> where that one way has six treatments <pause dur="0.7"/> because it's you can think of it as three by two <pause dur="0.3"/> or you can think of it as the six <pause dur="0.3"/> and it's up to you <pause dur="0.5"/> whether you want to think of it <pause dur="0.3"/> as just six treatments <pause dur="0.4"/> or whether you want to say well i would like to know right from the beginning <pause dur="0.3"/> the layout of my experiment in terms of both factors <pause dur="0.8"/> so when it's more complicated than a one-way design <pause dur="0.3"/> you can choose whether to take all combinations of the treatment factors <pause dur="0.2"/> and call them one big <pause dur="0.8"/> factor with all those levels <pause dur="0.3"/> or <pause dur="0.2"/> to split it up <pause dur="0.3"/> by the <pause dur="0.5"/> components <pause dur="2.7"/><vocal desc="cough" iterated="n"/></u><pause dur="0.3"/><u who="sf0660" trans="pause"> do you ever have three-way design </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> ooh

lots of times </u><pause dur="0.6"/> <u who="sf0660" trans="pause"> so it's like # <pause dur="1.4"/> on the # <pause dur="0.3"/> # dialogues within <gap reason="inaudible" extent="1 sec"/></u><u who="nm0658" trans="overlap"> yes </u><u who="sf0660" trans="overlap"> and you just have one and two-way <pause dur="0.3"/> and then generalize <pause dur="0.8"/> the model </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> this is this parallels here <pause dur="0.2"/> what you would have for the analysis </u><u who="sf0660" trans="latching"> yeah </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> and <pause dur="1.2"/> you can either choose <pause dur="0.2"/> to do the analysis <pause dur="0.2"/> thinking what is my design let me give it a name <pause dur="0.7"/> or and i prefer <pause dur="0.4"/> to say it's a general design </u><u who="sf0660" trans="latching"> mm </u><u who="nm0658" trans="latching"> how many treatment factors do i have <pause dur="0.6"/> and this menu system <pause dur="0.9"/> because it's for simple designs only <pause dur="0.2"/> it only allows you up to five factors but five is probably enough <pause dur="0.2"/> so <pause dur="0.3"/> if you go to the general <pause dur="0.6"/> then it would say how many treatment factors do you have <pause dur="0.4"/> and you could have up to five different treatment factors <pause dur="2.8"/> there's no limit in Genstat on the number of treatment factors <pause dur="0.3"/> but we find most experiments <pause dur="0.3"/> five is enough <pause dur="0.2"/> i would like one of the main bits of work i did when i was working in # <pause dur="0.4"/> in Niger was to try and encourage people who only had <pause dur="0.4"/> experiments with one or two factors <pause dur="0.6"/> # to include more factors <pause dur="6.4"/> once you click on okay <pause dur="1.4"/><kinesic desc="changes slide" iterated="n"/> this is what you will get <pause dur="1.5"/> so you will now get <pause dur="0.3"/> a Genstat spreadsheet <pause dur="1.4"/> and <pause dur="0.5"/> you can see here <pause dur="0.4"/> # let me return to

this previous slide <pause dur="0.5"/><kinesic desc="changes slide" iterated="n"/> just to remind you what i did <pause dur="2.4"/> this particular example <pause dur="0.7"/> i have <pause dur="0.5"/> four blocks <pause dur="0.6"/> and <pause dur="0.2"/> three levels of the treatment <pause dur="0.6"/> so it's a simple design i have twelve plots so there's a twelve down at the bottom <pause dur="0.5"/> which Genstat works out automatically <pause dur="0.2"/> just multiplying the four by the three <pause dur="0.2"/> that this was an experiment with twelve plots <pause dur="1.9"/> so when i click on okay <pause dur="1.0"/> it will produce the randomization <pause dur="0.3"/> and automatically <pause dur="0.3"/> you will now get <pause dur="3.7"/><kinesic desc="changes slide" iterated="n"/> i've called it data but it's really just the design <pause dur="0.5"/> you will now get <pause dur="0.9"/> this structure <pause dur="0.2"/> in a Genstat spreadsheet <pause dur="3.1"/> now i want you to notice and we will emphasize this later on <pause dur="0.6"/> that <pause dur="0.5"/> this way of laying out the data <pause dur="1.0"/> is part of the simple data management <pause dur="0.7"/> and i've seen one or two examples this term <pause dur="0.3"/> because <pause dur="0.2"/> you are all addicts of Excel <pause dur="0.8"/> you will find that Excel gives you too much freedom in laying out your data <pause dur="0.3"/> and that can cause lots of problems later on <pause dur="0.7"/> so i emphasize what we discussed in <pause dur="0.3"/> session three <pause dur="0.6"/> that <pause dur="0.2"/> when you have twelve plots in an experiment <pause dur="0.4"/> the

layout of the data for a statistics package <pause dur="0.3"/> is to have <pause dur="0.2"/> twelve rows of data <pause dur="0.2"/> one for each plot <pause dur="0.6"/> you will have the label for the plot <pause dur="0.6"/> and you will then <pause dur="0.3"/> have another column which says which block <pause dur="0.9"/> am i using <pause dur="0.4"/> which <pause dur="0.4"/> plot do i have and which treatment <pause dur="0.3"/> and these treatments have been randomized <pause dur="0.9"/> so there's your randomization <pause dur="0.9"/> so this is your data laid out <pause dur="0.2"/> in the sort of field order <pause dur="0.3"/> ready <pause dur="0.7"/> to <pause dur="0.2"/> add further columns if you like with your measurements once you take them <pause dur="0.9"/> and so <pause dur="0.2"/> this is all ready <pause dur="0.2"/> to be a data collection form <pause dur="1.7"/> and this is in Genstat <pause dur="1.3"/> so the next part of your strategy you see <pause dur="0.3"/> is to say well <pause dur="0.3"/> after <pause dur="0.3"/> i haven't got any measurements yet because i'm just designing my experiment <pause dur="0.3"/> but when i've got my measurements <pause dur="0.2"/> they will make more columns here <pause dur="0.9"/> do i want to enter these measurement data <pause dur="0.5"/> into <pause dur="0.3"/> Genstat or into Excel <pause dur="1.0"/> i can choose <pause dur="0.7"/> so if you choose to save <pause dur="0.2"/> this as an Excel file which i did <pause dur="0.5"/> then <pause dur="0.2"/> here is the same information <pause dur="0.2"/> saved in Excel <pause dur="1.3"/> so now as you see you've got <pause dur="0.3"/> A B C and D in Excel <pause dur="0.3"/> and

once you measure a few things <pause dur="0.2"/> the height of the plants or the mean height of the plants <pause dur="0.3"/> # the yield and various things <pause dur="0.2"/> those just become more columns <pause dur="0.3"/> which you can enter <pause dur="0.4"/> into <pause dur="0.4"/> back into Genstat for the analysis <pause dur="2.5"/> and you will remember from the # <pause dur="0.2"/> session on management <pause dur="0.2"/> we have said <pause dur="0.7"/> please enter your data straight from the field in the randomized order <pause dur="0.5"/> and now you see that's a very easy routine <pause dur="0.7"/> you can start in your stats package <pause dur="0.2"/> to randomize your experiment <pause dur="0.7"/> if you choose to do your data entry in Excel then that's fine <pause dur="0.6"/> you export <pause dur="0.2"/> all the details <pause dur="0.3"/> before the season <pause dur="0.4"/> here <pause dur="0.6"/> you now collect data and as you collect the data <pause dur="0.4"/> you just type in your extra columns of data <pause dur="0.3"/> and your data are ready entered <pause dur="1.1"/>

and in many of the courses that i give <pause dur="0.4"/> i try and encourage scientists <pause dur="0.3"/> not to enter data at the end of the season when they've got everything collected <pause dur="0.5"/> but if <pause dur="0.5"/> after <pause dur="0.5"/> a few weeks <pause dur="0.2"/> you've measured <pause dur="0.3"/> the data fifty per cent flowering <pause dur="0.3"/> then the day you measure it <pause dur="0.2"/> you just type it in <pause dur="0.3"/> it's very quick <pause dur="1.1"/> and so it's there in your field book <pause dur="0.3"/> and in it goes same day <pause dur="0.5"/> and then if there's a problem <pause dur="0.4"/> you can <pause dur="0.3"/> probably have a look very quickly and you've seen how easy it is <pause dur="0.3"/> to having <pause dur="0.5"/> even if you enter in Excel <pause dur="0.3"/> to then get your data into Genstat for the analysis <pause dur="0.4"/> so the very day you collect your data <pause dur="0.3"/> there is no problem doing your first analysis <pause dur="0.6"/> and the procedure works <pause dur="0.4"/> very nicely <pause dur="3.3"/> any <pause dur="0.2"/> questions so far <pause dur="1.3"/><vocal desc="cough" iterated="n"/><pause dur="2.6"/><kinesic desc="changes slide" iterated="n"/> the second example <pause dur="2.8"/> this now can follow up your initial question <pause dur="0.2"/> this is now <pause dur="0.4"/> an experiment <pause dur="0.5"/> where i have got <pause dur="0.2"/> three factors <pause dur="2.1"/> so i've got three treatment factors i'm assuming irrigation with four

levels <pause dur="0.3"/> i'm assuming fertility with two levels <pause dur="0.2"/> i'm assuming variety with three levels <pause dur="1.5"/> so i'm assuming a much more complicated experiment <pause dur="0.5"/> # how many treatments </u><pause dur="2.8"/><u who="sm0661" trans="pause"> twenty-four </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> twenty-four treatments <pause dur="0.4"/> so we're just taking <pause dur="0.5"/> we're going to have an experiment with all levels of these this would be a simple experiment <pause dur="0.3"/> which would mean the total number of treatments is twenty-four <pause dur="0.6"/> so i could enter this as a one-way <pause dur="0.4"/> design <pause dur="0.4"/> with a treatment factor going from one to twenty-four <pause dur="0.3"/> and associate each number <pause dur="0.3"/> with a combination of irrigation fertility or variety <pause dur="0.3"/> or i could choose <pause dur="0.7"/> to enter straight away and say i want to do this and <pause dur="0.4"/> # randomize it <pause dur="0.5"/> straight away for my irrigation fertility and variety <pause dur="0.9"/> i've decided to have four replicates <pause dur="1.1"/> and i assume also i've decided to have a split plot design <pause dur="2.5"/> and in my split plot design i'm going to have my level of irrigation and fertility on the main plots <pause dur="0.8"/> and i'm going to have my variety on the subplots <pause dur="1.0"/> i assume i've decided this <pause dur="0.3"/> maybe you decide this in

conjunction with a statistician <pause dur="1.1"/> before <pause dur="0.3"/> you come to the experiment <pause dur="0.9"/> now you decide <pause dur="0.2"/> that's the design i would like to try <pause dur="1.4"/><vocal desc="cough" iterated="n"/><pause dur="0.2"/> so how many <pause dur="0.5"/> subplots in this experiment </u><pause dur="0.9"/> <u who="sm0662" trans="pause"> <gap reason="inaudible" extent="1 word"/></u><pause dur="1.4"/> <u who="nm0658" trans="pause"> how many how many plots altogether <pause dur="2.6"/> you've already said there are twenty-four treatments </u><pause dur="3.9"/><u who="sm0663" trans="pause"> ninety-six </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> there are ninety-six <pause dur="0.4"/> plots <pause dur="0.4"/> just twenty-four by four <pause dur="0.5"/> because i've got four replicates <pause dur="0.7"/> and how many main plots </u><pause dur="10.2"/><u who="sm0664" trans="pause"> eighteen </u><u who="sm0665" trans="latching"> thirty-two </u><u who="sm0666" trans="overlap"> sixteen </u><pause dur="0.7"/> <u who="nm0658" trans="pause"> sixteen <pause dur="0.3"/> let's have a look <pause dur="6.8"/> well the corresponding design <pause dur="0.2"/> dialogue <pause dur="3.3"/><kinesic desc="changes slide" iterated="n"/> this is an answer to your question now can you have more than two i now go to the general design <pause dur="0.3"/> i've chosen to make it split plot <pause dur="0.3"/> so i'm going to the general split plot design <pause dur="1.0"/> and it will now say to me how many treatment factors do you have <pause dur="0.3"/> so i must understand <pause dur="0.6"/> these questions what do i mean by a treatment factor <pause dur="0.2"/> which i hope <pause dur="0.3"/> you understand quite clearly from the course <pause dur="1.4"/> and how many of those treatment factors <pause dur="0.2"/> are on the subplots <pause dur="1.8"/> well my design that i was considering had two treatment factors on the main plot <pause dur="0.2"/> and one on the subplots <pause dur="1.8"/> how many replicates

four <pause dur="0.4"/> on my main i'm going to call the main plot factor M plot i'm going to call the subplot factor subplot <pause dur="0.9"/> whole plot treatment factor one i'm going to say is irrigate <pause dur="0.8"/> whole plot treatment factor two i'm going to say is fertility <pause dur="1.2"/> with those number of levels <pause dur="0.3"/> and subplot treatment factor one variety has three levels </u><pause dur="0.9"/> <u who="sm0659" trans="pause"> why is that </u><pause dur="0.7"/> <u who="nm0658" trans="pause"> sorry </u><pause dur="0.2"/> <u who="sm0659" trans="pause"> why is that </u><pause dur="0.5"/> <u who="nm0658" trans="pause"> i chose that <pause dur="0.6"/> i decided that was my experiment you can have any number <pause dur="2.0"/> so <pause dur="0.2"/> you have to decide how many varieties do you have to compare </u><pause dur="1.4"/><u who="sm0659" trans="pause"> # <pause dur="0.3"/> why did you call those subplots <gap reason="inaudible" extent="1 sec"/> variety </u><pause dur="1.5"/> <u who="nm0658" trans="pause"> because i <pause dur="0.3"/> thought that that would be a factor <pause dur="0.6"/> which <pause dur="0.3"/> i didn't need big plots for <pause dur="0.4"/> that's part of the design process <pause dur="0.3"/> you have to <pause dur="0.9"/> you have to decide <pause dur="0.3"/> do you need the same size plots for all your factors <pause dur="1.8"/> if you don't <pause dur="0.7"/> then <pause dur="0.6"/> are there any factors that need larger plots than the others <pause dur="0.3"/> here irrigation <pause dur="0.2"/> obviously often needs large plots <pause dur="1.3"/> sometimes fertility <pause dur="0.7"/> levels need large plots because you get leaching from one plot to another <pause dur="0.9"/> and often varieties <pause dur="0.4"/> you can have quite small plots <pause dur="0.2"/> breeders have

very small plots <pause dur="0.8"/> so i'm assuming <pause dur="0.2"/> that <pause dur="0.2"/> either because of your expertise or in discussion with a statistician <pause dur="0.3"/> you decide that <pause dur="0.2"/> you can get away with small plots for variety <pause dur="0.3"/> but you need larger plots for here and that's why you chose a split plot design <pause dur="1.1"/> if that had not been the case <pause dur="0.4"/> then as we've said before <pause dur="0.7"/> i would recommend that you don't have a split plot design you've have a randomized plot design <pause dur="0.2"/> and then you <pause dur="0.4"/> you'd just have a different menu <pause dur="0.4"/> and you just say these are the three factors <pause dur="2.3"/> and what i want to show you in a minute is that we'll come and check <pause dur="0.4"/> well was it a good idea <pause dur="0.2"/> our design </u><pause dur="1.5"/> <u who="sf0660" trans="pause"> can i just ask what's the randomization seed is that the way of generating random numbers </u><pause dur="1.2"/> <u who="nm0658" trans="pause"> that's correct # yes i haven't explained these things down at the bottom <pause dur="0.4"/> the randomization seed <pause dur="0.6"/> means that that's the point at which it starts its random number generation <pause dur="1.0"/> so <pause dur="0.6"/> if you were <pause dur="0.3"/> to make a note of this <pause dur="1.6"/> then you could always regenerate the same randomization <pause dur="0.2"/> yet again you don't have to keep everything <pause dur="0.4"/> by <pause dur="0.6"/> typing this number in yourself so usually you allow the

computer <pause dur="0.2"/> to choose this and it's different every time <pause dur="1.2"/> but if you chose <pause dur="0.2"/> to make it the same you'd get the same randomization <pause dur="1.1"/> and that's quite a good way of keeping a record of the randomization you had <pause dur="0.3"/> without noting everything down <pause dur="0.4"/> if somebody would say please could you print me another copy of that <pause dur="0.4"/> well you can always <pause dur="0.3"/> print the resulting <pause dur="0.3"/> spreadsheet <pause dur="0.5"/> but if you said well <pause dur="0.8"/> sending the information all you have to do is fill this in this way and give that random number seed and you'll get the same <pause dur="0.2"/> randomization <pause dur="7.1"/> and this is the randomization that follows <pause dur="1.7"/><kinesic desc="changes slide" iterated="n"/> so this is the output when you press okay <pause dur="3.0"/> it goes on of course <pause dur="0.4"/> down to ninety-six plots we only have fourteen plots here <pause dur="0.4"/> but you see it gives you a column which says which block <pause dur="0.2"/> which main plot <pause dur="0.2"/> which subplot <pause dur="0.3"/> which level of irrigation which level of fertility and which variety <pause dur="0.8"/> these of course are not randomized <pause dur="1.5"/> but <pause dur="0.8"/> the this <pause dur="0.5"/> which is the level of irrigation fertility and variety <pause dur="0.4"/> that is associated <pause dur="0.2"/> is randomized <pause dur="0.6"/> to a

certain extent <pause dur="0.6"/> what you should notice <pause dur="0.4"/> is that the variety <pause dur="0.9"/> is randomized on the little plots <pause dur="0.9"/> so here's the first main plot which goes one one one <pause dur="0.8"/> while the subplots go one two three <pause dur="1.5"/> you will notice <pause dur="0.2"/> that irrigate <pause dur="0.5"/> is at the same level one <pause dur="0.2"/> for all those three <pause dur="0.4"/> because that's at the main plot level <pause dur="1.0"/> and so is fertility <pause dur="0.7"/> but the varieties <pause dur="0.3"/> go <pause dur="0.3"/> one two three <pause dur="1.2"/> because they are randomized on the subplot level <pause dur="0.8"/> and when we go to the next main plot <pause dur="0.6"/> here <pause dur="0.8"/> we have main plot two <pause dur="0.5"/> subplots one two three these two again remain the same because it's a split plot design <pause dur="0.3"/> and these three this is a complete repeat of all the values here <pause dur="0.5"/> so each main plot <pause dur="0.8"/> is a total repeat <pause dur="0.6"/> for all the levels of variety <pause dur="0.6"/> but it just is one plot from here </u><pause dur="7.8"/> <u who="sm0667" trans="pause"> <gap reason="inaudible" extent="1 sec"/> <pause dur="1.0"/> # <pause dur="2.2"/> if they're completely randomized will it be numbers one to ninety-six or </u><pause dur="0.8"/> <u who="nm0658" trans="pause"> # sorry </u><pause dur="0.3"/> <u who="sm0667" trans="pause"> if they're completely randomized </u><pause dur="0.8"/> <u who="nm0658" trans="pause"> if it were completely randomized </u><pause dur="0.2"/> <u who="sm0667" trans="pause"> mm </u><u who="nm0658" trans="overlap"> without even any replicates <pause dur="0.6"/> then it would randomize everything <pause dur="0.3"/> for the numbers one to ninety-six <pause dur="0.3"/> so you can choose <pause dur="0.3"/> do i have replicates <pause dur="0.6"/> which is a sort of blocking idea <pause dur="0.5"/> # <pause dur="0.2"/> do i have

treatments <pause dur="0.2"/> and how do i randomize <pause dur="0.5"/> you probably wouldn't use this menu if you just wanted a set of random numbers there's an easier way of doing that <pause dur="0.4"/> but that would be the extreme <pause dur="0.3"/> for the randomization that's right <pause dur="1.1"/> now <pause dur="1.1"/><kinesic desc="changes slide" iterated="n"/> you asked just now about this randomization seed <pause dur="0.4"/> but there are other options down at the bottom <pause dur="0.7"/> that you could also ask for <pause dur="0.9"/> and i've used one or two of these <pause dur="1.2"/> # i've asked could i have a dummy ANOVA table <pause dur="1.2"/> which might answer a bit more of your question <pause dur="0.2"/> why did i choose to put varieties at the subplot level <pause dur="0.8"/> what i would like to see occasionally is <pause dur="0.2"/> is the resulting design a sensible one for me to continue with <pause dur="1.6"/> and so i could learn a little bit about that with a dummy ANOVA table <pause dur="0.3"/> and i will show you in a minute what that is <pause dur="1.3"/> i could ask <pause dur="0.3"/> for trial ANOVA with random data <pause dur="2.7"/> what it does there <pause dur="0.3"/> please don't make too much of that the random data <pause dur="0.2"/> is only there to show you what will the results look like <pause dur="0.8"/> from this sort of experiment <pause dur="0.2"/> when you have

some data <pause dur="0.3"/> how will the results be laid out <pause dur="0.7"/> so it can show you <pause dur="0.3"/> the way in which the results will be laid out and i think that's quite useful ahead of the experiment <pause dur="0.4"/> to say is that a sensible thing so <pause dur="0.2"/> this is the sort of tool that i would like also for discussion <pause dur="0.3"/> with people such as yourself <pause dur="0.3"/> when you're saying i'm thinking of doing this sort of experiment <pause dur="0.4"/> and i can say well <pause dur="0.2"/> this is the sort of result you will get <pause dur="0.8"/> are you happy A with the design <pause dur="0.5"/> and B <pause dur="0.2"/> that you can then understand the results <pause dur="2.1"/> so let's show you what this <pause dur="0.2"/> works out <pause dur="1.3"/> because i ticked the dummy ANOVA table <pause dur="3.2"/><kinesic desc="changes slide" iterated="n"/> here i have <pause dur="0.2"/> the dummy ANOVA <pause dur="0.4"/> from Genstat <pause dur="6.1"/> i asked you how many main plots <pause dur="0.3"/> and you see <pause dur="0.2"/> this is straight print out from Genstat <pause dur="0.3"/> and here you see here is the main plot <pause dur="0.2"/> section <pause dur="2.7"/> so this is all revision <pause dur="0.3"/> but you can see <pause dur="0.2"/> that from the point of view of these two factors <pause dur="0.6"/> this is a simple experiment <pause dur="0.2"/> not with ninety-six <pause dur="0.2"/> plots <pause dur="0.5"/> but with ninety-six divided by three plots <pause dur="0.8"/> so there are <pause dur="0.3"/> thirty-two plots <pause dur="0.8"/> thirty-two main plots <pause dur="0.3"/>

which is my <pause dur="0.2"/> twenty-one plus three plus one plus three <pause dur="0.2"/> plus the three <pause dur="0.2"/> plus the extra one <pause dur="0.3"/> so here we are at the main plot section <pause dur="1.7"/> so when you want to evaluate if this was a good experiment <pause dur="0.5"/> from the point of view <pause dur="0.3"/> of the main plots <pause dur="0.6"/> you can start having a look just up here was it a good experiment to study irrigation and fertility and the interaction <pause dur="1.7"/>

and then when you come along to the subplot section <pause dur="1.2"/> you come along here <pause dur="0.4"/> and you have all the degrees of freedom <pause dur="0.5"/> down here <pause dur="0.2"/> ninety-five is the total because that's ninety-six minus one there's your <pause dur="0.3"/> n-minus-one <pause dur="2.1"/><vocal desc="cough" iterated="n"/><pause dur="0.6"/> i haven't given this but you'll try this in the practical <pause dur="0.3"/> Genstat also provides a sample ANOVA with random data to show you how the results will look <pause dur="0.4"/> and i find that quite useful in teaching and we're asking you to get this in the in the practical to have a look at that <pause dur="3.1"/> and the sort of question you can now answer is <pause dur="0.4"/> how would the degrees of freedom change if you moved fertilizer to the subplots <pause dur="0.9"/> if instead of having <pause dur="0.2"/> two main plot factors and one subplot factor <pause dur="0.3"/> you said well fertilizer could go on the little plots i wonder <trunc>w</trunc> how that would look <pause dur="1.1"/> and you have now two ways of doing that <pause dur="0.3"/> you can do that from first principles yourself <pause dur="1.2"/> or <pause dur="0.2"/> you could run the randomization again <pause dur="0.3"/> and move one of the factors from the main plot

level to the subplot level <pause dur="4.2"/> i hope you can see from here if we move <pause dur="0.2"/> fertilizer from here <pause dur="0.4"/> down to here <pause dur="1.3"/> then this would come down here <pause dur="1.0"/> and also this would come down here <pause dur="1.1"/> so we would now be doing an experiment <pause dur="0.5"/> with <pause dur="0.7"/> three <pause dur="0.2"/> degrees of freedom there <pause dur="0.6"/> and <pause dur="0.2"/> just three values here <pause dur="0.5"/> because this stays <pause dur="0.8"/> and so this now <pause dur="0.9"/> the <trunc>intera</trunc> the residual <pause dur="0.2"/> would have nine <pause dur="0.2"/> degrees of freedom <pause dur="0.3"/> we've got ourselves rather a small experiment at the main plot level <pause dur="0.3"/> that's not such a good idea <pause dur="1.4"/> so it's those sorts of things you can now do <pause dur="0.2"/> either by studying this <pause dur="0.3"/> or <pause dur="0.2"/> by running through the design again <pause dur="0.9"/> and that's what we've asked you to do in the practical <pause dur="5.2"/> any more comments or <pause dur="0.4"/> questions <pause dur="1.4"/> on this design part i'm now moving <pause dur="0.3"/> to data management </u><pause dur="1.0"/> <u who="sf0660" trans="pause"> # i'd like to ask something on the number of plots <pause dur="0.3"/> that you have for your <pause dur="0.8"/> # main <pause dur="0.4"/> blocking practice the main treatment practice </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> yep </u><u who="sf0660" trans="latching"> # <pause dur="1.6"/> from <pause dur="2.0"/> <gap reason="inaudible" extent="1 sec"/> <pause dur="0.6"/> where he's given us the three factors </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> yes </u><u who="sf0660" trans="overlap"> # <pause dur="1.6"/> i don't know <pause dur="0.3"/> if you've got irrigation as four levels fertility two levels <pause dur="0.3"/> how do you then <pause dur="0.4"/> get <pause dur="0.7"/> can you not say that you had nine <pause dur="1.0"/> plots for your <pause dur="1.2"/> two </u><pause dur="2.0"/> <u who="nm0658" trans="pause"> with the design as randomized <pause dur="0.8"/> that's what you've got </u><pause dur="4.8"/> <u who="sf0660" trans="pause"> so for your main plot from that did you say you've got how many plots </u><pause dur="0.6"/> <u who="nm0658" trans="pause"> for my main plot i said i've got <pause dur="0.3"/> thirty-two <pause dur="0.6"/> main plots <pause dur="0.8"/> which is the ninety-six that i had <pause dur="0.7"/> divided by three </u><pause dur="2.7"/> <u who="sf0660" trans="pause"> but if you didn't go through that ANOVA table <pause dur="0.4"/> # is there any way of working that out just from what you get at the very beginning # <gap reason="inaudible" extent="1 word"/> structure </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> yes <pause dur="1.6"/><kinesic desc="changes slide" iterated="n"/> if i didn't have that at the beginning <pause dur="0.5"/> i would say <pause dur="0.4"/> that

at the main plot level <pause dur="1.0"/> what i've got <pause dur="0.3"/> is <pause dur="1.3"/> four blocks </u><u who="sf0668" trans="overlap"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="0.5"/><u who="nm0658" trans="pause"> and four levels of irrigation which is four times four </u><u who="sf0668" trans="overlap"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="0.5"/> <u who="nm0658" trans="pause"> times two levels of fertility <pause dur="0.3"/> which is four times four is sixteen times two is thirty-two </u><pause dur="1.3"/> <u who="sf0668" trans="pause"> <gap reason="inaudible" extent="1 sec"/></u><pause dur="2.9"/> <u who="nm0658" trans="pause"> any more <pause dur="0.7"/> questions <pause dur="2.6"/> the second topic </u><pause dur="0.5"/> <u who="sm0669" trans="pause"> # excuse me <pause dur="0.2"/> # # did <gap reason="inaudible" extent="2 secs"/> <pause dur="0.3"/> and the design in the in in reality </u><pause dur="0.4"/><u who="nm0658" trans="pause"> yeah </u><u who="sm0669" trans="overlap"> i was thinking how about three factors <pause dur="0.4"/> <gap reason="inaudible" extent="2 secs"/> plot <pause dur="0.5"/> that's to say maybe to have <pause dur="0.2"/> the # irrigation out the way <pause dur="0.6"/> then the fertility as a subplot <pause dur="0.4"/> and then the variety as the sub-subplot <pause dur="0.5"/> in that case we will have been having <pause dur="0.4"/> more <pause dur="0.2"/> of a <pause dur="0.6"/> how will i call it degrees of freedom than <gap reason="inaudible" extent="2 secs"/> irrigation and the fertility <pause dur="0.4"/> as a main plot and then having the variety as a <pause dur="0.5"/> subplot how would you <pause dur="0.5"/> <gap reason="inaudible" extent="2 secs"/> </u><pause dur="0.8"/> <u who="nm0658" trans="pause"> okay <pause dur="0.7"/> can anybody <pause dur="0.4"/> # <pause dur="0.6"/> let me repeat the question <pause dur="0.3"/> and then i'm interested if any of you can now <pause dur="0.2"/> provide an answer you are now you've had fifteen weeks of statistics so you're all semi-statisticians <pause dur="2.2"/><vocal desc="laughter" n="ss" iterated="y" dur="2"/> now the question was <pause dur="0.9"/> that this was <pause dur="0.4"/> # deliberately something a little more complicated as an experiment <pause dur="0.2"/> than you've had before it had three factors <pause dur="2.5"/> and the question was well if you have three factors <pause dur="1.0"/> why did we just have a single split <pause dur="1.2"/> or if i understand your question correctly why didn't we have <pause dur="0.3"/> two splits <pause dur="1.0"/> where perhaps we have irrigation on the main plots <pause dur="0.7"/> and then we have <pause dur="0.4"/> # <pause dur="0.2"/> fertility on <pause dur="0.4"/> the sort of middle-size plots <pause dur="0.5"/> and variety on the little plots <pause dur="1.3"/> and that is quite common <pause dur="2.7"/> does anybody have any comments on that

suggestion <pause dur="0.2"/> supposing that <pause dur="0.3"/> when you're back home <pause dur="0.3"/> and starting to work somebody says i'm doing a three factor experiment <pause dur="0.3"/> so i'm doing it on a split-split that's called a split-split plot <pause dur="0.4"/> design <pause dur="0.6"/> very difficult to say if you've had a few drinks <pause dur="0.9"/><vocal desc="laughter" n="ss" iterated="y" dur="2"/> does anybody have any <pause dur="1.3"/> anybody have any comments <pause dur="0.5"/> on <pause dur="0.2"/> whether they would encourage that whether they would like that </u><pause dur="1.2"/> <u who="sm0670" trans="pause"> <gap reason="inaudible" extent="2 secs"/> </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> ah <vocal desc="laugh" iterated="n"/></u><u who="sm0670" trans="latching"> <gap reason="inaudible" extent="1 sec"/> and it was <pause dur="0.5"/> how would we differentiate the effect of <pause dur="1.2"/> the two <gap reason="inaudible" extent="1 word"/> that you combine in that one <pause dur="1.6"/> like the <pause dur="0.4"/> <gap reason="inaudible" extent="1 sec"/> the variety <pause dur="1.2"/> the irrigation in the block </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> yes <pause dur="2.7"/> here </u><u who="sm0670" trans="latching"> <gap reason="inaudible" extent="1 sec"/> combine <pause dur="0.3"/> make the main <gap reason="inaudible" extent="1 word"/> </u><pause dur="1.3"/> <u who="nm0658" trans="pause"> i had <pause dur="0.4"/> here i had irrigation and fertility on the main plots </u><u who="sm0670" trans="overlap"> yes </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> yes </u><pause dur="0.3"/> <u who="sm0670" trans="overlap"> how would you <trunc>s</trunc> </u><pause dur="0.5"/> <u who="nm0658" trans="pause"> how </u><pause dur="0.4"/> <u who="sm0670" trans="pause"> how would you differentiate the effect on your <pause dur="0.3"/> experiment <pause dur="0.6"/> of <gap reason="inaudible" extent="1 sec"/></u><pause dur="0.9"/> <u who="nm0658" trans="pause"> well <pause dur="0.2"/> if you look at the ANOVA table <pause dur="3.1"/><kinesic desc="changes slide" iterated="n"/> that's the same <pause dur="0.7"/> question as though <pause dur="1.6"/> you just did <pause dur="0.2"/> a randomized block design this is like a randomized block design <pause dur="0.5"/> and you'll have a set of treatment means which give you <pause dur="0.2"/> the mean for each level of irrigation <pause dur="0.8"/> and you will have another one <pause dur="0.2"/> which gives you the mean for each level of fertility <pause dur="0.4"/> just the two means 'cause there's <pause dur="0.4"/> only two levels <pause dur="0.3"/> and then you'll

have another table <pause dur="0.8"/> for the <pause dur="0.3"/> interaction <pause dur="3.8"/> and that is one reason <pause dur="0.3"/> if you're not sure how that's going to look in practice <pause dur="0.5"/> that's one reason <pause dur="0.2"/> why Genstat gives you <pause dur="1.1"/> a dummy analysis <pause dur="0.2"/> not just with the degrees of freedom which you got here <pause dur="0.9"/> but also <pause dur="0.2"/> with <pause dur="0.2"/> random data <pause dur="0.2"/> so you can see how all the means will look and you can see <pause dur="0.3"/> i wonder if that will give me sufficient information <pause dur="0.5"/> to understand <pause dur="0.2"/> all the components <pause dur="0.3"/> of the treatments that i've applied to my experiment <pause dur="2.1"/> i think it's a similar question <pause dur="0.3"/> that <pause dur="0.9"/> most people feel that <pause dur="0.2"/> if they've got many factors <pause dur="0.4"/> they're much happier <pause dur="0.2"/> if each factor's at a different sort of level <pause dur="0.6"/> which leads you <pause dur="0.3"/> towards having two factors in a split plot experiment and three factors in a split-split plot <pause dur="0.6"/> and i hope you never have five factors <pause dur="0.4"/> because then you've got <pause dur="0.5"/> huge plots and <pause dur="0.2"/> and so on <pause dur="1.9"/> does does anybody have any thoughts about would you encourage </u><pause dur="0.6"/> <u who="sm0671" trans="pause"> well <pause dur="0.8"/> just a general stab in the dark </u><pause dur="0.2"/> <u who="nm0658" trans="pause"> yes <pause dur="0.2"/> have a <trunc>s</trunc> </u> <u who="sm0671" trans="overlap"> you split your plot <pause dur="0.6"/> into <pause dur="0.5"/> levels of fertilizer and split it again <pause dur="0.4"/> and they're very small plots <pause dur="0.5"/> the smaller the sample <pause dur="0.6"/> # <pause dur="0.4"/> if you have a larger sample variations <pause dur="0.4"/> tend to be absorbed in a large sample rather than a small sample <pause dur="1.7"/> is that </u><u who="nm0658" trans="overlap"> that that's <pause dur="0.2"/> that's almost there </u><pause dur="0.2"/> <u who="sm0659" trans="pause"> <gap reason="inaudible" extent="1 word"/> <pause dur="0.2"/> depends on the way degrees of freedom <pause dur="0.2"/> <gap reason="inaudible" extent="1 sec"/></u><u who="nm0658" trans="overlap"> degrees of freedom will come

in <pause dur="0.7"/> # you would <pause dur="0.4"/> it would be like having the top level it would be like just having irrigation at the top level <pause dur="0.9"/> does anybody have any general feelings <pause dur="0.2"/> about whether <pause dur="0.5"/> they would <pause dur="0.3"/> encourage experiments to be at lots of different levels <pause dur="0.9"/> or whether they would <trunc>re</trunc> prefer information <pause dur="0.3"/> to be at a single level </u><pause dur="0.7"/> <u who="sm0672" trans="pause"> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="0.9"/> <u who="sf0673" trans="pause"> wouldn't it depend on what you're looking <pause dur="0.6"/> </u><u who="nm0658" trans="overlap"> <trunc>i</trunc> </u><u who="sf0673" trans="overlap"> at what you were interested in <pause dur="0.7"/> 'cause if you want to to find <pause dur="0.3"/> # equally <pause dur="0.2"/> the effect on irrigation and part of the effects of fertilizer and variety and on interactions <pause dur="0.6"/> then by splitting it up into several levels you're going to lose a lot of degrees of freedom for your <pause dur="0.5"/> upper levels </u><pause dur="0.9"/> <u who="nm0658" trans="pause"> right </u><pause dur="0.4"/> <u who="sf0673" trans="pause"> # <pause dur="0.2"/> so surely <pause dur="1.4"/> the information that you'll obtain <pause dur="0.2"/> will be # <pause dur="0.7"/> you won't have <pause dur="1.6"/> <gap reason="inaudible" extent="1 sec"/> <pause dur="0.6"/> # <pause dur="3.6"/> i don't know how to finish that sentence </u><pause dur="1.1"/> <u who="nm0658" trans="pause"> # let let me try and finish it for you well maybe try and ask somebody else </u><u who="sf0673" trans="overlap"> mm </u><u who="nm0658" trans="overlap"> because i think you were voting for the side of not having too many levels </u><u who="sf0673" trans="latching"> mm </u><pause dur="0.9"/> <u who="nm0658" trans="pause"> # there were other people who i think instinctively said let's have more levels is there anybody who would <pause dur="0.7"/> who approves of the idea of having lots of levels </u><pause dur="2.6"/> <u who="sm0671" trans="pause"> in blocks <gap reason="inaudible" extent="1 sec"/> <pause dur="0.4"/> why didn't we use <pause dur="0.2"/> one of those factors <gap reason="inaudible" extent="3 secs"/> </u><pause dur="0.5"/> <u who="nm0658" trans="pause"> right <trunc>e</trunc> each one going down </u><pause dur="0.2"/> <u who="sm0671" trans="pause"> yeah that's the way they <gap reason="inaudible" extent="2 words"/> </u><pause dur="0.2"/> <u who="nm0658" trans="pause"> right <pause dur="0.5"/> okay <pause dur="1.2"/> having <pause dur="0.5"/> your idea of having <pause dur="0.2"/> two factors in the split block design <pause dur="0.3"/> and three factors therefore in a split-split block design <pause dur="0.3"/> is extremely common <pause dur="2.1"/> my view is <pause dur="0.6"/> your other intervention which is to say <pause dur="0.3"/> let's not have too many levels unless we need to <pause dur="0.8"/> so the general view i have is that <pause dur="0.2"/>

lots of levels <pause dur="0.2"/> causes complication <pause dur="0.7"/> if you can have your experiment at a single level <pause dur="0.7"/> life is simpler <pause dur="0.7"/> all your tables are compared at the same level all your plots are the same size <pause dur="0.5"/> and the analysis the design is simpler <pause dur="0.2"/> and i think the analysis is simpler <pause dur="0.7"/> so the split plot analysis has lots of different levels <pause dur="0.3"/> and you will find when you look at the standard errors <pause dur="0.6"/> that that indicates that the analysis becomes very messy and complicated <pause dur="0.9"/> so i would prefer <pause dur="0.2"/> not to have <pause dur="0.2"/> split plot experiments <pause dur="2.6"/> the only reason i would have a split plot experiment is if <pause dur="0.2"/> some factors need <pause dur="0.2"/> large size plots <pause dur="0.2"/> like irrigation <pause dur="0.6"/> and other factors don't <pause dur="1.3"/> and then <pause dur="0.2"/> if the <trunc>l</trunc> if irrigation needs large size plots and we want to have irrigation and variety <pause dur="0.6"/> if we want to have no <pause dur="0.3"/> different levels <pause dur="0.2"/> then all plots have to be very very large <pause dur="0.9"/> whereas because variety only needs small plots <pause dur="0.2"/> we can choose to split the large plots that we need <pause dur="0.3"/> for irrigation <pause dur="0.2"/> into subdivision <pause dur="0.2"/> that seems a good reason <pause dur="1.0"/> i find

there is no other good reason <pause dur="0.3"/> for having split plot <pause dur="0.4"/> designs <pause dur="1.6"/> we will see later today <pause dur="0.3"/> that <pause dur="0.2"/> the whole nightmare of repeated measures analysis <pause dur="1.4"/> is a similar argument that the repeated measures <pause dur="0.8"/> are <pause dur="0.2"/> repeated within the plot <pause dur="0.5"/> they're like a sort of split plot <pause dur="0.6"/> where time <pause dur="0.3"/> is each measurement <pause dur="0.5"/> and you will find <pause dur="0.2"/> as soon as you have lots of different levels in your data <pause dur="0.6"/> your analysis is getting messier <pause dur="1.0"/> so i'm not very comfortable <pause dur="0.2"/> with the idea <pause dur="0.2"/> that if you increase the number of factors <pause dur="0.5"/> you also increase <pause dur="0.6"/> the layout problems <pause dur="0.3"/> remember <pause dur="0.3"/> at the very beginning <pause dur="0.6"/> we've said that <pause dur="2.0"/> when you look at design <pause dur="7.0"/><kinesic desc="changes slide" iterated="n"/> please think of your treatment structure <pause dur="0.3"/> we said we want three factors <pause dur="0.2"/> because that ties in with our objectives <pause dur="0.9"/> and then please think of your layout <pause dur="1.2"/> now what the people who do split-split plot all the time are doing <pause dur="0.4"/> is they are thinking of these two together <pause dur="1.2"/>

the treatments and the layout <pause dur="1.0"/> and they keep <pause dur="0.2"/> confusing them together <pause dur="0.9"/> and i would like people to think of the treatments first and say <pause dur="0.6"/> i could satisfy my objectives by having three factors and there'd be lots of objectives i could satisfy <pause dur="1.0"/> now can i have a very simple layout <pause dur="0.2"/> a simple layout <pause dur="0.2"/> is a randomized complete block <pause dur="0.6"/> sort of layout <pause dur="0.4"/> with twenty-four plots <pause dur="0.3"/> in this case <pause dur="1.0"/> for each block <pause dur="0.8"/> and so <pause dur="0.5"/> there are no different levels <pause dur="0.3"/> for the treatments <pause dur="1.2"/> and if you can manage that <pause dur="0.2"/> please do it <pause dur="0.8"/> if you can't manage that <pause dur="0.3"/> you say well <pause dur="0.5"/> maybe i'll have to go to a split plot <pause dur="0.2"/> sort of layout <pause dur="0.2"/> with two levels <pause dur="0.8"/> but don't volunteer for it and say because i've got two factors i will automatically go for two levels <pause dur="0.7"/> i think that causes many problems <pause dur="0.4"/> in the analysis of experiments <pause dur="0.2"/> and is part of the reason people

don't exploit their data <pause dur="0.2"/> as much as they could </u><pause dur="0.6"/> <u who="sm0669" trans="pause"> # my last question # would you be able to <pause dur="0.4"/> tell what a <pause dur="0.2"/> reasonable portion is statistically speaking <pause dur="0.7"/> of the effects of say irrigation <pause dur="0.2"/> confidently <pause dur="0.4"/> then <gap reason="inaudible" extent="2 secs"/> confidently <pause dur="0.4"/> <gap reason="inaudible" extent="1 sec"/> them <pause dur="0.6"/> i was thinking <gap reason="inaudible" extent="2 secs"/> <pause dur="0.4"/> when you split them and you could do these things <pause dur="0.4"/> see i would have gone in for irrigation as the main <pause dur="0.3"/> and then the # <pause dur="0.4"/> # fertility i mean the maybe <pause dur="0.4"/> the fertility <pause dur="0.2"/> <gap reason="inaudible" extent="1 word"/> that and then gradually move on <pause dur="0.5"/> to the variety <pause dur="0.4"/> and have an idea whether even there is an interaction between <pause dur="0.4"/> variety i mean irrigation <gap reason="inaudible" extent="1 sec"/> </u><pause dur="1.0"/> <u who="nm0658" trans="pause"> if you <pause dur="0.2"/> lay it out in a randomized block with just one level <pause dur="0.4"/> you can answer all those questions <pause dur="1.2"/> you can <trunc>askn</trunc> answer the questions about irrigation about variety and about the interaction <pause dur="0.7"/> whether it's laid out as a split plot <pause dur="0.2"/> or whether it's laid out as a randomized block <pause dur="0.5"/> so <pause dur="0.2"/> the questions you can answer about the treatments are the same <pause dur="1.8"/> there are some people who would argue <pause dur="0.5"/> that when it's laid out <pause dur="0.5"/> as a split plot <pause dur="0.7"/> compared to random and let me <pause dur="0.2"/> come back a step <pause dur="0.2"/> when it is laid out as a randomized block <pause dur="0.5"/> you are sort of treating each factor equally <pause dur="0.4"/> they are all on plots of the same size <pause dur="1.3"/> one of the arguments given in the textbooks <pause dur="0.6"/> for having <pause dur="0.2"/> a split plot <pause dur="0.6"/> is where <pause dur="0.2"/> you want more information <pause dur="0.3"/> in this case let's say on variety <pause dur="0.2"/> so you're going to put them on the little plots <pause dur="0.2"/> and then they're always close together <pause dur="1.2"/> and you <trunc>ye</trunc> you want less information <pause dur="0.2"/> on the irrigation so you put those on big plots <pause dur="0.2"/> that are <pause dur="0.2"/> by

definition further apart on average <pause dur="1.0"/> and <pause dur="0.2"/> so when you compare a randomized block <pause dur="0.3"/> with a split plot <pause dur="0.6"/> in the split plot <pause dur="0.7"/> according to the textbooks <pause dur="0.6"/> you get more information on the subplot factors <pause dur="0.7"/> and less information on the main plot factors <pause dur="1.8"/> the problem i have is that <pause dur="0.7"/> you get much less <pause dur="0.3"/> on the main plot compared with a very little gain <pause dur="0.4"/> on the subplot <pause dur="0.5"/> and also <pause dur="0.2"/> that does not account for the fact you also <pause dur="0.3"/> add in <pause dur="0.3"/> unnecessary complication in interpreting the results <pause dur="0.8"/> which means that <pause dur="0.3"/> i don't like that as a reason <pause dur="0.6"/> for doing a split plot experiment <pause dur="0.2"/> the only reason i like <pause dur="0.4"/> is the fact that you have to for practical purposes <pause dur="0.4"/> and there are many of those <pause dur="5.7"/> okay <pause dur="0.3"/> the next subject <pause dur="2.8"/><kinesic desc="changes slide" iterated="n"/> then we'll have a break <pause dur="0.3"/> and we'll discuss repeated measures after the break <pause dur="2.9"/> now <pause dur="1.4"/> what i hope to show you <pause dur="0.5"/> is <pause dur="0.8"/> that <pause dur="0.2"/> the ideas of data management <pause dur="0.6"/> which we <pause dur="0.3"/> distributed in session three which was <pause dur="0.6"/> very early last term <pause dur="0.7"/> that was data management <pause dur="0.2"/> for any sort of data <pause dur="1.3"/> when we translate those ideas into experimental data <pause dur="0.4"/>

experimental data are quite simple <pause dur="0.3"/> usually <pause dur="0.3"/> and so you shouldn't have any problem <pause dur="0.5"/> if <pause dur="0.2"/> you manage the data sensibly and if you keep the principles <pause dur="0.6"/> so we're going to review the standard ways of entering experimental data <pause dur="0.3"/> which i think follow automatically if you've understood the ideas of design <pause dur="2.1"/> and <pause dur="0.3"/> what we're also going to show you <pause dur="0.3"/> is what happens <pause dur="0.4"/> if your data have been entered differently <pause dur="0.5"/> and this is where we're hitting a new problem now <pause dur="0.2"/> because of all these people as i've said before <pause dur="0.3"/> who are maniacs for Excel <pause dur="0.5"/> and that means you can enter data in all sorts of crazy ways doesn't it <pause dur="0.6"/> # <pause dur="1.0"/> and and then you can have problems reorganizing your data <pause dur="0.3"/> so you can do the sensible analysis <pause dur="0.6"/> and i have to tell you that as statisticians now <pause dur="0.4"/> this is serious <pause dur="0.6"/> because <pause dur="0.5"/> in the olden days we found that we spent all our time helping people on the analysis <pause dur="1.0"/> now <pause dur="0.7"/> most of your time seems to be spent <pause dur="0.3"/> on rescuing your data from poor data management <pause dur="0.3"/> the analyses are very quick <pause dur="0.5"/> you just click on the

ANOVA button <pause dur="0.2"/> in Genstat <pause dur="0.5"/> so the analysis step <pause dur="0.2"/> is very very quick <pause dur="0.3"/> and you've done it many times this term <pause dur="0.7"/> the step which isn't quick <pause dur="1.5"/> is reorganizing your data <pause dur="0.3"/> because they were entered in a funny way <pause dur="1.5"/> so there are two ways to avoid this <pause dur="0.3"/> and i want to <pause dur="0.2"/> indicate both of them <pause dur="0.3"/> the first is please enter your data sensibly <pause dur="0.2"/> and then you won't have this problem <pause dur="1.3"/> and the second <pause dur="0.2"/> is if you haven't entered your data sensibly <pause dur="1.0"/> please don't re-enter <pause dur="0.7"/> please use the computer <pause dur="0.2"/> to reorganize your data <pause dur="0.5"/> but accept <pause dur="0.3"/> that <pause dur="0.2"/> don't get annoyed at either Excel or at Genstat or Minitab <pause dur="0.2"/> because that's taking the time <pause dur="0.3"/> i'm afraid it is the data manipulation <pause dur="0.2"/> that does take the time <pause dur="0.4"/> so <pause dur="0.6"/> allow a little bit of time for that <pause dur="2.0"/> so we're going to show you both <pause dur="0.6"/> and <pause dur="0.7"/> you either need to reorganize the data because you've entered it in an odd way <pause dur="0.9"/> or <pause dur="0.2"/> because alternative analyses <pause dur="0.4"/> require different layouts <pause dur="0.2"/> so <pause dur="0.2"/> sometimes <pause dur="0.2"/> when the data aren't so simple <pause dur="0.4"/> you might have three different analyses <pause dur="0.2"/> and for one analysis it was good to

enter the data across <pause dur="0.3"/> and for another analysis <pause dur="0.3"/> it was good to enter the data down <pause dur="2.5"/> so you need to become a little adept <pause dur="0.2"/> at data manipulation <pause dur="0.7"/> or you're going to waste a lot of time <pause dur="1.3"/> and a lot i do mean it <pause dur="4.3"/> perhaps i'll <pause dur="0.5"/> # tell you one <pause dur="0.7"/> # <pause dur="0.4"/> not horror story but close <pause dur="1.9"/> the subject of data management <pause dur="0.6"/> is not well reported in the textbooks <pause dur="0.9"/> when i arrived back a couple of years ago <pause dur="1.7"/> one of the first advisees i saw <pause dur="0.8"/> was somebody <pause dur="0.2"/> who was just finishing his PhD at Reading <pause dur="1.5"/> and he had been on this course <pause dur="0.9"/> but we hadn't been discussing much about data management in the olden days <pause dur="0.8"/> and he'd followed a little bit of the notes on how to organize the data <pause dur="1.3"/> and it was discussions with him <pause dur="0.4"/> that clarified to me that we must include <pause dur="0.6"/> data manipulation data management in this course <pause dur="0.3"/> which is why we've changed this course to include this <pause dur="1.1"/> he had an experiment which was done here <pause dur="0.8"/> and he'd done his experiment <pause dur="0.3"/> at <pause dur="0.3"/> two different places for each of the three years of his PhD <pause dur="2.6"/> and <pause dur="0.2"/> each experiment

he had measured ten different things <pause dur="1.8"/> so he'd had ten measurements <pause dur="0.8"/> it's all very simple <pause dur="0.3"/> these were very simple experiments <pause dur="1.3"/> he had forty-eight plots in each experiment <pause dur="1.5"/> he had now looked in a textbook <pause dur="0.9"/> and he looked at his notes <pause dur="0.3"/> to see how to enter the data <pause dur="1.6"/> and he had entered the data quite nicely organized <pause dur="0.5"/> with three columns <pause dur="1.3"/> the first column <pause dur="0.3"/> was the yield <pause dur="0.2"/> or the measurement <pause dur="0.9"/> the second column <pause dur="0.3"/> was the block <pause dur="0.5"/> and the third column was the treatment <pause dur="0.4"/> there were twelve treatments <pause dur="0.3"/> and there were four <pause dur="0.5"/> blocks <pause dur="0.6"/> so he had <pause dur="0.2"/> measurement <pause dur="0.2"/> block <pause dur="0.4"/> treatment <pause dur="1.8"/> because he'd looked in a textbook <pause dur="0.6"/> and textbooks <pause dur="0.4"/> only seem to deal with experiments when you have a single measurement <pause dur="1.6"/> which is unlike the real world <pause dur="0.4"/> where you always have lots of measurements <pause dur="1.4"/> he decided to enter his ten measurements <pause dur="0.4"/> in ten different files <pause dur="2.0"/> so he now had <pause dur="0.2"/> ten little files <pause dur="0.2"/> each one with three columns <pause dur="1.0"/> the block and the treatment columns were the same <pause dur="0.3"/> and the measurement was different <pause dur="2.2"/> and then he had six experiments <pause dur="1.9"/>

so he now had sixty files <pause dur="1.9"/> each one <pause dur="0.2"/> with three columns in <pause dur="1.1"/> and each one followed precisely the method of analysis he'd been taught in his course <pause dur="0.7"/> and he'd finished his analysis <pause dur="0.6"/> and he was three weeks away from submitting his PhD <pause dur="2.3"/> and his supervisor <pause dur="0.6"/> looked at his information and said <pause dur="0.8"/> there are two interesting things i would like you to do in addition <pause dur="1.6"/> the first is <pause dur="0.2"/> that i've noticed that you have measured <pause dur="0.7"/> the yield <pause dur="0.5"/> in two different ways <pause dur="0.2"/> i'd like you to do what's called an analysis of covariance where you adjust one measurement for the values of another measurement <pause dur="1.7"/> could you please do a <trunc>bi</trunc> a simple analysis of covariance <pause dur="0.6"/> for each of your six experiments <pause dur="0.5"/> and he gave him a very simple thing <pause dur="0.2"/> and showed him it was in the textbook <pause dur="2.4"/> the second question <pause dur="1.0"/> was even more perplexing to him he said could you do a simple combined analysis where you combine the information for the six experiments together <pause dur="0.2"/> to see how <pause dur="0.7"/> the <pause dur="0.2"/> results you have reinforce each other <pause dur="2.0"/> and so he came to

statistics <pause dur="0.4"/> to do this <pause dur="0.7"/> he knew no <pause dur="0.4"/> computing he only knew how to turn <pause dur="0.4"/> the computing handle <pause dur="0.7"/> so he was now being asked <pause dur="0.8"/> a simple question of data management <pause dur="0.4"/> well it would have been simple <pause dur="0.3"/> if his data had been organized sensibly <pause dur="0.6"/> but can you see that <pause dur="0.2"/> this idea of covariance <pause dur="0.9"/> which <pause dur="0.2"/> would have been trivial with Genstat <pause dur="0.2"/> normally <pause dur="1.0"/> meant that he had two data files <pause dur="0.2"/> which he had to merge together because the two columns were in different files <pause dur="1.6"/> nobody'd taught him about merging files <pause dur="2.8"/> and <pause dur="0.6"/> the combining of the experiments the six experiment <pause dur="0.2"/> meant not just that you merged the files <pause dur="0.4"/> but you then had to put them end to end <pause dur="1.0"/> because he had forty-eight but now he wanted forty-eight times six <pause dur="0.5"/> with another column which said which experiment it was <pause dur="1.7"/> so when he came with three weeks to go <pause dur="1.4"/> we explained <pause dur="0.3"/> these ideas to him <pause dur="1.0"/> but because he had no concept of data management <pause dur="0.3"/> or of <pause dur="0.3"/> computer ideas such as merging files <pause dur="0.7"/> he never succeeded <pause dur="0.2"/> the only way it could possibly have happened <pause dur="0.7"/> is if somebody else had taken

his data and done it all for him <pause dur="0.9"/> that's not his PhD <pause dur="0.8"/> or if he'd learned a little about data manipulation <pause dur="0.5"/> and these were very very simple tasks <pause dur="1.1"/> i have to say these are much easier tasks <pause dur="0.4"/> now you've got Windows <pause dur="0.3"/> than they were <pause dur="0.2"/> when you had to merge files using DOS commands <pause dur="1.4"/>

but <pause dur="0.2"/> nevertheless <pause dur="0.7"/> this was not possible for him to do <pause dur="1.5"/> and <pause dur="0.3"/> that's data manipulation <pause dur="0.4"/> taken to its illogical extreme <pause dur="1.2"/> but i have to say <pause dur="0.3"/> that the way he was encouraged to enter the data first <pause dur="1.2"/> was exactly what was recommended <pause dur="0.5"/> in a very popular textbook <pause dur="1.0"/> which we use on our courses <pause dur="0.5"/> says <pause dur="0.2"/> when you're entering your data into the computer this is the sort of layout <pause dur="1.2"/> and it encouraged the layout <pause dur="0.3"/> which caused his disaster <pause dur="0.3"/> because it didn't say of course in practical experiments <pause dur="0.2"/> you will have more than one measurement <pause dur="0.3"/> and just put them end to end <pause dur="0.4"/> don't make them in a different file <pause dur="6.5"/> so let me just remind you <pause dur="1.9"/><kinesic desc="changes slide" iterated="n"/> this was the yield data <pause dur="0.8"/> that we discussed in session three <pause dur="1.8"/> you have nothing new you need to learn if you understood session three <pause dur="0.3"/> notice <pause dur="0.2"/> that the way that we have laid this out <pause dur="0.2"/> with the block numbers <pause dur="1.3"/> is exactly the same <pause dur="0.4"/> as the way Genstat has just randomized your experiment at the beginning <pause dur="0.7"/> so here <pause dur="0.5"/> are the blocks <pause dur="0.2"/> the repetitions <pause dur="0.2"/> and the treatments <pause dur="0.7"/> and here are the

measurements <pause dur="0.3"/> that you just type in when you get them <pause dur="1.3"/> so as long as you don't mess things up <pause dur="0.8"/> from where you started <pause dur="0.8"/> the rules are very simple <pause dur="0.5"/> how many plots do you have <pause dur="1.2"/> each plot <pause dur="0.6"/> becomes a row <pause dur="1.2"/> each column <pause dur="0.3"/> is a measurement <pause dur="1.7"/> i can't think of anything <pause dur="0.4"/> that's more simple <pause dur="2.8"/> and that's what you have to do <pause dur="4.0"/> you will find <pause dur="0.3"/> that there's one complication which we do discuss in session three <pause dur="0.6"/> which is <pause dur="0.2"/> what happens if some of the measurements <pause dur="0.4"/> are made at the plant level or the plot level <pause dur="0.4"/> here sorry the plot level <pause dur="0.2"/> and other measurements are made at the plant level <pause dur="1.9"/> so for example <pause dur="0.2"/> it's very popular to measure the yield by harvesting at the plot level <pause dur="0.4"/> but you might measure the height of twenty plants in each plot <pause dur="0.6"/> i wonder how you would enter that <pause dur="0.3"/> to which the answer is <pause dur="0.2"/> either in Excel or in Genstat <pause dur="0.3"/> you have one sheet <pause dur="1.0"/> for your plot level <pause dur="0.3"/> and you have another sheet for your plant level <pause dur="0.4"/> and we've covered that <pause dur="0.5"/> both in <pause dur="0.4"/> the discussion in section three and in the practical that you did <pause dur="1.3"/> so as long

as you're happy with those <pause dur="0.5"/> you don't have a data management problem <pause dur="0.9"/> you should enter your data in this way <pause dur="2.9"/> i can leap ahead a little bit <pause dur="0.4"/> to <pause dur="0.4"/> these repeated measures ideas <pause dur="0.3"/> supposing that <pause dur="0.3"/><kinesic desc="changes slide" iterated="n"/> here's an example <pause dur="0.2"/> i'm sorry it's a bit in French <pause dur="0.3"/> but this is the weight <pause dur="0.2"/> of small potatoes middling potatoes et cetera <pause dur="1.4"/> we're going later on to consider repeated measures <pause dur="0.6"/> which is <pause dur="0.2"/> measuring things <pause dur="0.3"/> after <pause dur="0.2"/> twenty days twenty-five days thirty days and things like that <pause dur="0.3"/> how should you enter those data <pause dur="0.2"/> answer <pause dur="0.2"/> they are just measurements <pause dur="0.2"/> so enter them across <pause dur="0.8"/> just as though <pause dur="0.6"/> you were entering different measurements <pause dur="0.2"/> so just because you made the measurements which differ in time <pause dur="0.3"/> don't get more complicated <pause dur="0.2"/> just treat them as a measurement <pause dur="0.6"/> and so <pause dur="0.2"/> if you have <pause dur="0.2"/> six repeated measurements <pause dur="0.6"/> just <pause dur="0.6"/> they're six measurements <pause dur="1.0"/> enter them across <pause dur="0.2"/> they haven't changed the number of plots in your experiment <pause dur="0.4"/> so <pause dur="0.2"/> they go across <pause dur="5.8"/><kinesic desc="changes slide" iterated="n"/> this <pause dur="1.6"/> is how not <pause dur="0.2"/> to enter experimental data <pause dur="1.9"/> this is the same data <pause dur="4.6"/> here we have <pause dur="0.9"/> the

treatment information <pause dur="1.1"/> and here we've entered the data <pause dur="0.2"/> for replicate one replicate two replicate three for one measurement <pause dur="4.2"/> that is very popular <pause dur="2.3"/> isn't it <pause dur="0.8"/><vocal desc="laugh" iterated="n"/><pause dur="0.3"/> i i'm sorry i'm i'm looking at you because <pause dur="0.4"/> <trunc>i</trunc> <trunc>i</trunc> you know it's not you that's made this as a <pause dur="0.3"/> this is encouraged in many departments <pause dur="0.8"/> and it seems very obvious <pause dur="0.9"/> and it works quite well for simple problems <pause dur="0.2"/> also <pause dur="0.5"/> notice that if you were doing the analysis by hand <pause dur="1.6"/> in the olden days <pause dur="0.7"/> this is exactly what you would do <pause dur="0.4"/> because you could work out the mean of these <pause dur="1.3"/> and you could get <pause dur="0.3"/> the mean for that treatment <pause dur="0.9"/> so you could get all the treatment means down here <pause dur="0.3"/> and all the block means across here <pause dur="2.3"/> so if you are in the habit <pause dur="0.9"/> of confusing entry with analysis <pause dur="0.5"/> this is wonderful <pause dur="0.4"/> you can use Excel <pause dur="0.5"/> to work out calculated columns <pause dur="0.2"/> and it all seems to work quite well <pause dur="0.2"/> until you try and do a full analysis and then it all falls apart <pause dur="1.2"/> do not confuse analysis <pause dur="0.4"/> with entry <pause dur="1.3"/> if later on you want to analyse your data <pause dur="0.8"/> either with Genstat or Excel <pause dur="0.5"/> you can

enter the data in the proper way <pause dur="2.5"/><kinesic desc="changes slide" iterated="n"/> that's like this <pause dur="2.8"/> if you would now like <pause dur="0.7"/> to look at the data <pause dur="0.3"/> like this <pause dur="1.6"/><kinesic desc="changes slide" iterated="n"/> i've said <pause dur="0.3"/> this is how not to enter experimental data <pause dur="0.5"/> if you want to look at the data like this that's terrific <pause dur="0.2"/> it's called tabulation <pause dur="0.8"/> you enter the data <pause dur="0.2"/> in long columns <pause dur="0.2"/> and then you say <pause dur="0.2"/> please tabulate the columns <pause dur="0.2"/> and across the top i want one factor <pause dur="0.2"/> and down here i want the other factor <pause dur="0.5"/> and then i'll put some summaries i don't have a problem looking at the data like that <pause dur="0.2"/> i have a problem entering the data <pause dur="0.3"/> it's confusing entry <pause dur="0.2"/> with analysis <pause dur="1.1"/> in Genstat <pause dur="0.2"/> that's called tabulation <pause dur="0.9"/> in Excel it's called pivot tables <pause dur="0.2"/> i don't care <trunc>h</trunc> whether you do it in Excel <pause dur="0.3"/> or whether you do it in Genstat <pause dur="0.6"/> so enter your data in Excel <pause dur="0.3"/> the <pause dur="0.2"/> proper way <pause dur="0.3"/> enter your data in Genstat the proper way <pause dur="0.2"/> if you want to look at the data like this <pause dur="1.3"/> tabulate the data present the data very nice you can see what's going on <pause dur="0.8"/> you can see what a nightmare this is for your data entry <pause dur="0.9"/> by looking <pause dur="0.4"/> and explaining to

somebody <pause dur="0.2"/> how you would enter <pause dur="0.2"/> these data <pause dur="0.3"/> in that format <pause dur="0.7"/> because that only works if you've got one measurement <pause dur="0.4"/> but in the real world we have <pause dur="0.2"/> here we have one two three four five six measurements <pause dur="0.3"/> are you going to enter them on six different sheets <pause dur="0.3"/> it's all getting messy <pause dur="2.1"/> this <pause dur="0.4"/> is very simple <pause dur="1.6"/> and if ever you want to transform your data <pause dur="0.2"/> and get <pause dur="0.3"/> the total <pause dur="0.3"/> it just becomes another column <pause dur="0.5"/> down here </u><pause dur="3.6"/> <u who="sf0668" trans="pause"> why doesn't he have repeated measurements of so many different factors <pause dur="0.3"/> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="0.4"/> <u who="nm0658" trans="pause"> if you've got <trunc>repea</trunc> lots of repeated measurements they still keep going across </u><pause dur="1.3"/> <u who="sf0668" trans="pause"> <gap reason="inaudible" extent="1 sec"/> </u><pause dur="0.9"/> <u who="nm0658" trans="pause"> and you might have sets of repeated measurements you might measure <pause dur="0.2"/> lots of things after twenty days another set of things after thirty another set of things after forty <pause dur="0.2"/> they can go across <pause dur="1.1"/> so <pause dur="0.3"/> you can go way across <pause dur="0.3"/> here <pause dur="0.6"/> # </u><pause dur="0.8"/> <u who="sm0669" trans="pause"> what if <gap reason="inaudible" extent="1 sec"/> experiment where you take in data <pause dur="0.3"/> from <trunc>de</trunc> for instance you do maize and <pause dur="0.2"/> pea or whatever it is <pause dur="0.6"/> and then you have all similar columns like this will it make safe <pause dur="0.2"/> to try to box them together or have separate <gap reason="inaudible" extent="1 sec"/> because <pause dur="0.3"/> <gap reason="inaudible" extent="1 sec"/> on them really <pause dur="0.3"/> i mean they're independent <gap reason="inaudible" extent="2 secs"/> </u><pause dur="0.2"/> <u who="nm0658" trans="pause"> okay </u><u who="sm0669" trans="overlap"> <gap reason="inaudible" extent="2 secs"/> <pause dur="0.7"/> sort of shoot up into <gap reason="inaudible" extent="1 sec"/> <pause dur="0.4"/> for say <pause dur="0.3"/> maize height <pause dur="0.4"/> <gap reason="inaudible" extent="3 secs"/> </u><u who="nm0658" trans="overlap"> okay </u><u who="sm0669" trans="overlap"> <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="latching"> right <trunc>th</trunc> we'll that can be the last <pause dur="0.2"/> question before we have a quick break <pause dur="0.5"/> 'cause you have such a nice new coffee place downstairs <pause dur="0.6"/> # <pause dur="0.2"/> <trunc>th</trunc> <pause dur="0.3"/> # the question i hope it's understood to everybody <pause dur="0.2"/> there are some experiments <pause dur="0.2"/> which are called mixed cropping experiments

does anybody not know <pause dur="0.2"/> what is a mixed cropping experiment <pause dur="0.9"/> you you all happy that <pause dur="0.7"/> this is a an experiment where you might have <pause dur="0.2"/> maize on <trunc>sub</trunc> <pause dur="0.2"/> some plots <pause dur="0.3"/> and beans on other plots <pause dur="0.4"/> and then <pause dur="0.2"/> the aim of the experiment is to see how the maize and the beans mix together <pause dur="0.3"/> so some plots have both maize and beans together <pause dur="2.2"/> and my simple rule for this <pause dur="0.6"/> is <pause dur="0.2"/> please stay simple <pause dur="1.2"/> so that is <pause dur="0.5"/> that please do exactly the same <pause dur="2.5"/> down here you have all your plots <pause dur="1.0"/> your treatments will say is it maize sole or bean sole or a mixture <pause dur="0.5"/> and then here are all your measurements <pause dur="1.4"/> and in your measurements you will have lots of blanks <pause dur="0.9"/> for the observations <pause dur="0.4"/> that <pause dur="0.3"/> don't the maize observations don't have any beans leave it blank or put it as missing it doesn't matter <pause dur="0.9"/> keep life simple <pause dur="1.0"/> in the past <pause dur="0.4"/> there has been a problem <pause dur="0.2"/> particularly related <pause dur="0.3"/> to <pause dur="0.2"/> a package <pause dur="0.2"/> called Mstat <pause dur="0.3"/> that is very poor at data manipulation <pause dur="0.6"/> and that has caused people to say <pause dur="0.3"/> well maybe i'll enter <pause dur="0.2"/> the maize data in one file and the bean data in

another file <pause dur="1.0"/> or maybe i'll enter <pause dur="0.3"/> the maize sole data in one file and the bean sole data in another and the mixed data in a third <pause dur="1.0"/> and then they get very confused <pause dur="0.6"/> # <pause dur="0.3"/> so to avoid all that confusion <pause dur="1.2"/> keep my simple message <pause dur="0.7"/> don't confuse <pause dur="0.4"/> the entry with the analysis <pause dur="0.3"/> the reason people choose the different files <pause dur="0.2"/> is to simplify the analysis <pause dur="0.6"/> simplify the analysis afterwards <pause dur="0.6"/> but for the entry <pause dur="0.2"/> the entry is simple <pause dur="0.3"/> if you say <pause dur="0.2"/> how many plots did i have <pause dur="0.8"/> and i will enter observations <pause dur="0.2"/> and if there are some observations i don't have <pause dur="0.2"/> they're like missing values <pause dur="0.8"/> the reason is different <pause dur="0.4"/> but you just leave it blank <pause dur="0.5"/> or you put a missing value code in <pause dur="1.7"/> let's have a nice simple life <pause dur="0.2"/> and then you will find <pause dur="0.2"/> that the analysis is also simple <pause dur="0.8"/> after the break <pause dur="0.3"/> # <pause dur="0.5"/> we'll look at <pause dur="0.2"/> how to manipulate the data <pause dur="0.2"/> and then <pause dur="0.2"/> quickly on to repeated measures <pause dur="0.3"/> okay we'll have a break </u><gap reason="break in recording" extent="uncertain"/> <u who="nm0658" trans="pause">

i'm realizing as usual that luckily <pause dur="0.4"/> you have the slides for everything <pause dur="0.3"/> because i don't think i'm going to finish yet again and i do want to discuss repeated measures <pause dur="0.5"/> so i may have to leave out a little bit of <pause dur="0.2"/> one of the topics on data management <pause dur="2.8"/> i just want quickly to review the ideas <pause dur="0.6"/> to compare these two formats you've seen <pause dur="1.2"/> our recommended way of data entry this is all revision now <pause dur="0.2"/> has one row for each plot <pause dur="0.2"/> or unit <pause dur="1.0"/> and that's the lowest possible level <pause dur="0.3"/> if it's a subplot it's one row for each subplot <pause dur="0.5"/> so <pause dur="0.2"/> in the example <pause dur="0.2"/> in lecture three there are twenty-seven rows <pause dur="0.3"/> because there were three reps by nine treatments there were twenty-seven plots <pause dur="0.6"/> in the randomized one just now there were ninety-six rows because there were ninety-six plots <pause dur="1.0"/> therefore there's one column for each measurement <pause dur="1.8"/> the other way <pause dur="0.2"/> looks like a textbook example for a hand analysis <pause dur="2.0"/> also unfortunately <pause dur="1.0"/> and they've been written too

many times but because they're all powerful they don't have to listen to anybody <pause dur="0.5"/> # in Excel <pause dur="0.3"/> it's the way you need to lay out the data for Excel to do an analysis of variance <pause dur="0.2"/> we do not recommend you use <pause dur="0.5"/> analysis of variance in Excel <pause dur="0.2"/> if you're going to do analysis of variance <pause dur="0.5"/> use it in Genstat or Minitab or anything <pause dur="0.3"/> but Excel's analysis of variance is not very good </u><pause dur="0.9"/> <u who="sf0674" trans="pause"> statistically it's not very good </u><pause dur="0.7"/> <u who="nm0658" trans="pause"> it's it doesn't do enough it doesn't show you residuals </u><u who="sf0674" trans="overlap"> <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="overlap"> it doesn't encourage a critical looking at data <pause dur="0.2"/> it only goes up to two level factorials <pause dur="0.5"/> and it's much better <pause dur="0.3"/> if you're going to use Excel for your statistical analysis <pause dur="0.3"/> you go up to <pause dur="0.2"/> simple description and tabulation <pause dur="0.3"/> maybe you do graphics in Excel <pause dur="0.4"/> but don't get into ANOVA and regression in Excel <pause dur="0.2"/> you've gone off the end of Excel <pause dur="0.4"/> use the proper tool for the task you have <pause dur="0.5"/> when you have complicated <pause dur="0.7"/> analyses of data <pause dur="0.4"/> there are many statistics packages <pause dur="0.3"/> use the one that's most appropriate <pause dur="0.9"/> for you i think for experimental data <pause dur="0.2"/> it's absolutely clear <pause dur="0.2"/> that

Genstat is the most appropriate <pause dur="0.3"/> but the important thing is that you use <pause dur="0.2"/> one that is appropriate <pause dur="0.5"/> and <pause dur="0.2"/> you've fallen off the end <pause dur="0.3"/> of a very general purpose tool <pause dur="0.2"/> which is a spreadsheet <pause dur="1.8"/><vocal desc="cough" iterated="n"/><pause dur="0.7"/> so <pause dur="3.6"/><kinesic desc="changes slide" iterated="n"/> as i've said before this lecture <pause dur="1.0"/> try to use the standard format for your entry it will save time on the analysis later <pause dur="0.9"/> but if somebody has entered the data differently <pause dur="0.3"/> it's usually quicker to reorganize the data <pause dur="0.2"/> than to have to retype <pause dur="1.3"/> so <pause dur="0.2"/> don't say oh gosh you've made a big mistake <pause dur="0.2"/> you'd better start again <pause dur="1.2"/> use the computer to help reorganize the data <pause dur="0.6"/> use either a statistics package or a spreadsheet <pause dur="2.1"/> i find <pause dur="0.5"/> that Genstat <pause dur="0.6"/> the competition between Genstat and Excel <pause dur="0.3"/> for reorganizing the data <pause dur="0.4"/> for you <pause dur="0.3"/> would be quite a good one <pause dur="0.8"/> because <pause dur="0.2"/> i think for reorganizing <pause dur="0.3"/> if you knew neither package it would be quicker <pause dur="0.2"/> to learn and use Genstat <pause dur="0.8"/> because it's built to do that sort of reorganizing <pause dur="0.8"/> than to use Excel <pause dur="1.0"/> but <pause dur="0.2"/> life is not equal <pause dur="0.2"/> most of you know Excel very well and you hardly know Genstat <pause dur="0.4"/> so some of you will

prefer <pause dur="0.3"/> to reorganize your data in Excel <pause dur="0.4"/> and other people would prefer <pause dur="0.4"/> to learn a bit more of Genstat to do the reorganizing <pause dur="0.3"/> you should choose the method that's most appropriate for you <pause dur="0.4"/> and that will depend <pause dur="0.2"/> on your work in the future <pause dur="0.4"/> but if you're thinking of using Genstat seriously <pause dur="0.3"/> in the future <pause dur="0.3"/> then <pause dur="0.4"/> try and learn a bit about its methods <pause dur="0.2"/> for data manipulation and i'll show you <pause dur="0.3"/> a little bit about what that means <pause dur="2.7"/> first you have to understand what you're trying to do <pause dur="2.2"/><kinesic desc="changes slide" iterated="n"/> here is an example <pause dur="0.4"/> where i've taken the one from before <pause dur="0.3"/> where i assume we've entered the data in the wrong way <pause dur="2.0"/> and we want to go <pause dur="0.2"/> from this way <pause dur="0.2"/> to <pause dur="0.2"/> this way <pause dur="0.7"/> that's the way we would have had if we'd entered it correctly <pause dur="1.0"/> with <pause dur="0.2"/> the replicates and the treatments and the dry matter <pause dur="1.0"/> so you ought to see <pause dur="0.2"/> what it is <pause dur="0.2"/> we are trying to do <pause dur="0.3"/> we want to go <pause dur="0.2"/> i've said from this wide format <pause dur="1.3"/> to the long <pause dur="0.2"/> format <pause dur="2.6"/> we are trying <pause dur="1.0"/> to take these and i think you can see i hope you can see <pause dur="0.3"/> we're trying to stack these <pause dur="0.2"/> one below another <pause dur="1.3"/>

so what goes across <pause dur="0.3"/> becomes stacked <pause dur="2.3"/> it's not # only that <pause dur="1.5"/> you could picture that's very easy in Excel <pause dur="0.8"/> but we want to do a little more than that <pause dur="0.5"/> because we want to take this <pause dur="0.2"/> which are already in the right form <pause dur="0.9"/> and we want to repeat that <pause dur="1.5"/> so <pause dur="0.2"/> this one on the left hand side <pause dur="0.4"/> gets repeated <pause dur="0.2"/> three times <pause dur="1.4"/> once <pause dur="0.2"/> for each column <pause dur="1.8"/> so it stays opposite <pause dur="1.5"/> not just this measurement but this measurement <pause dur="1.1"/> and <pause dur="0.3"/> these reps one two and three <pause dur="1.0"/><kinesic desc="indicates point on board" iterated="n"/> we want to have a new column called rep <pause dur="0.3"/> which goes one <pause dur="0.2"/> two <pause dur="0.2"/> three <pause dur="3.1"/> so <pause dur="0.5"/> it's not just stacking <pause dur="2.2"/> it's a bit more <pause dur="0.7"/> what do we mean in practice usually <pause dur="0.3"/> you want to stack <pause dur="0.2"/> the measurements <pause dur="1.3"/> and you want to carry <pause dur="0.2"/> the factors <pause dur="1.8"/> let's have a look at how that works in Genstat <pause dur="2.5"/><kinesic desc="changes slide" iterated="n"/> Genstat <pause dur="0.8"/> has a dialogue <pause dur="0.3"/> called stack <pause dur="3.8"/> you just have to go <pause dur="0.4"/> you should be getting to the stage with Genstat that you become curious and you think well there must be there somewhere <pause dur="0.4"/> and you go to spread <pause dur="0.5"/> manipulate <pause dur="0.4"/> and you will find stack <pause dur="1.3"/> within stack <pause dur="1.6"/> it will say how many columns <pause dur="0.3"/> do you want to

stack together <pause dur="1.3"/> well it should be quite easy for you to say i want three columns because i have rep one rep two rep three i want those <pause dur="0.3"/> stacked <pause dur="0.5"/> one below the other <pause dur="1.4"/> i want to record the column source i want to record where they came from <pause dur="0.3"/> in a column called rep <pause dur="0.3"/> that's a new column <pause dur="1.9"/> and then you put <pause dur="0.2"/> your observations <pause dur="0.5"/> which were called rep one rep two rep three which were actually the yields <pause dur="0.6"/> and you put them there <pause dur="0.2"/> you notice it puts a one <pause dur="0.3"/> beside <pause dur="1.1"/> to say they're all going to be one column <pause dur="1.3"/> you might have had <pause dur="0.2"/> more things as well <pause dur="0.9"/> and then it would have twos beside <pause dur="0.5"/> because that would now be a second column <pause dur="2.3"/> and you have a repeat column <pause dur="0.2"/> which is treat <pause dur="1.6"/> you want treat <pause dur="0.2"/> to be repeated <pause dur="0.2"/> each time that's what we said was needed <pause dur="0.9"/> and when you click on okay <pause dur="3.4"/><kinesic desc="changes slide" iterated="n"/> then you will find <pause dur="0.7"/> that it will produce <pause dur="1.1"/> exactly <pause dur="0.4"/> that <pause dur="0.2"/> spreadsheet <pause dur="2.6"/> so it will take that <pause dur="0.7"/> and produce that <pause dur="4.5"/> i don't think that's too difficult <pause dur="2.2"/> so that is stacking in <pause dur="0.4"/> Genstat <pause dur="8.4"/> once you've got that far and you realize that's quite easy <pause dur="0.5"/> i <pause dur="2.0"/> just

occasionally <pause dur="0.3"/> you want to go the other way round <pause dur="4.9"/> not often <pause dur="0.7"/> but sometimes <pause dur="0.2"/> you have it long <pause dur="0.4"/> and you want to go wide <pause dur="1.6"/> and if you ever need that <pause dur="0.7"/> there is you should now not be surprised if there's a stack dialogue <pause dur="0.3"/> there is also an unstack <pause dur="1.1"/> and <pause dur="0.2"/> this is like stacking things on a shelf <pause dur="0.6"/> you you can either stack them up <pause dur="0.3"/> or you can take them from there <pause dur="0.2"/> and you move them <pause dur="0.6"/> sideways so that they're now unstacked <pause dur="0.7"/> so that also exists <pause dur="6.4"/> that is one sort of data manipulation <pause dur="2.0"/> the second one <pause dur="0.4"/> i'm going to <pause dur="0.2"/> mention that it exists and then leave it <pause dur="0.5"/> because <pause dur="0.6"/> i think there is no not time for the repeated measures <pause dur="0.4"/> so let me just mention <pause dur="0.3"/> that <pause dur="0.2"/> the next it's in your notes <pause dur="0.9"/><kinesic desc="changes slide" iterated="n"/> that the next is that you sometimes have data <pause dur="0.3"/> at two levels <pause dur="0.7"/> and you must summarize one level <pause dur="0.3"/> to move it to another level <pause dur="2.0"/> the example that <pause dur="0.2"/> you have in the notes <pause dur="0.3"/> is data that were measured <pause dur="0.7"/> yields were measured on the plot level <pause dur="0.7"/> but tuber number was measured on the plant level <pause dur="0.4"/> and you want to summarize the tuber number

onto the plot level <pause dur="0.3"/> to analyse with your other data <pause dur="0.7"/> and you'll be doing that in the practical <pause dur="0.3"/> one of the examples does that <pause dur="0.3"/> and that is called <pause dur="0.2"/> data summary <pause dur="0.3"/> before the analysis <pause dur="1.1"/> and that's very common <pause dur="0.3"/> and is usually where measurements are made at a lower level <pause dur="0.4"/> than the one where the treatments were applied <pause dur="1.2"/> so here the varieties were applied at the plot level <pause dur="0.4"/> but you measured something <pause dur="0.2"/> on plants within the same plot <pause dur="2.5"/> and so i just give you the three examples that you've had already <pause dur="0.9"/> in session three <pause dur="0.2"/> there were tuber measurements on twenty plants in each plot <pause dur="2.2"/> in session in the last session <pause dur="0.3"/> you had tree measurements <pause dur="0.8"/> this was what <gap reason="name" extent="1 word"/> <pause dur="0.4"/> described <pause dur="0.3"/> on four trees in each plot <pause dur="0.4"/> so you have <pause dur="0.2"/> a plot was four trees and you measured <pause dur="0.4"/> the girth <pause dur="0.8"/> and other measurements <pause dur="0.3"/> on each tree <pause dur="0.2"/> within the plot <pause dur="0.6"/> but you want to analyse at the plot level <pause dur="0.5"/> because that's where you applied your treatments <pause dur="1.1"/> and in the Genstat guide <pause dur="1.2"/> the very simple example that you will have seen in the

first part of the guide <pause dur="0.2"/> had four replicates and three treatments <pause dur="0.9"/> and there were twelve pens <pause dur="0.7"/> but there were two sections to each pen <pause dur="0.5"/> and that's the example in the Genstat guide <pause dur="1.8"/> so <pause dur="0.2"/> there we have <pause dur="0.2"/> twelve <pause dur="0.2"/> plots <pause dur="0.4"/> and two sections in a pen so we have twenty-four sections <pause dur="0.4"/> and we want to summarize from the twenty-four <pause dur="0.2"/> up to the twelve <pause dur="1.9"/> now <pause dur="0.2"/> i've put at the bottom <pause dur="0.2"/> check you recognize these <pause dur="0.3"/> these three as the same problem <pause dur="0.9"/> because in the Genstat guide <pause dur="0.4"/> we show you in detail how to solve this problem <pause dur="1.0"/> you will meet it very often and if you recognize that this is the same problem as this and this and many others <pause dur="0.6"/> then you can use Genstat <pause dur="0.4"/> to do <pause dur="0.2"/> that initial summary <pause dur="2.0"/> and <pause dur="0.3"/> let me just show you the <pause dur="1.1"/> the dialogue <pause dur="0.2"/> that you get with Genstat <pause dur="2.3"/><kinesic desc="changes slide" iterated="n"/> without giving you <pause dur="2.4"/> there's your data at your low level <pause dur="0.4"/> you want to get the total or the mean <pause dur="0.2"/> at the higher level <pause dur="0.2"/> and you want to carry other things along <pause dur="0.2"/> so your analysis can proceed at the higher level <pause dur="0.4"/> and so again <pause dur="0.4"/> you should find <pause dur="0.3"/> there is a summarize the

spreadsheet <pause dur="0.3"/> which is another dialogue <pause dur="1.0"/> which will help you <pause dur="0.3"/> when you've got very detailed data at one level <pause dur="0.2"/> and you want to move up <pause dur="0.2"/> to another level <pause dur="4.6"/> so that's in your notes <pause dur="0.4"/> and it'll be in the practical <pause dur="2.6"/> does anybody have any <pause dur="0.3"/> i haven't covered it because <pause dur="0.6"/> i'm a little behind and i would like to tackle the repeated measures <pause dur="1.1"/> does anybody have any comments or questions on that <pause dur="2.2"/> yes </u><pause dur="0.3"/> <u who="sm0675" trans="pause"> case we have # four trees # in each plot <pause dur="0.6"/> <gap reason="inaudible" extent="1 sec"/> we know for example the <unclear>directional reach</unclear> of the trees </u><u who="nm0658" trans="latching"> yes </u><pause dur="0.2"/> <u who="sm0675" trans="pause"> so we wanted to know the production for the block <pause dur="0.2"/> <gap reason="inaudible" extent="1 sec"/> separately then <pause dur="0.5"/> we add all the production of the four <pause dur="0.7"/> trees and then # we'd like to know <pause dur="0.5"/> # for the whole plot <pause dur="0.5"/> <gap reason="inaudible" extent="1 sec"/></u> <u who="nm0658" trans="overlap"> that's correct that <pause dur="0.2"/> that that's very common <pause dur="0.5"/> # <pause dur="0.4"/> you you don't have to and in the notes <pause dur="0.2"/> we describe you can do the analysis at the tree level <pause dur="0.7"/> but you must accept that you applied <pause dur="0.2"/> your treatment at the plot level <pause dur="0.5"/> so the usual thing is to say <pause dur="0.2"/> what was the production for the whole plot <pause dur="0.5"/> and then you must say well do i want the mean production per tree <pause dur="0.3"/> or the total production <pause dur="0.2"/> and that depends on the problem <pause dur="1.9"/>

on # some experiments i analysed <pause dur="0.4"/> # on disease <pause dur="0.7"/> there were <pause dur="0.7"/> measurements <pause dur="0.3"/> a an <pause dur="0.6"/> an insecticide <pause dur="0.2"/> was applied <pause dur="0.4"/> at the whole plot <pause dur="0.9"/> and then ten plants were measured <pause dur="0.7"/> to see how diseased they were <pause dur="1.4"/> and <pause dur="0.4"/> the idea was now <pause dur="0.2"/> for each plot you wanted a measure of disease <pause dur="1.8"/> in the olden days people always used <pause dur="0.7"/> the mean <pause dur="0.2"/> disease score <pause dur="0.5"/> as a measure of the disease per plot <pause dur="0.8"/> we decided <pause dur="0.5"/> that because we wanted to see the most effective <pause dur="0.4"/> insecticide <pause dur="0.2"/> we should also calculate <pause dur="0.3"/> the maximum disease score <pause dur="0.2"/> namely the disease score of the worst <pause dur="0.3"/> plant in the plot <pause dur="0.3"/> as a summary number <pause dur="0.4"/> to say <pause dur="0.2"/> that characterizes <pause dur="0.6"/> the worst <pause dur="0.2"/> that could <trunc>possib</trunc> if that worst is pretty good it's a good insecticide <pause dur="0.6"/> so <pause dur="0.2"/> it's not always <pause dur="0.2"/> usually you have the total or the mean in the plot <pause dur="0.3"/> but it can be the maximum or the minimum <pause dur="0.4"/> that is also useful <pause dur="0.3"/> so you calculate a summary statistic for the plot <pause dur="0.6"/> which you then analyse at the plot level <pause dur="6.3"/><kinesic desc="changes slide" iterated="n"/> last subject <pause dur="1.9"/> repeated measures <pause dur="3.3"/> these are very common in designed

experiments <pause dur="1.1"/> where measurements are repeated on the same unit <pause dur="2.6"/> they can be in time <pause dur="0.4"/> or in space <pause dur="1.9"/> and the problem <pause dur="0.3"/> from a statistical point of view is the same <pause dur="1.2"/> so <pause dur="0.5"/> animals weighed each week <pause dur="0.8"/> would be <pause dur="0.3"/> i'm assuming the animal receives a particular treatment <pause dur="0.7"/> at the beginning of the experiment or has having a diet as you go through the experiment <pause dur="0.3"/> and is weighed each week <pause dur="1.0"/> so rather than just an ordinary yield experiment where you measure just once at the end <pause dur="0.3"/> you record successively in time <pause dur="1.3"/> that is a repeated measure in time <pause dur="1.8"/> most <pause dur="0.2"/> repeated measures are measures in time <pause dur="2.1"/> occasionally here's another example tree diameter is recorded every six months <pause dur="0.4"/> to see about the growth <pause dur="1.9"/> in human experiments <pause dur="0.4"/> you often do lots of measurements in time <pause dur="0.3"/> you measure the weight of babies <pause dur="0.3"/> every month <pause dur="0.3"/> for a year <pause dur="1.7"/> you measure <pause dur="0.3"/> the effectiveness of a treatment for cancer <pause dur="0.4"/> by recording <pause dur="0.2"/> every three months <pause dur="1.3"/> the effect on patients and so on <pause dur="0.4"/> so this is very common <pause dur="0.2"/> in many fields of application <pause dur="0.9"/> just

occasionally <pause dur="0.2"/> these repeated measures are in space <pause dur="0.6"/> an example <pause dur="0.3"/> would be if you have a hedge <pause dur="1.0"/> and you want to see <pause dur="0.2"/> the effect of the hedge on the plants <pause dur="0.3"/> that are <pause dur="0.3"/> close to the hedge you might have <pause dur="0.3"/> a hedge <pause dur="0.2"/> # with a certain tree species <pause dur="0.5"/> and then <pause dur="0.2"/> you have some rows of maize <pause dur="0.3"/> which grow <pause dur="0.4"/> and you have six rows <pause dur="0.5"/> going away from the hedge <pause dur="0.6"/> and now you want to measure each row <pause dur="1.3"/> so you applied your plot <pause dur="0.2"/> consists of your hedge <pause dur="0.4"/> with six rows each side <pause dur="0.6"/> and now <pause dur="0.2"/> you want to repeat the measurement namely the yield <pause dur="0.4"/> but not for the whole plot <pause dur="0.3"/> but for each row <pause dur="0.4"/> to see the effect in space <pause dur="0.3"/> as you go further away <pause dur="0.2"/> from the hedge <pause dur="1.1"/> and that is a problem of repeated measures <pause dur="0.3"/> in space <pause dur="3.7"/> those repeated measures introduce problems of data management and analysis which we're going to look at <pause dur="1.1"/> and it reviews many of the ideas <pause dur="0.5"/> from this part of the course <pause dur="3.0"/><kinesic desc="changes slide" iterated="n"/> we'll have a very simple example <pause dur="0.5"/> and so you have more information it's in the Genstat guide <pause dur="1.9"/> there are five <pause dur="0.5"/> replicates <pause dur="0.2"/> this is an example for you because

it's example where there are no blocks <pause dur="0.2"/> one or two <pause dur="0.3"/> # so there are just <pause dur="0.2"/> fifteen petri dishes little dishes <pause dur="1.6"/> and <pause dur="0.5"/> there are three isolates of a fungus <pause dur="0.7"/> and they're repeated <pause dur="0.2"/> five times <pause dur="0.4"/> but there's no <pause dur="0.2"/> group of five here group of five here there are just fifteen of them <pause dur="2.1"/> and there were six measurements made <pause dur="0.6"/> on days three four five six seven and eight <pause dur="2.3"/><vocal desc="cough" iterated="n"/><pause dur="0.3"/> how many plots <pause dur="0.3"/> how many measurements <pause dur="1.0"/> is it clear about the normal data and how many plots <pause dur="4.8"/> how many plots <pause dur="0.3"/> easy question <pause dur="6.3"/> fifteeen </u><u who="sm0676" trans="latching"> fifteen </u><u who="nm0658" trans="latching"> it is <pause dur="0.4"/> i'm sorry you you probably were confused 'cause it's too easy <pause dur="0.5"/> # it depends whether you think of your your confuse your measurements with your plots you see if you don't <pause dur="0.3"/> it should be obvious you have a question </u><pause dur="0.3"/> <u who="sf0677" trans="pause"> <gap reason="inaudible" extent="1 sec"/> <pause dur="0.2"/> destructive <pause dur="0.7"/> measurements on the same plot as the <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="latching"> the <trunc>des</trunc> <pause dur="0.5"/> <trunc>tha</trunc> that's a very good question <pause dur="1.1"/> when you are measuring on the same plot <pause dur="1.4"/> you have the choice of measuring let's say height <pause dur="1.0"/> which is <pause dur="0.4"/> something you can go back to exactly the same plant <pause dur="0.2"/> and measure <pause dur="0.6"/> or <pause dur="0.8"/> harvesting a few plants in the plot <pause dur="1.1"/> from a stats point of

view <pause dur="0.4"/> it's roughly the same thing <pause dur="0.4"/> because <pause dur="0.5"/> they are <pause dur="0.3"/> they're still within the same plot <pause dur="0.7"/> but from a precision point of view <pause dur="0.4"/> it's much better <pause dur="0.4"/> if you can <pause dur="0.6"/> <trunc>i</trunc> <trunc>i</trunc> it isn't quite the same because <pause dur="0.2"/> if you go back to the same plant <pause dur="0.7"/> then your repeated measure <pause dur="0.6"/> is at the level plant <pause dur="0.4"/> if you're measuring the height <pause dur="1.1"/> whereas if it's destructive <pause dur="0.6"/> and <pause dur="0.2"/> you're measuring let's say the height of four plants and then you throw them away <pause dur="1.0"/> then <pause dur="0.4"/> and you measure the height of four more plants <pause dur="0.2"/> then you're still repeating the measure <pause dur="0.3"/> but the level is the plot level not the plant level because you can't go back to the same plant <pause dur="0.2"/> because it's harvested <pause dur="0.7"/> so where it differs <pause dur="0.2"/> is the level <pause dur="0.2"/> at which you're able to do the repeat <pause dur="0.9"/> usually we find <pause dur="0.3"/> that the lower the level you can do the repeat the better it is <pause dur="0.9"/> so <pause dur="0.2"/> we often find <pause dur="0.3"/> that non-destructive measurements <pause dur="0.4"/> are tremendously useful <pause dur="0.5"/> and last week <pause dur="0.2"/> i <pause dur="0.3"/> was examining somebody whose thesis <pause dur="0.4"/> is on taking aerial photographs <pause dur="0.4"/> of plots <pause dur="0.5"/> where <pause dur="0.2"/> you can measure <pause dur="0.4"/> the <pause dur="0.7"/> the area

roughly of each plant <pause dur="0.5"/> repeatedly <pause dur="0.8"/> very very easily by taking a photograph <pause dur="1.1"/> and that is <pause dur="0.5"/> non-destructive <pause dur="1.0"/> and <pause dur="0.3"/> was shown to be a very good way <pause dur="0.6"/> compared to these small harvests that people often take <pause dur="0.4"/> where you get exactly what you want namely the harvest but it destroys it so you can't measure the same plants later on <pause dur="5.0"/> here's the data <pause dur="0.6"/> well sorry <pause dur="0.4"/> # i haven't <pause dur="0.5"/> i'd asked you how many plots there are <pause dur="0.3"/> but it hasn't <pause dur="0.4"/> answered <pause dur="0.9"/> you should be saying <pause dur="0.4"/> there are six measurements so that's six columns <pause dur="1.1"/> the fact they're on days three four five six seven eight doesn't affect <pause dur="0.2"/> and there are fifteen <pause dur="1.2"/> plots <pause dur="0.4"/> therefore i'm going to have <pause dur="0.2"/> fifteen rows of data <pause dur="1.2"/> and so <pause dur="0.2"/> the data <pause dur="1.1"/> are going to look <pause dur="0.3"/> well here's an example of the way the data could look <pause dur="1.5"/><kinesic desc="changes slide" iterated="n"/> where <pause dur="0.2"/> there's the unit <pause dur="0.2"/> there's the isolate there's the rep <pause dur="0.3"/> and there's my measurements on day <pause dur="0.2"/> three four five six seven eight <pause dur="9.5"/><vocal desc="cough" iterated="n"/><pause dur="3.4"/> now <pause dur="0.6"/> you now have to think of your strategy <pause dur="0.4"/> for the analysis <pause dur="4.3"/> and here we begin <pause dur="0.2"/> with a slight <pause dur="0.3"/> problem <pause dur="2.1"/> that is <pause dur="0.2"/> i would like you to

look constructively at the data <pause dur="1.4"/> exploratory analysis <pause dur="0.2"/> we said <pause dur="0.2"/> is very important <pause dur="0.3"/> i wonder how you would like to explore <pause dur="0.5"/> these data <pause dur="2.7"/> well a very common way that people would like to explore the data <pause dur="0.8"/> is <pause dur="0.2"/> to see <pause dur="0.8"/> what's the change in observation <pause dur="0.4"/> over time <pause dur="2.3"/> sort of <pause dur="0.3"/> notice this goes <pause dur="0.4"/> on the first petri dish i go three-point-seven five six-point-one <pause dur="0.2"/> seven-point-five eight-point-three <trunc>nine-point-ei</trunc> <pause dur="0.2"/> seem to going up with time <pause dur="1.5"/> to understand my data <pause dur="0.5"/> maybe it would be nice <pause dur="0.8"/> to look at that sort of graph <pause dur="0.5"/> as a function of time <pause dur="0.9"/> that would be a nice exploratory method <pause dur="0.3"/> but unfortunately for a stats package <pause dur="0.3"/> exploration works best on columns <pause dur="1.3"/> so you may wish to do that exploration in Excel <pause dur="1.9"/> or <pause dur="1.2"/> you will find Genstat helps <pause dur="1.3"/> so a strategy for the analysis <pause dur="4.4"/><kinesic desc="changes slide" iterated="n"/> nothing changes you've changed the problem but you haven't changed the strategy which is <pause dur="0.2"/> please start by looking critically at your data <pause dur="1.3"/> so start with data exploration and that's usually graphs <pause dur="0.5"/> so you look at all the data <pause dur="0.2"/> are there any odd

observations <pause dur="0.7"/> you could do those Excel or in Genstat <pause dur="0.6"/> and you could get one graph for each plot <pause dur="0.5"/> so there'd be fifteen graphs <pause dur="0.4"/> be one way of exploring the data <pause dur="1.2"/> let's have a look at that first <pause dur="5.4"/> i'll come back to that for the second part <pause dur="0.3"/> so there's a way <pause dur="2.4"/><kinesic desc="changes slide" iterated="n"/> in Genstat <pause dur="1.3"/> of exploring the data <pause dur="4.1"/> and you will find that Genstat has a little menu <pause dur="2.5"/> which <pause dur="1.6"/> puts this out automatically <pause dur="1.0"/> so here's the graphs <pause dur="0.3"/> for <pause dur="0.2"/> # one treatment <pause dur="0.3"/> the second treatment <pause dur="0.2"/> and the third treatment <pause dur="2.8"/> and here is the graph <pause dur="0.3"/> for <pause dur="0.4"/> the means <pause dur="0.4"/> for the three treatments <pause dur="0.3"/> so there's a bit of analysis <pause dur="0.3"/> but here we have all our data <pause dur="0.8"/> this is the ninety observations <pause dur="0.7"/> there are fifteen lines <pause dur="0.2"/> because we have fifteen plots <pause dur="0.8"/> and each line has six points <pause dur="0.2"/> because we have six time points <pause dur="0.4"/> so we actually have all our data <pause dur="0.4"/> here <pause dur="0.8"/> and we could see if there was some odd observations <pause dur="0.7"/> i don't see anything particularly odd </u><pause dur="0.4"/> <u who="sm0678" trans="pause"> when you say odd <pause dur="0.7"/> what exactly do you mean </u><pause dur="8.7"/> <u who="nm0658" trans="pause"> the full set of numbers <pause dur="2.0"/> consists <pause dur="0.2"/> here's all our data <pause dur="0.9"/> these are all the numbers <pause dur="1.4"/> and what we have <pause dur="0.3"/> is we have

fifteen plots so we're actually using all the numbers <pause dur="0.3"/> in those graphs you can see every number <pause dur="0.4"/> somewhere there so we're using <pause dur="0.2"/> all our data <pause dur="0.8"/> in producing those plots </u><u who="sm0678" trans="latching"> my question is by looking at the graphs <pause dur="0.2"/> what <gap reason="inaudible" extent="1 word"/> <pause dur="0.9"/> are we looking for <pause dur="0.7"/> <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="overlap"> # okay <pause dur="10.0"/> does anybody have anything they've found without knowing seriously what they're looking for </u> <pause dur="3.3"/> <u who="sm0679" trans="pause"> there is a </u><u who="nm0658" trans="overlap"> any impressions </u><u who="sm0679" trans="latching"> increase # <pause dur="0.9"/> <unclear>to an</unclear> X<pause dur="0.2"/>-axis <pause dur="0.5"/> and </u><u who="nm0658" trans="overlap"> there seems to be an increase <pause dur="1.4"/> any # <trunc>re</trunc> remember your chick experiment <pause dur="0.4"/> and the increases sort of straight lines or curves </u><pause dur="1.4"/> <u who="sm0680" trans="pause"> curves </u><u who="sm0681" trans="latching"> curves </u><pause dur="0.5"/> <u who="nm0658" trans="pause"> curves <pause dur="1.7"/> we we <pause dur="0.4"/> sort of like that </u><u who="sm0682" trans="overlap"> yes </u><u who="nm0658" trans="overlap"> or <pause dur="0.2"/> wiggeldy </u><pause dur="1.2"/> <u who="sf0683" trans="pause"> some of them </u><pause dur="0.5"/> <u who="sm0682" trans="pause"> mm some of them were </u><u who="sf0683" trans="latching"> and some were straight mm </u><pause dur="2.1"/> <u who="nm0658" trans="pause"> i don't notice any that go sort of right like that <pause dur="1.0"/> sort of <pause dur="0.2"/> starting going up to the top and coming down <pause dur="0.4"/> remember <pause dur="0.4"/> we're worrying about statistics there's variation so <pause dur="0.2"/> <trunc>th</trunc> everything you can't have things that are exactly straight 'cause we're just connecting the points <pause dur="1.5"/> do you do you think <pause dur="0.3"/> that <pause dur="0.3"/> for some of these <pause dur="0.6"/> it would be sensible to have a straight line model <pause dur="0.3"/> would that be a rough reasonable summary </u><pause dur="0.6"/>

<u who="sm0684" trans="pause"> <gap reason="inaudible" extent="1 word"/></u><pause dur="0.4"/> <u who="nm0658" trans="pause"> for all the plots <pause dur="1.7"/> are there any plots where you think a straight line model isn't going to be sensible <pause dur="2.1"/> i don't actually see many <pause dur="0.4"/> which is surprising usually you find <pause dur="0.4"/> maybe one treatment is curved and the other treatments are straight <pause dur="1.2"/> as you found with the barley and the wheat one was more curved than another you might <pause dur="0.3"/> want to <pause dur="0.2"/> recognize that <pause dur="1.4"/> does anybody see any very surprising observations <pause dur="2.4"/> i don't <pause dur="0.8"/> i don't i don't sort of see a sudden spike like this <pause dur="0.4"/> which might be a recording error <pause dur="1.3"/> so this is exploration <pause dur="0.6"/> and <pause dur="0.4"/> exploration can be positive or negative <pause dur="0.7"/> usually i find <pause dur="0.2"/> you notice one or two very odd observations <pause dur="0.2"/> here i don't see anything very odd <pause dur="0.4"/> and things seem to be increasing <pause dur="0.7"/> so that here <pause dur="0.7"/> where this <pause dur="0.2"/> is the mean <pause dur="0.3"/> of that one <pause dur="2.2"/> i feel <pause dur="0.2"/> reasonable confidence that drawing a straight line <pause dur="0.4"/> which is the average for those points <pause dur="0.5"/> is probably a reasonable summary <pause dur="1.1"/> and i notice now <pause dur="0.2"/> that this straight line so this is analysis and this

is just presentation of the raw data <pause dur="1.9"/> and i now notice in this summary <pause dur="0.5"/> that <pause dur="0.7"/> these <pause dur="0.6"/> all three seem to be going up <pause dur="0.3"/> but this maybe is going up more gradually A it's lower <pause dur="0.2"/> and it's going up more gradually <pause dur="1.1"/> so from these <pause dur="0.6"/> i feel that this is probably a fair summary of the data i don't see any reason <pause dur="0.2"/> to say oh gosh it's not fair because of <pause dur="0.2"/> this <pause dur="0.9"/> and in here i'm starting my analysis <pause dur="1.1"/> and i've done that in a very simple way and visually <pause dur="2.3"/> does that help to answer your question <pause dur="9.6"/><vocal desc="cough" iterated="n"/><pause dur="3.4"/> okay so we have our data <pause dur="1.9"/> and <pause dur="0.4"/> i was looking for the strategy <pause dur="0.5"/> okay <pause dur="2.6"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="1.7"/> i suggest for all analyses you start with a simple summary <pause dur="0.2"/> and then you go on to simple analyses <pause dur="2.1"/> what could the simple analyses be <pause dur="4.0"/><kinesic desc="changes slide" iterated="n"/> well here we have the data <pause dur="4.0"/> one simple analysis <pause dur="0.2"/> could be to analyse the data on day three <pause dur="2.4"/> just take one time point <pause dur="0.9"/> another one <pause dur="0.6"/> day eight <pause dur="0.8"/> so there's a very simple analysis <pause dur="0.3"/> you could analyse <pause dur="0.2"/> each of your observations <pause dur="0.2"/> separately <pause dur="1.8"/> the next simple analysis <pause dur="0.4"/> could be <pause dur="0.6"/> to take a useful summary <pause dur="2.5"/> one summary might be the difference between

day eight and day three <pause dur="0.2"/> has the change <pause dur="0.2"/> been the same for each treatment and each replicate <pause dur="1.7"/> so that's what we suggest <pause dur="4.7"/><event desc="looks through slides" iterated="y" dur="22"/> i keep losing <pause dur="0.5"/> the # <pause dur="17.0"/><kinesic desc="changes slide" iterated="n"/> so i've suggested the first simple analysis could be the data at each time point <pause dur="1.4"/> then we could have simple function <pause dur="0.9"/> like the final minus the initial <pause dur="0.4"/> or <pause dur="0.6"/> we could have the slope <pause dur="0.2"/> of each line <pause dur="1.0"/> now you don't want the slope <pause dur="0.2"/> if too many lines could be very curved <pause dur="0.4"/> but here i think getting the slope of each line <pause dur="0.2"/> might be a sensible summary <pause dur="8.8"/><kinesic desc="changes slide" iterated="n"/> as a strategy what sort of strategy is this <pause dur="1.6"/> well i would claim that the repeated measures <pause dur="0.5"/> are like observations at a lower level <pause dur="0.5"/> they're not exactly the same <pause dur="0.7"/> but they're a little like a split plot analysis they're like taking <pause dur="0.2"/> day <pause dur="0.4"/> as <pause dur="0.2"/> a level within a petri dish <pause dur="2.1"/> and then <pause dur="0.5"/> in this example we have six observations within each petri dish <pause dur="0.2"/> for the six days <pause dur="1.4"/> or we have ten weights within each animal <pause dur="1.1"/> so it's a little like a split plot <pause dur="0.2"/> experiment <pause dur="0.3"/> where the factor time <pause dur="0.7"/> is within <pause dur="0.4"/> the treatment <pause dur="1.1"/> just as <pause dur="0.3"/> an

ordinary split plot <pause dur="0.8"/> but it's not quite a split plot <pause dur="0.2"/> because we don't randomize the times <pause dur="1.0"/> we can't <pause dur="2.4"/> like <pause dur="0.7"/> anything like that <pause dur="0.2"/> the analysis will be simpler if we first get a summary value at the plot level <pause dur="1.2"/> so our analysis is going to be simple if we summarize up <pause dur="0.6"/> to the petri dish <pause dur="1.5"/> whatever we do <pause dur="0.2"/> we've got fifteen petri dishes <pause dur="0.2"/> whatever summary we'd like to get <pause dur="0.5"/> that would be a simple <pause dur="0.2"/> analysis <pause dur="0.2"/> if we can go <pause dur="0.2"/> you were asking about split plots and i was saying <pause dur="1.0"/> simple experiments are at one level <pause dur="0.9"/> well repeated measures <pause dur="0.2"/> bring in a second level <pause dur="0.9"/> there are many methods for analysing repeated measures <pause dur="0.2"/> bringing in the two levels <pause dur="0.7"/> but the simplest is not to have two levels <pause dur="0.3"/> but is to summarize the data <pause dur="0.3"/> from the repeated measures up to the one level <pause dur="2.1"/> because that's where we apply the treatment <pause dur="1.6"/> which summary is appropriate <pause dur="0.5"/> depends on the data and the objectives <pause dur="0.9"/> and the booklet on analysis that i gave out in week eleven <pause dur="0.3"/> gives you some more details <pause dur="3.9"/><kinesic desc="changes slide" iterated="n"/> so <pause dur="0.7"/> <reading>the graphical display <pause dur="0.3"/> indicates <pause dur="0.3"/> that

a useful summary might be the slope of the regression line for each petri dish</reading> <pause dur="3.1"/> so we then have a problem how do we get the fifteen slopes <pause dur="2.0"/> in Excel we could use the data as they stand <pause dur="1.0"/> in Genstat <pause dur="0.2"/> be better to stack the data <pause dur="1.2"/> so now to do those slopes <pause dur="0.5"/> because Genstat works with columns <pause dur="0.2"/> be better to stack <pause dur="0.2"/> the data <pause dur="1.9"/> and that was shown <pause dur="0.3"/> earlier <pause dur="3.5"/><kinesic desc="changes slide" iterated="n"/> and once you analyse <pause dur="0.2"/> with the stack data <pause dur="0.9"/> you will find <pause dur="1.1"/> and the way we'd go through that <pause dur="0.3"/> and get the regression <pause dur="0.2"/> you will find <pause dur="0.3"/> that here are the fifteen <pause dur="0.8"/> observations <pause dur="0.3"/> and here are the fifteen slopes <pause dur="1.4"/> and here is the analysis <pause dur="0.3"/> where we're actually analysing the slopes <pause dur="1.0"/> we're getting the individual slopes <pause dur="0.4"/> and there's fourteen degrees of freedom here <pause dur="0.4"/> because there's fifteen petri dishes <pause dur="1.2"/> and <pause dur="0.3"/> we find <pause dur="0.4"/> that # the effect of slope is statistically significant <pause dur="0.2"/> and these are <pause dur="0.2"/> the three slopes <pause dur="1.6"/> which are the <trunc>lin</trunc> the slopes of those three lines <pause dur="0.3"/> and we find <pause dur="0.2"/> that these first two <pause dur="0.4"/> treatments are about the same <pause dur="0.2"/> but this slope <pause dur="0.3"/> is rather <pause dur="1.2"/> flatter <pause dur="9.1"/><kinesic desc="changes slide" iterated="n"/> there

are many other methods of analysis of repeated measures <pause dur="0.5"/> and Genstat <pause dur="0.4"/> has a whole set of dialogues specially for that <pause dur="1.3"/> they all try and get more levels more information <pause dur="0.3"/> by leaving the data at the two levels <pause dur="0.3"/> rather than summarizing up <pause dur="0.5"/> to one level <pause dur="1.6"/> they're often attractive in principle <pause dur="2.9"/> they're needed that should be an if <pause dur="0.3"/> they're needed if there is not enough data <pause dur="4.8"/> but <pause dur="0.9"/> to me they have a major problem <pause dur="0.4"/> that they are much more complicated <pause dur="0.4"/> and often <pause dur="0.3"/> their real problem is you can't tailor the analysis to the precise objectives of your research <pause dur="0.7"/> so they're like many complicated analyses <pause dur="0.4"/> that they're wonderful <pause dur="0.3"/> in principle <pause dur="0.3"/> but in practice they don't help <pause dur="1.1"/> that they are playing with data <pause dur="0.2"/> often <pause dur="1.1"/> and <pause dur="0.2"/> my conclusion is <pause dur="0.2"/> use the simple methods wherever possible <pause dur="1.4"/> and if you do use the more complex methods which you can because Genstat provides them from menus <pause dur="0.9"/> then don't just use them <pause dur="0.2"/> make sure they add constructively <pause dur="0.4"/> to what you were able to do quite simply <pause dur="0.3"/> with the simple methods <pause dur="6.1"/><kinesic desc="changes slide" iterated="n"/> okay <pause dur="0.5"/> practical work <pause dur="1.7"/>

the practical follows the topics covered in this session so it's useful for you to review those <pause dur="0.3"/> so we're doing some on designing some on managing data and some on repeated measures <pause dur="1.5"/> in each case <pause dur="0.4"/> i've deliberately used examples from the Genstat guides <pause dur="0.5"/> so <pause dur="0.5"/> you don't have to finish <pause dur="0.3"/> just concentrate on those parts you find most interesting <pause dur="0.3"/> and that will help you <pause dur="0.4"/> get more experience <pause dur="0.2"/> in using Genstat <pause dur="5.2"/><kinesic desc="changes slide" iterated="n"/> and <pause dur="1.4"/> two final slides <pause dur="3.4"/> this is now the end of the five sessions <pause dur="0.3"/> which are specifically for the analysis of experimental data <pause dur="1.5"/> you now should have two things <pause dur="0.8"/> the first is <pause dur="0.2"/> the broad picture of the role of statistics in research projects <pause dur="1.0"/> which has come from <pause dur="0.3"/> sessions one to five <pause dur="1.4"/> that was last term <pause dur="0.8"/> so you should have an idea of how you use statistics in design in data management <pause dur="0.8"/> you should have

reviewed <pause dur="0.3"/> basic statistical techniques <pause dur="1.0"/> statistical inference ANOVA simple regression <pause dur="0.2"/> that was the second part last term <pause dur="1.1"/> now you should be familiar with some of the special methods for analysing experimental data <pause dur="2.2"/> and hopefully <pause dur="0.9"/> you are therefore ready <pause dur="0.3"/> for a brief introduction <pause dur="0.4"/> to <pause dur="0.5"/> the role of modern statistical methods <pause dur="0.9"/> to help you <pause dur="0.4"/> in processing your research data <pause dur="7.8"/><kinesic desc="changes slide" iterated="n"/> the lecture room next week <pause dur="1.0"/> is <pause dur="0.2"/> the plant sciences <pause dur="0.3"/> lecture room so we're not here next week <pause dur="0.4"/> we're together with the other group <pause dur="0.3"/> in plant sciences </u><pause dur="0.2"/> <u who="ss" trans="pause"> <gap reason="inaudible, multiple speakers" extent="2 secs"/></u><u who="nm0658" trans="overlap"> in </u><u who="ss" trans="overlap"> <gap reason="inaudible, multiple speakers" extent="1 sec"/></u><pause dur="0.3"/> <u who="nm0658" trans="pause"> sorry </u><u who="ss" trans="overlap"> <gap reason="inaudible, multiple speakers" extent="1 sec"/></u><u who="nm0658" trans="latching"> the ground floor i <trunc>pres</trunc> the ground floor lecture room in plant sciences </u><pause dur="0.5"/> <u who="sm0685" trans="pause"> you mean the small lecture theatre </u><pause dur="0.9"/> <u who="sm0686" trans="pause"> there's one <pause dur="0.3"/> <gap reason="inaudible" extent="1 word"/> </u><u who="nm0658" trans="overlap"> i hope not it's <trunc>w</trunc> </u><u who="sf0687" trans="overlap"> there's one lecture theatre anyway isn't it </u><u who="nm0658" trans="overlap"> sorry </u><u who="sf0687" trans="overlap"> just one there isn't it </u><u who="nm0658" trans="overlap"> there's just one one lecture

theatre i've been told <pause dur="0.4"/> it's got to take seventy of us </u><u who="sf0687" trans="latching"> no it's large <gap reason="inaudible" extent="1 sec"/></u><u who="sm0688" trans="overlap"> <gap reason="inaudible" extent="1 sec"/> </u><u who="nm0658" trans="overlap"> so it's the large lecture theatre </u><u who="ss" trans="overlap"> <gap reason="inaudible, multiple speakers" extent="2 secs"/> </u><u who="nm0658" trans="overlap"> and next week the practical is again for you in the Met department </u><pause dur="0.5"/> <u who="sm0689" trans="pause"> is it just for us or will it be everybody else here as well </u><pause dur="0.3"/> <u who="nm0658" trans="pause"> everybody is in the lecture and the practical is still split <pause dur="0.3"/> into the two groups <pause dur="0.7"/> as you go <pause dur="2.3"/> can you please <pause dur="2.3"/> we want your critical review of <pause dur="0.3"/> these five lectures <pause dur="0.8"/> and so we have another of the evaluations <pause dur="0.2"/> this is on this session <pause dur="0.3"/> any comments <pause dur="0.2"/> remember we've changed the course a lot this year <pause dur="0.3"/> so any comments <pause dur="0.2"/> they don't have to be polite <pause dur="0.4"/> and i'll collect this in the practical <pause dur="0.7"/> so can you take one as you go out <pause dur="0.3"/> and then i'll collect these in the practical