Skip to main content


<?xml version="1.0"?>

<!DOCTYPE TEI.2 SYSTEM "base.dtd">





<publicationStmt><distributor>BASE and Oxford Text Archive</distributor>


<availability><p>The British Academic Spoken English (BASE) corpus was developed at the

Universities of Warwick and Reading, under the directorship of Hilary Nesi

(Centre for English Language Teacher Education, Warwick) and Paul Thompson

(Department of Applied Linguistics, Reading), with funding from BALEAP,

EURALEX, the British Academy and the Arts and Humanities Research Board. The

original recordings are held at the Universities of Warwick and Reading, and

at the Oxford Text Archive and may be consulted by bona fide researchers

upon written application to any of the holding bodies.

The BASE corpus is freely available to researchers who agree to the

following conditions:</p>

<p>1. The recordings and transcriptions should not be modified in any


<p>2. The recordings and transcriptions should be used for research purposes

only; they should not be reproduced in teaching materials</p>

<p>3. The recordings and transcriptions should not be reproduced in full for

a wider audience/readership, although researchers are free to quote short

passages of text (up to 200 running words from any given speech event)</p>

<p>4. The corpus developers should be informed of all presentations or

publications arising from analysis of the corpus</p><p>

Researchers should acknowledge their use of the corpus using the following

form of words:

The recordings and transcriptions used in this study come from the British

Academic Spoken English (BASE) corpus, which was developed at the

Universities of Warwick and Reading under the directorship of Hilary Nesi

(Warwick) and Paul Thompson (Reading). Corpus development was assisted by

funding from the Universities of Warwick and Reading, BALEAP, EURALEX, the

British Academy and the Arts and Humanities Research Board. </p></availability>




<recording dur="00:55:50" n="8375">


<respStmt><name>BASE team</name>



<langUsage><language id="en">English</language>



<person id="nm0211" role="main speaker" n="n" sex="m"><p>nm0211, main speaker, non-student, male</p></person>

<person id="sm0212" role="participant" n="s" sex="m"><p>sm0212, participant, student, male</p></person>

<person id="su0213" role="participant" n="s" sex="u"><p>su0213, participant, student, unknown sex</p></person>

<person id="sm0214" role="participant" n="s" sex="m"><p>sm0214, participant, student, male</p></person>

<person id="sm0215" role="participant" n="s" sex="m"><p>sm0215, participant, student, male</p></person>

<person id="su0216" role="participant" n="s" sex="u"><p>su0216, participant, student, unknown sex</p></person>

<personGrp id="ss" role="audience" size="s"><p>ss, audience, small group </p></personGrp>

<personGrp id="sl" role="all" size="s"><p>sl, all, small group</p></personGrp>

<personGrp role="speakers" size="8"><p>number of speakers: 8</p></personGrp>





<item n="speechevent">Lecture</item>

<item n="acaddept">Agricultural Botany</item>

<item n="acaddiv">ls</item>

<item n="partlevel">UG3</item>

<item n="module">unknown</item>




<u who="nm0211"> right as i <pause dur="0.3"/> said to <gap reason="name" extent="1 word"/> <pause dur="1.3"/> i'm here <pause dur="0.2"/> just for an hour to start this morning <pause dur="0.4"/> to give you some background <pause dur="0.7"/> on experiences that <pause dur="0.4"/> we've had over the last <pause dur="0.3"/> five or six years <pause dur="0.5"/> in the discovery of a particular set of protein families a <pause dur="0.4"/> a new protein <pause dur="0.8"/> superfamily so-called <pause dur="0.6"/> in plants <pause dur="0.5"/> you'll hear and you've heard previously <pause dur="0.6"/> from other people <pause dur="0.4"/> some of the fundamental studies that underlie <pause dur="0.5"/> protein structure and function <pause dur="0.8"/> i thought it would be useful to give you an example <pause dur="0.2"/> a worked example <pause dur="0.9"/> of how <pause dur="0.4"/> a particular practical project with which i was associated <pause dur="0.8"/> # <pause dur="0.5"/> in my former life 'cause i came from a commercial company <pause dur="0.7"/> however that practical project led on <pause dur="0.4"/> over the last few years into a <pause dur="0.4"/> discovery of a completely unexpected set of <pause dur="0.4"/> related proteins <pause dur="0.5"/> and it shows you how <pause dur="0.7"/> the functional activity of a protein <pause dur="0.4"/> can be preserved <pause dur="0.3"/> throughout evolution <pause dur="0.7"/> and something that's <pause dur="0.3"/> can now be uncovered by this combination of <pause dur="0.3"/> bioinformatic <pause dur="0.5"/> and protein structure study <pause dur="0.5"/> relationships <pause dur="0.8"/> some of you might know a

bit of the background but i think you'll find it <pause dur="0.3"/> hopefully an interesting story <pause dur="0.9"/> if you want to stop at any time <pause dur="0.6"/> this is <pause dur="0.9"/> an interactive <vocal desc="laugh" iterated="n"/><pause dur="0.3"/> discussion with a group as small as this so <pause dur="0.4"/> please if you don't understand any of the background just <pause dur="0.8"/> # <pause dur="0.8"/> make yourself heard <pause dur="0.5"/> so this is where it started <pause dur="0.9"/> and this is the <pause dur="0.4"/> the one real plant <pause dur="1.5"/><kinesic desc="turns on overhead projector showing transparency" iterated="n"/> # <pause dur="0.3"/> that i'll show you <pause dur="0.5"/> and this is the <pause dur="1.0"/> the commercial and agricultural <pause dur="0.2"/> basis <pause dur="0.2"/> of this whole academic project <pause dur="0.8"/> you might not recognize the plant but it's an oilseed rape plant <pause dur="1.0"/> and the practical project was represented <pause dur="1.1"/> by this area here <pause dur="0.5"/> which is a <pause dur="0.2"/> completely diseased section <pause dur="0.9"/> of an oilseed rape plant <pause dur="0.8"/> and it's a section of a stem <pause dur="0.6"/> that's been attacked by a particular virulent fungus <pause dur="0.4"/> that's destroyed the stem <pause dur="0.8"/> between these two points and it's gone from <pause dur="0.4"/> green to white <pause dur="0.4"/> and the leaf that's attached to the stem <pause dur="0.4"/> has been destroyed by that process <pause dur="0.9"/> and the fungus itself <pause dur="0.6"/> is called Sclerotinia <pause dur="1.2"/> # there's no kind of real English name for it but <pause dur="1.2"/> that's what it's known as <pause dur="0.2"/> and if

you're a farmer and you have this <pause dur="1.2"/><event desc="student enters room" iterated="n" n="su0216"/> i think we've got <pause dur="0.3"/><vocal desc="laugh" iterated="n"/><pause dur="0.5"/> somebody else <pause dur="0.2"/> okay <pause dur="0.5"/> if you're a farmer and you <pause dur="1.9"/> you have you have this type of fungus in your crop <pause dur="0.7"/> then <pause dur="0.2"/> it's bad news <pause dur="0.6"/> it attacks <pause dur="0.3"/> a whole range of agriculturally important crops <pause dur="0.4"/> particularly oilseed so <pause dur="0.7"/> # <pause dur="0.2"/> oilseed rape sunflower soya beans <pause dur="1.1"/> it's very difficult to control with fungicides <pause dur="0.7"/> the <pause dur="0.2"/> breeding materials that are available <pause dur="0.5"/> are very limited so <pause dur="0.2"/> although plant breeders have tried to <pause dur="0.5"/> introduce resistance <pause dur="0.6"/> they've # <pause dur="0.4"/> really rather failed <pause dur="0.4"/> over the last ten to twenty years <pause dur="1.6"/><kinesic desc="changes transparency" iterated="y" dur="12"/> the biochemistry of the disease is what <pause dur="0.7"/> # <pause dur="0.6"/> is where <pause dur="1.8"/> my interest lies <pause dur="0.9"/> and the biochemistry is represented here <pause dur="2.3"/> this is the same disease fungus growing on a petri dish <pause dur="1.5"/> in the lab <pause dur="0.5"/> and you can see that the colour of this is very different <pause dur="0.4"/> the outside is purple <pause dur="0.4"/> the inside orange <pause dur="0.7"/> and the fungus is growing across the surface <pause dur="0.4"/> and the reason that there's a colour change is that the P-H is changing <pause dur="0.5"/> the acidity of that petri dish is changing <pause dur="0.5"/> as the fungus

grows <pause dur="1.0"/> and it's going from <pause dur="0.8"/> neutral P-H represented by the purple <pause dur="0.6"/> to <pause dur="1.0"/> bright orange which means that the P-H has fallen <pause dur="0.2"/> so acid is being secreted by the fungus <pause dur="0.2"/> and it's the acid <pause dur="0.5"/> that's the major <pause dur="0.4"/> key feature of this fungus <pause dur="0.9"/> and the reason that these sorts of fungi <pause dur="0.5"/> are so <pause dur="0.4"/> successful is that they secrete <pause dur="0.4"/> a lot of acid <pause dur="1.5"/><vocal desc="laugh" iterated="n"/><pause dur="0.5"/> and <pause dur="0.3"/> do we have somebody else <pause dur="1.1"/> # <pause dur="1.6"/> and the reason they secrete acid <pause dur="0.6"/> and the acid that they secrete is the basis of this <pause dur="0.3"/> particular <pause dur="0.7"/> story for the next <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="8"/> forty-five minutes <pause dur="1.1"/> and this # <pause dur="4.9"/> this is the acid and this is how it works the acid is oxalic acid <pause dur="1.7"/> and do any of you know <pause dur="0.6"/> of any other plants maybe that <pause dur="1.0"/> have oxalic acid in it <pause dur="0.5"/> in them <pause dur="0.5"/> is it an acid that you <pause dur="0.7"/> know anything about in a biological context <pause dur="1.7"/> no <pause dur="0.8"/> okay well the <trunc>m</trunc> <pause dur="0.3"/> the most <pause dur="0.5"/> famous source of oxalic acid are <pause dur="0.5"/> green vegetables and things like rhubarb <pause dur="1.3"/> so <pause dur="0.5"/> this is the method of <pause dur="0.6"/> action <pause dur="1.8"/> of why oxalic acid is such a powerful toxin <pause dur="0.6"/> why the fungi that secrete this acid <pause dur="0.3"/> are so successful <pause dur="2.0"/> that they <pause dur="0.5"/> primarily act by <pause dur="0.6"/>

chelating the calcium <pause dur="0.4"/> so the calcium binds to the acid <pause dur="1.6"/> and it removes the calcium from the cell walls in the plant <pause dur="1.3"/> and cell walls of a plant are <pause dur="0.3"/> maintained <pause dur="0.2"/> in structure <pause dur="0.5"/> and the reason that they're <trunc>me</trunc> <pause dur="0.3"/> structure is maintained is <pause dur="1.0"/> through these calcium containing compounds called pectins and pectic acid <pause dur="0.5"/> it's what you put in jam to make it set <vocal desc="laugh" iterated="n"/><pause dur="0.6"/> it's a thickening agent that comes from plant wall <pause dur="0.3"/> cell walls <pause dur="1.3"/> once you get the calcium out of the plant <pause dur="0.8"/> cell walls <pause dur="0.5"/> particularly out of these specialized cell types <pause dur="0.8"/> then <pause dur="0.2"/> air can come into the plant <pause dur="0.4"/> 'cause of <pause dur="0.5"/> the rigidity of a plant is maintained by the water inside it <pause dur="0.6"/> once you let <pause dur="0.4"/> air into the vessels in the <pause dur="0.2"/> plant <pause dur="0.2"/> then <pause dur="0.6"/> the plant will start to wilt <pause dur="1.0"/> which is what happens <pause dur="0.3"/> next <pause dur="1.2"/> once the acid <pause dur="0.5"/> <trunc>re</trunc> goes into the plant it reduces the P-H <pause dur="0.9"/> all the other enzymes then that are present in the <trunc>f</trunc> <pause dur="0.9"/> that are secreted by the fungi <pause dur="0.3"/> are activated at low P-H <pause dur="0.4"/> and so the plant then starts to rot <pause dur="1.3"/> and once it's started to rot <vocal desc="laughter" iterated="y" dur="1"/><pause dur="0.3"/> that's the end of the

plant <pause dur="2.0"/> okay <pause dur="1.1"/> so that's <pause dur="0.7"/><kinesic desc="changes transparency" iterated="y" dur="15"/> how the oxalic acid works <pause dur="1.3"/> and the question was <pause dur="1.6"/> what could you do about <pause dur="0.2"/> overcoming the <pause dur="0.6"/> the problem <pause dur="0.9"/> so about eight years ago <pause dur="1.4"/> biochemists and the genetics people <pause dur="0.8"/> in <pause dur="0.4"/> what was then Zeneca Plant Sciences at Jealott's Hill <pause dur="1.3"/> were looking for <pause dur="0.3"/> biochemical approaches to reduce the level of acid secreted by the fungus and therefore perhaps to protect the fungus <pause dur="0.5"/> <trunc>pla</trunc> plant from the <trunc>fung</trunc> <pause dur="0.6"/> the fungus <pause dur="1.3"/> and the obvious thing to do was to say <pause dur="0.3"/> could we break down the acid <pause dur="0.5"/> secreted by the fungus <pause dur="1.0"/> and there are two <pause dur="0.4"/> enzymes that are known to <pause dur="0.7"/> to break down <pause dur="0.3"/> oxalic acid <pause dur="0.9"/> and these are <pause dur="0.2"/> oxalate oxidase <pause dur="1.5"/> which converts oxalate to carbon dioxide and hydrogen peroxide <pause dur="2.2"/> and oxalate decarboxylase that converts <pause dur="0.4"/> oxalate to formate <pause dur="0.8"/> and carbon dioxide <pause dur="0.6"/> so those are the two main <pause dur="0.5"/> pathways for breaking down <pause dur="1.0"/> oxalic acid <pause dur="0.8"/> so the immediate question was <pause dur="0.5"/> as a strategic question <pause dur="0.8"/> could we identify isolate <pause dur="1.2"/> and then <pause dur="0.2"/> introduce into a plant <pause dur="1.2"/> one or more of these two enzymes <pause dur="0.6"/> and therefore

allow the plant <pause dur="0.9"/> to protect itself from the fungus by degrading the acid <pause dur="0.8"/> it's a kind of simple <pause dur="0.8"/> idea simple question <pause dur="2.8"/> and the immediate answer was <pause dur="0.3"/> well we had to make a choice and the <pause dur="0.2"/> first choice was <pause dur="0.9"/> first enzyme which is a plant enzyme <pause dur="0.9"/> the second enzyme here is a fungal one <pause dur="0.6"/> so <pause dur="0.4"/> a lot more was known about this enzyme <pause dur="1.6"/> and over a period of a year or two <pause dur="0.6"/> that enzyme was <pause dur="0.9"/> # the gene that encodes that enzyme was isolated <pause dur="1.9"/> was put in back into a plant through genetic <pause dur="0.3"/> manipulation methods <pause dur="0.6"/> and we produced <pause dur="0.5"/> many transgenic <pause dur="0.3"/> genetically modified plants that now <pause dur="0.5"/> had <pause dur="0.6"/> this enzyme <pause dur="0.3"/> and expressed this enzyme <pause dur="0.5"/> its origin originally was from <pause dur="0.3"/> cereals <pause dur="0.2"/> so we were taking a <pause dur="0.5"/> an enzyme that was found in <pause dur="0.4"/> wheat and barley <pause dur="0.6"/> and introducing it into oilseed rape <pause dur="0.5"/> and what you've then got <pause dur="0.3"/> and this is the end of the <kinesic desc="changes transparency" iterated="y" dur="3"/> biology then <pause dur="1.1"/> and go on to the chemistry <pause dur="0.7"/> what you've then got was this these were two leaves of <pause dur="0.2"/> oilseed rape plants <pause dur="0.7"/> the one on the left <pause dur="0.6"/> is the traditional variety <pause dur="0.6"/> infected by <pause dur="0.9"/> # <pause dur="0.2"/> a small sample of the fungus <pause dur="0.6"/> and you can see it's extended <pause dur="0.6"/> and produced this so-called lesion this rotten area <pause dur="0.4"/> in the middle of the leaf <pause dur="1.1"/>

and the acid is spreading out to the edge where it's come becoming crinkled <pause dur="1.5"/> this leaf is a leaf from a transgenic or genetically modified variety <pause dur="0.3"/> where the fungus is <pause dur="0.4"/> no longer growing <pause dur="1.1"/> and this whole idea <pause dur="0.2"/> has been taken <pause dur="0.8"/> and used in over the last five years <pause dur="0.8"/> by companies in North America in particular <pause dur="0.5"/> and this isn't a G-M discussion today but <pause dur="1.4"/> there's a good chance that <pause dur="0.5"/> # <pause dur="0.8"/> a sunflower variety containing this gene <pause dur="0.5"/> will be commercialized over the next few years and will <pause dur="0.3"/> provide <pause dur="0.6"/> very good protection for the first time ever <pause dur="0.3"/> against this very <pause dur="0.4"/> # <pause dur="0.5"/> devastating <pause dur="0.5"/> pathogen of plants <pause dur="0.2"/> so that's the background <pause dur="1.4"/><kinesic desc="changes transparency" iterated="y" dur="12"/> and now we get to the chemistry <pause dur="0.4"/> and the chemistry says <pause dur="1.7"/> okay what is this enzyme <pause dur="0.7"/> # <pause dur="0.4"/> that was where the practical commercial work <pause dur="0.7"/> # was going <pause dur="0.5"/> from an academic perspective <pause dur="0.9"/> what is it about this enzyme that made it worth <pause dur="0.3"/> studying in its own right <pause dur="0.6"/> and these are <trunc>i</trunc> the characteristics that made it so <pause dur="0.5"/> interesting and unusual <pause dur="2.5"/> it was an enzyme <pause dur="0.8"/> in fact that had been <pause dur="0.8"/>

isolated previously <pause dur="0.8"/> and had been given the name germin <pause dur="0.5"/> about <pause dur="0.2"/> twenty years ago <pause dur="0.9"/> and it was isolated from barley embryos <pause dur="0.2"/> and wheat embryos <pause dur="0.7"/> at that time <pause dur="0.7"/> the protein was isolated <pause dur="0.2"/> but it wasn't known to be <pause dur="0.4"/> this enzyme <pause dur="0.2"/> and we were <pause dur="0.4"/> therefore left with a <pause dur="0.4"/> strange situation of <pause dur="0.8"/> a research group in Canada <pause dur="0.3"/> who'd worked on a protein that they'd <pause dur="0.2"/> given the name germin to <pause dur="0.4"/> and they'd found that it had these characteristics <pause dur="0.3"/> but they had no idea about its function <pause dur="1.9"/> they'd worked on it for a long period of time <pause dur="0.4"/> because it was pepsin resistant <pause dur="0.4"/> most enzymes being proteins are broken down by <pause dur="0.4"/> protein degrading enzymes themselves <pause dur="0.5"/> so <pause dur="0.4"/> this particular germin protein <pause dur="0.8"/> was almost completely resistant to <pause dur="0.2"/> <trunc>t</trunc> <pause dur="0.3"/> being broken down <pause dur="0.5"/> by normal <pause dur="0.7"/> # <pause dur="1.0"/> proteases <pause dur="0.6"/> it was also resistant to hydrogen peroxide <pause dur="1.0"/> which again is unusual for a protein <pause dur="0.8"/> usually these two treatments would completely denature a protein <pause dur="0.4"/> and make it unfold <pause dur="1.2"/> it was a glycosylated protein in other words it had sugars attached to it which is <pause dur="0.4"/> quite

usual in plants <pause dur="0.7"/> it was a multimeric one in other words it had lots of subunits <pause dur="1.1"/> and <pause dur="1.2"/> it was considered to be important because there was a lot of it <vocal desc="laugh" iterated="n"/><pause dur="0.5"/> and biologists think <pause dur="0.2"/> well if there's a lot of it <pause dur="0.4"/> it must be important it's a kind of <pause dur="0.4"/> crude analysis of its <pause dur="0.2"/> significance <pause dur="0.9"/> and so <pause dur="0.3"/> the group in Toronto had worked for <pause dur="0.6"/> ten years or more <pause dur="0.3"/> on this germin <pause dur="0.8"/> they had the sequence of it but they'd no way of finding the function <pause dur="0.7"/> we were working on <pause dur="0.9"/> the barley equivalent of this <pause dur="0.9"/> on <pause dur="0.4"/> the idea that it was an enzyme and then we went back to the gene <pause dur="0.4"/> they had the gene and the protein but no function <pause dur="0.6"/> we put the two things together <pause dur="0.4"/> and we were able to tell the group in <pause dur="0.2"/> Canada <pause dur="0.3"/> what they'd been doing for the last ten years was working <pause dur="0.3"/> on an oxalate oxidase <pause dur="1.2"/> and that was quite surprising to them because <pause dur="0.3"/> up to that point <pause dur="0.2"/> nobody had ever isolated an oxalate oxidase from plants <pause dur="0.8"/> # <pause dur="0.4"/><kinesic desc="changes transparency" iterated="y" dur="5"/> <trunc>a</trunc> and isolated the gene from a <pause dur="0.5"/> this enzyme <pause dur="1.6"/> if you look <pause dur="0.7"/> then <pause dur="0.5"/> and using the <pause dur="0.4"/> the # <pause dur="0.4"/> bioinformatics techniques <pause dur="0.4"/>

at the structure of this protein <pause dur="0.7"/> which is what was done <pause dur="0.5"/> very simply <pause dur="0.6"/> it then became <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="8"/> quite clear that this protein wasn't <pause dur="0.7"/> a unique <pause dur="0.4"/> protein but in fact had <pause dur="0.5"/> lots of <pause dur="0.4"/> quite close relatives <pause dur="1.6"/> and this is <pause dur="0.4"/> the analysis as it was <pause dur="0.2"/><vocal desc="clears throat" iterated="n"/><pause dur="0.4"/> about <pause dur="0.6"/> five years ago <pause dur="0.5"/> and at that time <pause dur="0.5"/> in all the databases in the world <pause dur="1.6"/> wherever you looked you could find a total <pause dur="0.3"/> of ten sequences <pause dur="0.5"/> this is the Prodom <pause dur="0.8"/> # <pause dur="0.2"/> protein <pause dur="0.3"/> the main database <pause dur="0.9"/> that showed <pause dur="0.9"/> this <pause dur="0.5"/> pattern of ten proteins <pause dur="1.7"/> but the eight at the top were from plants <pause dur="0.4"/> the two at the bottom were from slime moulds <pause dur="1.5"/> the <trunc>f</trunc> different colours here represent areas of conservation <pause dur="0.7"/> the different colours at the ends are where they're different so <pause dur="0.4"/> we have this family of ten proteins where there was quite a lot of conservation <pause dur="0.5"/> in the middle of them <pause dur="0.7"/> the the blue area <pause dur="2.3"/> some of them had <trunc>diff</trunc> had the same <pause dur="0.3"/> N-terminus which is <pause dur="0.3"/> labelled here <pause dur="0.8"/> but the <pause dur="0.9"/> they were different at the <pause dur="0.6"/> C-terminus at this end of the protein <pause dur="0.8"/> so the cereal proteins were different from the <pause dur="0.9"/> the brassica <pause dur="0.3"/> and dicot

proteins <pause dur="1.1"/> the slime moulds <pause dur="1.3"/> two on the bottom <pause dur="0.8"/> and the slime mould is one of these <pause dur="0.7"/> # <pause dur="0.7"/> eukaryotes that <pause dur="0.7"/> not as <trunc>m</trunc> <pause dur="0.2"/> much is known about biochemically <pause dur="0.8"/> but it was known that when this slime mould <pause dur="1.0"/> became <pause dur="0.5"/> desiccated became starved <pause dur="0.3"/> it produces very small spores <pause dur="0.4"/> and these spores <pause dur="0.2"/> contained a lot of this protein <pause dur="1.1"/> again there was no particular function assigned to it <pause dur="0.7"/> but it was known to be somehow related to desiccation <pause dur="1.1"/> so there are series of clues here being built up <pause dur="0.5"/> during this story that say <pause dur="1.1"/> we've got a very resistant <pause dur="1.0"/> heat stable pepsin resistant protease resistant <pause dur="0.6"/> hydrogen peroxide resistant <pause dur="0.2"/> protein found in cereals <pause dur="0.8"/> it seems to have some similarity <pause dur="0.5"/> to a desiccation tolerant <pause dur="0.4"/> starvation induced <pause dur="0.9"/> # <pause dur="0.3"/> protein from a very <pause dur="0.5"/> primitive <pause dur="0.5"/> eukaryote a <trunc>pri</trunc> very primitive <pause dur="0.5"/> # mould in this case a slime mould <pause dur="0.9"/> so that <pause dur="0.4"/> was interesting because it said <pause dur="0.6"/> there's a <pause dur="0.6"/> a <pause dur="0.2"/> protein connection between <pause dur="0.3"/> a slime mould <pause dur="0.5"/> a barley plant <pause dur="0.6"/> <trunc>a</trunc> and a wheat embryo <pause dur="0.9"/> so <pause dur="0.4"/> that's a <pause dur="1.3"/> an interesting position <pause dur="0.7"/> and about

that time <pause dur="0.2"/> a group in Germany <pause dur="0.9"/> as well as ourselves <pause dur="0.6"/> started to think more about the evolutionary significance of this <pause dur="0.6"/> and they also <pause dur="0.4"/> found by repeated database searches <pause dur="1.4"/> that this family of proteins which <pause dur="0.4"/> started to become enlarged as time went on from ten to twenty <pause dur="0.7"/> and # i'll tell you what the latest version is at the end of this story <pause dur="2.4"/> they were given the name <pause dur="0.3"/> originally germin from the <pause dur="0.2"/> germination <pause dur="0.6"/> # <pause dur="1.1"/> association of the <pause dur="0.3"/> of the wheat gene <pause dur="0.7"/> these # related proteins were then known as germin-like proteins <pause dur="1.3"/> more and more detailed <pause dur="0.3"/> sequence similarity searches <pause dur="0.4"/> showed that there were particular <pause dur="0.6"/> amino acids <pause dur="0.4"/> within these germins and germin-like proteins <pause dur="0.4"/> that were also found in yet another <pause dur="0.9"/> much more distantly related group of proteins that <pause dur="0.7"/> # <pause dur="0.3"/> are particularly interesting to <pause dur="0.6"/> to plant scientists <pause dur="0.4"/> and those are the proteins found in <trunc>s</trunc> <pause dur="0.2"/> in seeds in storage proteins <pause dur="0.6"/> proteins that are part of <pause dur="0.5"/> all our diets <pause dur="0.3"/> because we <pause dur="0.6"/> indirectly or directly <pause dur="0.4"/> we rely upon eating seeds as a <pause dur="0.5"/> form of

# <trunc>pro</trunc> # vegetable protein at least <pause dur="1.4"/> and if you look at vegetable proteins in seeds <pause dur="0.7"/> particularly the so-called legumin vicilin proteins <pause dur="0.6"/> then you'll see <pause dur="0.4"/> you can identify particular amino acids that are also found <pause dur="0.4"/> in this group of proteins that i've just described <pause dur="1.3"/> and this was a simple attempt to look at <pause dur="0.6"/> the evolutionary significance of these families <pause dur="0.5"/> during time <pause dur="0.3"/> and this was put together <pause dur="0.8"/> by a group in Germany <pause dur="1.2"/> again <pause dur="0.7"/> around this period of <pause dur="0.2"/> five to six years ago <pause dur="0.5"/> and they started <pause dur="0.3"/> with a hypothesis that said <pause dur="1.9"/><kinesic desc="changes transparency" iterated="y" dur="5"/> if you believe in evolution then at the beginning of time there <trunc>sh</trunc> <pause dur="0.5"/> should be some so-called ancestral protein <pause dur="0.6"/> from which all these other proteins <pause dur="0.8"/> # were produced <pause dur="0.3"/> during evolution <pause dur="1.5"/> we're <trunc>i</trunc> in a position that said <pause dur="0.8"/> some of them evolved into these <pause dur="0.3"/> slime mould proteins that are known as spherulins <pause dur="1.5"/> some of them <pause dur="0.2"/> involved <pause dur="0.6"/> evolved rather <pause dur="0.4"/> into <pause dur="1.0"/> what we now know as the germin <pause dur="0.4"/> types of protein <pause dur="1.1"/> some of them <pause dur="0.6"/> evolved into <pause dur="1.5"/> they're described here as C-globulins <pause dur="1.4"/> but another <pause dur="0.2"/>

more important group <pause dur="0.3"/> and these are the ones that <pause dur="0.6"/> found in seeds now <pause dur="0.6"/> went through a duplication event at some stage <pause dur="1.1"/> to form <pause dur="0.6"/> these legumin vicilin <pause dur="0.8"/> # <pause dur="0.3"/> precursor proteins <pause dur="0.3"/> because what i haven't said is that <pause dur="0.6"/> the storage proteins are twice as big in length <pause dur="0.7"/> as the germins <pause dur="0.5"/> so at some time <pause dur="0.5"/> they seem to have doubled in size <pause dur="1.2"/> so they have <pause dur="0.3"/> two halves and within each half <pause dur="0.4"/> you can see <pause dur="0.7"/> # the the letters here represent the conserved <pause dur="0.3"/> amino acid residues <pause dur="0.4"/> during this whole period of evolution <pause dur="0.3"/> from <pause dur="0.4"/> the primitive <pause dur="0.4"/> slime mould up to the <pause dur="0.9"/> the higher plant <pause dur="0.6"/> just a few amino acids particularly glycines prolines glycines <pause dur="1.0"/> we've seen phenylalanines <pause dur="1.5"/> particularly those amino acids were conserved in the same place <pause dur="0.3"/> all the way through this process <pause dur="0.9"/> so although you look at these proteins and you don't see very much <pause dur="0.9"/> <gap reason="inaudible" extent="1 word"/> perhaps superficial <pause dur="0.4"/> significance in similarity terms <pause dur="0.4"/> they do have these conserved residues strictly conserved at particular points <pause dur="2.4"/>

so that was an outline a sort of sketch of what <pause dur="0.4"/> the evolutionary pattern might have been <pause dur="2.0"/> and why was it possible to do that <pause dur="0.3"/> was <pause dur="0.8"/> because this three-dimensional structure <pause dur="0.4"/> of the storage proteins was <pause dur="0.5"/> already known <pause dur="3.1"/><kinesic desc="changes transparency" iterated="y" dur="5"/> and i'll say why that's important in a second <pause dur="1.2"/> the diagrammatic <pause dur="0.3"/> version of that is <pause dur="0.8"/> summarized on here <pause dur="0.5"/> so again we start with <pause dur="0.3"/> something where there's an ancestor <pause dur="0.4"/> you go through duplications you then evolve into <pause dur="0.6"/> a whole series of different families <pause dur="1.2"/> and this covers a whole <pause dur="0.6"/> a large period of time and it covers a lot of different <pause dur="0.3"/> potential functions <pause dur="0.8"/> so <pause dur="0.4"/> that's the <pause dur="0.6"/> the outline kind of cartoon of history <pause dur="0.8"/><kinesic desc="changes transparency" iterated="y" dur="25"/> that <pause dur="0.5"/> # <pause dur="0.4"/> i want to build from <pause dur="1.9"/> okay <pause dur="0.9"/> i said that the three-dimensional structure of <pause dur="0.8"/> of the storage proteins was known <pause dur="3.4"/> and this is what you get if you look at the alignment of different storage proteins <pause dur="1.9"/> and i've marked on here just a <pause dur="0.5"/> a couple of the <pause dur="0.4"/> globally conserved amino acids <pause dur="1.4"/> these are two <pause dur="0.4"/> storage proteins <pause dur="0.3"/> the first from <pause dur="0.6"/>

soya bean <pause dur="0.2"/> the one on the top is this <pause dur="0.6"/> protein called glycinin <pause dur="0.9"/> and the second protein is a <pause dur="0.5"/> storage protein from beans <pause dur="1.1"/> and the <trunc>w</trunc> <pause dur="0.3"/> the <trunc>pro</trunc> the amino acids i've marked in red and blue <pause dur="0.5"/> are two of these globally conserved ones <pause dur="0.3"/> so wherever you <pause dur="0.2"/> find a storage protein from whatever species <pause dur="0.4"/> you'll always find <pause dur="0.5"/> a proline at this position <pause dur="0.3"/> and a glycine here <pause dur="0.4"/> and lots of variations in between but there'll always be a <pause dur="1.0"/> conserved residues another glycine here <pause dur="0.6"/> so these boxes represent these few key residues <pause dur="1.3"/> but the question is <trunc>w</trunc> <pause dur="0.3"/> what function do those few key residues have <pause dur="0.2"/> in determining the structure of a protein because <pause dur="0.7"/> linear sequences of proteins are usually <pause dur="0.8"/> they're useful but they're <pause dur="0.4"/> rather meaningless when you're trying to relate <pause dur="0.5"/> protein sequence to function <pause dur="0.5"/> function depends upon the shape of the protein <pause dur="0.5"/> and the shape of the protein is determined by how it folds up <pause dur="1.1"/> how <pause dur="0.4"/> this linear sequence becomes a three-dimensional structure <pause dur="1.4"/><kinesic desc="changes transparency" iterated="y" dur="9"/> and if you look at the three-dimensional structure <pause dur="1.1"/> of a

storage protein <pause dur="1.4"/> and this is a simple version of it <pause dur="3.7"/> this is <trunc>f</trunc> <pause dur="0.6"/> a sort of diagram of it being folded <pause dur="0.9"/> # <pause dur="0.3"/> it's obviously two-D to <trunc>f</trunc> <pause dur="0.6"/> to flatten out here but <pause dur="1.3"/> it's meant it's composed of a series of strands <pause dur="0.5"/> you <pause dur="0.2"/> probably know about <pause dur="1.2"/> # the <pause dur="0.3"/> components of proteins that <pause dur="0.7"/> are made up out of either <pause dur="0.6"/> <trunc>bet</trunc> so-called beta strands <pause dur="0.3"/> or <pause dur="1.1"/> or helices so there's curved <pause dur="0.2"/> helix <pause dur="0.6"/> # components within a protein <pause dur="0.4"/> or there are <pause dur="0.5"/> there are strands which are just short stretches <pause dur="1.1"/> and these two <pause dur="0.8"/> marked residues <pause dur="0.9"/> that were on the previous <pause dur="0.4"/> # overhead <pause dur="0.4"/> the blue and red ones <pause dur="0.7"/> are in key areas <pause dur="0.3"/> that determine the three-dimensional structure of the protein <pause dur="0.5"/> so the glycine <pause dur="1.3"/> is here <pause dur="0.4"/> where the <pause dur="0.2"/> protein bends <pause dur="0.9"/> and the proline is at a key area here <pause dur="0.8"/> that forms an interaction between this part of the protein and this part <pause dur="1.4"/> so the moral <pause dur="0.2"/> to remember all the time is that <pause dur="0.6"/> just <pause dur="0.5"/> occasional amino acids at key points in a protein <pause dur="0.8"/> can determine the three-dimensional structure <pause dur="0.4"/> it doesn't matter too much <pause dur="1.3"/> in many cases what's between <pause dur="0.7"/> this corner and

this corner <pause dur="0.4"/> as long as in three-dimensional space you can join up one corner another corner <pause dur="0.4"/> over a certain distance <pause dur="1.3"/> so when you're looking at proteins and the elution of proteins you <trunc>th</trunc> <pause dur="0.5"/> have to think all the time about <pause dur="0.8"/> okay <pause dur="0.2"/> linear sequences are fine <pause dur="0.5"/> but it's three-dimensional shapes which are the real key to biology <pause dur="1.1"/> # <pause dur="0.5"/> and the it's the chemistry and the structure behind that that i'll <pause dur="0.9"/> go on to <pause dur="0.2"/> to mention <pause dur="0.8"/> so <pause dur="0.3"/> that's a just a <pause dur="0.4"/> a representation <pause dur="0.3"/> of these <pause dur="0.2"/> two amino acids and the <pause dur="0.6"/> the <pause dur="0.2"/> key function that they play <pause dur="0.9"/> in determining <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="13"/> structure <pause dur="1.7"/> so the question <trunc>o</trunc> <pause dur="0.7"/> or the next question was <pause dur="1.2"/> if we looked at a similar alignment <pause dur="0.2"/> of <pause dur="1.4"/> the germins and the germin-like proteins <pause dur="2.2"/> what could we learn by first of all at a <pause dur="0.4"/> <trunc>in</trunc> initial alignment of the different <pause dur="0.4"/> germin <pause dur="0.3"/> family <pause dur="0.4"/> and also <pause dur="0.3"/> could we relate the germin structural <pause dur="0.4"/> sorry <pause dur="0.3"/> could we relate the <pause dur="0.3"/> storage protein structural information <pause dur="0.6"/> to possibly predicting <pause dur="0.4"/> what shape of <pause dur="0.3"/> protein the germin-like proteins might have <pause dur="1.7"/> so the next stage was to do a <pause dur="0.6"/> simple

alignment of the <pause dur="0.9"/> germin <pause dur="1.1"/> proteins <pause dur="0.3"/> # <pause dur="0.6"/> you don't need to read it just kind of look at the colours if you <pause dur="0.7"/> can there <pause dur="1.6"/> and the colours as usual are colour coded according to the type of amino acid <pause dur="0.9"/> so the <pause dur="0.5"/> the yellows are the <pause dur="0.7"/> # <pause dur="0.6"/> the sulphur containing amino acids <pause dur="0.5"/> prolines are the green <pause dur="1.9"/> and <pause dur="0.2"/> the dark blues are the <pause dur="0.5"/> basic amino acids <pause dur="0.5"/> and you'll see if you go along this sequence there are areas where <pause dur="1.1"/> there are quite good stretches of similarity between all these families <pause dur="2.3"/> there are certain regions there where there's more <pause dur="0.2"/> conservation than others <pause dur="0.8"/> and one area where there are series of successive conserved amino acids is <pause dur="0.5"/> is here <pause dur="0.8"/> and particularly <pause dur="0.5"/> of interest to us because <pause dur="1.2"/> again <pause dur="0.7"/> if you're looking at amino acid you have to think <pause dur="0.7"/> okay some amino acids have an importance in structure <pause dur="0.6"/> some amino acids have an importance <pause dur="0.5"/> in <pause dur="0.4"/> say enzyme activity <pause dur="0.4"/> in things like <pause dur="0.4"/> binding of metals inside a protein <pause dur="1.5"/> and all our particular attention was <pause dur="0.4"/> # <pause dur="0.4"/> was drawn to these <pause dur="0.7"/> particularly to these two <pause dur="0.3"/> areas of <pause dur="1.0"/> here

where there are <pause dur="0.4"/> dark blue lines that you see <pause dur="0.9"/> are conserved <pause dur="0.5"/> histidine residues <pause dur="1.3"/> so there were two there <pause dur="1.8"/><kinesic desc="changes transparency" iterated="y" dur="9"/> and there were also <pause dur="0.3"/> if you went on through the protein <pause dur="9.1"/> and this is towards the <pause dur="0.3"/> the end towards the C-terminus of the protein <pause dur="0.2"/> you could see that <pause dur="0.2"/> there was another <pause dur="1.7"/> histidine residue here <pause dur="1.9"/> and the reason why <pause dur="0.4"/> we should always pay attention to <pause dur="0.3"/> things like histidines i say they're <pause dur="0.4"/> well known to being <pause dur="0.3"/> involved in <pause dur="0.3"/> the active site of enzymes <pause dur="1.3"/> so <pause dur="0.3"/> enzymes have a structure <pause dur="0.8"/> but the structure is really only there <pause dur="0.5"/> to form a scaffold <pause dur="1.0"/> in effect <pause dur="0.5"/> and the activity the chemistry in the <pause dur="0.2"/> protein in the enzyme <pause dur="0.4"/> takes place usually in the middle of it <pause dur="0.7"/> where the active site of the enzyme is <pause dur="0.5"/> so you have this <pause dur="0.4"/> rigid structure <pause dur="0.8"/> which provides a shape <pause dur="0.8"/> but the chemistry that goes on in an enzyme <pause dur="0.8"/> takes place in the active site in the usually in the centre <pause dur="0.8"/> where the chemical reactions <pause dur="0.5"/> # <pause dur="0.3"/> occur <pause dur="0.9"/> and many chemical reactions particularly <pause dur="0.4"/> if you remember this is an oxidase enzyme <pause dur="0.3"/> an oxalate oxidase <pause dur="1.0"/> many

oxidases <pause dur="0.6"/> require metals for their activity <pause dur="0.5"/> so there's a <pause dur="0.4"/> they have a metal cofactor <pause dur="1.6"/> so how do metals stick on to <pause dur="1.1"/> on to the insides of proteins <pause dur="0.4"/> they have to be held in position <pause dur="1.7"/> and they have to be held in position through <pause dur="0.6"/> particular amino acids that have a <trunc>sor</trunc> <pause dur="0.6"/> a very rigid geometry <pause dur="1.7"/> so the distance between <pause dur="0.2"/> specific amino acids <pause dur="0.6"/> will hold particular metals <pause dur="0.4"/> and each metal <pause dur="1.2"/> requires <pause dur="0.3"/> different sorts of <pause dur="0.4"/> binding amino acids <pause dur="2.2"/> can anybody tell me what kinds of metals there might be inside oxidase enzymes <pause dur="2.7"/> any guesses </u><pause dur="3.5"/> <u who="sm0212" trans="pause"> iron </u><pause dur="0.8"/> <u who="nm0211" trans="pause"> yep </u><pause dur="0.6"/> <u who="su0213" trans="pause"> <gap reason="inaudible" extent="1 word"/> </u><pause dur="0.3"/> <u who="nm0211" trans="pause"> yep <pause dur="0.4"/> # two <pause dur="0.4"/> good ones </u><u who="sm0214" trans="overlap"> zinc </u><u who="sm0215" trans="overlap"> magnesium </u><pause dur="0.4"/> <u who="nm0211" trans="pause"> sorry </u><pause dur="0.3"/> <u who="sm0214" trans="pause"> zinc </u><pause dur="0.3"/> <u who="nm0211" trans="pause"> yep <pause dur="0.7"/> that's three of the major ones <pause dur="0.2"/> pretty good </u><pause dur="1.4"/> <u who="sm0215" trans="pause"> magnesium </u><pause dur="1.3"/> <u who="nm0211" trans="pause"> # <pause dur="0.5"/> not quite magnesium <pause dur="0.2"/> something a bit like <vocal desc="laughter" iterated="y" dur="1"/><pause dur="0.5"/> you start with the <trunc>s</trunc> first three letters <pause dur="0.3"/> of <pause dur="0.9"/> manganese is <pause dur="0.9"/> magnesium is <pause dur="0.6"/> not usually found in <pause dur="0.6"/> in oxidases <pause dur="0.8"/> certainly iron <pause dur="0.7"/> zinc <pause dur="1.0"/> particularly iron and copper <pause dur="0.2"/> i suppose are the two most common <pause dur="0.8"/> zinc <pause dur="0.3"/> not quite so much <pause dur="1.8"/> manganese is often found as an alternative to # <pause dur="0.6"/> to iron <pause dur="0.8"/> so <pause dur="0.4"/> just think about those <pause dur="0.5"/> those

four components <pause dur="0.6"/> things like # <pause dur="0.9"/> amine oxidases are copper oxidases <pause dur="0.8"/> in each case though whether it's <pause dur="0.8"/> iron <pause dur="0.2"/> manganese or copper <pause dur="0.6"/> they're always held in position <pause dur="0.5"/> by <pause dur="0.4"/> histidine <pause dur="0.2"/> amino acids so you get this <pause dur="0.6"/> thing called a histidine cluster <pause dur="1.1"/> and you should always look very carefully if you start to see <pause dur="0.7"/> conservation of histidines <pause dur="0.9"/> in a protein alignment <pause dur="1.1"/> but that's <pause dur="0.2"/> circumstantial evidence for there being a metal binding site <pause dur="0.5"/> and that was our prediction <pause dur="0.4"/> on the basis of this <pause dur="1.2"/>

# <pause dur="0.7"/> initial survey <pause dur="1.3"/> so <pause dur="0.3"/> were we dealing with a <pause dur="0.3"/> copper containing <pause dur="0.4"/> iron containing or manganese containing <pause dur="1.0"/> enzyme <pause dur="2.4"/> right <pause dur="2.2"/> so how could we answer that without doing any <pause dur="0.7"/> real <pause dur="0.6"/> # difficult <pause dur="0.7"/> biochemistry <pause dur="0.8"/> and the power now of <pause dur="0.5"/> of computing facilities in <pause dur="0.5"/> structural biology <pause dur="0.6"/> is so great that you can do a lot of work <pause dur="0.7"/> without actually going into the lab any longer <pause dur="0.5"/> # <pause dur="0.8"/> many labs are being depopulated because <vocal desc="laugh" iterated="n"/><pause dur="0.2"/> computers are taking over <pause dur="0.6"/> and it's a lot easier to <pause dur="0.5"/> run the <pause dur="0.2"/> computer and use the modelling programmes that exist <pause dur="0.6"/> to answer some of these questions <pause dur="0.4"/> rather than saying <pause dur="0.6"/> i want to know <pause dur="0.5"/> whether these three histidines possibly might form a binding site <pause dur="0.3"/> i've got to purify the protein i've got to add metals i've got to do very complex # <pause dur="0.3"/> analytically <pause dur="0.6"/> work <pause dur="0.4"/> and then i might be wrong <pause dur="0.2"/> or i can't get the pure <pause dur="0.3"/> protein pure enough <pause dur="0.5"/> but what you can do in a few days is to <pause dur="0.2"/> say <pause dur="0.8"/> well let's <pause dur="0.7"/> try and produce a model of the protein <pause dur="1.3"/> i mean <pause dur="0.5"/> the best way to produce a model of a protein is if you have an existing structure to

work from <pause dur="1.4"/> and the <pause dur="0.4"/> great benefit for us is that we had the structure of the storage protein <pause dur="0.8"/> we had the idea that <pause dur="0.4"/> the storage protein structure was <pause dur="0.3"/> probably related to the germin structure <pause dur="0.7"/> so we take <pause dur="0.6"/> the crystal and the three-dimensional coordinates of the <pause dur="0.6"/> <trunc>pr</trunc> of the storage protein <pause dur="0.4"/> and we would try and fit the germin <pause dur="1.1"/> sequence on to that <pause dur="0.3"/> backbone <pause dur="0.3"/> and see what we got <pause dur="1.1"/> and this is what we did <pause dur="0.7"/><kinesic desc="changes transparency" iterated="y" dur="4"/> # <pause dur="0.9"/> couple of years ago <pause dur="0.4"/> and this is a summary of the <pause dur="0.4"/> conclusion <pause dur="1.6"/> we used as a model <pause dur="1.3"/> an average between <pause dur="1.1"/> two existing structures and these were the two storage proteins <pause dur="1.0"/> and this case canavalin and phaseolin <pause dur="0.4"/> so we took from the databases <pause dur="0.8"/> three-dimensional structure of the and coordinates of those <pause dur="0.7"/> those proteins <pause dur="0.7"/> if you remember i said that they had two halves to them because they were twice as big as germin <pause dur="0.7"/> so we treated them as independent halves <pause dur="0.7"/> an N-terminal half <pause dur="0.4"/> an N-terminal domain and a C-terminal domain <pause dur="1.1"/> and the red and orange <pause dur="0.3"/> one represents one protein and the other so <pause dur="0.6"/> that's just

to show if you look sideways at these proteins <pause dur="0.8"/> that they have this <pause dur="0.8"/> beta barrel shape <pause dur="1.5"/> and # <pause dur="4.5"/> and we used the average of those <pause dur="0.4"/> shapes <pause dur="0.9"/> to fit <pause dur="0.5"/> the germin sequence on to <pause dur="0.5"/> and you can this is automatic you <pause dur="0.6"/> do the alignment and the computer will come out and it will try and fold the protein according to the <pause dur="0.6"/> coordinates that are <pause dur="0.3"/> it knows exist here <pause dur="0.9"/> and if you do that <pause dur="0.2"/> you get <pause dur="0.5"/> depending on the <pause dur="0.5"/> quality of the alignment you get a prediction then <pause dur="0.5"/> of the three-dimensional structure of your <pause dur="0.5"/> favourite protein <pause dur="0.7"/> so you don't have to crystallize it <pause dur="0.5"/> you can <pause dur="0.2"/> use the model <pause dur="1.5"/> and the model immediately told us something <pause dur="0.4"/> very interesting <pause dur="0.4"/> this is the crude <pause dur="0.4"/> model here <pause dur="0.6"/> that shows it is very similar to these two <pause dur="1.1"/> but it also told us that if we look in the <pause dur="0.4"/> inside of the <pause dur="0.2"/> barrel of this model <pause dur="1.1"/> and we ask <pause dur="0.3"/> where are those three histidines <pause dur="0.5"/> that we <trunc>th</trunc> <pause dur="0.4"/> saw in the alignment <pause dur="0.3"/> they were quite a <trunc>f</trunc> <pause dur="0.5"/> they were too close together <pause dur="0.3"/> and there was one that was quite a long way apart <pause dur="1.0"/> but once you fold up the protein <pause dur="0.9"/> you

find that those three <pause dur="0.5"/> and they're represented here by green <pause dur="0.6"/> those three histidines <pause dur="2.0"/> fold together so they're very <pause dur="0.6"/> close to each other <pause dur="0.7"/> the they're adjacent amino acids then <pause dur="1.0"/> so in folding the protein we've brought the third histidine <pause dur="0.4"/> close to the first two <pause dur="0.5"/> which confirms now that you have <pause dur="0.6"/> three histidines together <pause dur="0.6"/> and it's further strong evidence <pause dur="0.6"/> that that <pause dur="0.5"/> is a histidine cluster <pause dur="0.2"/> so-called <pause dur="0.6"/> and that that could act as the site for binding the metal <pause dur="0.5"/> inside the protein <pause dur="1.8"/> and all of that can be done now with <pause dur="0.6"/> these very powerful modelling programmes that <pause dur="0.5"/> i'm sure <pause dur="0.3"/> if you haven't heard about them <gap reason="name" extent="2 words"/> will <pause dur="0.5"/> be showing you later <pause dur="3.0"/> so that was the <pause dur="1.0"/> the computer based research <pause dur="2.6"/> and # <pause dur="0.3"/> just spend a couple of minutes now with a <pause dur="0.5"/> sort of interlude before i go on to <pause dur="0.8"/> is that <pause dur="0.2"/> predicted research really been <pause dur="0.3"/> proven by <pause dur="0.5"/> what's happened recently <pause dur="1.7"/> and i'll go back a bit <pause dur="0.3"/> to the question <pause dur="1.6"/> that i was interested in personally <pause dur="0.4"/> which was if we could <pause dur="1.1"/> imagine that there is an evolutionary sequence

of these proteins that started <pause dur="1.1"/> somewhere with a <pause dur="0.5"/> ancestral a hypothetical ancestral protein <pause dur="0.6"/> that then evolved through a slime mould protein <pause dur="0.9"/> to things in lower plants and then eventually to seed proteins <pause dur="0.8"/> logic would say <pause dur="0.6"/> well all of those must have had some very ancient ancestry somewhere <pause dur="0.7"/> can can we identify <pause dur="0.7"/> what the oldest surviving member of this protein family is <pause dur="2.0"/> and <pause dur="0.6"/> most people would say well <pause dur="0.6"/> would you go back to the early plants or you go back to the early fungi or <trunc>f</trunc> <pause dur="0.5"/> from which <pause dur="0.4"/> plants are <pause dur="1.0"/> very distantly related <pause dur="1.1"/> but i wanted to push the boundary in time back a bit further <pause dur="0.4"/> and so <pause dur="0.3"/> i started to search for bacterial and <pause dur="0.5"/> primitive <pause dur="0.4"/> archaeol which is a <pause dur="0.5"/> # <pause dur="0.6"/> # a related form of primitive bacteria <pause dur="0.5"/> could you find these sorts of proteins <pause dur="0.9"/> # even further back in evolutionary time <pause dur="0.6"/> because after all the <pause dur="0.5"/> proteins that are found in plants and animals now <pause dur="0.6"/> didn't kind of arrive from outer space <vocal desc="laugh" iterated="n"/> <pause dur="0.4"/> unless you <pause dur="0.3"/> are # <pause dur="0.4"/> are a particularly strange religion they came from <pause dur="0.5"/> some <trunc>a</trunc> existing protein

structures <pause dur="0.8"/> so plants and animals didn't evolve a whole set of new protein structures <pause dur="1.3"/> they took <pause dur="0.4"/> existing ones from more primitive life forms and they <pause dur="0.5"/> amalgamated them they cut and they pasted them and they <pause dur="0.5"/> used them for different things but they didn't <pause dur="0.5"/> in many cases they didn't really invent <pause dur="1.1"/> new <pause dur="0.4"/> three-dimensional structures <pause dur="1.2"/> and if nothing else that's a sort of take home message that <pause dur="1.0"/> every protein in you or i or a <pause dur="0.4"/> vegetable <vocal desc="laugh" iterated="n"/><pause dur="0.6"/> really <pause dur="0.3"/> is made up out of <pause dur="0.4"/> quite a small number of <pause dur="0.2"/> three-dimensional structures <pause dur="0.5"/> there's probably <pause dur="0.6"/> you know you have about a hundred-thousand genes <pause dur="1.7"/> # plants have maybe forty or fifty-thousand genes <pause dur="0.7"/> but # <pause dur="0.2"/> each of which encodes <pause dur="0.5"/> a different protein <pause dur="1.2"/> but there aren't a hundred-thousand different proteins <pause dur="0.6"/> in terms of their structures in a <pause dur="0.3"/> in you and there aren't forty-thousand different proteins in terms of structure in a <pause dur="0.4"/> in a plant <pause dur="0.6"/> there are probably five-hundred to a thousand <pause dur="0.8"/> and all the other <pause dur="0.3"/> variation is just minor <pause dur="0.7"/> sort of tinkering with the

edges or <pause dur="0.4"/> duplications or <pause dur="0.7"/> taking a bit out of a <trunc>s</trunc> existing structure <pause dur="1.4"/> there's a <pause dur="0.2"/> really a <trunc>v</trunc> quite a small number of those underlying structures <pause dur="0.5"/> and now <pause dur="0.6"/> as more <pause dur="1.5"/> sort of organisms are being sequenced it becomes clear that <pause dur="0.8"/> in a <trunc>f</trunc> probably five years we'll know what all those structures are <pause dur="0.5"/> you'd be able to go and say <pause dur="0.7"/> there's <pause dur="0.3"/> you know there's five-hundred structures and they make make life <pause dur="0.2"/> whether it's bacterial or a human <pause dur="1.1"/> and everything else is just <pause dur="0.7"/> rearrangements of those existing <pause dur="0.5"/> it might be less than five-hundred eventually <pause dur="1.4"/> and so <pause dur="0.4"/> the conclusion must be <pause dur="1.0"/> that you will find in bacteria <pause dur="0.8"/> the underlying three-dimensional components of all other proteins that have <pause dur="0.2"/> been produced <pause dur="0.4"/> during evolution <pause dur="1.4"/> and that's in fact what we did do <pause dur="0.5"/> we went back and we <pause dur="0.4"/> looked in all the databases now <pause dur="0.7"/> there are fifty or sixty <pause dur="0.2"/> bacterial species where the complete gene sequence is known <pause dur="0.8"/> therefore you know <pause dur="0.3"/> all the genes therefore if you predict <pause dur="0.3"/> every protein sequence <pause dur="0.5"/> so you know that <pause dur="0.6"/> in E-coli or <pause dur="0.7"/> or B-subtilis

the two best known <pause dur="0.4"/> bacteria <pause dur="0.5"/> you <pause dur="0.2"/> can now predict <pause dur="0.4"/> the at least the primary sequence of every protein <pause dur="1.1"/> and people are now trying to model <pause dur="0.4"/> and predict the three-dimensional structure of every protein in an organism <pause dur="1.2"/> and in the future <pause dur="0.6"/> the idea will be that <pause dur="0.4"/> you will take a different cell from an organism and be able to say <pause dur="1.0"/> a skin cell of a human <pause dur="0.3"/> has this set of proteins and we know the structure of all of them <pause dur="0.6"/> so that <pause dur="0.4"/> that's <pause dur="0.4"/> not far-fetched people are doing that now <pause dur="1.3"/> so what we did was go back and say can we find these <pause dur="0.7"/> ancestors of these <pause dur="0.2"/> storage proteins of these germins in bacteria <pause dur="0.9"/> and <pause dur="1.4"/> as you might expect the similarities become more and more <pause dur="0.3"/> limited as the further <trunc>y</trunc> <pause dur="0.3"/> back in time you go <pause dur="0.4"/> so you have to look for <pause dur="0.2"/> key <pause dur="0.6"/> conserved amino acids <pause dur="0.6"/> and we knew <pause dur="0.3"/> from this analysis that <pause dur="0.4"/> clearly the conserved <pause dur="0.6"/> # <pause dur="0.8"/> histidines in the centre of the protein <pause dur="0.4"/> were <pause dur="0.2"/> some of the most functionally <pause dur="0.5"/> interesting of the amino acids <pause dur="0.3"/> because they're the ones that determine <pause dur="0.5"/> potentially the binding site to the

metals <pause dur="0.4"/> and potentially the <pause dur="0.6"/> the enzyme activity of the protein <pause dur="0.4"/><kinesic desc="changes transparency" iterated="y" dur="7"/> and this is a <pause dur="1.3"/> initially just a brief <pause dur="1.0"/> # <pause dur="0.3"/> outline of that <pause dur="1.4"/> there's lots of letters on here which are just sequences but <pause dur="2.8"/> we <trunc>c</trunc> we categorized this and attempted to categorize it to make the analysis easier <pause dur="0.7"/> and we divided <pause dur="0.5"/> the conserved areas up into two groups <pause dur="0.6"/> we said that there was a conserved <pause dur="1.5"/> # <pause dur="0.6"/> motif here <pause dur="0.5"/> with these two conserved histidines which are the grey boxes <pause dur="0.5"/> and there was a conserved motif here <pause dur="0.5"/> where the histidine was conserved all the way down there <pause dur="1.8"/> all the proteins at the top of this list are from bacteria <pause dur="1.5"/> and <pause dur="1.6"/> what we also have in here is a <pause dur="0.3"/> thing that i haven't mentioned <pause dur="0.4"/> which is a space between this motif and this one <pause dur="3.1"/> so in other words the two motifs <pause dur="1.2"/> were at different distances apart <pause dur="0.8"/> in the plant proteins <pause dur="0.2"/> the germins <pause dur="0.4"/> there are about <pause dur="0.4"/> twenty or twenty-five amino acids <pause dur="0.7"/> from the end of this motif to the beginning of this one <pause dur="1.0"/> in the storage proteins that can vary as well <pause dur="0.2"/> which are the <unclear>book</unclear>

bit on the bottom <pause dur="0.9"/> but in these primitive <pause dur="0.2"/> bacteria <pause dur="0.6"/> that distance was <pause dur="0.6"/> was less <pause dur="0.4"/> so again we have another <pause dur="1.3"/> kind of quantitative way of looking at <pause dur="0.3"/> a protein evolution that we have these two conserved motifs that both had histidines in <pause dur="0.2"/> that we knew when they folded up came together <pause dur="1.9"/>

that in <pause dur="0.2"/> plants were a certain distance apart in the linear sequence <pause dur="0.4"/> but if you look at the bacteria they were closer together <pause dur="1.4"/> so during evolution certain things had happened <pause dur="1.9"/> the two motifs <pause dur="0.5"/> had in effect moved apart in sequence <pause dur="1.8"/> the protein size itself had also changed because <pause dur="0.3"/> the bacterial proteins were only about a hundred amino acids in length in total <pause dur="1.3"/> whereas the plant ones were twice that length and <pause dur="0.3"/> and the protein <pause dur="0.5"/> some storage proteins were double that length <pause dur="1.1"/> so we had a <pause dur="1.1"/> a model now that said <pause dur="0.5"/> in ancient bacteria we had a <pause dur="0.3"/> <trunc>s</trunc> <pause dur="0.7"/> fairly small protein <pause dur="0.7"/> with these conserved amino acids in it <pause dur="1.0"/> as it <pause dur="0.7"/> moved from a bacteria to a fungus to a plant an animal <pause dur="3.0"/><kinesic desc="changes transparency" iterated="y" dur="5"/> and this is a representation of that <pause dur="1.1"/> one

important thing happened <pause dur="1.3"/> and again this is just a <pause dur="0.2"/> look at the shape rather than the detail <pause dur="0.7"/> the two motifs are the <pause dur="0.2"/> blocks <pause dur="0.2"/> where the yellow residues are <pause dur="0.6"/> these two motifs <pause dur="0.8"/> had moved apart <pause dur="1.4"/> and this is represented by the kind of <pause dur="0.4"/> tower in the middle <pause dur="1.3"/> the bacteria <pause dur="0.2"/> at the front at the top here <pause dur="0.9"/> plants and animals towards the bottom <pause dur="1.0"/> these residues are ones which had been inserted <pause dur="0.3"/> into the middle of a protein <pause dur="0.4"/> during the billions of years of evolution <pause dur="0.7"/> there were also residues <pause dur="0.5"/> stuck on at each end that i haven't shown <pause dur="0.3"/> at this end and that end <pause dur="1.9"/> but because we knew the significance of the two motifs and the histidine residues we could trace them <pause dur="0.9"/> but <pause dur="0.9"/> during evolutionary time <pause dur="0.2"/> proteins had become more and more complex <pause dur="0.4"/> they'd <pause dur="0.3"/> had extra residues inserted into them <pause dur="0.4"/> and they'd had extra residues stuck on either end <pause dur="0.4"/> and then eventually the whole protein had doubled <pause dur="0.6"/> and become a storage protein <pause dur="4.0"/><kinesic desc="changes transparency" iterated="y" dur="16"/> so we're getting kind of to the end of the <pause dur="0.7"/> the story now but <pause dur="0.6"/> # <pause dur="4.5"/> i just want to show you # <pause dur="1.2"/>

this which was the next attempt at <pause dur="0.7"/> at our model of <pause dur="1.0"/> what <pause dur="0.7"/> the <pause dur="0.5"/> the germin the oxalate oxidase might look like in real life i've told you what the computer said it would look like <vocal desc="laugh" iterated="n"/><pause dur="0.5"/> i've told you the prediction of how it might have changed during <pause dur="0.4"/> evolutionary time <pause dur="0.8"/> but what did it really look like <pause dur="1.5"/> and i go back to the comment that said <pause dur="0.4"/> this protein was a <pause dur="0.6"/> multimeric protein it had different subunits in it <pause dur="1.7"/> for many years for about ten years <pause dur="1.0"/> the biochemists in Canada had said <pause dur="0.9"/> we think that <pause dur="0.8"/> the germin protein is made out of five subunits because when we <pause dur="0.2"/> separate them which you can do <pause dur="0.3"/> we get kind of five and we <trunc>lo</trunc> if we measure the molecular weight <pause dur="0.5"/> we get something that says <pause dur="0.5"/> the molecular weight of the total protein's five times the weight of the subunit <pause dur="0.6"/> and that was <pause dur="0.4"/> what the computer said would be the model of <pause dur="0.4"/> five subunits stuck together <pause dur="2.4"/> we became a bit doubtful about whether that was valid because <pause dur="0.4"/> we already knew from the <pause dur="0.6"/> storage protein structures <pause dur="0.9"/> that <trunc>s</trunc> <pause dur="0.3"/> storage

proteins in seeds <pause dur="1.3"/> are composed of three subunits <pause dur="1.5"/> and we've <pause dur="0.2"/> if you remember that we said that <pause dur="0.4"/> each subunit is about twice as big as <pause dur="0.5"/> the germin subunit <pause dur="1.0"/> so common sense <vocal desc="laugh" iterated="n"/><pause dur="0.2"/> if you believe in biology having common sense would <trunc>s</trunc> <pause dur="0.3"/> argue that <pause dur="1.4"/> if we know that there's a structure of three subunits in each one is twice the size of the one we're interested in <pause dur="0.5"/> it's kind of obvious that say <pause dur="0.4"/> well wouldn't it <pause dur="0.2"/> make more sense to have six subunits of similar <pause dur="0.4"/> size <pause dur="0.3"/> that would then give an equivalent shape <pause dur="0.9"/> if <pause dur="0.3"/> evolution had conserved shapes and i've argued that <pause dur="0.4"/> evolution does conserve protein shapes <pause dur="1.3"/> and the computer model then said <pause dur="1.0"/> if we had <pause dur="0.6"/> in fact six <pause dur="0.6"/> subunits in our shape <pause dur="1.4"/> we would have then have something that we'd described as a <pause dur="0.2"/> trimer of dimers because <pause dur="0.4"/> you have this triangular <pause dur="0.9"/> shape <pause dur="0.8"/> so there's two here two here and two here <pause dur="0.6"/> so it's not <pause dur="0.7"/> there isn't a sixfold axis of symmetry there's a threefold axis of symmetry <pause dur="0.4"/> so <pause dur="0.6"/> and <pause dur="0.4"/> that shape would look very very similar to what we have in a seed

in a storage protein <pause dur="0.7"/> but here we'd have six bits rather than <pause dur="0.6"/> than three <pause dur="0.2"/> double-sized bits <pause dur="0.7"/> and that's the <pause dur="0.4"/> kind of simple maths of it <pause dur="1.0"/><kinesic desc="changes transparency" iterated="y" dur="12"/> so those were our two working hypotheses <pause dur="0.3"/> and <pause dur="0.4"/> the Canadian group were said <pause dur="0.3"/> oh sniff <pause dur="0.6"/> we've spent ten years and we've said it's a pentamer 'cause if you measure the weight <pause dur="0.6"/> then that tells you it's a pentamer <pause dur="1.7"/> and # <pause dur="1.2"/> what i'll now do is show you <pause dur="0.9"/> how we resolved that <pause dur="1.5"/> and we did it through <pause dur="0.6"/> conventional crystallography <pause dur="1.1"/> we had a <pause dur="0.9"/> student who's just finished <trunc>s</trunc> <pause dur="0.2"/> his PhD successfully <pause dur="0.3"/> who crystallized <pause dur="0.9"/> the germin-like protein the <pause dur="0.4"/> the oxalate oxidase from barley <pause dur="1.3"/> he purified it and purified it and eventually got a <pause dur="0.8"/> a <pause dur="0.5"/> a source of protein that was sufficiently pure to <pause dur="0.2"/> to produce crystals from it <pause dur="0.5"/> and that was a lot harder in this case than <pause dur="0.6"/> than in most cases and i won't go into the biochemistry but <pause dur="0.4"/> eventually <pause dur="0.5"/> he found us <pause dur="0.3"/> a crystal <pause dur="0.3"/> that was good enough to <pause dur="0.6"/> be able to resolve <pause dur="0.4"/> in the <pause dur="0.8"/> in the X-ray beams that you use for this sort of thing <pause dur="1.0"/>

and this is <pause dur="0.3"/> it's not published yet so <pause dur="0.5"/> not many people in the world have ever seen it before but <pause dur="0.5"/> this is his resolution <pause dur="0.9"/> of that <pause dur="0.8"/> now the definitive <pause dur="0.6"/> three-dimensional structure <pause dur="1.0"/> of oxalate oxidase from barley <pause dur="1.9"/> and <pause dur="0.6"/> if we get it the right way round <pause dur="2.0"/> well if there is a right way round <pause dur="1.8"/> you see there are six colours <pause dur="0.6"/> so we've confirmed absolutely that it is a hexamer <pause dur="0.2"/> it's made of six subunits <pause dur="4.1"/> there are some other very <pause dur="1.0"/> unusual or <pause dur="0.2"/> sort of key features about this that help to explain <pause dur="1.1"/> its <pause dur="0.5"/> # its biological <pause dur="0.5"/> and its chemical properties <pause dur="1.7"/> one is that <pause dur="1.2"/> if you <pause dur="0.3"/> just take for example there's three corners here <pause dur="0.9"/> we've got one <pause dur="0.9"/> two and three <pause dur="0.6"/> these are corners where this subunit <pause dur="0.6"/> the light blue one <pause dur="0.6"/> interacts with the dark blue one <pause dur="0.9"/> and they are held together <pause dur="0.8"/> by <pause dur="1.0"/> very <pause dur="0.6"/> tight <pause dur="0.4"/> linking of the <pause dur="0.4"/> of the helical <pause dur="0.4"/> # <pause dur="0.6"/> ends of the each subunit so <pause dur="0.3"/> they're called sort of <pause dur="0.2"/> <trunc>a</trunc> <pause dur="0.6"/> alpha helical clasps they <pause dur="0.4"/> join together <pause dur="0.5"/> very tightly <pause dur="0.9"/> so first of all it's held very tightly at three corners <pause dur="1.3"/> it's also <pause dur="0.2"/> stuck

together in effect by <pause dur="0.7"/> the centre of the protein <pause dur="0.9"/> this is the beginning of it this is the N-terminus of it <pause dur="0.7"/> the C-terminus is the bit down here <pause dur="2.5"/> the N-terminus <pause dur="0.9"/> is held together <pause dur="0.5"/> in this case the dark blue <pause dur="0.9"/> subunit <pause dur="0.4"/> is attached <pause dur="0.2"/> to this # magenta coloured one <pause dur="1.5"/> and it's held together by very strong bonds between <pause dur="0.7"/> the the amino acids in the centre here <pause dur="0.8"/> so you have <pause dur="1.1"/> these subunits <pause dur="0.4"/> at each corner which are <pause dur="0.2"/> holding each other together tightly <pause dur="0.7"/> you also have <pause dur="1.4"/> the other <pause dur="0.4"/> alternative <pause dur="0.3"/> groupings of this subunit this one and this one and this one or this one <pause dur="0.3"/> are holding each other together <pause dur="0.4"/> so you have a <pause dur="0.2"/> a intensively strong relationship that <pause dur="0.4"/> holds these <pause dur="0.2"/> different units together <pause dur="0.4"/> and that's characteristic of <gap reason="inaudible due to background noise" extent="1 word"/> the fact <pause dur="0.3"/> that this is very thermally stable <pause dur="0.7"/> it's withstands eighty degrees and it still survives <pause dur="0.5"/> so it takes a lot of energy to break those <pause dur="0.7"/> there's the <pause dur="0.7"/> links between it <pause dur="1.3"/> also <pause dur="0.3"/> it has the highest amount of <pause dur="0.6"/> hidden <pause dur="0.3"/> surface if you can imagine this is just on a flat surface

but <pause dur="0.2"/> you can look at this in three dimensions but <pause dur="0.6"/> a lot of the <pause dur="0.5"/> the surface of each monomer <pause dur="0.7"/> as you join it to the next monomer <pause dur="0.8"/> is not <pause dur="0.6"/> therefore exposed to the outside world to the solvents around the protein any longer <pause dur="1.3"/> so as you stick things together if you can <pause dur="0.5"/> hide <pause dur="0.3"/> the surface <pause dur="0.2"/> by sticking them together <pause dur="0.4"/> you reduce the exposure of the whole protein to the solvents around it <pause dur="1.0"/> and this has more than half of <pause dur="0.5"/> the area of each <pause dur="0.2"/> subunit hidden <pause dur="0.5"/> by the association <pause dur="0.5"/> and that's <pause dur="1.3"/> # i don't know whether it's the world record but it's close to the world record of proteins of <pause dur="0.8"/> of hiding <pause dur="0.3"/> surface <pause dur="0.5"/> by <pause dur="0.2"/> by assembling into a <pause dur="1.0"/> # into a larger <pause dur="0.7"/> # <pause dur="0.2"/> order protein <pause dur="0.7"/> and because <pause dur="0.2"/> it doesn't have <pause dur="1.0"/> very many surface loops on the surface these are the <pause dur="0.7"/> the strands that <trunc>joi</trunc> sorry the loops that join the strands together <pause dur="0.4"/> because these are not very many or large <pause dur="0.8"/> and where they are large they're hidden in the middle <pause dur="0.7"/> means that if you want to dissolve this protein with a protease <pause dur="1.0"/> you don't have many sites for the proteases to attack <pause dur="0.6"/>

so in other words it's a <pause dur="0.4"/> it becomes resistant to degredation by <pause dur="0.3"/> protein attack <pause dur="0.4"/> so it can withstand all sorts of chemical <pause dur="0.5"/> thermal <pause dur="0.9"/> and and other physical <pause dur="0.3"/> breakdown because it's such a tightly conserved <pause dur="1.0"/> and that helps you understand why it's evolved <pause dur="1.4"/> so successfully during <pause dur="0.3"/> throughout # <pause dur="0.4"/> time <pause dur="1.7"/> that in a seed what you want <pause dur="1.2"/> in <pause dur="0.4"/> the proteins in seeds <pause dur="0.2"/> seeds of the dried up part of the plant they have to withstand dehydration desiccation <pause dur="0.4"/> they have to withstand high temperatures <pause dur="1.2"/> and so the functional characteristics of this <pause dur="0.5"/> whole protein <pause dur="0.4"/> superfamily <pause dur="1.9"/> have these different <pause dur="0.2"/> characteristics that <pause dur="1.4"/> it started off in a primitive bacteria <pause dur="0.3"/> as quite a small protein but it had this probably <pause dur="0.4"/> the same thermally stable <pause dur="0.3"/> structure <pause dur="0.8"/> and during evolution <pause dur="1.4"/> where you'd need <pause dur="0.2"/> a desiccation tolerant thermally stable <pause dur="0.4"/> protein structure <pause dur="0.5"/> it's a lot easier to use one that exists in that organism <pause dur="0.4"/> rather than to invent one <pause dur="0.5"/> and it's a bit teleological but <pause dur="0.6"/> # <pause dur="1.1"/> plants <pause dur="0.7"/> in seeds have taken this <pause dur="0.4"/> desiccation tolerant

protein <pause dur="0.3"/> and multiplied it up enormously <pause dur="1.1"/> and they've used it <pause dur="0.5"/> for a different purpose <pause dur="0.8"/> what i'm <trunc>ju</trunc> <pause dur="0.2"/> going to say now explains two other bits of the <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="16"/> biology <pause dur="1.8"/> and that is to take <pause dur="3.9"/> in in effect a third of that <pause dur="0.2"/> structure that you saw before <pause dur="0.5"/> and i'm going to compare it exactly <pause dur="0.5"/> with a <pause dur="0.2"/> one unit of a storage protein <pause dur="1.5"/> okay <pause dur="0.6"/> so if you can imagine the top here is oxalate oxidase but it's <pause dur="0.6"/> it's a third of the hexamer it's two <pause dur="0.2"/> subunits <pause dur="0.6"/> and we're comparing it directly <pause dur="0.7"/> with one subunit from the storage protein <pause dur="0.7"/> and you can see and you can superimpose this on this <pause dur="0.3"/> and they're almost indistinguishable <pause dur="0.7"/> so although in primary sequence <pause dur="0.5"/> if you match this to this you'd have less than <pause dur="0.5"/> twenty per cent similarity <pause dur="0.6"/> we know that the conservation is <pause dur="0.5"/> # <pause dur="0.3"/> important areas <pause dur="1.5"/> we we've got the helices here <pause dur="1.0"/> in the same place <pause dur="1.1"/> so we've got absolute now structural confirmation that our hypothesis that storage proteins were related to this is confirmed <pause dur="0.4"/> by real measurement <pause dur="0.4"/> in space <pause dur="1.0"/> and the two other <pause dur="0.3"/>

bits that i haven't mentioned are <pause dur="1.0"/> if you can see the green blobs in the middle here <pause dur="0.4"/> that is our metal <pause dur="2.6"/> there's one metal in each subunit <pause dur="0.5"/> this is the <pause dur="0.2"/> the metal that's held together by the histidine residues <pause dur="0.7"/> so manganese that's why i was <trunc>k</trunc> <vocal desc="laugh" iterated="n"/><pause dur="0.4"/> keen on mentioning manganese at the beginning <pause dur="0.5"/> so manganese containing oxidase <pause dur="1.0"/> it's a unique manganese containing oxidase 'cause there'd never been any described like it before <pause dur="2.9"/> storage proteins have one histidine <pause dur="0.4"/> i didn't stress that but you if you'd counted the number you might have seen that <pause dur="0.9"/> storage proteins have one histidine <pause dur="0.2"/> they have no metal <pause dur="0.5"/> so they've lost the two other histidines during evolution <pause dur="0.7"/> they've <pause dur="0.3"/> preserved the structure <pause dur="0.7"/> they've lost the metal <pause dur="0.4"/> they're no longer an enzyme <pause dur="1.0"/> so they don't have a <trunc>f</trunc> <pause dur="0.4"/> a chemical function <pause dur="0.4"/> they act as a store of amino acids in a seed <pause dur="1.0"/> so <pause dur="0.3"/> what you eat <pause dur="0.2"/> your diet <pause dur="1.2"/> is made up out of <pause dur="0.4"/> in effect <pause dur="0.2"/> deactivated enzymes <pause dur="1.7"/> that have gone through evolution <pause dur="0.5"/> by maintaining a structure

that <pause dur="0.5"/> can withstand heat <pause dur="0.4"/> and temperature <pause dur="0.7"/> but it's lost its enzyme activity by losing the <pause dur="0.4"/> histidines that bind the metal <pause dur="1.8"/> the other point i said <pause dur="0.5"/> was if you imagine the <pause dur="0.4"/> two motifs i said they moved apart in evolution <pause dur="1.6"/> they did move apart <pause dur="1.2"/> and that's represented by this loop here <pause dur="1.7"/> the loop here <pause dur="0.2"/> is the distance between the conserved areas <pause dur="0.3"/> and this loop <pause dur="0.4"/> can really be quite large <pause dur="0.3"/> without disturbing at all <pause dur="0.4"/> the structure of the protein <pause dur="1.0"/> so <pause dur="0.4"/> this loop and some of the other loops <pause dur="0.3"/> have changed in size <pause dur="0.3"/> but they haven't altered <pause dur="0.5"/> the structure <pause dur="0.7"/> and as an aside <pause dur="0.6"/> if you're interested in <pause dur="1.1"/> in food studies at all <pause dur="0.8"/> then you know that <pause dur="0.2"/> some storage proteins in seeds are powerful <pause dur="0.2"/> allergens <pause dur="1.0"/> and the best known of those is the peanut allergen <pause dur="0.3"/> if you <pause dur="0.7"/> if any of you are allergic to nuts <pause dur="1.0"/> very dangerous for some people <pause dur="1.1"/> part of the reason for that is that the peanut allergen <pause dur="0.5"/> has a very large loop in this position <pause dur="0.6"/> and that the allergenic <pause dur="0.5"/> amino acids <pause dur="0.4"/> are <pause dur="0.7"/> in these loopy areas <pause dur="1.1"/> so during evolution some

subset of storage proteins have become allergenic <pause dur="0.5"/> by putting in unfortunately for humans <vocal desc="laugh" iterated="n"/><pause dur="1.0"/> rather unpleasant amino acids here <pause dur="1.1"/> that can be toxic to people <pause dur="0.4"/> but now we understand the structure <pause dur="0.7"/> there are <pause dur="0.4"/> # <pause dur="0.2"/> G-M people <pause dur="0.7"/> who are modifying peanut proteins to remove those loops <pause dur="0.3"/> and therefore remove the <kinesic desc="changes transparency" iterated="y" dur="19"/> allergic potential <pause dur="0.7"/> of peanuts <pause dur="0.5"/> so <pause dur="0.5"/> the summary now says <pause dur="4.6"/> that <pause dur="2.3"/> # <pause dur="2.0"/> which of these shall i show you <pause dur="0.5"/> is that something like this happened in time <pause dur="1.2"/> # <pause dur="2.4"/> this is a <trunc>f</trunc> <pause dur="0.4"/> brief phylogeny then of the whole story <pause dur="1.8"/> but you had <pause dur="0.5"/> archaeol species you had <pause dur="0.4"/> bacterial species <pause dur="0.7"/> green bacteria <pause dur="2.1"/> fungi plants ferns <pause dur="0.5"/> the it doesn't have animals on here <pause dur="0.4"/> animals also have <pause dur="0.5"/> proteins that are related to C-storage proteins <pause dur="0.5"/> nobody knows what they do yet <vocal desc="laugh" iterated="n"/><pause dur="0.5"/> but if you look in a <pause dur="0.9"/> in a human or in a <pause dur="0.2"/> nematode worm <pause dur="0.5"/> they have a <trunc>s</trunc> protein sequence that's quite like the storage proteins <pause dur="0.5"/> we haven't got a clue what it does in an animal <vocal desc="laugh" iterated="n"/><pause dur="0.4"/> 'cause # <pause dur="1.0"/> we we suspect it's #

something to do with <pause dur="0.3"/> with desiccation tolerance but <pause dur="0.6"/> we don't know yet <pause dur="0.9"/> the other thing is that at certain times in evolution <pause dur="0.6"/> we had a duplication event <pause dur="0.6"/> this duplication event led to C-storage proteins there was another one i haven't had time to talk about <pause dur="0.5"/> at the beginning here <pause dur="0.4"/> that led to a different group of proteins <pause dur="0.8"/> and amongst the ones that <pause dur="0.7"/> this duplication led to <pause dur="0.5"/> were <pause dur="0.3"/> the other oxalate oxidase sorry <pause dur="0.3"/> the other oxalate degrading enzyme i showed you <pause dur="0.3"/> right back at the beginning <pause dur="0.6"/> so although we started ten years ago <pause dur="0.2"/> nearly <pause dur="0.9"/> with the choice between <pause dur="0.9"/> should we use oxalate oxidase or oxalate decarboxylase <pause dur="0.5"/> what we didn't have was <pause dur="0.4"/> any clue that in fact <pause dur="0.4"/> the two enzymes <pause dur="0.4"/> are probably very closely related <pause dur="0.7"/> but <pause dur="0.7"/> we go back to here <pause dur="0.8"/> we now know through this evolutionary analysis <pause dur="0.7"/> that oxalate decarboxylase <pause dur="0.4"/> is a duplicated version <pause dur="0.6"/> of oxalate oxidase <pause dur="0.8"/> it's very limited <pause dur="0.3"/> in its <trunc>co</trunc> <pause dur="0.2"/> conservation <pause dur="0.5"/> but we now know that this <pause dur="0.3"/> is twice of <pause dur="0.9"/> of the size of that <pause dur="0.3"/> and it's a member of the same superfamily so there's a certain <pause dur="0.4"/> kind of symmetry in the story <pause dur="0.7"/> that says <pause dur="1.1"/> throughout all of

this <pause dur="0.7"/> we followed a <pause dur="0.3"/> kind of academic analysis but it's led to an understanding of <pause dur="1.0"/> of conservation of function <pause dur="0.6"/> of conservation of <pause dur="0.8"/> # <pause dur="1.6"/> in some cases conservation of <trunc>s</trunc> sorry conservation of structure <pause dur="1.3"/> but at <pause dur="0.4"/> of a rather broad diversification of function <pause dur="0.6"/> and that's the <pause dur="0.5"/> a message in all of these evolutionary studies <pause dur="0.9"/> that you can start from very <pause dur="0.8"/> apparently very different and distantly related proteins and <pause dur="0.7"/> if you know the structure <pause dur="0.7"/> that's the key thing <pause dur="0.9"/> then you can find that <pause dur="0.5"/> the diversification isn't very that great <pause dur="0.3"/> and that lots of proteins <pause dur="0.5"/> are really members of this small subset of families <pause dur="0.7"/> so # <pause dur="0.8"/> i should finish there <pause dur="0.7"/> i'm sorry about the confusion for the <vocal desc="laugh" iterated="n"/> at the beginning <pause dur="0.9"/> # <pause dur="0.6"/><vocal desc="laugh" iterated="n"/><pause dur="0.9"/> <gap reason="name" extent="2 words"/> is <pause dur="0.2"/> obviously the expert in the modelling and i'm sure he's going to <pause dur="0.6"/> tell you and and show you how some of the <pause dur="1.0"/> # these techniques can be used <pause dur="1.1"/> but this is a <pause dur="0.2"/> i think an <pause dur="0.5"/> an interesting framework to build from <pause dur="1.1"/> and <pause dur="0.5"/> i should finish there <pause dur="0.3"/> anyway <pause dur="0.9"/> thank you <gap reason="name" extent="1 word"/> i hope you have every word of that <vocal desc="laughter" iterated="y" dur="1"/>