Skip to main content

pslct008

<?xml version="1.0"?>

<!DOCTYPE TEI.2 SYSTEM "base.dtd">

<TEI.2><teiHeader>

<fileDesc>

<titleStmt>

<title>Searching</title></titleStmt>

<publicationStmt><distributor>BASE and Oxford Text Archive</distributor>

<idno>pslct008</idno>

<availability><p>The British Academic Spoken English (BASE) corpus was developed at the

Universities of Warwick and Reading, under the directorship of Hilary Nesi

(Centre for English Language Teacher Education, Warwick) and Paul Thompson

(Department of Applied Linguistics, Reading), with funding from BALEAP,

EURALEX, the British Academy and the Arts and Humanities Research Board. The

original recordings are held at the Universities of Warwick and Reading, and

at the Oxford Text Archive and may be consulted by bona fide researchers

upon written application to any of the holding bodies.

The BASE corpus is freely available to researchers who agree to the

following conditions:</p>

<p>1. The recordings and transcriptions should not be modified in any

way</p>

<p>2. The recordings and transcriptions should be used for research purposes

only; they should not be reproduced in teaching materials</p>

<p>3. The recordings and transcriptions should not be reproduced in full for

a wider audience/readership, although researchers are free to quote short

passages of text (up to 200 running words from any given speech event)</p>

<p>4. The corpus developers should be informed of all presentations or

publications arising from analysis of the corpus</p><p>

Researchers should acknowledge their use of the corpus using the following

form of words:

The recordings and transcriptions used in this study come from the British

Academic Spoken English (BASE) corpus, which was developed at the

Universities of Warwick and Reading under the directorship of Hilary Nesi

(Warwick) and Paul Thompson (Reading). Corpus development was assisted by

funding from the Universities of Warwick and Reading, BALEAP, EURALEX, the

British Academy and the Arts and Humanities Research Board. </p></availability>

</publicationStmt>

<sourceDesc>

<recordingStmt>

<recording dur="00:49:46" n="6461">

<date>31/01/2001</date><equipment><p>video</p></equipment>

<respStmt><name>BASE team</name>

</respStmt></recording></recordingStmt></sourceDesc></fileDesc>

<profileDesc>

<langUsage><language id="en">English</language>

</langUsage>

<particDesc>

<person id="nf0704" role="main speaker" n="n" sex="f"><p>nf0704, main speaker, non-student, female</p></person>

<person id="sm0705" role="participant" n="s" sex="m"><p>sm0705, participant, student, male</p></person>

<person id="sm0706" role="participant" n="s" sex="m"><p>sm0706, participant, student, male</p></person>

<person id="sm0707" role="participant" n="s" sex="m"><p>sm0707, participant, student, male</p></person>

<person id="sm0708" role="participant" n="s" sex="m"><p>sm0708, participant, student, male</p></person>

<person id="sm0709" role="participant" n="s" sex="m"><p>sm0709, participant, student, male</p></person>

<person id="sm0710" role="participant" n="s" sex="m"><p>sm0710, participant, student, male</p></person>

<person id="sm0711" role="participant" n="s" sex="m"><p>sm0711, participant, student, male</p></person>

<person id="sm0712" role="participant" n="s" sex="m"><p>sm0712, participant, student, male</p></person>

<person id="sm0713" role="participant" n="s" sex="m"><p>sm0713, participant, student, male</p></person>

<person id="sm0714" role="participant" n="s" sex="m"><p>sm0714, participant, student, male</p></person>

<person id="sm0715" role="participant" n="s" sex="m"><p>sm0715, participant, student, male</p></person>

<personGrp id="ss" role="audience" size="m"><p>ss, audience, medium group </p></personGrp>

<personGrp id="sl" role="all" size="m"><p>sl, all, medium group</p></personGrp>

<personGrp role="speakers" size="14"><p>number of speakers: 14</p></personGrp>

</particDesc>

<textClass>

<keywords>

<list>

<item n="speechevent">Lecture</item>

<item n="acaddept">Computer Science</item>

<item n="acaddiv">ps</item>

<item n="partlevel">UG2</item>

<item n="module">Design methods</item>

</list></keywords>

</textClass>

</profileDesc></teiHeader><text><body>

<u who="nf0704"> yep <pause dur="7.3"/> okay <pause dur="0.2"/> today we're going to talk about searching over the last few weeks i've talked to you about sorting <pause dur="0.4"/> and # <pause dur="0.5"/> having sorted your list <pause dur="0.2"/> or having <pause dur="0.4"/> not bothered to sort it you might want to look for a particular item <pause dur="0.8"/> so i'm going to talk through the subject today <pause dur="0.4"/> what is searching and i'll give you some examples of searching algorithms <pause dur="1.8"/> next Wednesday i'll give you some examples of <pause dur="0.2"/> even cleverer techniques to find what you want quickly <pause dur="1.2"/> so we've done sorting we'll get on to searching <pause dur="1.3"/> <gap reason="name" extent="1 word"/> is sort of keeping up with this in the formal methods stream <pause dur="0.3"/> in the fact that he's talking about some things and then he's going to be talking about the formal specifications <pause dur="0.2"/> of sorting <pause dur="0.6"/> so hopefully it's coming together at least in our minds and hopefully in your minds <pause dur="0.9"/> so <pause dur="0.3"/> searching <pause dur="4.4"/><kinesic desc="changes transparency" iterated="y" dur="12"/> computer's still <pause dur="0.2"/> not working <pause dur="6.4"/> okay i guess you know what searching is really <pause dur="0.5"/> we're looking at locating a particular item in a list <pause dur="1.3"/> we're looking <pause dur="2.3"/> if you've

got a list of items you've got a pack of playing cards and you're looking for a particular one <pause dur="1.2"/> then <pause dur="0.5"/> you can perhaps look at every one <pause dur="0.2"/> until you find <pause dur="0.2"/> the one you want <pause dur="0.2"/> oh look there's the ace of spades <pause dur="2.3"/> if the list is already sorted <pause dur="0.9"/> so you've already got your cards in suit order and rank order <pause dur="0.8"/> # <pause dur="0.6"/> you can look through the entire list until you find either the one you want <pause dur="0.5"/> or something that's bigger <pause dur="0.5"/> oh we've found the two of spades and <pause dur="0.2"/> it's the ace mustn't be here <pause dur="1.3"/> so that's always the problem with searching <pause dur="0.2"/> you hope what you want is there <pause dur="0.3"/> but it might not be <pause dur="2.9"/> and you can get into really clever ideas because if you know it's sorted <pause dur="0.9"/> you can just flick through saying oh there's the hearts there's the clubs oh here's the spades let's look for the one i want <pause dur="0.9"/> so <pause dur="0.2"/> using the knowledge of the order <pause dur="0.2"/> to find what you want quicker <pause dur="1.6"/> so that's where we're going <pause dur="7.1"/><kinesic desc="changes transparency" iterated="y" dur="5"/> when we did sorting we <pause dur="0.2"/> looked at T-string list sort <pause dur="0.3"/> and we sorted on <pause dur="1.1"/> strings mainly didn't we <pause dur="0.2"/> so that <pause dur="0.5"/> we were looking to get the strings in

order <pause dur="1.1"/> today i'm going to look at <pause dur="0.2"/> numbers in particular integers so we're going to be searching for integers <pause dur="0.6"/> # i wondered whether i should have <pause dur="0.2"/> gone with the same idea as whether we should have been looking for strings and i got caught out <pause dur="0.6"/> # <pause dur="1.4"/> i had a random generator which was generating <pause dur="1.9"/> numbers between <pause dur="0.3"/> one <pause dur="0.2"/> and twenty <pause dur="0.9"/> # <pause dur="0.2"/> eighteen <pause dur="0.5"/> twenty <pause dur="1.2"/> five <pause dur="0.7"/> twelve <pause dur="0.8"/> and <pause dur="0.2"/> when i used string list sort on it <pause dur="0.8"/> it <pause dur="0.2"/> gave me <pause dur="0.2"/> an ordering <pause dur="0.2"/> like this one eleven <pause dur="0.3"/> twelve <pause dur="1.2"/> eighteen <pause dur="1.6"/> # <pause dur="0.5"/> two <pause dur="0.3"/> twenty <pause dur="0.8"/> because when you're handling strings <pause dur="0.4"/> it's looking at the strings and <pause dur="0.6"/> <trunc>i</trunc> <pause dur="0.2"/> it sees one as being followed by one followed by something else <pause dur="0.4"/> so don't get caught out by that <pause dur="0.8"/> # if you're going to use a sort make sure you're using the right sort of sort <pause dur="0.7"/> # <pause dur="1.0"/> i've put my searching algorithm to work working on my sort puzzles <pause dur="0.5"/> so <pause dur="0.2"/> i i hacked together a new sort routine that sorts integers into the proper <pause dur="1.7"/> range #<pause dur="0.3"/> and <pause dur="0.2"/> you can use that in the examples that we have <pause dur="0.6"/> today </u><pause dur="1.0"/> <u who="sm0705" trans="pause"> <event desc="student enters room" iterated="n"/> sorry </u><pause dur="2.9"/> <u who="nf0704" trans="pause"> there's some notes there <pause dur="0.5"/> okay <pause dur="1.9"/> two of the techniques that we

look at <pause dur="0.4"/> you can generalize to string you can change <pause dur="0.2"/> whatever string comparison you're doing to it and see compare text and you can do the same <pause dur="3.5"/> now yes <pause dur="0.2"/> little message in there about case sensitivity <pause dur="0.5"/> # you got to remember if you're sorting strings <pause dur="0.8"/> how have you handled case sensitivity have you put capital-A <pause dur="0.6"/> way up front and small-A <pause dur="0.6"/> after capital-Z <pause dur="0.5"/> have you got small-A <pause dur="0.2"/> and capital-A next to each other <pause dur="0.7"/> if you have <pause dur="0.2"/> then obviously your sorting technique has to work in the same way <pause dur="0.7"/> as your searching technique <pause dur="0.2"/> 'cause if they're different your algorithms will be up the chute <pause dur="1.9"/> so searching for numbers <pause dur="0.2"/> exhaustive and binary will generalize to string <pause dur="0.5"/> interpolation which is <pause dur="0.2"/> tricky <pause dur="0.2"/> but fast <pause dur="0.5"/> works with numbers and it doesn't work well with strings <pause dur="0.3"/> so <pause dur="0.6"/> if you're looking to search for strings <pause dur="0.4"/> generalize exhaustive or binary <pause dur="0.2"/> don't try with interpolation <pause dur="0.4"/> unless you're very clever <pause dur="0.2"/> in which case you can probably will do it <pause dur="1.9"/><kinesic desc="changes transparency" iterated="y" dur="9"/> i tried i couldn't do it <pause dur="5.6"/> okay in the same way as

we've done over the last few weeks what i've <pause dur="0.2"/> done <pause dur="0.4"/> in the hope we'd <trunc>na</trunc> have a nice computer to show them working <pause dur="0.5"/> is <pause dur="0.3"/> i've taken <pause dur="0.2"/> T-string list <pause dur="0.4"/> and i've inherited a new class in this example it's called <pause dur="0.4"/> T-S-string list <pause dur="1.0"/> i'm using a different class to the one we were using for sorting <pause dur="0.8"/> and <pause dur="0.6"/> i'm doing this <pause dur="0.3"/> in a vaguely object oriented manner <pause dur="0.9"/> okay <pause dur="0.4"/> # <pause dur="0.3"/> the function name's a bit long and it's falling off the end of the slide <pause dur="0.6"/> # <pause dur="0.4"/> but <pause dur="0.2"/> we we can go through what it is <pause dur="0.4"/> it's a function <pause dur="0.3"/> exhaustive so it's a method of this class <pause dur="0.5"/> it's taking a single parameter target <pause dur="0.2"/> which is what we're searching for <pause dur="0.9"/> and it's returning an integer <pause dur="1.5"/> okay <pause dur="1.3"/> little bit of comment telling you what it does returns the position of the first occurrence of target in the list <pause dur="0.8"/> if it's not present <pause dur="0.2"/> a negative value is returned <pause dur="0.8"/> my error handling <pause dur="1.1"/> okay <pause dur="0.4"/> local variable here <pause dur="0.5"/> integer <pause dur="0.6"/> begin end <pause dur="0.5"/> code on the next slide <pause dur="11.6"/><kinesic desc="changes transparency" iterated="y" dur="10"/> okay degree of <pause dur="0.2"/> inconsistency with my <pause dur="0.3"/> coding remember with functions in Pascal <pause dur="0.5"/> you either say the function name <pause dur="1.2"/> and the return value <pause dur="0.4"/>

or you say result <pause dur="0.7"/> on the return value <pause dur="0.7"/> good programmers wouldn't mix it <pause dur="0.2"/> in <pause dur="0.2"/> one's <pause dur="0.6"/> bit of <pause dur="0.2"/> program so we probably <pause dur="0.4"/> should change that one to say result <pause dur="0.5"/> although <pause dur="0.2"/> the effect is exactly the same <pause dur="3.1"/> so result takes the value <pause dur="0.2"/> minus-count <pause dur="0.8"/> so if we get to the end of a this function and we haven't changed the result of a function it's going to return a negative value <pause dur="1.1"/> i've said minus-count remember count is the number of items in your string list <pause dur="1.9"/> okay <pause dur="1.2"/> a for loop we're going to go through the loop going to go zero to count-minus-one <pause dur="1.4"/> list start at zero <pause dur="0.3"/> they go to count-minus one so they've got count elements <pause dur="1.0"/> okay <pause dur="0.7"/> and we say <pause dur="0.5"/> if <pause dur="0.9"/> we're working with integers so i'm converting this string to an integer <pause dur="0.5"/> is equal to a target <pause dur="0.3"/> what we're looking for <pause dur="0.5"/> if this item's <pause dur="0.2"/> equal to the target <pause dur="0.5"/> put the result <pause dur="0.6"/> equal to I and exit <pause dur="0.6"/> someone said what's exit mean <pause dur="0.8"/> in Pascal <pause dur="0.3"/> exit <pause dur="0.3"/> takes you to the end of the function the end of the procedure <pause dur="0.7"/> so <pause dur="0.4"/> it saves you # <pause dur="0.2"/> having <pause dur="0.2"/> a more complicated if <pause dur="1.9"/> good programming practice <pause dur="0.6"/>

possibly <pause dur="0.2"/> not the best of programming practice <pause dur="0.5"/> # <pause dur="0.2"/> exiting out <pause dur="0.5"/> the trouble is that <pause dur="0.6"/> otherwise you're into a complicated while loop here <pause dur="0.5"/> which is more difficult to understand so yeah it's a good programming practice 'cause you can read that and say <pause dur="0.3"/> oh when i've found the value i've finished i'm off home <pause dur="0.8"/> or i've finished the function anyway <pause dur="0.8"/> okay so <pause dur="0.3"/> this is exhaustive search <pause dur="0.3"/> i look at every item <pause dur="0.9"/> if <pause dur="1.0"/> i find the item that i want <pause dur="0.6"/> i said the result equal to that's position <pause dur="0.5"/> and i exit and that's the end of it <pause dur="0.7"/> if the item i want isn't in the list <pause dur="0.2"/> i get right down to the end of the list <pause dur="0.6"/> i've never done this bit of code here <pause dur="0.8"/> and i'm out of the bottom here <pause dur="0.3"/> result still equal to count-minus-one <pause dur="0.7"/> and the function will return having executed count times <pause dur="0.8"/> found nothing and say oh <pause dur="0.3"/> minus-count <pause dur="1.4"/> okay you happy with that </u><pause dur="0.9"/> <u who="sm0706" trans="pause"> list <pause dur="0.3"/> is <pause dur="0.6"/> if there's nothing in your list if you have an empty <pause dur="0.8"/> list <pause dur="0.3"/> you will <pause dur="0.2"/> still going to return zero <pause dur="1.2"/> if your count's going to be zero </u><pause dur="0.4"/> <u who="nf0704" trans="pause"> woo-hoo <pause dur="0.3"/> yes <pause dur="0.5"/> okay the point is that <pause dur="0.3"/> if <pause dur="0.2"/> count is zero and my list <pause dur="0.2"/> is <pause dur="0.6"/> empty count will be zero <pause dur="0.5"/> and so i'm going to return zero <pause dur="0.3"/> when you do testing you're supposed to

test with no elements aren't you <pause dur="0.3"/> yes <pause dur="0.2"/> well spotted <pause dur="0.6"/> okay so # <pause dur="1.0"/> we probably <pause dur="0.2"/> want to <pause dur="0.9"/> add one <pause dur="0.6"/> add one <pause dur="0.6"/> subtract probably subtract yes <pause dur="7.1"/> i i had in mind when i was writing these bits of code <pause dur="0.3"/> that i'd return different negative values <pause dur="0.3"/> depending whereabouts it didn't find it and which method 'cause the other methods have different ways of doing it <pause dur="0.7"/> # <pause dur="0.9"/> the same <pause dur="0.2"/> mistake is in all the programs <pause dur="0.3"/> i i tell you guys to test <pause dur="0.5"/> # i i i test by generating little lists and seeing if it looks right # <pause dur="0.9"/> okay <pause dur="0.3"/> # <pause dur="11.1"/><kinesic desc="changes transparency" iterated="y" dur="7"/> okay <pause dur="0.6"/> the rest of it's mostly true anyway <pause dur="0.8"/> okay <pause dur="0.2"/> exhaustive search because it examines each item in turn <pause dur="0.6"/> items that are at the front of the list <pause dur="0.8"/> # are found more quickly than those at the rear <pause dur="0.4"/> so <pause dur="0.4"/> if you had the whole of the telephone directory <pause dur="0.3"/> sorted on your telephone numbers and you were looking for a particular number <pause dur="1.2"/> # <pause dur="0.2"/> if it was near the beginning of the list you'd find it in <pause dur="0.2"/> seconds milliseconds <pause dur="0.4"/> if it was near the back you'd find it in hours or days <pause dur="1.0"/>

so <pause dur="0.9"/> it's not altogether <pause dur="0.2"/> the best <pause dur="0.4"/> and of course if it's missing <pause dur="0.3"/> you've got to look at everything <pause dur="0.2"/> before you find it <pause dur="0.6"/> so that's your worst case <pause dur="0.4"/> and this is <pause dur="0.2"/> in our big-O notation it's O-<pause dur="0.6"/>N <pause dur="0.2"/> so however many items you've got <pause dur="1.2"/> you have to look at each of them <pause dur="1.2"/> but for small lists it's adequate <pause dur="0.2"/> especially if you write it correctly <pause dur="0.7"/> okay <pause dur="0.3"/> easy to write and debug <pause dur="2.6"/> and test <pause dur="10.9"/><kinesic desc="changes transparency" iterated="y" dur="9"/> okay same function <pause dur="0.2"/> but this time <pause dur="0.2"/> we're assuming <pause dur="0.2"/> that we've already sorted the list <pause dur="0.7"/> # <pause dur="0.2"/> T-string list sort wouldn't be enough 'cause what i explained it sorts strings not integers <pause dur="0.6"/> but we can <pause dur="0.5"/> write a little sort routine or we can find one elsewhere <pause dur="0.7"/> # <pause dur="0.3"/> that would do the sort correctly so assume our list of integers is sorted into <pause dur="0.6"/> numeric order <pause dur="0.4"/> okay <pause dur="2.0"/> much the same as before <pause dur="0.7"/> the <pause dur="0.3"/> comment <pause dur="0.3"/> cut and pasted from before has the word <pause dur="0.2"/> sorted inserted <pause dur="0.4"/> here <pause dur="2.0"/> in a sorted list <pause dur="1.2"/> returns a negative value <pause dur="0.4"/> which we'll need to change because it's not quite right <pause dur="0.9"/> okay <pause dur="0.8"/> two local variables <pause dur="0.2"/> I and J they're not important what they're called <pause dur="0.6"/> # <pause dur="1.0"/> code on next slide <pause dur="14.1"/><kinesic desc="changes transparency" iterated="y" dur="13"/> okay <pause dur="0.6"/>

exhaustive searching on sorted list <pause dur="0.2"/> we start off putting the result equal to a negative number <pause dur="0.5"/> more negative than anything that we will have seen before <pause dur="1.2"/> and <pause dur="0.2"/> we've <pause dur="0.4"/> got <pause dur="1.3"/> here <pause dur="0.3"/> # <pause dur="1.7"/> we're using a local variable which <pause dur="0.7"/> i'm putting to the difference between <pause dur="0.4"/> the item i'm looking at and target <pause dur="0.4"/> so i'm looking to see <pause dur="0.5"/> whether <pause dur="0.2"/> i've passed the target or not because <pause dur="0.9"/> if we're <pause dur="0.2"/> earlier than target <pause dur="0.7"/> this value will be positive <pause dur="0.7"/> if we're passed the target it'll be negative <pause dur="0.5"/> and if we're actually on the target if this value here <pause dur="0.3"/> equals our target <pause dur="0.4"/> we've found the item <pause dur="0.3"/> is that clear <pause dur="0.9"/> yeah <pause dur="0.4"/> yeah <pause dur="0.9"/> okay <pause dur="0.3"/> so we're looking to find the target <pause dur="0.6"/> # <pause dur="0.6"/> it just saves having a complicated bit of code <pause dur="0.5"/> where i've got <pause dur="0.6"/> J here <pause dur="0.2"/> and J here <pause dur="0.3"/> i'd have to say is this greater than that <pause dur="0.9"/> or equal to that <pause dur="0.3"/> so <pause dur="0.7"/> just trying to save time <pause dur="0.8"/> # an optimizing compiler would do it for us anyway <pause dur="1.0"/> so <pause dur="0.7"/> if J is equal to zero <pause dur="0.4"/> then this is equal to this <pause dur="0.2"/> and we've found it <pause dur="1.2"/> so <pause dur="0.8"/> if we come in here <pause dur="0.3"/> we've found it <pause dur="2.1"/> we can put result equal to I <pause dur="0.7"/>

because that's the position in the list it is <pause dur="1.6"/> and exit <pause dur="0.4"/> out the end of the function without doing anything else <pause dur="1.5"/> okay <pause dur="0.8"/> if <pause dur="0.2"/> J is not equal to zero <pause dur="1.3"/> but J is still positive <pause dur="2.3"/> J is still positive J is positive <pause dur="0.4"/> if J is positive then this value <pause dur="0.4"/> is <pause dur="0.2"/> bigger than this value <pause dur="0.2"/> so this is bigger <pause dur="0.2"/> than that means we've passed the point in the list where our item should be <pause dur="1.2"/> so if J's positive <pause dur="0.2"/> because <pause dur="0.2"/> string <pause dur="0.6"/> I is bigger than target <pause dur="0.6"/> we've passed the point <pause dur="0.2"/> and we can also exit <pause dur="1.0"/> so we go round our loop <pause dur="0.4"/> round and round and round <pause dur="0.6"/> comparing the current item <pause dur="0.2"/> against target <pause dur="0.7"/> if <pause dur="0.2"/> target <pause dur="2.9"/> is less than what we're looking at we'll go round again <pause dur="2.9"/> target is less than <pause dur="2.1"/> is that the right way round <pause dur="2.8"/> are you certain it's the right way round do we want to look at an example <pause dur="4.4"/> okay we've got a list here <pause dur="0.6"/> suppose in this list here that we're looking at <pause dur="0.2"/> we've got thirteen as the target <pause dur="5.3"/><kinesic desc="writes on board" iterated="y" dur="4"/> what we do is we've got a target of thirteen <pause dur="0.6"/> we compare it to one <pause dur="1.0"/> we say <pause dur="0.5"/> one <pause dur="0.3"/> minus thirteen <pause dur="0.8"/> minus-twelve <pause dur="0.9"/> that difference is

negative <pause dur="0.3"/> so we're going to <pause dur="0.5"/> not be saying J is nought we're not going to be saying J is positive <pause dur="0.5"/> we're going to go round again so we're going to look at the next item <pause dur="0.5"/> so we're going to say one <pause dur="0.2"/> minus eleven <pause dur="0.3"/> which is minus-ten <pause dur="0.2"/> on a good day <pause dur="0.8"/> and <pause dur="0.5"/> we'll say <pause dur="0.2"/> hey that's still negative it's not equal to it it's not greater than <pause dur="0.2"/> so we'll go round again <pause dur="1.1"/> come to this item <pause dur="0.4"/> one <pause dur="0.2"/> minus twelve <pause dur="6.1"/><vocal desc="sneeze" iterated="n"/><pause dur="1.9"/> are you are you watching what i'm doing or are trying to # confuse me here <pause dur="3.3"/> we start by saying <pause dur="0.2"/> what's in strings I <pause dur="0.2"/> minus thirteen <pause dur="1.7"/> then we say what's in strings I <pause dur="0.3"/> minus thirteen <pause dur="1.7"/> yeah <pause dur="1.3"/> and that's minus-two <pause dur="5.6"/> and when we get to this point so Is nought one two three <pause dur="1.1"/> eighteen <pause dur="0.2"/> minus thirteen is <pause dur="0.2"/> positive <pause dur="2.2"/> okay <pause dur="0.5"/> and at that point <pause dur="0.3"/> we're saying <pause dur="1.8"/> J's not equal to zero <pause dur="1.5"/> J is greater than zero so we passed it so we'll go to exit <pause dur="1.6"/> yeah <pause dur="1.9"/> so we work out <pause dur="0.2"/> this quantity each time <pause dur="0.7"/> and <pause dur="0.5"/> if it's equal to it we've found it <pause dur="0.5"/> if it's gone positive we've gone past it so it's not there <pause dur="2.3"/> now are you happy with that <pause dur="5.5"/> mostly <pause dur="9.3"/><kinesic desc="changes transparency" iterated="y" dur="9"/> okay <pause dur="0.4"/>

exhaustive search <pause dur="0.7"/> still <pause dur="1.0"/> big-O-<pause dur="0.2"/>N <pause dur="0.9"/> worst sort of case is we've got to go round looking at every item <pause dur="1.3"/> could be O-N divided by two but that still is O-N <pause dur="1.5"/> it is faster <pause dur="1.5"/> usually <pause dur="0.5"/> when the item's not there <pause dur="0.6"/> because exhaustive searching without it being sorted you're going to go past <pause dur="0.5"/> everything and then say oh it <pause dur="0.2"/> isn't here <pause dur="0.7"/> if it's sorted <pause dur="0.4"/> normally <pause dur="1.1"/> it'd be somewhere between <pause dur="0.3"/> the beginning and the end <pause dur="0.2"/> on average in the middle <pause dur="0.6"/> you'd only look at half the list and say oh it's not there <pause dur="0.2"/> i'm going home <pause dur="2.0"/> but <pause dur="0.2"/> there are two things which will make it slower and you got to be aware of <pause dur="0.8"/> one <pause dur="0.2"/> there is <pause dur="0.2"/> a little bit more code <pause dur="0.2"/> we've got a few more statements to look at <pause dur="0.4"/> each time we're doing this comparison this and that <pause dur="0.7"/> is it equal <pause dur="0.2"/> is it greater than <pause dur="0.9"/> and you've got the overhead of sorting <pause dur="0.2"/> how long does your sort algorithm take <pause dur="0.8"/> so <pause dur="0.3"/> you've got both those things to contend with <pause dur="1.1"/> it's faster normally <pause dur="0.5"/> but make sure when you're saying it's faster <pause dur="0.5"/> that you've remembered sorting takes some time <pause dur="9.2"/><kinesic desc="changes transparency" iterated="y" dur="8"/> okay <pause dur="1.0"/> that's exhaustive search which effectively looks at everything <pause dur="1.1"/> binary search <pause dur="0.4"/> works on

sorted lists your list has to be sorted for it to work <pause dur="0.8"/> and what it does <pause dur="0.3"/> is <pause dur="0.2"/> it starts <pause dur="0.2"/> at the middle of the list <pause dur="2.7"/> back to drawing up the list on the board <pause dur="1.3"/> it starts at the middle <pause dur="0.3"/> can say that one's the middle <pause dur="1.2"/> and <pause dur="0.2"/> it says is our target <pause dur="0.6"/> if it was thirteen <pause dur="1.2"/> before or after this <pause dur="0.9"/> thirteen <pause dur="0.3"/> is bigger than twelve so we know <pause dur="0.3"/> we only need to look in this part of the list <pause dur="0.7"/> if our target was three <pause dur="0.3"/> we'd know <pause dur="0.4"/> we only needed to look before it <pause dur="0.4"/> oh <pause dur="0.2"/> and if our target was twelve it's a hey we've found it we can go home now <pause dur="2.1"/> so <pause dur="0.2"/> if it's equal we've found it <pause dur="0.2"/> smaller <pause dur="2.0"/> up <pause dur="1.7"/> bigger <pause dur="0.2"/> look down <pause dur="2.5"/> and if there aren't any items left in the bit of list that we're looking at <pause dur="0.4"/> it's not there <pause dur="0.3"/> so we're finished <pause dur="1.8"/> so it's doing the chop we've seen sort routines that do this we <pause dur="0.3"/> did with the <pause dur="0.5"/> # <pause dur="0.2"/> quick sort we were effectively chopping the list in half and sorting that <pause dur="13.4"/><kinesic desc="changes transparency" iterated="y" dur="9"/> okay so here's a bit of code that's going to do much the same <pause dur="0.5"/> # <pause dur="1.1"/> this is going to be a method on this object here <pause dur="0.6"/> called binary <pause dur="0.6"/> bit of comment which is exactly the same as the comment that we had before <pause dur="1.2"/> a few local variables here that we're going to

use <pause dur="0.4"/> min mid max and J <pause dur="0.7"/> # <pause dur="2.9"/> min's going to tell us where's the top of the bit of list we're looking at max is going to tell us the bottom <pause dur="0.2"/> and mid's where we're going to compute the middle point <pause dur="0.5"/> 'cause each time we'll go for the chunks of list <pause dur="0.5"/> and J is just the local variable <pause dur="0.7"/> same as we had before <pause dur="1.6"/> could have used more imaginative names <pause dur="0.4"/> one of the things we say about programs use <pause dur="0.4"/> meaningful names <pause dur="0.4"/> # <pause dur="0.4"/> the trouble with trying to fit everything onto a slide is that if your names get very meaningful when you get if statements they fall off the end <pause dur="8.8"/><kinesic desc="changes transparency" iterated="y" dur="7"/>

# <pause dur="4.5"/> okay <pause dur="0.2"/> it's <pause dur="0.2"/> definitely a bit more complicated than the code for the others it it takes more space <pause dur="0.5"/> and i've shunted some things onto two lines so that they fit <pause dur="0.4"/> so <pause dur="0.2"/> it's more complicated code we've got here <pause dur="3.1"/> but let's talk it through okay we're starting here <pause dur="0.6"/> min equal to zero max equal to count-minus-one <pause dur="0.2"/> the top and bottom of our list <pause dur="1.6"/> and we got a while loop we're going to go through while min is less than equal to max <pause dur="0.2"/> we're changing those values in here <pause dur="0.8"/> and <pause dur="0.2"/> if they're equal then our sublist has got nothing in <pause dur="0.2"/> and we said <pause dur="0.2"/> sublist nothing in <pause dur="0.3"/> it's empty <pause dur="3.5"/> okay J is this difference between <pause dur="1.0"/> the item we're looking at and target <pause dur="0.8"/> so once that becomes positive <pause dur="0.5"/> we're passed it <pause dur="0.4"/> when it's negative we're earlier <pause dur="0.4"/> and when that's zero we've found it <pause dur="1.5"/> and that's what the code says <pause dur="0.8"/> if <pause dur="0.3"/> J equals zero <pause dur="0.3"/> then we've found it <pause dur="0.4"/> and

we can return <pause dur="0.6"/> mid <pause dur="0.2"/> which is our index for that middle point <pause dur="3.6"/> did we miss a line of code there <pause dur="0.2"/> yeah we missed that line of code there <pause dur="0.9"/> this one calculates mid <pause dur="0.2"/> and we just add max and min and divide by two <pause dur="0.6"/> we could have <pause dur="0.8"/> copied the example from string list and done a shift right two <pause dur="0.6"/> # <pause dur="0.9"/> but i hesitate to try that sort of thing in my code <pause dur="2.3"/> okay so we've found it <pause dur="2.0"/> if <pause dur="0.2"/> J is greater than zero <pause dur="0.2"/> we're later in the list than the item we're looking for <pause dur="0.3"/> so we need to search earlier <pause dur="1.7"/> so <pause dur="0.5"/> if we're <pause dur="1.3"/> down here in our list <pause dur="1.3"/> and we're looking for thirteen we know we need to look earlier <pause dur="4.1"/> okay <pause dur="0.2"/> and we look earlier by changing our value <pause dur="0.2"/> of <pause dur="0.4"/> max <pause dur="0.5"/> we make it equal to <pause dur="0.6"/> not the middle point 'cause we know that's not the item <pause dur="0.2"/> the one before <pause dur="4.6"/> if it isn't earlier <pause dur="0.2"/> and it isn't this one <pause dur="0.2"/> it must be later <pause dur="0.8"/> if it's there <pause dur="1.0"/> and we handle that in the final else clause here which puts min <pause dur="0.8"/> equal to mid-plus-one <pause dur="0.3"/> again we don't need to include mid 'cause we know it's not that <pause dur="0.6"/> so <pause dur="0.4"/> search <pause dur="0.3"/> earlier <pause dur="0.3"/> search later <pause dur="1.2"/> # <pause dur="0.5"/> if it's not there <pause dur="0.2"/> we

make it equal to count-#-<pause dur="0.6"/>minus-one <pause dur="2.9"/> so we'll return a negative number <pause dur="6.6"/> okay <pause dur="0.2"/> are you happy with that as an algorithm <pause dur="0.9"/> yeah </u><pause dur="0.3"/> <u who="sm0707" trans="pause"> could we <pause dur="0.3"/> write it a lot easier if we use recursion instead </u><pause dur="1.5"/> <u who="nf0704" trans="pause"> couldn't we use it write it a lot easier if we used recursion instead </u><u who="sm0707" trans="overlap"> then you've got <gap reason="inaudible" extent="3 secs"/><pause dur="0.5"/> until you get down to the <gap reason="inaudible" extent="1 sec"/></u><pause dur="0.8"/> <u who="nf0704" trans="pause"> yeah i i wondered about <pause dur="0.2"/> writing a recursive version of this <pause dur="0.5"/> # <pause dur="4.5"/> i am sure that you could write this as a recursive routine <pause dur="0.7"/> # </u><pause dur="0.4"/> <u who="sm0707" trans="pause"> <gap reason="inaudible" extent="2 secs"/> <pause dur="0.6"/> <gap reason="inaudible" extent="1 sec"/><pause dur="0.4"/> well no this it would make sense to use them <gap reason="inaudible" extent="1 sec"/> but <pause dur="0.4"/> you can use recursion to take a lot out and make it more simple </u><pause dur="0.8"/> <u who="nf0704" trans="pause"> i'll tell you what <pause dur="0.5"/> we'll have <pause dur="0.8"/> three minutes till half past <pause dur="0.5"/> and if you're feeling adventurous <vocal desc="laughter" n="sm0707" iterated="y" dur="1"/> sketch out what we think is a recursive one <pause dur="0.3"/> i believe that you could make it recursive <pause dur="0.4"/> # <pause dur="0.2"/> my my book <pause dur="0.2"/> didn't make it recursive <pause dur="0.4"/> we'll <trunc>tr</trunc> try and sketch one out <pause dur="0.2"/> as i say three minutes <pause dur="0.4"/> # and we'll compare and contrast the two <pause dur="0.4"/>

okay </u> <u who="ss" trans="latching"> <gap reason="inaudible, multiple speakers" extent="3 mins"/><event desc="discussing task" iterated="y" dur="3:20"/></u><u who="nf0704" trans="overlap"> anyway i'm hoping we're going to have a recursive algorithm soon </u><u who="ss" trans="overlap"> <gap reason="inaudible, multiple speakers" extent="1 min 15 secs"/></u><u who="nf0704" trans="overlap"> okay have we have we got a a version that's recursive </u><pause dur="0.5"/> <u who="sm0708" trans="pause"> # sort of </u><u who="nf0704" trans="latching"> sort of <pause dur="0.6"/> do you want to write it up or do you want to dictate it to me </u><u who="sm0708" trans="latching"> neither </u><u who="sm0709" trans="overlap"> <gap reason="inaudible" extent="1 word"/></u><pause dur="0.3"/> <u who="nf0704" trans="pause"> <vocal desc="laughter" n="ss" iterated="y" dur="1"/> neither <pause dur="0.3"/> okay </u><u who="sm0710" trans="overlap"> it's not very <gap reason="inaudible" extent="1 sec"/> </u><u who="sm0711" trans="overlap"> we decided it's it because Delphi isn't <pause dur="0.3"/> functional language it wasn't actually very easy to do <pause dur="0.6"/><vocal desc="laughter" n="ss" iterated="y" dur="1"/> if it was list if it <pause dur="0.2"/> was a list based <pause dur="0.5"/> like <pause dur="0.3"/> <gap reason="name" extent="1 word"/> <pause dur="0.3"/> it would be a lot easier to do than Delphi </u><pause dur="1.2"/> <u who="nf0704" trans="pause"> okay the the the statement we've got

here <pause dur="0.2"/> is <pause dur="0.2"/> it's Delphi's fault for <vocal desc="laughter" n="ss" iterated="y" dur="2"/> not being a functional <shift feature="voice" new="laugh"/> language <shift feature="voice" new="normal"/><pause dur="0.5"/> and and if it was a functional language it'd be a lot easier to write <pause dur="0.4"/> # <pause dur="0.2"/> i've got a version that i'll i'll try on you <pause dur="0.4"/> # <pause dur="1.5"/> what would the execution time be like if we compared MIRANDA with Delphi <pause dur="0.7"/> answers on the back of a postcard to <pause dur="1.4"/> whoever you want to send it to <pause dur="0.8"/> # <pause dur="0.8"/> i think that the <pause dur="0.8"/> basic algorithm <pause dur="0.7"/> but what would bother me was that the sort of function level <pause dur="0.2"/> of this binary <pause dur="0.9"/> like the quick sort i'd need quite a few <pause dur="0.4"/> parameters into it <pause dur="1.2"/> and i think what i'd need is to be calling something well <pause dur="0.4"/> like our min and max as parameters in there <pause dur="1.1"/> and our target <pause dur="1.3"/> and make those all integers and things like that <pause dur="1.6"/> but basically the code would be <pause dur="0.2"/> if this thing i've called J <pause dur="0.7"/> is greater than zero <pause dur="1.5"/> so <pause dur="0.7"/> i'd need to look earlier in the list <pause dur="0.4"/> i'd call binary <pause dur="1.6"/><kinesic desc="writes on board" iterated="y" dur="45"/> again <pause dur="0.4"/> with min <pause dur="1.1"/> # <pause dur="0.2"/> min-<pause dur="0.3"/>minus-<pause dur="0.2"/>one <pause dur="0.3"/> i need to calculate min and J up here <pause dur="0.6"/> calculate them <pause dur="1.2"/> past target <pause dur="0.7"/> sorry my writing's falling to bits here

# <pause dur="0.2"/> else <pause dur="1.7"/> if it was <pause dur="0.7"/> bigger <pause dur="1.3"/> then i'd call binary <pause dur="2.2"/> with <pause dur="0.3"/> mid-<pause dur="1.3"/>plus-one <pause dur="0.5"/> max <pause dur="1.1"/> and <pause dur="0.3"/> target <pause dur="2.4"/> else <pause dur="1.6"/> i've found it so result <pause dur="2.3"/> will be equal to mid <pause dur="2.0"/> i don't think that's the whole code but i think the body is that what the sort of thing you were aiming at </u><u who="sm0712" trans="latching"> yeah </u><pause dur="0.2"/> <u who="sm0713" trans="pause"> yeah think we were trying </u><u who="nf0704" trans="overlap"> i think you <trunc>st</trunc> you still <pause dur="0.2"/> need the decorative bits </u><u who="sm0713" trans="overlap"> we were trying to pass <pause dur="0.2"/> we were trying to pass the list as a parameter which didn't really work </u><pause dur="0.8"/> <u who="nf0704" trans="pause"> ah yeah <pause dur="0.2"/> now you see <pause dur="0.2"/> that's where we had problems with the quick sort algorithm and # and we made a a scratch list as well didn't we <pause dur="0.6"/> # <pause dur="0.2"/> the conventional algorithms that just use arrays <pause dur="0.9"/> they just sort of index into <trunc>bis</trunc> <pause dur="0.2"/> different bits of the array and i think that helps <pause dur="0.7"/> i think you'd do it easier with link lists as well actually <pause dur="1.1"/> # <pause dur="0.6"/> so i think <pause dur="0.2"/> i think you're right i think you could write a double Basic algorithm <pause dur="0.9"/> i think the algorithm itself that i've given you is flawed from what <gap reason="name" extent="1 word"/> was saying here <pause dur="0.5"/> # <pause dur="0.8"/> if we only have one item <pause dur="1.4"/> and it's the item <pause dur="1.4"/> fall <pause dur="0.2"/> for the same problem that you were

telling me about earlier <pause dur="0.5"/> if we have only one item <pause dur="0.2"/> and it's the item <pause dur="1.5"/> then <pause dur="0.8"/> min's zero max is zero so this while falls out <pause dur="0.4"/> and we pop down here <pause dur="0.9"/> and we say result equals minus-count <pause dur="0.8"/> now this algorithm i i <trunc>hi</trunc> hijacked from Delphi three algorithms and only changed into objects so i think he's wrong as well </u><u who="sm0714" trans="overlap"> yeah it does <gap reason="inaudible" extent="1 sec"/> </u><pause dur="1.2"/> <u who="nf0704" trans="pause"> so let's <trunc>choo</trunc> </u><u who="sm0714" trans="overlap"> i bet it does work <gap reason="inaudible" extent="2 secs"/></u><u who="sm0715" trans="overlap"> <gap reason="inaudible" extent="2 secs"/> nought is equal to nought <pause dur="1.2"/> 'cause it would still run the middle section </u><pause dur="0.8"/> <u who="nf0704" trans="pause"> so it gives a right answer </u><pause dur="0.3"/> <u who="sm0715" trans="pause"> yeah </u><u who="nf0704" trans="latching"> oh so Mr Delphi algorithms is right <pause dur="0.2"/> oh i'm relieved <pause dur="1.1"/> okay <pause dur="0.2"/> # <pause dur="0.2"/> but the bit of error checking we would put in to make everything easier is we'd check if our list was empty and if it was empty we wouldn't bother trying to search <pause dur="0.2"/> or sort <pause dur="1.4"/><kinesic desc="changes transparency" iterated="y" dur="11"/> okay <pause dur="0.3"/> let's push on <pause dur="1.6"/> that's about half the slides we've done <pause dur="4.8"/> okay <pause dur="0.7"/> # <pause dur="0.5"/> the next few slides are just comparing the approaches <pause dur="0.4"/> i decided that i wasn't # <pause dur="0.2"/> doing very well at drawing examples on the board and i'd do some <pause dur="0.5"/> at home <pause dur="0.2"/> in advance <pause dur="1.2"/> okay <pause dur="0.5"/> binary search <pause dur="0.7"/> at each <trunc>s</trunc> step <pause dur="0.3"/> whether <trunc>re</trunc> recursive or not <pause dur="0.9"/> we're cutting the list in half <pause dur="2.3"/> and because we're cutting the <pause dur="0.2"/> list in half at each step <pause dur="0.4"/> we're <pause dur="1.0"/> big-O log-N <pause dur="0.8"/>

if you had <pause dur="0.5"/> a million items <pause dur="0.6"/> sorry i i spelled out a million because when people say million i never quite see <pause dur="0.2"/> how many zeros there are <pause dur="0.7"/> one followed by six <pause dur="0.2"/> zeros <pause dur="0.3"/> the most steps will be twenty <pause dur="1.1"/> with exhaustive search <pause dur="0.2"/> it would have been a million steps <pause dur="1.2"/> yeah <pause dur="0.2"/> so <pause dur="0.7"/> a lot <pause dur="0.2"/> faster <pause dur="0.2"/> the most you're going to do is twenty 'cause you're <pause dur="0.6"/> cutting it in half cutting it in a half <pause dur="0.4"/> and very quickly <pause dur="0.2"/> you're down to having either found it <pause dur="0.5"/> oh <pause dur="0.2"/> or it's not there <pause dur="3.7"/> okay <pause dur="5.7"/><kinesic desc="changes transparency" iterated="y" dur="6"/> as i said i i i i felt <pause dur="0.2"/> last <pause dur="0.2"/> time when i was talking about sorting <pause dur="0.4"/> and i started playing this game let's write a few lists on the board <pause dur="0.4"/> that i was always getting to a point where <pause dur="1.1"/> the example became more confusing so i thought i'll <pause dur="0.3"/> knock one up and i'm try it with this <pause dur="0.6"/> # <pause dur="0.8"/> as i implied earlier <pause dur="0.3"/> i've got a random number generator that just generates numbers in this instance <pause dur="0.4"/> # <pause dur="1.8"/> they've got to be between one and nine don't they <pause dur="0.6"/> think they're between one and ten and we just didn't get ten <pause dur="0.8"/> okay <pause dur="0.2"/> so this <pause dur="0.3"/> is just a random list i've

generated for this example <pause dur="0.7"/> this is the original list <pause dur="0.6"/> we can't use binary sort on that original list <pause dur="0.2"/> because it's not sorted <pause dur="0.4"/> binary search sorry <pause dur="0.2"/> binary search needs it to be sorted <pause dur="0.8"/> if we were to use exhaustive search <pause dur="1.3"/> we would look at each item in turn <pause dur="1.0"/> suppose we were looking for the value five <pause dur="0.7"/> we'd look first of <pause dur="0.8"/> item zero which is two <pause dur="0.3"/> then we'd see <pause dur="0.4"/> four <pause dur="0.2"/> then we'd see nine then we'd see five and say hey we've got it <pause dur="2.9"/> okay <pause dur="0.6"/> of course there's a few more fives in there but # <pause dur="0.8"/> this stops as soon as it's found one <pause dur="0.2"/> so it hasn't found these down here <pause dur="1.7"/> okay so you're happy with that <pause dur="1.1"/> if we were to use exhaustive search on this <pause dur="0.7"/> but <pause dur="0.2"/> to take the sorted list so we'd run our sorting algorithm <pause dur="2.6"/> we'd look at this one we'd look at element <pause dur="1.1"/> zero <pause dur="1.4"/> and say oh there's a one <pause dur="0.3"/> then we'd look at element one and there's a two <pause dur="0.4"/> another two another two <pause dur="0.5"/> and then we'd <pause dur="0.2"/> four's got a four in it element five's got a five <pause dur="0.3"/> oh we've found it <pause dur="0.9"/> in this example <pause dur="1.1"/> # <pause dur="0.2"/> the exhaustive search on the non-sorted list <pause dur="0.2"/> found it

faster <pause dur="1.1"/> because of the repeated instances <pause dur="2.8"/> binary search would cut the list into half <pause dur="0.9"/> and it would see a four <pause dur="0.2"/> there <pause dur="0.9"/> and it would then cut the remaining list <pause dur="0.4"/> in half <pause dur="0.2"/> which would be at this point here seven so it cuts it <pause dur="0.6"/> here <pause dur="1.1"/> and then here <pause dur="0.3"/> and it says <pause dur="0.3"/> oh i've found <trunc>s</trunc> five at position seven <pause dur="0.2"/> so it finds <pause dur="0.4"/> a different instance <pause dur="0.2"/> but since i'd asked for item five <pause dur="0.8"/> it doesn't matter <pause dur="2.9"/> okay are you happy with that as an example <pause dur="0.9"/> that's <pause dur="0.5"/> just how it works <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="8"/> if we took the same thing <pause dur="0.2"/> and we're looking for six <pause dur="4.3"/> if we did the exhaustive <pause dur="0.4"/> non-sorted <pause dur="3.5"/> we'd look at all the items saying is this six is it six where's that six gone <pause dur="0.6"/> and we wouldn't find it <pause dur="0.3"/> so we'd look at everything <pause dur="0.6"/> and not find it <pause dur="1.6"/> this sorted list we've got here <pause dur="0.4"/> exhaustive search <pause dur="0.6"/> again we'd start at item zero <pause dur="0.3"/> look at one <pause dur="0.3"/> two <pause dur="0.7"/> three <pause dur="1.8"/> and would eventually find it here at position <pause dur="0.5"/> oh sorry i'm looking for six aren't i look at <pause dur="0.5"/> five and it's not that one you'd look at six and it's not that one you'd look at seven and it's not that

one <pause dur="1.0"/> and eventually <pause dur="0.2"/> here <pause dur="2.7"/> when it got <pause dur="0.2"/> to <pause dur="0.4"/> index eight there <pause dur="0.7"/> it'd say ooh this is too big it can't be here <pause dur="0.7"/> so looking for six it'd look all the way down to nearly the bottom of the list and then say oh it's not here anyway <pause dur="2.1"/> the binary search would again split it <pause dur="0.2"/> here <pause dur="0.2"/> at the middle say <pause dur="0.6"/> that's item four is that it <pause dur="0.9"/> no <pause dur="2.2"/> split it here <pause dur="0.9"/> is that it <pause dur="0.3"/> no <pause dur="2.9"/> split it <pause dur="1.2"/> probably here <pause dur="2.3"/> think the middle <trunc>li</trunc> the it's rounding down 'cause it's a div operation <pause dur="0.9"/> that's too big <pause dur="0.6"/> the sub<pause dur="0.3"/>list <pause dur="0.2"/> between here and here <pause dur="0.2"/> it's got no elements in it <pause dur="0.2"/> it'll fall out in that <pause dur="0.5"/> min max less than or equal to <pause dur="0.6"/> empty result equals minus-count <pause dur="0.4"/> minus one <pause dur="1.4"/> so <pause dur="0.2"/> it's just different approaches <pause dur="0.4"/> the binary is looking at fewer items 'cause it finds the middle one <pause dur="1.4"/> and then it says shall i look at that half or this half <pause dur="0.6"/> finds the middle one shall i look at this half or that half <pause dur="0.5"/> and so there's many less operations for it to do <pause dur="6.4"/><kinesic desc="changes transparency" iterated="y" dur="8"/> okay <pause dur="0.4"/> the next one we're doing is <pause dur="0.8"/> definitely <pause dur="0.4"/> tricky <pause dur="0.6"/> i don't think we're going to <pause dur="0.2"/> cover <pause dur="0.2"/> all the code

today what i'll do <pause dur="0.6"/> i'll tell you about the method <pause dur="0.4"/> and vaguely how it works <pause dur="0.6"/> then i'll go on to talk about hunting which is another approach <pause dur="0.4"/> and then next week when we talk about hashing <pause dur="0.2"/> i'll talk about interpolation as well so if you can remember next Wednesday to bring these notes with you as well <pause dur="0.7"/> then <pause dur="0.4"/> i won't need to copy more of them for you <pause dur="0.2"/> so i'll talk about the approach but i won't go into detail of the algorithm <kinesic desc="changes transparency" iterated="y" dur="8"/> just yet <pause dur="7.5"/> okay <pause dur="0.3"/> the approach of interpolation is similar to binary <pause dur="0.5"/> we might be able to write it recursively maybe i'll give you a week to try that <pause dur="0.6"/> okay <pause dur="1.2"/> it's using known values to guess where an unknown value lies <pause dur="0.5"/> so <pause dur="0.2"/> i've got a list of numbers between one and a million <pause dur="0.7"/> if they're actually evenly distributed <pause dur="0.5"/> and i'm looking for five-hundred-thousand <pause dur="0.5"/> i'd guess it was in the middle <pause dur="1.3"/> if i was looking for two-hundred-and-fifty-thousand i'd guess it was quarter of the way down <pause dur="0.8"/> if it was evenly distributed <pause dur="0.4"/> and that is the approach <pause dur="0.6"/> that we <pause dur="0.8"/> use <pause dur="0.4"/> what we know

about the top and the bottom value <pause dur="0.5"/> to guess what <pause dur="1.0"/> where the one we want is <pause dur="1.8"/> okay <pause dur="0.4"/> # <pause dur="0.2"/> if we then look in the middle and find that it's actually seven-hundred-and-fifty-thousand in there <pause dur="1.0"/> we'll change our process <pause dur="1.1"/> so <pause dur="1.1"/> in searching it uses the indexes of known values <pause dur="0.4"/> to guess what the index of the one we want might be <pause dur="1.5"/> it works well if your list is evenly distributed <pause dur="0.5"/> if you have all numbers from one to a million then your guess is going to be <pause dur="0.5"/> brilliantly right in that sort of list <pause dur="1.2"/> if your list <pause dur="0.2"/> isn't evenly distributed but it sort of is <pause dur="1.2"/> # <pause dur="0.3"/> it'll work <pause dur="0.5"/> quite well <pause dur="0.9"/> if your list is <pause dur="0.5"/> skewed and you've got a lot of big numbers and hardly any small numbers it's telephone numbers sort of <pause dur="0.9"/> territory <pause dur="0.5"/> # <pause dur="0.2"/> it isn't so good <pause dur="1.2"/> and it only works with numbers <pause dur="1.1"/> well i guess again if we're clever we could probably predict strings we could probably if we had a list of surnames <pause dur="0.5"/> we could probably say if they're evenly distributed or beginning with A and B and C <pause dur="0.6"/> we could use this technique <pause dur="0.2"/> but the

algorithm i'm going to show you <pause dur="0.4"/> relies <pause dur="0.2"/> on there being <pause dur="0.3"/> integer values in there <pause dur="1.3"/> okay <pause dur="0.8"/><kinesic desc="changes transparency" iterated="y" dur="12"/> i'll show you the outline of the code but i know we're not going to <pause dur="2.1"/> finish it <pause dur="2.8"/> that's just the outside of the function <pause dur="0.5"/> # few variables <pause dur="0.5"/> bits of comment <pause dur="0.8"/> the next slide <pause dur="3.7"/><kinesic desc="changes transparency" iterated="y" dur="8"/> gives a flavour but as i say we'll cover this next week so we won't go into the details <pause dur="5.6"/>

but it's a much like the other codes to start with in the fact that we're looking <pause dur="0.5"/> to see <pause dur="2.1"/> if <pause dur="0.3"/> we're on target <pause dur="1.7"/> if we're on target <pause dur="0.4"/> then we're done <pause dur="1.6"/> and if it's absolutely not there we're going to do a minus-count minus one thing there <pause dur="1.3"/> so that's if it's there <pause dur="2.8"/> but before we can reach this point where this is true <pause dur="0.9"/> we've got <pause dur="0.7"/> three subcalculations we'll need to do we will be going round the loop lots of times <pause dur="0.6"/> one we're going to calculate where the dividing point is <pause dur="0.6"/> so <pause dur="0.2"/> how many items are there in the list <pause dur="0.7"/> what are we actually looking for are we looking for a value that we believe is halfway through <pause dur="0.2"/> or quarter of the way <pause dur="0.3"/> through <pause dur="0.6"/> so we're

going to work out <pause dur="0.2"/> where we think <pause dur="0.2"/> it'll be in the list <pause dur="0.6"/> and there's a calculation for doing that <pause dur="0.9"/> then we're going to check that it's not out of bounds 'cause there's no point in looking <pause dur="0.5"/> if we've already <pause dur="0.3"/> fallen off the end of our list <pause dur="2.1"/> and when we've done that <pause dur="0.2"/> we'll create a new middle value we'll look and say <pause dur="0.8"/> oh <pause dur="0.3"/> is that it <pause dur="0.4"/> if it isn't it <pause dur="0.2"/> we'll go round again <pause dur="0.9"/> oh it sounds like recursion to me <pause dur="1.2"/> but we'll cover that in detail <pause dur="0.6"/><kinesic desc="changes transparency" iterated="y" dur="14"/> next week <pause dur="11.2"/> in theory <pause dur="0.8"/> it's <pause dur="0.2"/> the same order of magnitude as binary <pause dur="0.2"/> O-log-N <pause dur="1.4"/> but in actual tests with real data <pause dur="0.5"/> it runs <pause dur="0.3"/> on integer lists of a hundred-thousand items <pause dur="0.5"/> three times faster than binary <pause dur="0.2"/> it finds it quicker <pause dur="3.4"/> and that's particularly useful <pause dur="0.5"/> # <pause dur="0.8"/> the way that it works if you're having to fetch your data from a slow device 'cause if you've got it on a hard disk <pause dur="0.6"/> or i don't know <pause dur="0.4"/> something else that's relatively slow to fetch it from <pause dur="1.8"/> this is fetching what it thinks is the right area <pause dur="0.6"/> whereas binary you're finding the middle <pause dur="0.3"/> when you're finding the middle of the <pause dur="0.4"/>

rest of it you'll <pause dur="0.2"/> probably be rushing off somewhere to fetch it <pause dur="0.8"/> okay but more details of that <kinesic desc="changes transparency" iterated="y" dur="8"/> as i say <pause dur="0.5"/> next time <pause dur="1.5"/> two more slides <pause dur="4.5"/> hunting <pause dur="1.4"/> if you need to locate many items within a list <pause dur="0.9"/> a hunting technique will make it better for you <pause dur="0.7"/> # <pause dur="0.3"/> in the fact that <pause dur="1.3"/> if you assume that the items you're looking for <pause dur="0.9"/> are <pause dur="0.4"/> either close together <pause dur="0.9"/> or even better that you've <trunc>sil</trunc> sorted the list of items that you're looking for so they're in order <pause dur="1.0"/> you can start your hunt from the item that you've found so you find one item <pause dur="1.3"/> and if the next item on the list <pause dur="0.6"/> that you're looking for <pause dur="0.8"/> is smaller than the one you've found you know to look in that bit of the list if it's bigger you know to look in that <pause dur="0.2"/> bit of the list <pause dur="1.5"/> does that make sense so you just get your list <pause dur="0.7"/> you got all your items in your list <pause dur="0.8"/> and <pause dur="0.6"/> you've got a a list up here with <pause dur="3.8"/> # <pause dur="0.2"/> lots and lots of items in <pause dur="6.3"/> and then going down to a hundred got an <trunc>i</trunc> a large list there <pause dur="0.7"/> and you've got a smaller list of things that you <pause dur="0.4"/> want to <pause dur="0.6"/> find from

there i think perhaps some of these are actually in the list <pause dur="1.5"/> so if you look for three first <pause dur="1.4"/> having found it's not at that point <pause dur="0.4"/> you can look in the remainder <pause dur="0.2"/> for the next one <pause dur="1.2"/> so it's just <pause dur="0.2"/> bringing together <pause dur="0.5"/> what are you using your list for why are you doing the searching <pause dur="0.6"/> is this the reason that directory enquiries lets you look for two <pause dur="0.2"/> phone numbers is it quicker for them <pause dur="1.1"/> or is it 'cause they want to charge you money <pause dur="9.4"/><kinesic desc="changes transparency" iterated="y" dur="14"/> okay here we go <pause dur="0.4"/> # <pause dur="1.7"/> searching <pause dur="0.3"/> is <pause dur="1.3"/> something that you need to do a lot of in your databases you've created large amounts of data <pause dur="0.5"/> and then you need to access it <pause dur="1.3"/> # <pause dur="0.6"/> why do you need to access it <pause dur="0.4"/> what sort of algorithms <pause dur="0.6"/> well <pause dur="0.7"/> if all you're doing is looking at short lists and doing it occasionally <pause dur="0.5"/> an exhaustive sort <pause dur="0.2"/> search whether on a sorted or unsorted list <pause dur="0.3"/> is good <pause dur="0.2"/> you can write it easily you can test it you can debug it <pause dur="0.5"/> and why do more <pause dur="2.6"/> a binary search is good <pause dur="1.5"/> on a large list <pause dur="2.4"/> the other positive

thing about it is it's not dependent on data distribution <pause dur="0.5"/> interpolation works well <pause dur="0.2"/> if we've got <pause dur="0.5"/> a list <pause dur="0.3"/> evenly distributed <pause dur="0.4"/> if it's not binary search will be faster <pause dur="3.4"/> these two <pause dur="0.4"/> work <pause dur="0.3"/> well <pause dur="0.2"/> for numbers <pause dur="0.7"/> and for strings you can work with # <pause dur="0.8"/> names as well <pause dur="1.4"/> interpolation <pause dur="0.7"/> is very fast <pause dur="1.1"/> it's not good on unevenly distributed <pause dur="0.2"/> and it's not good on strings <pause dur="0.5"/> but remember three times faster on a hundred-thousand items <pause dur="0.7"/> it is <pause dur="0.2"/> good <pause dur="1.4"/> the trouble is <pause dur="1.0"/> you've seen two bits of <pause dur="0.3"/> slide that represent it there are three more slides on that <pause dur="0.4"/> the code is longer <pause dur="0.2"/> there's <trunc>la</trunc> many more bugs that can come in there <pause dur="0.5"/> while you're writing it <pause dur="0.5"/> do you need it <pause dur="1.4"/> okay <pause dur="0.8"/> right <pause dur="0.2"/> so that's the end of today <pause dur="0.5"/> on Monday <gap reason="name" extent="1 word"/> will be talking to you about formal methods <pause dur="0.4"/> and on Wednesday i want to talk to you about hashing and we'll include interpolation <pause dur="0.6"/> so i think that covers it thank you

</u></body>

</text></TEI.2>