CS211 Spring 2002 Lecture 14: Searching 3/6/2003 ------------------------------------------------------------------------------- 0) Announcements + Prelim 1 (T1) tonight (see website for rooms) (bring ID) + A4 due 3/12 (see online material) Overview: + more about big oh (oh no!) + searching: - linear - binary + intro to sorting - insert - selection + analysis of searching/sorting ------------------------------------------------------------------------------- 1) Searching and Sorting and Data Structures Programing (and OOP): + information + action Data structures: + stores information + provides information Providing of information: + need ways to find and analyze data + common actions are searching and sorting Searching + look for something in a collection of data + not too many algorithms to worry about Sorting + reorganize data in a collection (or into another collection) + many, many algorithms Comparable interface + want generic way to compare data + might want to use same algorithms (action) for variety of structures (information) + hint of generic programming! more on that in a couple lectures ------------------------------------------------------------------------------- 2) Chaotic Search + Randomly look inside a structure + Not really discussed in textbooks, but it *is* an approach + big-Oh of this approach...maybe infinite + then again, maybe luck strikes! ------------------------------------------------------------------------------- 3) Linear Search Standard search: + easy to implement + easy to learn (taught at intro levels) Linear: + search inside collection one element at a time from start to end of collection + if item (KEY) found, stop looking + ordering not important unless looking for something that is ordered Somewhat generic code (could be better...how?): public static boolean linearSearch(Comparable[] c, Object key) { int i=0; while(i < c.length) if (c[i].compareTo(key)==0) return true; else i++; return false; } Asymptotic complexity: + reminder: worst-case scenario + amount of data in c is n + key could be last element + so, could feasibly not find key until have done n comparisons + comparison is the active operation + so, linear search is O(n) Other scenarios? + average? see DS&SD Conclusion: + compare to chaotic approaches + you might get lucky if key appears earlier, so O(n) won't happen often + about best you can do for unordered data is linear search Applications: + continuous data - find a value (usu. root) of an equation - pick an epsilon (tolerance) and march forward until equation works + discrete: - number guessing - pick a number from 1-100 -- could take 100 guess if start from 1 ------------------------------------------------------------------------------- 4) Binary Search Applications (continued): + continuous: bisection method + discrete: number guessing revisted Number guessing: + strategies for guessing a number between 1 and 100 - linear: start from 1 or 100 and work up/down - binary: start from 50 and split in halves takes 6 or 7 moves (depends on you split integers) + tallies for different size games: range linear binary pattern of guesses 1-1 1 1 1-10 10 5 (5-10,7-10,8-10,9-10,10) 1-100 100 8 (1-50,1-25,1-13,1-7,1-4,1-3,1-2,1) 1-1000 1000 11 (500,250,125,63,32,16,8,4,3,2,1) + binary better! + why does it work here? numbers (the range) are sorted! General algorithm: + sort the collection c (0..n-1 elements) + find middle of c (m) (n/2 element) - if m==key, return true - if m < key, search right of c (n/2+1..n-1) - if m > key, search left of c (0..n/2-1) Asymptotic complexity? + for given size n, what's worst case? + count comparisons of m==key (active operation) + approximate analysis: n comps pattern of search (assume last elem is the one we want) (o=not inspected,x=inspected) 1 1 o->x 2 2 oo->xo->xx 4 3 oooo->oxoo->oxxo->oxxx 8 4 oooooooo->oooxoooo->oooxoxoo->oooxoxxo->oooxoxxx 16 5 oooooooxoooooooo->see pattern for 8 (other values of n? need to ceiling operation... we're trying to find a pattern of worst cases first!) so, it seems that n = 2^(comps-1) so, log[2](n) = (comps-1) so, comps = log[2](n)+1 (see DS&SD, pg 114) + "comps" is really a count of operations, which is a function of n T(n) seems to be log[2](n)+constants (see TAmax from DS&SD) + big-Oh says to ditch constants, so binear search = O(log[2](n)) (CS people appreviate as log n) + note: anything less than O(n) means that not all values in the collections are being inspected What about unsorted collection? ex) {1, 4, 3, 5, 2} - look for the 2 with binary search - pattern: 3, 4, 1 because 4 is less than 5 - crap! it doesn't work - need sorting.... What about continuous data? (like equns) + already sorted! no problem Recursive binary search (alternative code): + see pg 460 DS&A Handy way to sort: + http://java.sun.com/j2se/1.4/docs/api/java/util/Arrays.html -------------------------------------------------------------------------------