Transcript of JMU CS 240: Intro to Asymptotic Complexity Classes

previous videos we worked on understanding the running time of algorithms by developing functions that would map from the input size of the algorithm to the number of steps it would take to complete so the goal now is to categorize those functions using ASM totic analysis and and what we mean by ASM totic analysis is that we're basically going to disregard the behavior of our growth functions for small inputs and we're going to focus on how those functions grow as the input sizes approaches infinity and that's how we're going to develop our categories and we'll eventually provide some mathematical definitions of those categories but let's start with some informal rules of thumb so the informal idea is that our functions are going to be categorized according to their dominant or fastest growing term and what that means is that we're basically going to throw away constants in lower order terms in order to determine the category that we're working with so let's see some examples um this notation can be read as 10n is in big Theta n so the category is Big Theta n um and we're saying that 10n is in that category we could also say that 10n is order n and how did we end up putting it in that category well as we said here we're going to throw out constants we're going to disregard them so this 10n just becomes n um and that's how that ended up in that category let's do another example um so 5 n^2 + 2 n + 3 so maybe we worked really hard to analyze some algorithm and it had a set of nested five nested uh Loops that each haven't done times and so on and we ended up uh with this growth function 5 n^2 + 2N plus 3 that's exactly how many steps this algorithm performs well um in order to find the category the big Theta category for this algorithm we're going to throw out the lower order terms and what I mean by lower order terms are if your function is composed of a number of of different elements added together those elements are the terms and the slower growing terms right n versus n^ 2 just get thrown away constants thrown away we're left with 5n s and again we're going to throw out the constant so the 5 n^2 just becomes n^2 so my category is n s it ends up in this category n^ there's one more example n log n plus n um so here this the lower growing term is this n and you might be momentarily tempted to throw out the log n right if it's n * log n and the log n is slower growing well we should throw that out but that only applies when we're multiplying values together that only applies to constants so here for 5 n^2 we threw out the five but log n is not a constant it doesn't get thrown out so the category is log n log n um we're in in order n log or we're in big Theta n log but that's the rule um let's take a second to try to figure out why that rule intuitively might make some sense and to do that let me show a couple of quick demos here we go let's compare the function 10n to N2 now if we're just looking at at this figure we might be tempted to prefer n s as an algorithm it takes fewer steps for all of the inputs that are being plotted here and we want faster algorithms we want algorithms that take fewer steps but of course you are smart enough to recognize um that if we look a little bit further along the x axis if we look at larger inputs eventually that n s is going to end up um with larger values being a slower algorithm than than 10n well what if we compare n^2 to 20 n what if we use a larger constant on our n here here once again we might say ah well actually n s is looking pretty good it's faster for all of these values of n so maybe we should prefer n^2 but of course you know what going to happen if we look a little bit further out eventually that n squar grows faster it's going to dominate our 20n and we're actually going to prefer our 20n algorithm if the input is large enough and that same story is going to repeat for any value of any constant value that we select to multiply by n always going to be the case that eventually no matter how big that constant is we're going to prefer the N algorithm the order n algorithm over N2 so that's one justification for throwing out the constants they may change the story they may change the story in the short term but they're never going to in the long term make a function with an n squ in it um better than a function that that is that is linear okay well um I I said faster and slower a lot when I was um working through that little example uh this is the summary despite constants functions from slower growing classes will always be faster eventually have to be a little bit careful with this with this wording um when I'm talking about a slow growing function I mean a function where the value of the function is growing slowly as the input size increases so like log n is an extreme slow growing function even when n is really big log n doesn't get very big and what that means is that if an algorithm runs in log n time it is a fast algorithm um so that that's the potential confusion just to keep that clear okay so that's one reason that we might want to drop the constants it's not going to um yeah that's one reason we might want to drop the constants how about lower order terms let's look at another example here we go now we're comparing uh n^2 + 3 n to n^2 and what we can see here is obviously n2+ 3 n is larger there's this gap between them and this isn't a case where uh that Gap is going to go away right if I look out farther along the xaxis it's not like eventually N2 will end up larger than n^2 + 3n that isn't going to happen um n2+ 3n is always going to be the larger of the two and as we can see that gap between them is actually getting bigger right if we look over here uh you know 3n is not very big but as we look further along 3n has gotten bigger so the gap between these two functions is going to grow as n increases so we might be thinking okay well that really sounds like that 3n difference here is important and just throwing it out would be um throwing away some useful information but let's see what does happen when we when we look a little bit further out on the xaxis let's we were going up to 20 before let's multiply that by a factor of 10 and go out to 200 what do we see um well now it looks like n^2 + 3n is almost Landing right on top of n^2 we can hardly see the difference so you can tell by a little tiny bit that n2+ 3 n is above it's a little bit larger but that difference between them is looking very insignificant um and if we go out a little farther still let's say we increase n by another factor of 10 um now they're indistinguishable they look like they're right on top of each other they look like exactly the same function well how can that be given that we know that the gap between them is actually growing and growing as we go uh as as n increases well that's true but if we consider um the two parts of this function the n s part and the 3n part the N squ part is growing so much faster that it's it's overwhelming this 3n part right so once we're up here at 4 million right if we look at the y- axis here we've gotten to 4 million well um this 3n is is very small in comparison to the overall magnitude of the function it's getting washed out essentially by this faster growing term and at this point it seems like the 3n is kind of an irrelevant detail if what we're interested in is kind of the shape of this function as it grows it's really not having any impact in the long term these two functions look exactly the same and that's the justification for dropping these lower order terms the informal justification okay one more quick example you might be thinking okay that all makes sense but how why should we drop constants for functions that are in the same category that're in the same um order of growth so if we compare 10n and N for example um clearly 10n is larger it's always going to be larger and no matter how far I go out on the um x-axis here that Gap is not going to shrink right it's not going to to disappear um with larger values of n it's always going to be 10 times as large well it's actually true that in practice it's not that we never care about constant factors right if I have an algorithm and I work really hard and I make it 10 times faster well I've only increased its speed by a constant Factor but I'm probably going to be pretty happy about that um so I do care about constant factors but there are some reasons that they're not going to be our priority for analysis one is more than the absolute speed of an algorithm what this ASM totic analysis about it's about understanding the impact of increasing the input size so if I have a linear time algorithm or two linear time algorithms 10n and N for example and I ask the question what happens when I double the input size or when I increase the input size by a factor of 10 in some sense the answer is the same for both of them I can I can um it's going to take 10 times as long or 100 times as long if I increase the input size by 10 or if I input the incre I increase the input size by 100 so that story about the impact of increasing the input size doesn't change at all as a result of the constant values so that's one reason that we may not be interested in the constant values and that's really another side of that same story is uh what happens if we use a faster computer so if I get a computer that's 10 times as fast and I'm uh running a linear algorithm well the story of what happens is that I can process 10 times as many inputs and that that constant factor that I might be multiplying by that that n value value doesn't change that story so that story is the same if the running time was 10n or 100n or n buying a computer that's 10 times as fast will allow me to process 10 times as many elements as I could with my slower computer if the algorithm was an N squared algorithm uh doubling getting a computer that's twice as fast doesn't allow me to process uh twice as many elements so it's a different story in that cas um the other reason we we might want to not want to focus on constant factors is that they are influenced by the kinds of distractions that we talked about when we introduce the choice of basic operation so if we pick a basic operation um that's going to change the constants even though it isn't really fundamentally changing the way the algorithm operates uh constants could be impacted by programming languages and lots of other sort of irrelevant details okay so let's kind of recap all of these ideas so our our process that is that we're going to drop constants in lower order terms and that's going to give us a set of categories a set of classes so let's draw some of those out for example log n we'll say Theta log n that's a category and in that category we have things like 10 log n we have you know 15 log n plus 12 we have an infinite number of different functions that all once we apply our our dropping constants in low order terms end up in this log n category similarly we have a category for n beta n and in here we have things like 12 n we have n + 26 we have 200 N we have an infinite number of functions but they're all considered linear functions they all land in this n category this Thea n category and all of the functions that we discussed when we did our our previous uh sort of discussion of different function families we're going to have a category for each of them so how about n log in and we have a bunch of members of that family so you know 2 N log n Etc and we are going to have a category for N squared and a category for n cubed and a category for 2 to the n and a category front factorial and again there are an infinite number of these categories but this is a few of the sort of Usual Suspects a few of the cases that are going to come up now and again and there's no overlap between these categories a function lands in exactly one category and they have the property that any function in this category is going to eventually be worse than any function in this category so it's providing an ordering on all of these growth functions a categorical ordering so all of our linear functions are in some sense eventually better slower growing than all of our nlog n functions all of our NL unlog n functions are eventually in the long run at least better than all of our Theta and squared functions so these categories um are ordered in that in that sense we can think of this uh maybe in anly to biology so in biology it's very important uh to categorize life forms so you have Kingdom philm class order family genus species so look at the fundamental you know categories of life everything is a individual species are categories categorized into sets of sets um where of gen of a Genus etc for computer science these are our fundamental categories so so I guess we could say that the individual functions you know running times for individual algorithms could be considered a species and then these big Theta categories are like a Genus perhaps um and all of the different um algorithms in that genus hang together they're sort of all in some sense uh worse than all of the N log species and better than all of the log species so this is kind of our most fundamental categorization of algorithms and algorith runtimes in computer science um notice also that this provides uh sort of another justification for our flexibility in picking basic operations when we're doing algorithm analysis so we said when we introduce this notion of a basic operation it doesn't really matter that much the exact operation that you pick as long as it's something that takes a constant amount of time and it happens in the innermost Loop and the reason now we can see that it doesn't really matter what you pick is that any reasonable choice of a basic operation is going to leave you in the same category so it doesn't really matter what we pick in that sense because as long as we make a reasonable Choice it's it's it's going to be it'll leave you in that same big big Theta category okay um so this has all been a little hand wavy and informal uh what we're going to do as we move forward is we're going to come up with mathematical definitions of these big data categories and also introduce Notions of upper bounds and lower bounds um when we introduce bigo and big Omega notation

JMU CS 240: Intro to Asymptotic Complexity Classes

Share your thoughts