Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
April 5, 2022 05:21 am GMT

Speed up your Code!!!

One of the methods to speed up your applications could be to implement Caching.

Caching not only speeds up the response time for the users but it also reduces the resourse usage on the system making them efficient.

But what is caching?

In simple terms, Caching means storing frequently demanded things closer to those asking for it

Lets take a look at a simple analogy very beautifully depicted in this youtube video about caching.

Suppose you want to write an article about a certain topic and want to access books from the library for the content.

Now, you could go to the library every time you want to look at some piece of information from the books but that would be a very expensive operation. Travelling from your home to the library, searching for the book, noting down the content, travelling back to your home, and repeating this whole process again and again.

What you can do however, is that you can borrow these books from the library and take them home with you.

So now, everytime you need to lookup some content from the books, they are right there within an arms reach at your house.

In this analogy, the library sort of acts as a hard drive which has a lot of contents(books) saved in it but accessing it frequently would be rather expensive. And your house acts like a cache, where you have only the books youll need to look up frequently for information and accessing which is relatively cheap.

Python comes with a built-in module called functools which contains higher order functions which either take other functions as arguments or return other functions.

The functools module also contains some higher order functions like lru_cache for caching which well take a look later in the article.

For now, lets try to implement a simple cache all by ourselves in python!

Save the below code in a .py file and try executing it. Lets see what output do we get.

import requestscache = dict()def get_article_from_server(url):    print("Fetching article from server...")    response = requests.get(url)    return response.textdef get_article(url):    print("Getting article...")    if url not in cache:        cache[url] = get_article_from_server(url)    return cache[url]get_article("https://realpython.com/sorting-algorithms-python/")get_article("https://realpython.com/sorting-algorithms-python/")get_article("https://realpython.com/sorting-algorithms-python/")get_article("https://realpython.com/sorting-algorithms-python/")

When we run the above code, we get the below output.

Image description

See how,

Fetching article from server. . .

is printed only one time and that is the first time?

Thats because once we have fetched the article, we are saving it in the cache we created before sending it as a response in the function get_article.

Hence for the subsequent calls to the same article, we dont have to fetch it from the server/internet, we can just read the contents we saved in cache earlier and return the result.

But we cant just build a cache from scratch everytime we need to use it. The cache we implemented above was relatively simple but what if we need more complex caching mechanisms? Thankfully python comes with some built-in functions/decorators that will help us do just that!

Lets take the problem of calculating factorial of a number.
Below is what a simple code to calculate factorial will look like.

def factorial(n):    return n * factorial(n-1) if n else 1

But the problem with this code is that, for each new value of n it will re-calculate the values.

What do I mean by that is that if i calculate the factorial of 7 then i get 5040 as the output which was calculated by 7x6x5x4x3x2x1

But now if I calculate factorial of 5, it will re-calculate 5x4x3x2x1 which it had already calculated earlier while calculating factorial of 7.

These calculations can be stored in the cache and retreived without having to re-calculate again thereby saving computing resource and time making the overall system efficient!

Lets now modify the code to use cache instead.

from functools import cache@cachedef factorial(n):    return n * factorial(n-1) if n else 1

If we call factorial(10), the function will make 11 recursive calls since the cache is empty. But if we call factorial(5) after this, the code wont do any calculations at all, it will just look up the value in the cache and return it.

And if we call factorial(12), the code will only make two recursive calls for 11 and 12 since the values up to 10 were already stored in the cache.

Great! now we know how we can create a cache and apply it to a function which does some expensive operations so that we can speed up our code and make it more computationally efficient.

Lets take a little detour and revisit the library example . . .

You kept researching topics one after the other and as a result have bought home more books that you can keep!

In terms of caching, what has happened here is that your cache has grown indefinitely.

The purpose of cache is to be small and only store some of the data that is being frequently accessed by the users, having your cache grow as big as your hard disk defeats its purpose as it will be significantly expensive to maintain this cache.

So, what we need now is a mechanism or policy that would remove content/information from the cache which has lost it's relevance and is not accessed as frequently anymore. That is, we need some policy that would help us off-load some of the books stored in our house back to the library. This is where cache eviction policies come in.

Cache Eviction Policies

There are many such strategies defined that will clear out your cache and maintain its size keeping it from growing indefinitely. lets look at a brief of some of those policies.

  1. First-In First-Out (FIFO) - The entires that gets added to the cache first, gets deleted first. That is, it evicts the oldest entries.
  2. Last-In First-Out (LIFO) - The entires that gets added to the cache last or the newest entries, gets deleted first. That is, it evicts the newest entries.
  3. Least Recently Used (LRU) - The entries that havent been used in a long time gets evicted from the cache.
  4. Most Recently Used (MRU) - The entry that was used most recently gets evicted from the cache.
  5. Least Frequently Used (LFU) - The entries that dont get accessed too often gets evicted from the cache.

Lets take a look at the working of LRU in a bit of detail . . .

When you implement a cache with LRU strategy, it organises its items in the way they are being used by the users.

Everytime you access an entry in the cache, the LRU algorithm will move this entry to the top(most recently used) of the cache.

Now when the algorithm looks at the cache, it knows that the entries at the bottom are stale and not being used by the users anymore hence it can be safely removed from the cache to make way for newer entries.

Take a look at the below image.

Image description

Here, when the user fetches article 1, the cache stores the article as most recent then serves it to the user.

Now when the user request another article. . .

Image description

The cache assigns the article 2 as most recent and pushes the article 1 down the list.

In this way the LRU strategy identifies which entries need to be evicted from the cache so that the size of the cache can be managed.

Lets try implementing LRU strategy with the example of calculating fibonacci numbers

Below is what a simple code to calculate fibonacci numbers would look like

def fibonacci(n):    if n < 2:        return n    return fibonacci(n-1) + fibonacci(n-2)

Since this is a recursive function, for a large value of n this function would become computationally heavy.

Lets implement a cache on this function with LRU strategy.

from functools import lru_cache@lru_cache(maxsize=16)def fibonacci(n):    if n < 2:        return n    return fibonacci(n-1) + fibonacci(n-2)

Similar to what we saw earlier, we can use the lru_cache decorator from functools built-in library to implement cache with the LRU eviction strategy.

Here maxsize parameter defines the maximum size the cache can grow to.

Now, if we call this function and get its cache info, we would get the output something like this.

Image description

Here,

hits - the number of call that were returned from the cache

misses - the number of calls that didnt exist in the cache and had to be calculated

maxsize - the maximum size of the cache. Since we defined out maxsize to be 16 it shows the same in the output

currsize - the current size of the cache. Here we can see the cache is full

To sum up

caching not only reduces the computing resource on your system but also makes your code run faster.

Implementing caching in your code can drastically improve the user experience by giving the quick and optimal results.

Sources

  1. https://www.youtube.com/watch?v=6FyXURRVmR0
  2. https://docs.python.org/3/library/functools.html
  3. https://realpython.com/lru-cache-python/

Original Link: https://dev.to/skazi019/speed-up-your-code-39de

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To