Sources Contact Advanced Search Tutorials

An Interest In:

Web News this Week

Search Archive

Some of Our Sources

View All Sources

Help Webnuz

Referal links:

May 1, 2020 07:11 am GMT

Let's build a Scalable system

I previously wrote about:

In this article, we'll get to know the preliminary steps you can take as a Software Engineer for building a scalable system.

Let's see how we can decrease loadtest time from 187s to 33s

Note: I'll be using Node.js but don't skip reading, try to absorb the concept, especially if you're a beginner.

Here is the task:

Build a server with only one GET request to return the highest Prime number between 0 - N

My Setup

I've used pure Node.js (Not express.js) for the creation of my server and routes as well, you are free to use express.js
You can use this idea with any language, so don't skip reading but you can skip the code/code repo.

Let's Start!

I used this as one of my assignments for hiring (experienced) devs. The session used to be a pair-programming setup where the candidate was free to use the Internet and tools of his/her choice. Considering the kind of my routine work, such assignments are really helpful.

When you wrote a brute-force approach

Let's assume you created your server with the basic algorithm to find a Prime Number. Here is an brute force approach example:

// just trying the first thought in mindfunction isPrime(n) {  for(let i = 2; i <= Math.sqrt(n); i += 1) {    if (n % i === 0){      return false;    }  }  return true;}function calculateGreatestPrimeInRange(num) {    const primes = [];    for (let i = 2; i <= num; i += 1) {      if (this.isPrime(i)) primes.push(i);    }    return primes.length ? primes.pop() : -1;  }

You'll try to use it in your GET route say like this https:localhost:9090/prime?num=20, it will work fine and you'll feel good. You tried it with some numbers like ?num=10, 55, 101, 1099 you will get instant response and life feels good :)

Hold On!

As soon as you try a large number say num=10101091 you will feel the lag (I've tried it in the browser, you can use Postman)

Since we are not using PM2 right now (which does a ton of things many of the beginners are not aware of), you'll notice that when you try to open a new tab and try for a smaller number, your tab will be waiting for the result of the previous tab.

What you can do now?

Let's bring in concurrency!

Cluster mode at the rescue!

Here is the block of code showing Cluster mode in action. If you are not aware of Cluster Module please read about it.

const http = require('http');const cluster = require('cluster');const os = require('os');const routes = require('./routes');const cpuCount = os.cpus().length;// check if the process is the master processif (cluster.isMaster) {  // print the number of CPUs  console.log(`Total CPUs are: ${cpuCount}`);  for (let i = 0; i < cpuCount; i += 1) cluster.fork();  // when a new worker is started  cluster.on('online', worker => console.log(`Worker started with Worker Id: ${worker.id} having Process Id: ${worker.process.pid}`));  // when the worker exits  cluster.on('exit', worker => {    // log    console.log(`Worker with Worker Id: ${worker.id} having Process Id: ${worker.process.pid} went offline`);    // let's fork another worker    cluster.fork();  });} else {  // when the process is not a master process, run the app status  const server = http.createServer(routes.handleRequests).listen(9090, () => console.log('App running at http://localhost:9090'));}

Voila!

After implementing the Cluster Module, you'll see a drastic change!

You can notice after this using threads, the browser tab with a smaller number will get the response quickly meanwhile the other tab is busy doing the calculations (you can try it out in Postman as well)

For those who are not using Node.js, cluster mode means running your app in concurrent mode using the available threads in the CPU.

Now we have a bit of relaxation but what else we can do to make it even more performant because out single requests with large numbers are still lagging?

Algorithms at your rescue!

I know this is a haunting word but it is really an essential tool you cannot ignore and in the end, after implementing a new algorithm you'll get to realize the worth of Algorithms.

So for prime numbers, we have a Sieve of Eratosthenes
We have to tweak it a bit so as to fit this in our use-case. You can find the complete code in the repo inside the class Prime.

Let's have a look at the Loadtesting Results

Brute force approach for num=20234456

Command passed to the loadtest module:

loadtest -n 10 -c 10 --rps 200 "http://localhost:9090/prime?num=20234456"

Result:

INFO Total time:          187.492294273 sINFO Requests per second: 0INFO Mean latency:        97231.6 msINFO INFO Percentage of the requests served within a certain timeINFO   50%      108942 msINFO   90%      187258 msINFO   95%      187258 msINFO   99%      187258 msINFO  100%      187258 ms (longest request)

Using SOE with modifications for num=20234456

Command passed to the loadtest module:

loadtest -n 10 -c 10 --rps 200 "http://localhost:9090/prime?num=20234456"

Result:

INFO Total time:          32.284605092999996 sINFO Requests per second: 0INFO Mean latency:        19377.3 msINFO INFO Percentage of the requests served within a certain timeINFO   50%      22603 msINFO   90%      32035 msINFO   95%      32035 msINFO   99%      32035 msINFO  100%      32035 ms (longest request)

You can compare both the results above and can see SOE is a clear winner here.

Can we improve it further?

Yes, we can, we can add a cache, a plain Object in Javascript which can be used as a HashMap.

Using a cache will store the result for a given number N, if we get a request again for N, we can simply return it from the store instead of doing the calculations.

Let's see the results

Brute force approach with cache for num=20234456

INFO Target URL:          http://localhost:9090/prime?num=20234456INFO Max requests:        10INFO Concurrency level:   10INFO Agent:               noneINFO Requests per second: 200INFO INFO Completed requests:  10INFO Total errors:        0INFO Total time:          47.291413455000004 sINFO Requests per second: 0INFO Mean latency:        28059.6 msINFO INFO Percentage of the requests served within a certain timeINFO   50%      46656 msINFO   90%      46943 msINFO   95%      46943 msINFO   99%      46943 msINFO  100%      46943 ms (longest request)

Using SOE with modifications & cache for num=20234456

INFO Target URL:          http://localhost:9090/prime-enhanced?num=20234456INFO Max requests:        10INFO Concurrency level:   10INFO Agent:               noneINFO Requests per second: 200INFO INFO Completed requests:  10INFO Total errors:        0INFO Total time:          33.047955697999996 sINFO Requests per second: 0INFO Mean latency:        19081.8 msINFO INFO Percentage of the requests served within a certain timeINFO   50%      23192 msINFO   90%      32657 msINFO   95%      32657 msINFO   99%      32657 msINFO  100%      32657 ms (longest request)

Time Analysis

Conditions	Time
With basic algo	187.492294273 s
With SOE	32.284605092999996 s
With Cache	47.291413455000004 s
With SOE & Cache	33.047955697999996 s

Finally

I hope you understood the benefits of the following:

Multithreading
Algorithms
Caching a.k.a Memoization

I hope you liked this short note, your suggestions are welcome. Hre is the code repo: find-highest-prime

You can find me on Github, LinkedIn, and, Twitter

Original Link: https://dev.to/ashokdey_/building-a-scalable-system-4l41

Share this article:

View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To