Grokking System Design Interview- Steps to Follow

Grokking System Design Interview


The challenges of a software engineer, with the myriad duties and responsibilities, may range from mild to critical. The prospect personnel needs to have ample knowledge in more than a handful of the system and high-level languages, development and diagnostic tools, and other related technologies. This article is all about Grokking System Design Interview.

It also requires the constant maintenance, troubleshooting, or upgrading of an existing system. The position, over time, may need to create from scratch a new and untried system. System design and development is one of the many routes for the specialization of a computer professional.

A problem, though, exists in the real world. Many of these software engineers find the interview for a system design position a challenge of the hard kind for several valid reasons.

One is they are not confident they have enough experience in doing system design for a large-scale project. Second is they are terrified of the system design interview because of the vast and sometimes ambiguous questions that do not have any standard answers. Third and last would be the wrong attitude of “come what may” resulting in coming into the interview not even half prepared.  

An applicant that gets the chance to be scheduled for an interview by a super-mega corporation of the likes of Amazon, or Google, or even FedEx, has no reason to go unprepared; unless, on the offhand, that a serious, unexpected medical need arises. 

If the applicant, however, comes in bright and fresh for the interview after a lengthy preparation and exhibits the capability of handling an intricate system, a zest to sort out the trouble, expect then that an irresistible offer will be at hand. 

To aid the candidate in achieving success in a system design interview, we have come up with a step guide and a sample system design procedure for a theoretical application. It is assumed that the candidate already knows something of the tools and techniques requisite in system design.

Steps to Follow in Designing a System During an Interview.

    1. Clear and definite system goals, scope, parameters, and requirements;

    2. Start your work (but ask along the way, if need or can be);

    • Estimates on constraints and capacity

    • API

    • System designing

    • Design of database

    • Cache

    • Replication and partitioning of data

    • Load balancer

    • Telemetry

    • DB cleaning

    • Security and permissions

    • Why would we need URL shorteners?

    3. Present your model.

Step One. Clear and Definite System Goals, Scope, and Requirements

After the interviewer mentions what he wants, you should ask and clarify the goals, system parameters, requirements and functions, and the scope of the system. This is important since the interviewer might leave and be back by the end of an hour to discuss your model.

It is never a good idea to assume anything about putting up a system because there are so many things needed to be worked upon in system design. From scratch, you build it up, test and debug, run again, test and debug. This must be the attitude to have unless the interviewer tells you that they will leave it all up to you.  

If that is the case then start working immediately to have the time to iron out the kinks in your model system. If not, however, then ask everything that needs to be decided upon by the company.

Queries that must be asked on items of system parameters such as storage, memory, database, servers, interfaces, cache, scaling, bandwidth, data model, security, and SQL or not.

Determine as well from the interviewer the requirements of the intended system. These are the primary, secondary and extended functions of what the system should be able to do and what not to do. The primary requirement would have to be why users would be attracted to the application or website. 

What does the system actually do? Is there any requirement before the system can do anything? Will the product of the operation be stored or given and not stored? How many operations does the system perform? Do we install a redundancy algorithm on all functions or operations?

These are but a few of the questions a candidate must ask the interviewer to be able to design a viable system. Although it may not be a perfect and running system under a limited time, the essence of the system will be seen and the potential of the candidate will then be known. 

Again, this all depends on the time and the intricacy of the system they demand on you. Whatever that may be before you take that seat, it is best to be prepared for all eventualities and the tools and techniques you need to come through.

From hereon, we will assume that the interviewer has told you to create an application that shortens a long, and copy mistake-prone URL address down to fifty percent or less. When that user uses the “shortened” URL, they will be redirected to our system and our system will give them the “long” URL so they can browse into that URL.

Do note that this is only a theoretical setting.

Step Two. Start Your Work (but ask along the way, if need or can be)

    • Estimates on Constraints and Capacity

A system that shortens a long URL address to less than fifty percent will need the URL address, of course, and the user’s information. Somewhere, somehow, those data will need to be stored for the user or those that he/she had shared the shortened URL.

The creation of appropriate components will be discussed as we go along but an understanding of this function on a worldwide scale needs to have first the estimation, as close as possible, of capacities and constraints that affect the running of all the programs within the system. 

Traffic, storage, memory, and bandwidth estimates must be determined or approximated by the system designer from the inputs given by the interviewer. This will help to see what the backbone of the system will look like to support the next activities that must be done.

For example, on the first year of operation, assuming there would be ten million new users a month: 

Traffic:  10M new users/month (assuming 100:1 ratio between read/write)

1B redirects/month    (10M * 100)

3.8 QPS(queries/sec) (10M / [(60sec/min)*(60min/hr)*(24/day)*30) [write]

380 redirects/sec [read]

Storage: 1 new user = 100 bytes 

10M new users = 120M new users/yr

10M new users =  1 GB (gigabytes)/month; = 12 GB/yr

Bandwidth: (write) 3.8 QPS * 100 bytes  = 380 bytes/sec

(read)  380 redirects/sec * 100 bytes – 38 Kb/sec

Memory: Those users that request redirects need to be cached to speed up the service, so:

380 redirects/sec * (3600sec/hr * 24 hrs/day) = 33 million redirects/day

But only about 20% of these redirects are repeated, so:

33 million * 0.20 * 100bytes = 660 MB/day

    • Application Programming Interface (API)

Knowing the previous boundaries of the projected system will be undergoing, it is a good practice to set the interfaces that will set up the rules between machines and users. To access many of the web services, any of the available APIs in the market may be used such as SOAP or REST where REST would be preferable since its creation was to augment the shortcomings of SOAP.

Define the APIs for creating the “shortened” URL with required and optional parameters. The same will be done also for the return of the “shortened” URL to the user’s machine and also for the deletion portion of the created short URL.

Malicious users will be dealt with as early as this time by limiting users to a certain number of created “shortened” URLs for a certain period of time. This is important for the survival of the service which may end up with an overflow of created URLs.

    • System Designing

This is the part of the system where a computer algorithm creates the shortened URL. There are two possible ways to get the job done: by creating the actual URL itself by using cryptographic hash functions or by using a standalone Key Generation Service (KGS).

If done by the first method, a base64 number system is desirable to match up with a cryptographic hash function such as the SHA-256. If by the second method, the return of the “shortened” URL will be faster but concurrency problems may arise depending on which KGS is used.

Any of the two methods chosen will have both issues. A necessary workaround for those issues needs to be addressed since sending out an already-used “shortened” URL will ruin everything. 

    • Design of Databases

This part of the system design will help in the flow of data between the system’s individual components. This section will project the type of database we would need from some important information that must be known before selection of the appropriate type. Information such as:

1. How big is one record? (We know this already at 100 bytes, in assumption.)

2. How many records will there be? (We also know this already at 10M new users/mo & 120M/yr.)

3. The system will have more read operations than write operations.

4. Does a relationship exist between records?

The last question is not an important aspect in determining which database to use. Fortunately, there is no relationship between records which will lessen the stress and workload on system maintenance as a whole.

With that known, a no relationship database may be chosen where it is also a much easier type of database in terms of scaling when the service gets to have an increase in new users. 

    • Cache

As mentioned earlier, for speed sake for one, we may cache “shortened” URLs that are requested to be retrieved quite more frequently than others. We may do this by using that product that is readily available in the market right off the shelf. There are many out there so choose the best one that is most applicable to the use of the system model.

There are still many decisions that must be made other than just procuring a ready-to-use product. Decisions such as the size of the memory cached based on the estimations that were done earlier. The eviction policy is to be adopted when if a server is full. Which to evict first, by least recently used, or other policy?

Which would be more efficient to use, a single big server, or a couple or triple smaller servers for caching? What data structure is favorable to use to keep track of all those URLs and hashes that go with it? Do we replicate if multi servers are used?

A flowchart and/or a pseudo-code will help get the idea across the table. It will also help realize the model system into skin and bones.

    • Replication and Partitioning of Data

Data partitioning is preferred when a large volume of data is anticipated. It can make the system more efficient, allows access to a considerable portion of a single partition, query management, and handling are vastly improved along with better performance, and, helps in the simplification of common tasks on data.

If data partition is not undertaken, database maintenance will take ages to finish, old data will be hard to find, indexes will be difficult and time-consuming to maintain, data queries will be difficult. 

The advantages of data partitioning far outweigh the other side of the coin. This, however, must be done with utmost care and in the correct fashion to achieve all the advantages mentioned.

There are two methods of data partitioning: by Range, and by Hash.

1. Range Partitioning: the hash key of the URL is the identifying mark. What the first letter of the hash key is will determine where it will go. That URL will go to the partition exclusive to that letter; if letter “A” or “a” then to partition “A”/”a”, so on and so forth. This method has a drawback that must be anticipated by the systems engineer.

2. Hash Partitioning: this method uses the hash function we had chosen earlier; that is if chosen over the Key Generation Service (KGS). If not, then there is no other choice but the range method. If it was chosen, however, that hashing function will distribute those URLs according to the generated key (1-256). Likewise, this method has its own drawback.

    • Load Balancer (LB)

The importance of a load balancer may not be evident to a common web or internet user but it does to a computer professional. A website that grows in popularity and has only one medium server springs into action by adding two other medium-sized servers. The sheer size of the volume of growing traffic, however, is so huge it overwhelms the website and the hardware they have.

Load balancing would have helped, and a couple more medium servers to form a farm or pool of servers. Load balancing is a technique employed to ensure the website will not go into a figurative “firestorm” once the happy event of tremendous traffic rise happens. Dire consequences would likely erupt if not done so.

It is the efficient distribution of traffic across the farm of servers. Load balancers may be installed in special hardware by a vendor or it may come as software making it cheaper and flexible.

The LB can be placed anywhere in the system: between the different types of servers or between the client and the servers. Three methods of traffic distribution are available: Round Robin (which happens to be popular among the three), IP Hash, and Least Connections. 

Single traffic or user will persist in one server until the session is over. This is usually the method for those websites that employ virtual carts when a user shops. This is not without any drawbacks that the system designer must be aware of.  

    • Telemetry

Telemetry is analytics. It may be an added cost on the system developer’s end but the data gathered is invaluable for the improvement of the product or service. Relatively a new technology, telemetry has leaped to the “cloud” and has embedded itself into chips.

If you could only know the country the request for “shortened” URL  came from, what time of the day, the browser it jumped out from, the number of times a particular URL was retrieved, and probably the name, even partial, of the user to know the gender.

Telemetry, however, has a rather unique challenge. It is not a problem by itself but rather a social issue. Many users who notice the presence of telemetry in a product would immediately turn it off. They do not like, to say the least, that someone is “looking” at them. The social issue of privacy.

    • DB Cleaning

Purging of data is equally important that when that storage is near full performance will suffer. The same situation happens with databases. The good thing with purging is that it has the option of arching data, especially when losing historical data is frowned upon.

In the case of our system, cleaning the database is necessary. That is why, right from the first query of a new user, an option of an expiration date must be filled by the new user. By the end of that time, e.g., one year, the link to that user will be deleted and the key that contains the “shortened” URL is sent back to the key DB available for use again as a new user.

This is not the only case that happens in the real world. The candidate must anticipate answering any questions from the interviewer with regards to DB cleaning.

    • Security and Permissions

There is the perception that security is always excessive; but, when a breach happens, security then is not enough.

NoSQL databases generally do not require iron-clad security. NoSQL is also a new guy in the block (not a kid anymore, though). There are situations, in admission when outside security needs to be brought in.

For our system, however, permissions are the issue, for the moment. Do we allow users, new or current (by the telemetry we get from their requests), to create their own unique URLs? If we do, it will be limited to the number base we use (base64, remember?).

Shall we allow a set of users to access a particular URL at the same time or not? We can create an option and call it permission level and it could only either be public or private and store it with the URL itself.

Also, right from the first instance of request, and making it a premium account, the user can fill in the column for user-ID which will be stored in a separate table. The users in that table are the only ones that can see a specific URL.

Certainly, if the user does not have permission, an error message will be sent.

    • Why Would We Need URL Shorteners?

The primary reason why many love URL shorteners are it makes life easier. With a shorter URL, you save space, it is faster to copy, lesser mistake to copy (especially with a moving mobile), easier to display, tweet, or print.

That is definitely an advantage everyone feels. Let us then go a notch higher than everybody; the techies, the geeks, the digital marketer, the online seller. These are the people that wake up and sleep with a computer as a pillow. 

They would not bother to bathe because it is a waste of time. Time.

For these guys, URL shorteners and the short link it gives is a giant time-saving tool. The more email they can send in a day, with the shortened URL embedded in it, the closer they get to that yellow Lamborghini or that black Bugatti or Paris Hilton.

The web has created a mountain of opportunities and passing the interview is another opportunity.

Step Three. Present Your Model.

In essence, the system design for the shortening of a URL project is done. With a candidate’s stock knowledge in computer science or information technology and the tools and techniques in the creation and maintenance of systems, the interview would be a cinch. 

That distinction of being a Google administrator can and will be a great boost to a candidate’s character and personality for the better. Good luck.

Grokking System Design Interview- Steps to Follow

Leave a Reply

Your email address will not be published. Required fields are marked *

Scroll to top