5min. read

Software Analysis Process that Could Save Your Web

A few months ago, I got a question about the search feature on the web. The feature is to be used on an e-commerce store and it has to be fast and efficient. Of course, it was the interview question. If I did not answer it well, how could I get the job? Well, I just wanted to share my answer here.

Search... It seems like it is a simple thing that most software must do, but actually it's not. Each of the search functions could have its own specific features that are completely different from another. The question was for the e-commerce website, so we have to focus on what the customers would like to find... a product. The feature is for searching products or any other searchable items in the store. The API could be something like /search/suggest.json?q=XXX&resources[type]=XXX&resources[options][unavailable_products]=hide if this is for the Shopify store. And this function could return the data of type products, collections, pages, and articles in JSON format. Anyway, if I answered something simply from Shopify, no one could really see my potential, so I answered something else.

SERVER-SIDE

I assumed that we have our own server. And we could cache data in the memory cache to reduce the needs of accessing the database, and this cache must be deleted when the master value from the database has changed. Moreover, there is also a read-only replica database that syncs itself with the master database periodically.

The first thing to do for the search is to handle the predictive search (for autocomplete feature). Normally the predictive search returns a small sample of results, hence this one does not need to get directly to the database unless the number of results is very close to zero. We need to understand first that accessing the database is very slow and could slow down man other things that could be more important than just quickly grab some quick results. In this case, we might need to really search the database. The number of results to be returned is also low. We have to check the frontend also, some predictive search results also include the images of the results which take a huge area on the screen, we ended up showing only a few results from the search, hence there are no needs to get a lot of results at all for this kind of search. Sometimes, with the images, the screen could accommodate only around 3-4 results. For text-only results, it could probably be 8-10 results. If the user cannot find what they want, they would hit return/enter to fully search anyway. Limiting the predictive search also saves a lot of customers' mobile data.

After the predictive search. The real search (after the customer hits the return/enter button) is fired and, for this time, the replica database is accessed. We do not need to access the master database because this is the read-only feature. The master database is for critical tasks that require insert or update of the data. The frontend is also very important. By setting up the pagination or infinity scrolls, it reduces a lot of workload at the server's side and also decreases the time the customer has to wait.

CLIENT-SIDE

Javascript is one of the most popular languages for this kind of feature on the web. It handles the API calls to the server without refreshing the whole page (AJAX calls). Here's the most important part. If we set up the autocomplete to be fired after every key-up, then it could cost the customer and the server n calls for an n-character long search term. To reduce this number, we have to look at how fast humans can type on average. To improve the performance and reduce the unnecessary API calls, there is a way to reduce the number of APIs by delaying it. On average, a person can type 40 words per minute (typists could type faster anyway), which means that in one minute, a person could type around 200 characters per minute or 3 characters every second. This would mean that a user could spend around 333ms to press one key. If we can delay the API call by 400ms, we could get the second character before the first API was fired, so we can cancel the first API. But to cancel it this way, fast typists could not get the predictive search results at all, hence we have to set the limit to the delay, I would set it to eventually call for every 3 characters, this would reduce the number of API calls by 67%. So we can (1) reduce the number of API being fired, (2) reduce the load in the browser so it does not need to fire the API and render the result for every keystroke which would be too short for human to process anyway, (3) reduce the load to the server. Also, we could reduce the load to the network although the network is fast because the user cannot read every single search result while typing anyway. And if the network is slow, then reducing the number of APIs this way would not overwhelm the network too much.

Here is the pseudocode for the search box in ReactJS Component.

class searchBox extends React.Component { performPredictiveSearch = () => { // fire the predictive search API and render the predictive search results }; predictiveSearch = () => { // clear the previous timeout to prevent it from being fired if the next keystroke was pressed before the 400ms timeout // or fire the API anyway if after every 3 characters // to fire the API, call performPredictiveSearch }; completeSearch = () => { // fire the complete search API and // render the search result in the main body. }; render () { return <div> <input type="search" onkeyup="predictiveSearch" onchange="completeSearch" /> </div> } }

Category: Tech Tags: optimization, web, design, software, search