Scalable Web Development

Developing a simple web application with scalability in mind

Published in

Towards Data Science

5 min readSep 2, 2017

Nowadays, web services are becoming more and more popular with increasing mobile access and development of e-commerce. Anyone who would come up with a website will definitely expect it to grow to attract more and more visitors.

But to start with are we going to use an entire server farm anticipating 1 Million users?, which may be the target in 5 years time. Well the answer is no. But are we going to completely ignore our sincere expectation and build a static web page?. Yet another big NO!!!. Lets see what we can do about it.

Considerations to start with…

To make it simple, lets go ahead with 5W framework which is an accepted way of information gathering and problem solving. Lets apply the framework with scalability in mind and develop on top of it.

What?

For our scenario this is a web application. But the first question to ask should be is it a web application? May be not. For now it is.

Why?

The purpose of the web application. Following are some common use cases that are common these days.

E-Commerce — Needs security, SSL and other certification.
Interactive application — Social networking, Educational, Blogging where there are large number of users accessing concurrently and streaming content.
Information display — Just showcasing content, not much of serious computations.
Analytical platforms — Accept request and server asynchronously/synchronously. Publish/Subscribe based processing (PubSub). APIs that provide interfaces to perform functions. Few expensive processing will be running for a longer period of time.

When?

When are we going to deploy the product. Time is crucial as “time is money”. More time we take, more we have to pay for Software Processes. With modern agile practice, more attention is given towards coding rather than documentation. Therefore a clear architecture must be followed and communicated regularly. Or else this could happen.

Starting with a highly scaled solution would definitely take more time starting from the design phase itself. Starting too simple would add additional pressure with rework. Therefore a tradeoff has to be agreed before actual development. We’ll see how…

Where?

Where comes in few forms.

Where is our target market?
Where are we?
Where are we going to deploy the product/project?

These questions are mostly addressed by the business analysis. Yet as engineers we prefer to have services deployed closer to target customer/user base due to latency, security and load balancing reasons.

Who?

Who are we targetting. What are their access patters. Time zones and etc, comes under this section. This is because, such information directly shows the nature of workload distribution over time. Working in two time zones sometimes makes it easy to deploy releases as we can expect to have lower loads of work at working hours at the development centers. Or we might need to provide redundant servers to serve request during maintenance hours for a better QOS (quality of service).

A Scalable Solution

Let us now consider the scenario of the web application to build a scalable solution. It is always good to think ahead, but not beyond foreseeable future.

Maintainability and scalability

Anticipating some growth what ever we make must be maintainable. Otherwise there would be huge rework in every release of version. Therefore proper separation of concerns has to be practiced over the entire project.

The diagram represents layering of content for 2 database instances. Load balancing is used either to use both instances or to use one at a time as fail safe mode.

Connecting Components

It is always to have separate components and having messages passed for communication. This would add the communication over head, yet this can scale very well and be maintained easily too. Following diagram represents the organization of components in implementation. This is different from the above diagram. This does not demonstrate the actual flow of infomation, but the separation of components in technical terms.

Arrangement of components in actual implementation

Usually NGINX is used (Used Apache in old times, Microsoft has IIS and Passenger for Python and Ruby) to route requests. The containers usually run in different ports but we usually expose only port 80 to the world (For security reasons), so that no one from outside can connect to our databases.

Sample flow

User requests www.mydomain.com and this request will be routed to HTML static content.
Static content will be loaded to the users browser. This is mostly an Angular or a ReactJS application. This is because, unlike in old days where we used PHP, JSP, JADE, Twig or Blade to make templates, we don’t do that anymore. It kept our applications coupled with API and request controllers.
Once the content gets loaded all the other work usually gets done by calling the web API. For an example the login request would be like this.

"method": "POST",
"body": {
    "username": "anuradha",
    "password": "password1234"
},
"headers": {
    "content-type": "application-json"
}

These requests are sent to the url www.mydomain.com/login as a post request. These will be directed to the authentication server by NGINX. Upon success a token will be provided for the webapplication. These are known as JWT (JSON Web Tokens). These will be used to authenticate users afterwards.
All the coming requests will be sent to www.mydomain.com/api/somepath. NGINX will route the /api/ requests to the API container. JWT must be send as a header in the form authorization: bearer <token>. Obviously in order to prevent someone hijacking your token an https connection must be used.

Why containers

Containers are lightweight virtualization layers, mostly running linux kernels. These are used because they can be started and terminated quickly than that of a Virtual Machine. Also these do not consume as much resources as a VM would do.

Also the container enable safe deployment of content. Say for an example a vendor can ship a Docker Image for an API after configuring for a particular environment without having to send the codebase.

In some cases NGINX itself comes as a container, to handle large number of requests and route to a large number of other containers. Click and see!

Glimpse at Elastic Beanstalk

This is a platform that provides many web servers that can scale out as the load increases. They monitor for the resource utilization and keep adding resources automatically and charge accordingly. This has become popular due to the ease of deployment. In their own terms it is,

There is no additional charge for Elastic Beanstalk — you pay only for the AWS resources needed to store and run your applications.

Advantages
* Simple
* Automatic scaling
* Resource monitoring
* Provides all the infrastructure components needed
* Secure