Scale a Cloud-Hosted Single-Tenant Deployment

Introduction

This article discusses how multi-tenant designs affect the scalability of software architecture. Security, load handling, flexibility, and control are all influenced by your decision to use a single-tenant or multi-tenant host.

What Is Scaling In Web Applications?

Once you’ve created an application and released it to the public, you want to be sure that it will still perform well as your number of users increases. As traffic grows, the app needs to be able to handle more HTTP and database requests per minute, higher volumes of data being transferred across the internet, and more transactions with complex business logic. Not only do you need to serve more requests, but you need to do so quickly (which is called having low latency), while keeping costs low by using as few resources (CPU, RAM, disk, network) as possible. How well your app handles large numbers of users is known as its ability to scale.

If you’re already using efficient code and technology stack choices to build your app, there are only two ways to scale.

Vertical Scaling (Scaling Up): By adding more power to existing servers (CPU, RAM, storage).
Horizontal Scaling (Scaling Out): By adding more servers to distribute load.

What Is the Worst That Could Happen?

If your user base grows slowly and consistently, you might feel comfortable gradually adding RAM and CPU to the cloud server that hosts your app. But there is danger in this approach, as your servers may become overloaded due to:

Temporary spikes related to specific events: For example, when ticket purchase sites crash as bookings for a concert open or when a chat platform crashes as everyone tries to send a chat message on New Year’s Eve.
Spikes due to growth: For example, when a site changes its terms of service and thousands of customers register on a competitor’s site as an alternative.

In either case, if you aren’t prepared to scale, you could lose both revenue and reputation. You need to be able to scale your resources up and down rapidly to accommodate spikes in usage, and to scale them up permanently to accommodate spikes in growth.

Apps That Fail To Scale

Popular online games are notorious for failing to handle large numbers of users. Users will grudgingly wait to play when they encounter errors in purchased games, such as Diablo with its Error 37 inability to log in, and business won’t be greatly affected. However, when it comes to free-to-play or monthly subscription games, login errors or hour-long waits often result in players refusing to give companies or games a second chance. The launch of Pokémon GO in 2016 was a disaster, as millions of players review bombed the augmented reality mobile game when they couldn’t log in, download the game, or even find game locations that weren’t empty.

Recent examples of apps that failed to scale include both ChatGPT and its Chinese competitor, DeepSeek. ChatGPT is an AI LLM with high RAM and CPU requirements that throttle its use in peak times. Although DeepSeek was the most downloaded free app on the Apple mobile store in 2025, users experienced login and chat-response outages soon after its launch, despite it being vastly more efficient than ChatGPT.

Scaling concerns affect even programming language choices. PHP’s single-process-per-HTTP-request design is one of the reasons programmers flocked to Node.js when it was released. With Node’s non-blocking IO requests, it was trivial for a junior programmer to write an asynchronous web server to handle thousands of simultaneous requests with no extra work and very little RAM.

What Is Multi-Tenant Architecture?

One important decision you need to make when selecting scalable infrastructure is whether to use a single-tenant or multi-tenant architecture.

The accepted definition of multi-tenancy is the use of a single instance of software, which runs on a single server, to serve multiple tenants. A tenant is a group of users with common access to their own data, isolated from other users and data.

Consider a project management software website like Trello or Redmine. Such a website is described as multi-tenant if it can serve multiple customers (companies running projects) from one server. For example, if a designer used the same project management software to work on two different projects at two different companies, she would be seen as a completely different user under each company (tenant), even if she logged in to both companies with the same email address.

But what exactly is a single instance of software? If an app has one web server process but multiple databases, is it still multi-tenant? Are multiple WordPress sites that all store their data in the same database classified as multi-tenant? Would a load balancer be multi-tenant if it evenly distributed customer requests to multiple identical instances of a web application that all run on the same server and access the same database?

As you can see, the difference between multi-tenant and single-tenant systems is continuous, not discrete. And infrastructure can be single-tenant or multi-tenant too — consider the difference between an app on a dedicated physical server and many Docker containers for different customers that run on a shared server. An instance of software can involve multiple URLs, processes, databases, files, virtual machines, containers, and servers. However, for the purposes of this discussion, let’s at least define the two clear ends of the spectrum.

A single-tenant application:

Serves one customer using a single database or file system.
Needs a new URL, server (or virtual machine or container), web application, or database before it can serve another customer.

A multi-tenant application:

Serves many customers using a single database or shared file system.
Can serve many customers from the same URL, server (or virtual machine or container), web application, or database.

The important differences between these ends of the spectrum are as follows:

A multi-tenant design serves multiple customers.
A single-tenant design isolates data in databases and files, which affects security and privacy.
A single-tenant design isolates servers, which affects the resource load on CPU, RAM, and disk (although the distinction between virtual and physical servers can get blurry with virtual machines and containers).

The Advantages And Risks Of Multi-Tenancy

Multi-tenancy has a lot of advantages. It’s simple to serve multiple customers with one set of infrastructure. Application and database version upgrades need to be done only in one place.

Shared infrastructure is much cheaper than a dedicated physical server. At the time of writing, the cheapest dedicated Hetzner server is ten times the price of the cheapest shared server. Since shared infrastructure is generally hosted inside containers or VMs, you can easily scale your infrastructure up or down by adjusting your share of the physical machine’s resources. You can instantly increase the power of your “server” from a dashboard or by emailing support.

One danger of multi-tenant systems is the risk of exposing customer data. If you use a single database, you must be careful that none of the SQL queries you write expose another customer’s data. An attacker who gains access to your database gains access to all your customers’ data. If you’re using a shared WordPress web host, a single website with a malicious plugin could threaten the data of all other sites on the server.

For example, the online authentication gateway, Okta, suffered major attacks in 2022 and 2023. The credentials of ten thousand Okta customers were stolen in 2022, and in 2023, the hackers gained access to a number of Okta customer support accounts. Gaining access to customer support gave the attackers access to all the customers that the compromised accounts could access on the infrastructure. Multi-tenant systems involve more customers per server than single-tenant ones, and thus expose a larger attack surface.

These attacks provide an incentive for considering a self-hosted solution instead of a cloud-hosted one. Cloud solutions are convenient, but they present easier targets because hackers can:

Learn exactly what software is running on popular service providers’ infrastructure.
Gain access to multiple customers’ data if they gain access to the cloud server.
Exploit social engineering and phishing attacks between customers and cloud providers.

If you’re in an industry that has data privacy regulations or a country that disallows the storing of data overseas, single-tenant systems and local dedicated servers might be your only option.

Scaling And Tenancy

Multi-tenant systems are especially problematic when it comes to scaling. While, in general, you can quickly scale these systems up and down because they use a slice of shared hosting, multi-tenancy poses other dangers.

You have to worry not only about load spikes in your application, but about the load spikes of noisy neighbors on your shared server. If another app on the same server experiences a sudden spike in requests or needs to run several CPU-intensive calculations, your app might slow or become unusable. If your app uses the same range of IP addresses for sending emails as another app on the email provider uses for sending spam, your emails might be flagged as spam too. These problems are difficult for hosts to manage — generally, providers warn customers to stop harmful behaviors in the short term and require them to upgrade their resource tier, or leave, in the long term.

While a shared server might be vastly cheaper than a dedicated physical one, it is actually proportionally more expensive. For example, while the aforementioned Hetzner physical server may cost ten times the price of joining a shared server, it offers 25 times more disk space, 16 times more RAM, and six exclusive-use CPU cores instead of the intermittent use of two threads. If you can fully use the dedicated resources, they can be less expensive over time.

The same expense effect is true for cloud services like Amazon Web Services (AWS). Hosting your app with multiple Amazon databases, virtual instances, serverless functions, storage buckets, and other trendy techniques is usually cheap for tiny businesses, but for large applications, using these multiple services can result in far more expensive fees than managing a private server.

Single-tenant systems and hosts allow you absolute control over your infrastructure. You can choose the exact server specifications, operating system, and software tools that you want — while excluding any unnecessary software that might otherwise slow your system.

The disadvantage of single-tenant systems and dedicated-server hosting is wasted resources. You need to buy or rent a server that can handle your peak loads, which by definition, means that on average the server will remain mostly unused. Scaling these servers up is also trickier because you need to buy a new server, install and configure an operating system, and install and configure your application and dependencies. You also need to maintain the servers over time. While these efforts can be simplified using scripts and DevOps tools, it is still more work than requesting more resources on a shared host.

Scaling And Authentication

Authentication needs special attention when designing your infrastructure because of the CPU load when hashing passwords. Hashing is an intentional bottleneck in any authentication system, whether for registration or login, because it is slow in order to make attacks on users’ passwords take longer. This means hashing is particularly susceptible to load spikes causing degradation or loss of service.

Let’s examine a real example by load testing an authentication gateway: FusionAuth. FusionAuth is an authentication gateway that can be cloud-hosted or self-hosted (using a free tier which offers most features websites commonly need). Although FusionAuth uses a single-tenant design, it allows you to partition users and customers into distinct logical tenants within the same database. Using multiple tenants is not as secure as using a single tenant, but can prove useful if you want to host DEV, QA, and PROD environments on the same server or if you need to white-label and theme an application for a new client that wants to use a customized and isolated version of your service. Learn more about the use cases for multi-tenancy in single-tenant architectures.

Load Testing Authentication With Different Numbers Of Tenants

Since FusionAuth is stateless, you can scale it horizontally by adding more instances that all access the same PostgreSQL database instance. Even in 2019, FusionAuth could support 100 million users on one database.

The question is: How does using multiple tenants on the same instance impact load?

Load tests were run on local FusionAuth instances that have databases with one, two, and three tenants. The default FusionAuth configuration for Docker was used. During each test, 2,000 users were registered, logged in, and then deleted. The users were split equally between tenants in the multi-tenant tests, so the total workload remained the same. The tests were run from a multi-threaded Go script against a PostgreSQL and FusionAuth instance, each using two cores of a twelve-core 3.4 GHz CPU.

The results of the load tests, which were each run five times, are presented in the tables below. The columns show the duration of the registration, login, and deletion steps in milliseconds, and the overall number of users processed per second for each step (the load capacity of the system) is provided in the last line of each table.

One Tenant:

	Register	Log In	Delete
	21994	14811	3213
	38241	14818	3402
	17640	14791	3301
	17075	14861	3528
	17379	15002	3304
Median	17640	14818	3304
Users Per Second	113	135	605

Two Tenants:

	Register	Log In	Delete
	21556	17075	5310
	26933	15863	4413
	18928	15490	3916
	18297	15199	3401
	17895	15208	331
Median	18928	15490	3916
Users Per Second	106	129	511

Three Tenants:

	Register	Log In	Delete
	17280	14881	3204
	17078	14801	3132
	17054	14697	3313
	17277	14600	3497
	17687	14798	3488
Median	17277	14798	3313
Users Per Second	116	135	604

You can see that the number of users processed per second did not change significantly across tests, meaning that processing speed is independent of multi-tenancy. From this, you can conclude that if a server supports more than one tenant, each tenant will have proportionally less processing speed. This decreased processing speed highlights that, when resources are fixed (such as on a shared server) and your needs could bump up against the processing limits, you should always aim for your application to be the only one on the server.

Another interesting observation from the tests concerns how the number of simultaneous requests affected performance. All the tests ran without error when using 200 requests (coroutines in Go). However, when this number increased to 400 simultaneous requests, FusionAuth encountered a few timeout errors. When the number of simultaneous requests surpassed 500, the excess requests began to fail rapidly. This demonstrates why load spikes are dangerous and why you should always provision enough infrastructure to handle your peak usage, not your average usage.

On average, during the tests, the database used 200 MB of RAM and FusionAuth used 700 MB. As expected, FusionAuth made full use of its two CPU cores and the database used only 36% of one. When the FusionAuth instance was configured to use all twelve cores, it could log in 550 users per second.

Conclusions And Recommendations

Consider the following points when designing your system to scale:

If your business is growing, you should increase your infrastructure capacity to ensure that your user experience does not degrade due to increased latency and dropped requests.
Do you have enough CPU, RAM, disk space, TCP connections, and internet bandwidth to handle usage spikes? Estimate the worst-case scenario and prepare in advance.
Conversely, if your usage is much lower than expected, you should be able to swap to a less powerful server to save money.
If you’re a small business, you can use shared hosts and tools, but you should have an exit plan ready to swap to single-tenant solutions and dedicated servers as you grow.
If you’re using multi-tenant infrastructure, monitor your application to check how performance varies over time. If you have noisy neighbors, you need to know if your application will continue to perform adequately when other apps on the server experience load spikes.
Consider the worst-case scenario if your infrastructure host were to be attacked, causing denial of service or exposing customer data. If your service going down or the loss of private data would cause financial or personal harm, switch to a more secure infrastructure.
You can trade security for performance by using a faster hashing algorithm or decreased iterations in your authentication system; this will save CPU time and increase throughput.

To explore the benefits of single-tenant solutions yourself, use the FusionAuth pricing calculator to compare the fees for different loads and hosting types, and read about how one client saved thousands of dollars by self-hosting their own authentication gateway.

Results

Recent

The Importance Of Scaling And Why Single-Tenant Applications Are Better Than Multi-Tenant SaaS

Introduction

What Is Scaling In Web Applications?

What Is the Worst That Could Happen?

Apps That Fail To Scale

What Is Multi-Tenant Architecture?

The Advantages And Risks Of Multi-Tenancy

Scaling And Tenancy

Scaling And Authentication

Load Testing Authentication With Different Numbers Of Tenants

Conclusions And Recommendations

Further Reading

More on tenants

How To Use FusionAuth's Multi-Tenant Feature To Create A Private Label Offering

Subscribe to The FusionAuth Newsletter

Results

Recent

The Importance Of Scaling And Why Single-Tenant Applications Are Better Than Multi-Tenant SaaS

Introduction

What Is Scaling In Web Applications?

What Is the Worst That Could Happen?

Apps That Fail To Scale

What Is Multi-Tenant Architecture?

The Advantages And Risks Of Multi-Tenancy

Scaling And Tenancy

Scaling And Authentication

Load Testing Authentication With Different Numbers Of Tenants

Conclusions And Recommendations

Further Reading

More on tenants

How To Use FusionAuth's Multi-Tenant Feature To Create A Private Label Offering

Related Posts

Announcing FusionAuth 1.58.0 - The Delegation Dolphin

Announcing FusionAuth 1.57.0 - The Webhook Wombat

Announcing FusionAuth Version 1.56 - The SSO Snake

Subscribe to The FusionAuth Newsletter