Supabase Part 3: Multi Tenancy

This is the third article in a series on Supabase, focusing on solutions to common problems, tips, and recommendations.

In this post, I will share tips and tricks about multi tenancy and multi tenant user sessions.

Database Model

There are two common approaches to implementing multi-tenancy in a SaaS database. One method is to give each customer a separate database or schema. The second method is to store all data in shared tables, isolating customers by a single tenant_id column. The decision is not purely technical, as compliance and regulatory requirements can play a role. For example, a schema-per-tenant model may be mandatory to provide more data isolation. While most of the concepts in this article apply to both approaches, I will focus on the schema-per-tenant model.

Assigning each customer a separate schema can be more challenging to maintain, but when designed properly, it works very well. This approach adds an extra layer of safety by ensuring data isolation and allows for easier migration of specific tenants to different nodes later on. However, there are a few serious considerations to keep in mind.

Performance

First, let’s consider the number of tables. A large number of tables in the same database instance can become a performance bottleneck at scale. While the impact may not be immediate, it's wiser to design efficiently from the start to keep the number of tables manageable. Although normalization is generally recommended in RDBMS, it can sometimes be more efficient to denormalize similar data to improve storage, query performance, and reduce maintenance overhead.

For example, if you have 10 tables holding configuration options for different parts of the application, they might be merged into a single table without any issue. You can use config jsonb columns for config values and a config_type enum column to distinguish between different configurations. At first glance, this may seem unnecessary, but it becomes significant with a schema-per-tenant approach. Adding 50 extra tables to a single instance is manageable, but repeating them for each schema means 50,000 more tables for 100 tenants. And if "100" is a very small number for your business, you can do the math; the overhead might be massive. At some point, you may even encounter operating system limits in single server instance.

An efficient solution to table bloat is to keep these tables in a shared schema, avoiding duplication across tenant schemas. The data can be separated by a tenant_id and RLS policies. This approach significantly reduces the number of tables, and maintenance overhead. While it may seem like a hybrid approach of two different multi-tenancy models, in my experience, it's the most effective strategy.

💡
Linking tables in different schemas through cross-schema foreign key constraints should be avoided. As the number of tenant schemas grows, foreign key bloat will occur, similar to tables. This will negatively impact performance and increase maintenance complexity.

Security

The shared schema can be the default “public” schema or a custom “private” schema. In Supabase, the “public” schema is exposed to PostgREST by default, allowing clients to query tables within it. Therefore, RLS policies must be carefully configured to ensure data security. I recommend moving all tables that aren’t needed or are infrequently accessed on the frontend to the “private” schema, so only backend systems can access them. This is especially important for tables containing highly sensitive data. I recommended the following architecture:

  • public

    • contains app specific data, shared data, tenant based normalized meta data

    • must not contain sensitive data

    • all tables must use RLS policies; access should only be allowed by “authenticated” role

    • policies in tenant tables must also check tenant_id in policies

    • updates and deletes should be completely restricted for critical paths (only backend should manipulate data)

    • must not contain any destructive stored procedures such as maintenance functions etc.

    • procedures must have SECURITY INVOKER instead of SECURITY DEFINER to prevent bypassing RLS

  • private

    • same as public, but not exposed to clients with anonymous key

    • only secure clients (with service_role key) and direct database/pooler connections can access

    • should contain sensitive procedures, functions and tables (can’t be called from supabase clients)

    • administration jobs and scripts that spans multiple schemas should use SECURITY DEFINER to bypass RLS

  • t_XXXXXXXXXX (tenant schemas)

By default, only the “public” schema is exposed to Supabase clients (PostgREST). In most cases, this should suffice, and it’s safer to keep other schemas unexposed. While you can configure this in the Supabase dashboard, there’s no built-in way to change it dynamically. If you want to access custom tenant schemas with the Supabase client, you need a method to update this setting each time a new tenant signs up. The following function can override the settings to accomplish this, as documented in PostgREST.

CREATE OR REPLACE FUNCTION "private"."expose_schemas"() RETURNS "pg_catalog"."text" AS $BODY$
  DECLARE schemas text;
  BEGIN
  -- get all tenant schemas in addition to "public" and "storage"
  SELECT 'public,storage,' || string_agg(nspname, ',') INTO schemas
  FROM pg_namespace
  WHERE nspname LIKE 't_%';
  -- expose schemas
  EXECUTE 'ALTER ROLE authenticator SET pgrst.db_schemas = ' || quote_literal(schemas);
  -- notify postgrest for the schema config
  NOTIFY pgrst, 'reload config';
  NOTIFY pgrst, 'reload schema';

  RETURN schemas;
END
$BODY$
  LANGUAGE plpgsql VOLATILE SECURITY DEFINER
💡
Exposing everything is not recommended, as it reveals internal information to all clients. It’s better to limit exposure as much as possible and use views instead of tables for common queries. In a schema-per-tenant model, this becomes even more critical, as exposing each tenant schema can create unnecessary overhead in PostgREST and increase the risk of security leaks.
The recommended approach for writing queries in stored procedures and functions is to use fully qualified names for database objects, such as table names and data types, while setting search_path to an empty string. Misconfigured search paths or relying on non-fully qualified names can lead to data leaks in multi-tenant environments. This best practice is also documented in the Supabase advisor, where you can identify risky functions in the database.

Access Control

Access to the database can be controlled using app_metadata parameters in the Supabase API token. A common practice is to set app_metadata.tenant_id for each user and utilize it in RLS policies for authorization. Examples are provided in the previous article.

Multi-Tenant Login

So far, I’ve successfully built a multi-tenant architecture, but one possibly important feature is missing. In Supabase, users can log in concurrently from multiple devices or browsers, but they can’t log in to multiple tenants simultaneously. This limitation exists due to Supabase’s authentication architecture. Supabase handles users and authentication primarily with three tables.

  1. auth.users: Stores users uniquely by their email address and holds authentication metadata like raw_app_meta_data (which corresponds to app_metadata in the token payload). Since each email must be unique, it’s not possible to have multiple distinct app_metadata entries for a single user. This limitation prevents multi-tenant logins for users with the same email.

  2. auth.sessions: Stores each active session for a logged-in user. Does not impact multi-tenancy.

  3. auth.identities: Stores linked identities per user from different providers. Does not impact multi-tenancy.

If the authentication system were designed slightly differently, with raw_app_meta_data stored in the auth.sessions table, handling multi-tenant logins would be easier. However, given the current architecture, a workaround is needed.

Fortunately, the Supabase JS client allows you to configure its fetch function, giving you control over what gets sent to PostgREST. By overriding this function, you can add custom headers dynamically. Although there's a global headers parameter for custom headers, it’s set during client initialization and cannot be modified afterward; making it insufficient for more complex use cases. Below is an example of an overridden fetch method:

const client = supabase.createClient(
      "SUPABASE_URL",
      "SUPABASE_ANON_KEY",
      {
        global: {
          fetch: async(input: RequestInfo | URL, init: RequestInit = {}) => {
            return await customFetch(input, init);
          }
        }
      }
);
const customFetch = async function(input: RequestInfo | URL, init: RequestInit = {}) {
  // get active tenant id from somewhere (e.g. sessionStorage)
  const currentTid = sessionStorage.currentTenantId || null;
  if (currentTenantId) {
    if (init?.headers instanceof Headers) {
      init.headers.set("X-Tenant-Id", currentTenantId);
    }
    else {
      init.headers ||= {} as Record<string, string>;
      init.headers["X-Tenant-Id"] = currentTenantId;
    }
  }
  return await fetch(input, init);
};
💡
The createClient function has also the auth.storage parameter, allowing you to customize how the access token is stored and cached. By default, it's set to localStorage, making the token persistent across browser tabs. For example, setting it to sessionStorage would limit each login session to a single browser tab.

The code example demonstrates how each Supabase API query includes the X-Tenant-Id header, which is sent to PostgREST. This allows you to retrieve the actual tenant ID in RLS policies, instead of relying on app_metadata. Here’s the crucial part: as discussed earlier, app_metadata remains the same for a user across all active sessions, regardless of how many tenants the user is logged into. In the case of multi-tenant sessions, app_metadata.tenant_id could differ from the tenant ID of the active session. This is why the tenant header must override app_metadata, as shown below:

CREATE OR REPLACE FUNCTION "auth"."tenant_id"()
  RETURNS "pg_catalog"."uuid" AS $BODY$
DECLARE
   "tenantId" uuid;
   "headerTenantId" uuid;
BEGIN
  SELECT coalesce(
    (nullif(current_setting('request.jwt.claims', true), '')::jsonb->'app_metadata'->>'tenant_id'),
    null
    )::uuid INTO "tenantId";
  SELECT coalesce(
    (nullif(current_setting('request.headers', true), '')::jsonb->>'x-tenant-id'),
    null
    )::uuid INTO "headerTenantId";

  -- IMPORTANT: ensure that user is part of this tenant:"headerTenantId"
  -- otherwise ignore the header and use app_metadata.tenant_id
  IF "headerTenantId" = ANY(auth.allowed_tenants()) THEN 
    "tenantId" := "headerTenantId";
  END IF;

  RETURN "tenantId";
END;
$BODY$
  LANGUAGE plpgsql STABLE
  COST 100

Using auth.tenant_id() in RLS policies can now support multiple tenants per user login. The key aspect of this function is validating the header value (IF "headerTenantId" = ANY(auth.allowed_tenants())). Without this check, there would be a critical security vulnerability allowing malicious users to access other tenants' data. Remember that anyone can manipulate http request headers. This validation ensures that the tenant ID sent in the headers belongs to a tenant of which the user is a member. The function is open to improvements of course, for example throwing errors if there is a mismatch.

A boilerplate for this function is shown below. It fetches the active user’s tenant IDs from a custom my_custom_users_table table and returns them as an array of UUIDs. Another possibility is to save the allowed tenant IDs into app_metadata.allowed_tenants and validate if allowed_tenants includes the tenant ID from the header, in the auth.tenant_id function. Both methods are valid, with the latter being more verbose but faster because it eliminates the need for an extra query. As long as you implement a method that ensures the user can only access their own tenants, any method is acceptable.

CREATE OR REPLACE FUNCTION "auth"."allowed_tenants"()
RETURNS "pg_catalog"."_uuid" AS $BODY$ 
  -- return allowed tenant ids for current user
  -- must have INDEX(user_id, tenant_id) to read it fully from RAM
  SELECT ARRAY_AGG(tenant_id) FROM private.my_custom_users_table
  WHERE user_id = (SELECT auth.uid());
$BODY$
LANGUAGE sql STABLE
COST 100

It is assumed that the private.my_custom_users_table contains a list of users similar to auth.users, but it enforces uniqueness through the (user_id, tenant_id) tuple rather than the email address. Data should not be duplicated in this table; instead, it should store additional user information that is specific to the application.

Keep in mind that this header modification must be implemented across all clients connecting to your application. If you are using edge functions or a custom backend service, you will need to initialize the Supabase client with the correct X-Tenant-Id header in those as well. Otherwise, your frontend client and backend service could query different schemas using the same access token, potentially leading to data leaks.

Conclusion

Making a multi-tenant application requires a different mentality, especially with each customer having their own schemas. The performance and security drawbacks of each database model should be taken into consideration while developing the software.

In the next article, I’ll cover the interoperability between edge functions and a custom backend service.

Did you find this article valuable?

Support Arda's Notebook by becoming a sponsor. Any amount is appreciated!