Global Multi-Tenant SaaS Transformation Using Amazon EKS and AWS Managed Services

2026-01-08
Cloud Modernisation

1. EXECUTIVE SUMMARY

ConsintAI is a global B2B SaaS provider that delivers workflow automation, collaboration, and API integration tools to mid market and enterprise clients across North America, Europe, and APAC. Currently, the platform supports over 2,500 organizations and roughly 1.2 million active users.

As their user base exploded, Consint hit a wall. Their hybrid infrastructure a mix of on premises hardware and unmanaged cloud instances became a major bottleneck. Because they were using a single tenant deployment model, scaling meant duplicating infrastructure. This drove up costs linearly, slowed down deployment cycles, and made global expansion a logistical nightmare.

Leveraging the AWS Migration Acceleration Program (MAP), our team led a full scale architectural transformation. We moved Consint AI out of their legacy environment and rebuilt the platform into a cloud native, multi tenant SaaS architecture using managed AWS services, containerized microservices, and aggressive DevOps automation.

Key Outcomes:

·       52% reduction in overall infrastructure costs.

·       99.99% availability achieved.

·       10x faster deployment cycles.

·       Global latency slashed to <120 ms.

·       5x increase in platform scalability without requiring redesigns

2.1 Company Profile

Attribute

Details

Industry

SaaS (Workflow Automation & Collaboration)

Products

Automation Engine, API Integrations, Collaboration Tools

Customers

2,500+ organizations

Active Users

~1.2 million

Deployment Model

Single tenant (pre migration)

Regions

US, EU, APAC

Employees

~300 (120 engineering)

Prior Infrastructure

Hybrid (On prem + unmanaged cloud)

Migration Completion

2025

 

2.2 Business Context and Strategic Objectives

Consint AI was at a tipping point. They had outgrown the architecture that got them off the ground, and those technical limitations were starting to directly impact revenue and customer retention. They simply couldn't sign enterprise clients if they couldn't guarantee enterprise grade SLAs or low latency globally.

Strategic Objectives:

1.    Pivot from a rigid single tenant setup to a scalable multi tenant SaaS model.

2.    Deliver a globally optimized, low latency experience for non US users.

3.    Hit strict enterprise grade availability targets (99.99%).

4.    Implement rigorous CI/CD and DevOps automation to free up engineering cycles.

5.    Decouple infrastructure costs from customer growth to improve profit margins.

6.    Build a foundation capable of handling 5x 10x user growth seamlessly.

 

3. UNDERSTANDING THE PAIN POINTS

3.1 Detailed Pain Points

Pain Point 1   Single Tenant Architecture Causing Linear Cost Scaling

Because every new customer required a dedicated infrastructure stack, Consint AI was essentially penalized for growing. While this isolated customer data, it created massive operational bloat. Infrastructure utilization hovered below 30%, yet they were managing over 2,500 distinct environments at an average cost of $180/month per tenant. Onboarding took 2 3 days, slowing down sales velocity and destroying profit margins as the user base grew.

Pain Point 2   Global Latency and Inconsistent User Experience

The legacy app was anchored in a single region. As they sold into European and APAC markets, performance tanked. US users experienced a reasonable 120 180 ms latency, but EU users faced 350 600 ms, and APAC users suffered through 600 900 ms delays with page load times dragging up to 5 seconds. This drove a 35% spike in complaints from international users, severely hurting product adoption abroad and risking enterprise churn.

Pain Point 3   Scalability Limitations and Peak Failures

The system was statically provisioned. When concurrent users pushed past the 15,000 mark, the platform started buckling, resulting in a 6.2% peak failure rate. Because scaling took hours (sometimes days), transactions and API calls frequently timed out during heavy usage, damaging the brand's reliability.

Pain Point 4   Deployment Inefficiencies and Lack of Automation

Without a proper CI/CD pipeline, deployments were a bi weekly manual chore. Releases took 2 4 hours, and if something broke, rollbacks took just as long. The lack of automated testing meant a higher risk of production incidents, forcing engineers to spend time firefighting instead of building features.

Pain Point 5   Availability Risks and Single Region Dependency

The legacy setup had no multi AZ or Disaster Recovery (DR) (For Highly critical databases) mechanisms in place. Uptime sat at 98.2% (roughly 157 hours of downtime a year), with Mean Time To Recovery (MTTR) spanning 2 3 hours. These frequent outages violated SLAs and flooded the support desk with escalations.

 

4. OUR SOLUTION   WHAT WE OFFERED AND WHY

4.1 Solution Design Philosophy

We built our technical strategy around six non negotiable principles tailored strictly for a high growth SaaS provider.

Principle 1   SaaS Native Multi Tenant Architecture

Moving away from isolated stacks was priority number one. We engineered a logically isolated, shared infrastructure model. Multiple tenants now share the same compute resources, but data remains strictly segregated via tenant aware schemas and centralized Cognito (need to add in. Diagram)authentication. This slashed onboarding times from days to hours and fundamentally broke the linear cost to growth ratio.

Principle 2   Global First Performance and User Experience

To fix the international latency issues, we decentralized content delivery. We implemented a globally distributed architecture using CloudFront edge locations, combined with Route 53 latency based routing. By moving to stateless application designs, we dropped global latency from 900 ms down to under 120 ms, standardizing the UX worldwide.

Principle 3   Automation Driven Operations and DevOps Enablement

We treated infrastructure as code (IaC) from day one. Using Terraform alongside robust CI/CD pipelines, we automated builds, testing, deployments, and rollbacks. This shrank deployment windows from weeks to same day releases and eliminated human configuration errors.

Principle 4   Elastic Scalability and Demand Based Resource Allocation

We abandoned static provisioning entirely. The new architecture leans heavily on Amazon (EKS Auto comparison with Standard EKS) EKS for auto scaling containerized workloads, supported by serverless Lambda functions for asynchronous tasks. The platform now scales dynamically with traffic, eliminating capacity planning guesswork and ending peak hour outages.

Principle 5   Cost Optimization with Measurable ROI

Every architectural choice had to justify its cost. By moving to a pay as you go model, aggressively rightsizing compute, and leveraging MAP funding, we drove down infrastructure costs by 52%. We traded heavy CapEx for predictable, usage based OpEx.

Principle 6   Built in Reliability and High Availability

To hit enterprise SLAs, we baked fault tolerance into the foundation. Everything was deployed across multiple Availability Zones with automated failover mechanisms and self healing health checks, ultimately pushing availability to 99.99%.

 

4.2 Migration Strategy   The 7Rs Framework Applied

We applied the AWS 7Rs migration strategy to Consint AI’s application portfolio, categorizing workloads based on business criticality, complexity, and modernization potential.

Strategy

No. of Workloads

Applications / Components

Rationale

Rehost

10

Internal admin tools, monitoring, reporting.

Low risk, non customer facing. Allowed rapid migration for immediate AWS stability while focusing effort elsewhere.

Replatform

15

Relational DBs (to Aurora), basic app services.

Minimal code changes but massive operational upside by shifting to managed AWS services.

Refactor

12

Core SaaS platform, workflow engine, APIs.

The heavy lifting. We broke the monolith into microservices to enable multi tenancy and independent scaling.

Repurchase

4

CRM, email, analytics.

Dropped legacy self hosted tools for native SaaS offerings to kill maintenance overhead.

Retire

3

Legacy modules, unused reports.

Dead weight. Decommissioning these saved money and simplified the landscape.

Retain

0

N/A

Everything moved; nothing was left on the legacy hybrid setup.

 

4.3 Key AWS Services Recommended

Compute Layer:

·       Amazon EKS: Chosen over EC2 because the SaaS rewrite demanded microservices. EKS gave us managed Kubernetes orchestration, standardizing deployments and allowing the platform to scale horizontally with ease.

Database Layer:

·       Amazon Aurora PostgreSQL (Multi AZ): Replaced self managed EC2 databases. It handles multi tenant workloads beautifully, offering automatic failover and massive performance gains over standard Postgres.

·       Amazon DynamoDB: Used for high throughput, low latency tenant metadata and session storage where serverless scaling is critical.

Caching & Performance:

·       Amazon ElastiCache (Redis): Absorbed the database load for frequently accessed configurations, dropping response times to the sub millisecond range.

API & Integration:

·       Amazon API Gateway: Provided centralized governance, throttling, and secure exposure for the multi tenant API layer.

·       AWS Lambda: Took over background jobs and event driven notifications to keep the core compute layer lean.

Delivery & Networking:

·       Amazon CloudFront & Route 53: The backbone of our global latency fix, utilizing edge caching and smart DNS routing.

Security & Storage:

·       AWS Cognito: Replaced fragmented identity solutions with secure, scalable user authentication.

·       AWS KMS & Secrets Manager: Ensured strict encryption and eliminated hardcoded credentials.

·       Amazon S3: Provided durable, cheap storage for user uploads, backed by Glacier for compliance archiving.

4.4 Alternatives Rejected

·       Pure Lift and Shift: Rejected. Moving a flawed single tenant monolith to the cloud would just mean hosting a flawed monolith on AWS. It wouldn't fix the core cost or scaling issues.

·       Multi Cloud: Rejected. Splitting between AWS and Azure would introduce unnecessary operational complexity and IAM overhead for an engineering team that needed to move fast.

·       Serverless Only: Evaluated, but core stateful workflows required the consistent low latency that EKS provides. We kept serverless for event driven edges.

 

5. TARGET ARCHITECTURE ON AWS

5.1 Architecture Overview

The final state is a cloud native, multi tenant SaaS platform built in us east 1 (with eu west 1 ready for DR). We utilized a defense in depth model across logical layers to guarantee security without compromising speed.

 

Attribute

Implementation Details

Deployment Model

Multi tenant SaaS with logical isolation

Region Strategy

Primary: us east 1, Secondary (DR ready): eu west 1

Availability

Multi AZ for all core workloads

Scalability

Horizontal auto scaling

Security Model

Zero Trust, least privilege

Data Protection

KMS (Rest), TLS 1.2+ (Transit)

 

5.2 Architecture Component Reference

Public / Edge Layer: CloudFront (CDN), AWS WAF (Security), Route 53 (DNS), ALB (Layer 7 Routing).

Application Layer: Amazon EKS (Microservices), API Gateway (Traffic control), AWS Lambda (Event processing), Step Functions (Orchestration).

Data Layer: Aurora PostgreSQL (Transactional), DynamoDB (Metadata), ElastiCache (Sessions), S3/Glacier (Object/Archive).

Security Layer: IAM/Cognito (Identity), KMS/Secrets Manager (Keys), GuardDuty (Threats), Security Hub (Posture).

Observability: CloudWatch (Metrics), X Ray (Tracing), CloudTrail (Audits), AWS Config (Compliance).

5.3 Network Architecture

We built a single VPC spanning 3 AZs. Public subnets only house the ALBs and NAT Gateways. EKS nodes and Lambdas live strictly in private application subnets with no public IP exposure. Data stores (Aurora, Redis) sit even deeper in private data subnets, utilizing VPC Endpoints to talk to S3 and Secrets Manager without traversing the public internet.

5.4 Security Architecture

We enforced a Zero Trust model. The perimeter is guarded by WAF rules against OWASP Top 10 threats. Internal access relies on strictly scoped IAM roles. All sensitive data is locked down with KMS, and we use GuardDuty alongside Security Hub to constantly monitor for anomalous API behavior or compliance drift.

5.5 Data Architecture and Isolation

We shifted to a shared database model utilizing tenant aware partitioning. Logical isolation is strictly enforced at the application level using tenant IDs. DynamoDB handles lightweight metadata routing, while Aurora manages heavy transactional loads, all backed by point in time recovery.

5.6 High Availability and DR

The entire stack is multi AZ with automatic failovers for the DB and load balancers. We instituted a pilot light DR strategy in a secondary region, achieving an RTO of <1 hour and an RPO of <15 minutes.

 

 

6. MIGRATION EXECUTION   THE MAP JOURNEY

6.1 MAP Phase Overview

We executed this using the standard AWS MAP framework, balancing speed with risk mitigation.

Phase

Duration

Key Activities & Deliverables

Phase 1: Assess

Weeks 1 3

MRA, dependency mapping, SaaS readiness evaluation, TCO modeling. Delivered the 7Rs classification and migration roadmap.

Phase 2: Mobilize

Weeks 4 10

Built the AWS Landing Zone, configured VPC/IAM, and established CI/CD and IaC pipelines to prepare for automated deployments.

Phase 3: Migrate (Wave 1)

Weeks 11 16

Rehosted low risk internal tools to stabilize the AWS environment and test operational runbooks.

Phase 4: Migrate (Wave 2)

Weeks 17 26

Replatformed databases to Aurora and began containerizing core backend services.

Phase 5: Migrate (Wave 3)

Weeks 27 34

The heavy refactoring phase. Transitioned core modules into microservices, cutover to the multi tenant architecture, and went live.

Phase 6: Optimize

Ongoing

Post migration rightsizing, implementing Savings Plans, and conducting Well Architected Reviews.

 

6.2 Cutover Strategy

For a globally active SaaS platform, downtime wasn't an option. We used a Blue Green deployment strategy. AWS DMS (with Change Data Capture) kept the legacy database and Aurora synced in real time. We used Route 53 weighted routing to bleed traffic over gradually (10% → 25% → 50% → 100%), monitoring CloudWatch metrics for latency spikes. The cutover window took under 60 minutes with zero visible downtime to the end users.

 

7. BENEFITS REALISED   BEFORE AND AFTER COMPARISON

7.1 Quantitative Benefits

Metric

Before (Legacy)

After (AWS Native SaaS)

Impact

Infrastructure Cost

~$1.0M/year

~$480K/year

52% reduction; killed idle capacity

System Uptime

98.2% (~157 hrs downtime)

99.99% (< 1 hr downtime)

Hit enterprise SLAs; no single point of failure

Deployment Cycle

1 2 weeks

Same day CI/CD

8 10x faster releases

Provisioning Time

2 3 days per tenant

< 2 hours

Massive boost to sales velocity

Scalability Capacity

~15,000 concurrent users

>75,000 concurrent users

5x scale without redesign

Global Latency

600 900 ms (APAC)

<120 ms globally

5 8x performance boost

Failure Rate

~6.2% at peak

<0.5%

Reliable transaction processing

DR RTO

12 24 hours

< 1 hour

Rapid recovery capabilities

DR RPO

24 hours

< 15 minutes

Near zero data loss exposure

Eng. Productivity

35 40% time on infra

<10% time on infra

Freed ~30% capacity for actual R&D

Support Tickets

High (outage related)

Reduced by ~60%

Dramatically improved stability

 

7.2 Qualitative and Strategic Benefits

Faster Time to Revenue Moving to a true multi tenant model stripped the friction out of onboarding. Activating new enterprise clients dropped from days to hours, accelerating revenue realization and unblocking the sales pipeline.

Unlocking Global Markets By caching at the edge with CloudFront, we erased the latency tax that was hurting international expansion. The platform now feels just as fast in Europe and APAC as it does in the US, directly reducing churn in those target markets.

Engineering Velocity Automating everything via IaC and CI/CD changed the culture of the engineering team. They stopped acting as sysadmins fighting infrastructure fires and shifted entirely back to product development. This agility is allowing Consint AI to outpace their competitors in feature delivery.

Enterprise Readiness Reaching 99.99% availability wasn't just a technical win; it was a commercial requirement. Consint AI can now confidently sign aggressive SLAs with major enterprise clients without fear of financial penalties or reputational damage.

 

8. CONCLUSION AND RECOMMENDATION

This engagement was significantly more than a standard infrastructure lift and shift; it was a fundamental SaaS modernization. By transitioning Consint AI off a restrictive, single tenant hybrid setup and onto a cloud native, multi tenant AWS architecture, we resolved their most critical business bottlenecks: linear cost scaling, geographic latency, and deployment friction.

Financially, the shift to a consumption based model immediately cut infrastructure overhead by half. Technically, the adoption of managed services (EKS, Aurora) and rigorous CI/CD practices brought their deployment cadence and system resilience up to enterprise standards.

The resulting architecture strictly aligns with the AWS Well Architected Framework, providing a highly automated, secure, and elastic foundation that can support Consint AI's projected user growth without further architectural overhauls.

Share This On

Leave a comment