Overview
A big thanks to my mentor Mr. Samrat Priyadarshi who helped me build t his project which focuses on building a globally scalable web server using Google Cloud Platform (GCP). The key components include network configuration, multi-region deployment, autoscaling, health checks, load balancing, DNS configuration, monitoring, and load testing. This ensures high availability, reliability, and performance across multiple geographical regions.
Architecture
The architecture consists of:
Cloud DNS: Resolves domain names to IP addresses for users accessing the application.
Cloud Load Balancer: Distributes traffic across multiple regions to ensure high availability and scalability.
[Managed Instances Groups [MIGs](cloud.google.com/compute/docs/instance-groups)]: Deployed across multiple regions (US, Europe, Asia) for redundancy and low latency.
Autoscaling: Automatically adjusts the number of VM instances based on traffic.
Health Checks: Monitors the availability of instances to ensure they are serving traffic.
Steps to Build the Globally Scalable Web Server
Step 1: Configuring Network and Firewalls
1.1. Network Creation
Create a Virtual Private Cloud (VPC):
Define a network to isolate your resources. I used default network here.Subnet Creation:
Use the automated subnet creation mode. The subnet will be generated automatically for your chosen regions.
1.2. Firewall Rules
Basic Firewall Rules (not meant for PROD scenario)
Allow Remote Access:
TCP:22 (SSH)
TCP:3389 (RDP)
Allow Web Traffic:
TCP:80 (HTTP)
TCP:443 (HTTPS)
Ingress/Egress:
Deny all other ingress traffic
Allow all egress traffic
Specialized Firewall for Load Balancer: Create a firewall rule to allow traffic from Google’s load balancing IP ranges so that health probes can verify VM status:
gcloud compute firewall-rules create allow-load-balancer \ --network=webserver-vpc \ --source-ranges=130.211.0.0/22,35.191.0.0/16 \ --allow=tcp:80
This rule ensures that the load balancer can send probes to instances to check their health and only direct traffic to healthy VMs.
Step 2: Configuring the Multi-Region Application with Autoscaling and Health Checks
2.1. Instance Template
Location: Global
Subnet: Auto-assigned
IP Address: Ephemeral
vm OS type
- Startup Script:
Use a startup script to install and configure the web server (e.g., Apache)
apt update && apt -y install apache2
EQUIVALNT REST:
{
"creationTimestamp": "2025-01-28T06:17:56.310-08:00",
"description": "",
"id": "7787731650216286715",
"kind": "compute#instanceTemplate",
"name": "ha-web-server",
"properties": {
"confidentialInstanceConfig": {
"enableConfidentialCompute": false
},
"description": "",
"scheduling": {
"onHostMaintenance": "MIGRATE",
"provisioningModel": "STANDARD",
"automaticRestart": true,
"preemptible": false
},
"tags": {
"items": [
"http-server",
"https-server",
"lb-health-check"
]
},
"disks": [
{
"type": "PERSISTENT",
"deviceName": "ha-web-server",
"autoDelete": true,
"index": 0,
"boot": true,
"kind": "compute#attachedDisk",
"mode": "READ_WRITE",
"initializeParams": {
"sourceImage": "projects/debian-cloud/global/images/debian-12-bookworm-v20250113",
"diskType": "pd-balanced",
"diskSizeGb": "10"
}
}
],
"networkInterfaces": [
{
"stackType": "IPV4_ONLY",
"name": "nic0",
"network": "projects/gke-migration-446808/global/networks/default",
"accessConfigs": [
{
"name": "External NAT",
"type": "ONE_TO_ONE_NAT",
"kind": "compute#accessConfig",
"networkTier": "PREMIUM"
}
],
"kind": "compute#networkInterface"
}
],
"reservationAffinity": {
"consumeReservationType": "ANY_RESERVATION"
},
"canIpForward": false,
"keyRevocationActionType": "NONE",
"machineType": "e2-small",
"metadata": {
"fingerprint": "Q1UCB9rGe1I=",
"kind": "compute#metadata",
"items": [
{
"value": "apt update && apt -y install apache2",
"key": "startup-script"
}
]
},
"shieldedVmConfig": {
"enableSecureBoot": false,
"enableVtpm": true,
"enableIntegrityMonitoring": true
},
"shieldedInstanceConfig": {
"enableSecureBoot": false,
"enableVtpm": true,
"enableIntegrityMonitoring": true
},
"serviceAccounts": [
{
"email": "679731798265-compute@developer.gserviceaccount.com",
"scopes": [
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring.write",
"https://www.googleapis.com/auth/service.management.readonly",
"https://www.googleapis.com/auth/servicecontrol",
"https://www.googleapis.com/auth/trace.append"
]
}
],
"displayDevice": {
"enableDisplay": false
}
},
"selfLink": "projects/gke-migration-446808/global/instanceTemplates/ha-web-server"
}
2.2. Managed Instance Groups (MIG)
- Deployment:
Create instance groups in multiple zones (e.g., US, Europe, Asia) to achieve global distribution.
Now we need create each instance group using the instance template we created.
Autoscaling Configuration:
Minimum Instances: 1
Maximum Instances: 3
Autoscaling Signal: Based on HTTP load balancing utilization (e.g., target of 10% utilization)
Initialization & Cool-Down:
- Set an initial delay (e.g., 60 seconds) and a cool-down period for scaling decisions.
Health Checks:
Configure health checks to monitor VM health.
Initial Delay: 300 seconds to allow proper startup.
Auto-Healing: Enable to replace unhealthy VMs automatically.
MIGs for each Zone
Note: Before configuring the Cloud Load Balancer, you might see a warning indicating that the instance groups are not yet associated with any backend pool, which is fine for now.
Step 3: Setting Up Load Balancing and SSL Certificates
3.1. Load Balancer Configuration
- Type: Application Load Balancer (public facing)
Frontend Configuration:
Protocol: HTTPS
Port: 443
- Certificate: I used Google-managed SSL certificates to ensure secure, encrypted communication between users and the web server, which significantly reduced the administrative burden of managing certificates manually. Their automated issuance, renewal, and security compliance help maintain a robust security posture with minimal effort. Additionally, the provisioning process was remarkably swift — typically completed within just a few minutes — allowing me to rapidly integrate secure communications into the infrastructure without delaying the overall project timeline
EQUIVALENT REST:
{
"certificate": "-----BEGIN CERTIFICATE-----\nYk2AGhxkmcdmSz/yilrbCgj8=******************************************\n-----END CERTIFICATE-----\n",
"creationTimestamp": "2025-01-28T07:02:54.963-08:00",
"description": "",
"expireTime": "2025-04-28T09:19:42.000-07:00",
"id": "12547031058238216**",
"kind": "compute#sslCertificate",
"managed": {
"status": "ACTIVE",
"domains": [
"demo-project.live"
],
"domainStatus": {
"demo-project.live": "ACTIVE"
}
},
"name": "my-certificate",
"selfLink": "projects/gke-migration-446808/global/sslCertificates/my-certificate",
"subjectAlternativeNames": [
"demo-project.live"
],
"type": "MANAGED"
}
Backend Configuration:
For the load balancer’s backend configuration, I created three backend services, each mapped to a separate managed instance group deployed across different regions. This setup ensures high availability, fault tolerance, and optimized traffic distribution by directing requests to the nearest healthy instance group. Each backend service is configured with an HTTP health check to monitor instance health and remove unresponsive ones from the load balancer’s routing. By using three backend services corresponding to three instance groups, I achieved regional redundancy, reducing latency for users worldwide while enhancing scalability and resilience against failures in any single region
Backend type: Instance Group
Protocol: HTTP
Named Port: http
Timeout: 30 seconds
Backends
Health Check: Creating a new health check is crucial because it continuously monitors the status of your instances, ensuring that only healthy servers handle traffic. The load balancer relies on these checks to route traffic to instances that pass, preventing unresponsive or faulty servers from affecting overall performance and reliability. Additionally, health checks provide essential metrics that trigger autoscaling and auto-healing processes, automatically replacing or adding instances to maintain optimal performance during peak demand or unexpected failures. In essence, a robust health check mechanism is key to maintaining system reliability, performance, and security by proactively identifying and mitigating potential issues.
EQUIVALENT REST:
{
"checkIntervalSec": 10,
"creationTimestamp": "2025-01-28T06:45:56.340-08:00",
"description": "",
"healthyThreshold": 3,
"id": "5716819761501131595",
"kind": "compute#healthCheck",
"logConfig": {
"enable": false
},
"name": "my-health-check",
"selfLink": "projects/gke-migration-446808/global/healthChecks/my-health-check",
"tcpHealthCheck": {
"request": "",
"port": 80,
"response": "",
"proxyHeader": "NONE"
},
"timeoutSec": 10,
"type": "TCP",
"unhealthyThreshold": 3
}
Load Balancing Mode: Rate-based (e.g., a maximum of 100 requests per second per instance)
Locality Policy: Round-robin to evenly distribute traffic.
4. Configuring Cloud DNS and A Records
Creating a Cloud DNS Zone is essential for managing domain name resolution within Google Cloud. A Cloud DNS Zone acts as a container for DNS records, allowing you to map domain names to IP addresses efficiently. After setting up the DNS zone, I added an A Record, which links the domain name to the external IP address of the load balancer. This ensures that when users access the domain, traffic is routed to the load balancer, which then distributes requests to the backend instances. This setup improves accessibility, enables global reach, and ensures users always connect to a health.
After configuring the SSL certificate, we needed to create a third A record to correctly map the load balancer’s IP to the domain name. While NS (Name Server) and SOA (Start of Authority) records are automatically generated when a Cloud DNS Zone is created, they only define the authoritative DNS settings and do not resolve domain queries to an IP address. The additional A record explicitly assigns the load balancer’s external IP to the domain, ensuring proper DNS resolution. This step is necessary to enable secure HTTPS traffic flow after the certificate configuration, allowing users to access the web application seamlessly through the custom domain.
Create a Cloud DNS Zone
Add an A Record pointing the load balancer’s IP to the domain.
pointing the load balancer’s IP to the domain for A records.
EQUIVALENT REST for Zone Details
GET https://dns.googleapis.com/dns/v1/projects/gke-migration-446808/managedZones/my-dns-zone
{
"cloudLoggingConfig": {
"enableLogging": false
},
"creationTime": "2025-01-28T15:37:09.186Z",
"description": "",
"dnsName": "demo-project.live.",
"fingerprint": "37908d7cdd695ad000000194ad8f2d42",
"id": 4003855636162501328,
"location": "global",
"name": "my-dns-zone",
"nameServers": [
"ns-cloud-d1.googledomains.com.",
"ns-cloud-d2.googledomains.com.",
"ns-cloud-d3.googledomains.com.",
"ns-cloud-d4.googledomains.com."
],
"visibility": "public"
}
Zone details and Record Sets
EQUIVALNET REST for A records
GET https://dns.googleapis.com/dns/v1/projects/gke-migration-446808/managedZones/my-dns-zone/rrsets/demo-project.live./A
{
"name": "demo-project.live.",
"rrdatas": [
"34.54.1.253"
],
"ttl": 18000,
"type": "A"
}
5. Load Testing with Locust
There are multiple alternative ways to perform load test and simulate traffic from different regions instead of mannually provisioning spot VMs in different regions of GCP and then performing load test from those regions.
In this method is used LOCUST which is an Open-Source, Python-based distributed traffic generation tool that can simulate thousands of requests from different locations.
Install Locust: Documentation
Allow Traffic to Port 8089 (Default Locust Web UI Port)
By default, Locust runs on port 8089. To access it, you need to open this port:
Allow Port in Firewall Rules
Run this gcloud
command to open port 8089:
gcloud compute firewall-rules create allow-locust \
--allow tcp:8089 \
--source-ranges=0.0.0.0/0 \
--target-tags=locust-server \
--description="Allow Locust web UI access"
Create a python script:
use nano/vim/vi editor:
nano load_test.py
#sample script to load test from multiple locaton
from locust import HttpUser, task, between
class LoadTestUser(HttpUser):
wait_time = between(1, 3)
@task
def test_backend_service(self):
# Test the backend service through the load balancer
self.client.get("/backend-endpoint") # Adjust this to the correct path for your service
# US Region simulation
class USLoadTestUser(LoadTestUser):
host = "http://your-loadbalancer-us.com" # Load balancer's endpoint
@task
def test_backend_service(self):
self.client.get("/backend-endpoint")
# Africa Region simulation
class AfricaLoadTestUser(LoadTestUser):
host = "http://your-loadbalancer-africa.com" # Load balancer's endpoint
@task
def test_backend_service(self):
self.client.get("/backend-endpoint")
# Asia Region simulation
class AsiaLoadTestUser(LoadTestUser):
host = "http://your-loadbalancer-asia.com" # Load balancer's endpoint
@task
def test_backend_service(self):
self.client.get("/backend-endpoint")
# Europe Region simulation
class EuropeLoadTestUser(LoadTestUser):
host = "http://your-loadbalancer-europe.com" # Load balancer's endpoint
@task
def test_backend_service(self):
self.client.get("/backend-endpoint")
- Run the test:
locust -f load_test.py
- This will start Locust, and you can monitor the load tests via the Locust web interface (by default, available at
[http://localhost:8089](http://localhost:8089)),
locust -f load_test.py[2025-02-02 13:50:24,081] cs-558123390422-default/INFO/locust.main: Starting Locust 2.32.8 [2025-02-02 13:50:24,083] cs-558123390422-default/INFO/locust.main: Starting web interface at http://0.0.0.0:8089
Access the Web Interface
web interface at http://0.0.0.0:8089
- Start the load test
Navigate to Load Balalcer Monitoring
- Before the load test
- After the load test:
Backend services autoscaled
Traffic Distribution Across multiple regions
I successfully performed a load test across multiple regions using the GCP load balancing service, simulating traffic from different geographic locations. The traffic was distributed across the regions (US, Europe, Africa) to the backend service, demonstrating the effectiveness of the load balancing setup. The GCP load balancer efficiently routed the requests to the appropriate backend instances, ensuring high availability and fault tolerance. This test helped validate that the backend service can handle traffic from diverse regions while maintaining optimal performance and ensuring seamless user experiences. The results confirmed that the load balancing service is effectively managing the distribution of traffic, preventing bottlenecks and ensuring a scalable architecture.
- Measure response times, throughput, and failure rates.
6. Monitoring Scale-Up and Scale-Down Behavior
Scaled down after load test
Use Cloud Monitoring and Logging to track:
CPU utilization
Memory usage
Autoscaling events
Ensure instances scale up and down based on traffic demands.
Apache web server
Observations
Scalability: The system scaled seamlessly to handle varying traffic loads.
High Availability: The load balancer ensured uninterrupted service during instance failures.
Performance: Load testing confirmed low latency and high throughput under peak loads.
Cost Efficiency: Autoscaling optimized resource utilization, minimizing costs during low traffic periods.
Tools Used
Locust: Load testing.
Cloud Monitoring: Resource tracking.
Cloud DNS: Domain name resolution.
Cloud Load Balancing: Traffic distribution.
Future Enhancements
Integrate CDN for faster content delivery.
Implement multi-region database replication (Cloud SQL or Firestore).
Automating provisioning with Terraform.
Conclusion
This global web server setup ensures the service is always available, resilient to failures, and performs well across different regions. By using tools like Cloud DNS, Load Balancing, and Autoscaling, the system can automatically adjust to changes in traffic, providing a smooth and fast experience for users. The setup is built to handle increased traffic, efficiently distributing it across regions and making sure resources are used effectively for the best performance