AWS Lambda cho người mới: Serverless là gì và bắt đầu như thế nào?

Câu chuyện mở đầu: Tại sao lại có “Serverless”?

Hãy tưởng tượng bạn mở một quán cà phê. Có hai cách để vận hành:

Cách 1 - Thuê mặt bằng cố định (như EC2):

Trả tiền thuê hàng tháng dù có khách hay không
Phải tự quản lý điện, nước, bảo trì
Nếu đông khách, phải thuê thêm mặt bằng (scale)
Nếu vắng khách, vẫn phải trả tiền thuê

Cách 2 - Thuê theo giờ (như Lambda):

Chỉ trả tiền khi có khách đến
Không có khách = $0
Tự động mở rộng khi đông khách
Không cần lo bảo trì

Lambda hoạt động theo cách thứ 2. Bạn chỉ viết code, AWS lo mọi thứ còn lại.

”Serverless” không có nghĩa là không có server

Đây là hiểu lầm phổ biến nhất!

Serverless có nghĩa là: Bạn không cần quản lý server. Server vẫn tồn tại, nhưng AWS quản lý hoàn toàn - từ việc cập nhật OS, vá lỗi bảo mật, đến scale khi cần.

Bạn chỉ cần:

Viết code (function)
Upload lên Lambda
Định nghĩa khi nào function chạy (trigger)

AWS sẽ lo:

Khởi tạo server khi có request
Chạy code của bạn
Tắt server khi xong (không tốn tiền)
Auto-scale nếu có nhiều request cùng lúc

Khi nào dùng Lambda? Khi nào dùng EC2?

Đây là câu hỏi mà người mới thường thắc mắc. Hãy nhìn vào bảng so sánh:

Tiêu chí	Lambda	EC2
Thời gian chạy	Max 15 phút	Không giới hạn
Billing	Theo milliseconds	Theo giờ
Cold start	Có (100ms - 3s)	Không
Quản lý	AWS quản lý hoàn toàn	Bạn tự quản lý
Scale	Tự động, không giới hạn	Cần cấu hình ASG
State	Stateless	Có thể stateful

Dùng Lambda khi:

✅ API backends với lưu lượng không đều (đỉnh điểm và vắng)
✅ Xử lý files khi upload lên S3
✅ Scheduled tasks (cron jobs)
✅ Real-time data processing
✅ Chatbots, webhooks
✅ Chi phí là ưu tiên hàng đầu

Dùng EC2 khi:

✅ Ứng dụng chạy liên tục 24/7
✅ Tasks chạy > 15 phút
✅ Cần GPU hoặc hardware đặc biệt
✅ Ứng dụng stateful (lưu data trên disk)
✅ Legacy applications không thể containerize

Hiểu về Lambda Runtime

Khi bạn tạo Lambda function, bạn cần chọn runtime - môi trường để chạy code của bạn.

Các runtime phổ biến:

Runtime	Version	Use case
Node.js	18.x, 20.x	APIs, web backends
Python	3.9, 3.10, 3.11, 3.12	Data processing, ML
Java	11, 17, 21	Enterprise applications
Go	1.x	High-performance
.NET	6, 8	Microsoft ecosystem
Ruby	3.2, 3.3	Scripting, prototypes

💡 Tip cho người mới: Bắt đầu với Python hoặc Node.js - cú pháp đơn giản và có nhiều tài liệu nhất.

Anatomy của một Lambda Function

Mọi Lambda function đều có cấu trúc cơ bản này:

Python

def lambda_handler(event, context):
    """
    event: Dữ liệu đầu vào (từ trigger)
    context: Thông tin về execution environment
    """
    
    # 1. Đọc dữ liệu từ event
    name = event.get('name', 'World')
    
    # 2. Xử lý logic
    message = f"Hello, {name}!"
    
    # 3. Trả về kết quả
    return {
        'statusCode': 200,
        'body': message
    }

Node.js

export const handler = async (event, context) => {
    // 1. Đọc dữ liệu
    const name = event.name || 'World';
    
    // 2. Xử lý
    const message = `Hello, ${name}!`;
    
    // 3. Trả về
    return {
        statusCode: 200,
        body: message
    };
};

Giải thích chi tiết:

event - Đây là “input” của function. Tùy vào nguồn trigger mà event có cấu trúc khác nhau:

Từ API Gateway: chứa HTTP request info
Từ S3: chứa thông tin file mới upload
Từ CloudWatch Events: chứa scheduled event data

context - Chứa thông tin về environment:

function_name: Tên function
memory_limit_in_mb: RAM được cấp
aws_request_id: ID của request (để debug)
get_remaining_time_in_millis(): Còn bao lâu trước khi timeout

Hands-on: Tạo Lambda Function đầu tiên

Đây là phần quan trọng nhất! Chúng ta sẽ tạo một Lambda function đơn giản xử lý HTTP requests.

Bước 1: Chuẩn bị code

Tạo file lambda/hello/index.py:

import json
from datetime import datetime

def lambda_handler(event, context):
    """
    Simple Hello World Lambda function
    Demonstrates basic request/response handling
    """
    
    # Log để debug (xem trong CloudWatch Logs)
    print(f"Received event: {json.dumps(event)}")
    
    # Lấy parameters từ request
    # Khi gọi từ API Gateway, query params nằm trong event
    query_params = event.get('queryStringParameters', {}) or {}
    name = query_params.get('name', 'World')
    
    # Tạo response
    response_body = {
        'message': f'Hello, {name}!',
        'timestamp': datetime.now().isoformat(),
        'function_name': context.function_name if context else 'local',
    }
    
    # Return với format cho API Gateway
    return {
        'statusCode': 200,
        'headers': {
            'Content-Type': 'application/json',
            'Access-Control-Allow-Origin': '*'  # CORS
        },
        'body': json.dumps(response_body)
    }

Bước 2: Terraform Configuration

# main.tf

# Đóng gói code thành zip
data "archive_file" "hello_lambda" {
  type        = "zip"
  source_dir  = "${path.module}/lambda/hello"
  output_path = "${path.module}/lambda/hello.zip"
}

# IAM Role cho Lambda
# Lambda cần IAM role để có quyền chạy và ghi logs
resource "aws_iam_role" "lambda_exec" {
  name = "${var.project_name}-lambda-role"

  # Trust policy: cho phép Lambda service assume role này
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "lambda.amazonaws.com"
      }
    }]
  })
}

# Attach policy cho phép ghi CloudWatch Logs
resource "aws_iam_role_policy_attachment" "lambda_logs" {
  role       = aws_iam_role.lambda_exec.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

# Lambda Function
resource "aws_lambda_function" "hello" {
  filename         = data.archive_file.hello_lambda.output_path
  function_name    = "${var.project_name}-hello"
  role             = aws_iam_role.lambda_exec.arn
  handler          = "index.lambda_handler"  # file.function_name
  runtime          = "python3.12"
  source_code_hash = data.archive_file.hello_lambda.output_base64sha256

  # Cấu hình resources
  memory_size = 128  # MB (128 - 10240)
  timeout     = 10   # seconds (1 - 900)

  # Environment variables
  environment {
    variables = {
      ENVIRONMENT = var.environment
      LOG_LEVEL   = "INFO"
    }
  }

  tags = {
    Name = "${var.project_name}-hello"
  }
}

# CloudWatch Log Group (tự động tạo, nhưng set retention)
resource "aws_cloudwatch_log_group" "hello" {
  name              = "/aws/lambda/${aws_lambda_function.hello.function_name}"
  retention_in_days = 14  # Giữ logs 14 ngày để tiết kiệm chi phí
}

Bước 3: Tạo API Gateway để gọi Lambda

# api_gateway.tf

# API Gateway HTTP API (v2 - đơn giản và rẻ hơn REST API)
resource "aws_apigatewayv2_api" "main" {
  name          = "${var.project_name}-api"
  protocol_type = "HTTP"

  cors_configuration {
    allow_origins = ["*"]
    allow_methods = ["GET", "POST", "OPTIONS"]
    allow_headers = ["Content-Type"]
    max_age       = 300
  }
}

# Integration: Kết nối API Gateway với Lambda
resource "aws_apigatewayv2_integration" "hello" {
  api_id             = aws_apigatewayv2_api.main.id
  integration_type   = "AWS_PROXY"
  integration_uri    = aws_lambda_function.hello.invoke_arn
  integration_method = "POST"
}

# Route: /hello
resource "aws_apigatewayv2_route" "hello" {
  api_id    = aws_apigatewayv2_api.main.id
  route_key = "GET /hello"
  target    = "integrations/${aws_apigatewayv2_integration.hello.id}"
}

# Stage: Môi trường deploy
resource "aws_apigatewayv2_stage" "default" {
  api_id      = aws_apigatewayv2_api.main.id
  name        = "$default"
  auto_deploy = true

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.api_gw.arn
    format = jsonencode({
      requestId      = "$context.requestId"
      ip             = "$context.identity.sourceIp"
      requestTime    = "$context.requestTime"
      httpMethod     = "$context.httpMethod"
      routeKey       = "$context.routeKey"
      status         = "$context.status"
      responseLength = "$context.responseLength"
    })
  }
}

resource "aws_cloudwatch_log_group" "api_gw" {
  name              = "/aws/apigateway/${var.project_name}"
  retention_in_days = 7
}

# Permission: Cho phép API Gateway invoke Lambda
resource "aws_lambda_permission" "apigw" {
  statement_id  = "AllowAPIGatewayInvoke"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.hello.function_name
  principal     = "apigateway.amazonaws.com"
  source_arn    = "${aws_apigatewayv2_api.main.execution_arn}/*/*"
}

Bước 4: Outputs

# outputs.tf
output "api_endpoint" {
  description = "API Gateway endpoint URL"
  value       = "${aws_apigatewayv2_api.main.api_endpoint}/hello"
}

output "lambda_function_name" {
  value = aws_lambda_function.hello.function_name
}

Bước 5: Deploy và Test

# Deploy
terraform init
terraform apply

# Test API
curl "https://xxx.execute-api.ap-southeast-1.amazonaws.com/hello?name=AWS"

# Response
{
  "message": "Hello, AWS!",
  "timestamp": "2025-01-23T10:30:00.123456",
  "function_name": "myapp-hello"
}

Event Triggers phổ biến

Lambda không tự chạy - nó cần được “trigger” bởi một sự kiện:

1. API Gateway (HTTP requests)

# Đã demo ở trên
resource "aws_apigatewayv2_route" "hello" {
  route_key = "GET /hello"
  # ...
}

2. S3 (File upload)

# Trigger Lambda khi có file mới trong S3
resource "aws_s3_bucket_notification" "lambda_trigger" {
  bucket = aws_s3_bucket.uploads.id

  lambda_function {
    lambda_function_arn = aws_lambda_function.process_file.arn
    events              = ["s3:ObjectCreated:*"]
    filter_prefix       = "uploads/"
    filter_suffix       = ".jpg"
  }
}

3. CloudWatch Events (Scheduled - Cron)

# Chạy Lambda mỗi 5 phút
resource "aws_cloudwatch_event_rule" "every_5_minutes" {
  name                = "every-5-minutes"
  schedule_expression = "rate(5 minutes)"
}

resource "aws_cloudwatch_event_target" "lambda" {
  rule      = aws_cloudwatch_event_rule.every_5_minutes.name
  target_id = "TriggerLambda"
  arn       = aws_lambda_function.cleanup.arn
}

4. SQS (Message Queue)

resource "aws_lambda_event_source_mapping" "sqs" {
  event_source_arn = aws_sqs_queue.tasks.arn
  function_name    = aws_lambda_function.worker.arn
  batch_size       = 10  # Process 10 messages at a time
}

Hiểu về Cold Start

Đây là “nhược điểm” lớn nhất của Lambda mà bạn cần biết.

Cold Start là gì?

Khi không có request trong một thời gian, AWS sẽ “tắt” Lambda instance của bạn để tiết kiệm tài nguyên. Khi có request mới, AWS phải:

Khởi tạo container mới
Download và extract code của bạn
Load runtime (Python, Node.js…)
Chạy initialization code (imports, connections…)
Sau đó mới chạy handler function

Quá trình này mất 100ms - 3 giây tùy vào:

Runtime (Java cold start lâu hơn Python)
Code size (nhiều dependencies = lâu hơn)
VPC configuration (thêm ~1-2s nếu trong VPC)

Giảm Cold Start

# Provisioned Concurrency: Giữ Lambda "ấm" sẵn
resource "aws_lambda_provisioned_concurrency_config" "hello" {
  function_name                     = aws_lambda_function.hello.function_name
  provisioned_concurrent_executions = 5  # Luôn có 5 instances sẵn sàng
  qualifier                         = aws_lambda_alias.live.name
}

# Hoặc dùng SnapStart (chỉ cho Java)
resource "aws_lambda_function" "java_app" {
  # ...
  snap_start {
    apply_on = "PublishedVersions"
  }
}

Best Practices

1. Giữ function nhỏ và focused

# ❌ Sai - làm quá nhiều thứ
def lambda_handler(event, context):
    validate_input()
    process_data()
    save_to_database()
    send_email()
    update_cache()
    return response

# ✅ Đúng - một function làm một việc
def lambda_handler(event, context):
    validate_input()
    process_data()
    # Trigger function khác để gửi email
    trigger_email_function()
    return response

2. Environment Variables cho secrets

resource "aws_lambda_function" "app" {
  environment {
    variables = {
      # ❌ Sai - hardcode
      # API_KEY = "sk-1234567890"
      
      # ✅ Đúng - từ Secrets Manager
      SECRET_ARN = aws_secretsmanager_secret.api_key.arn
    }
  }
}

3. Timeout và Memory phù hợp

# Rule of thumb:
# - API responses: timeout 10-30s, memory 128-512MB
# - Data processing: timeout 60-300s, memory 512-2048MB
# - ML inference: memory 2048-10240MB

resource "aws_lambda_function" "api" {
  timeout     = 30
  memory_size = 256
}

Lỗi thường gặp và cách fix

1. “Task timed out”

Nguyên nhân: Function chạy quá timeout limit

Fix:

resource "aws_lambda_function" "slow" {
  timeout = 60  # Tăng timeout (max 900s)
}

2. “Unable to import module”

Nguyên nhân: Dependencies không được đóng gói

Fix: Sử dụng Lambda Layers hoặc đóng gói dependencies:

pip install -r requirements.txt -t ./package
cd package && zip -r ../lambda.zip .
cd .. && zip -g lambda.zip index.py

3. “AccessDeniedException”

Nguyên nhân: IAM role thiếu permissions

Fix: Thêm policy cần thiết:

resource "aws_iam_role_policy" "s3_access" {
  role = aws_iam_role.lambda_exec.name
  policy = jsonencode({
    Statement = [{
      Effect   = "Allow"
      Action   = ["s3:GetObject"]
      Resource = ["arn:aws:s3:::my-bucket/*"]
    }]
  })
}

Chi phí Lambda

Lambda rất rẻ cho workloads không đều:

Metric	Free Tier	Giá sau đó
Requests	1 triệu/tháng	$0.20 per 1M
Duration	400,000 GB-seconds	$0.0000166667/GB-second

Ví dụ tính:

1 triệu requests/tháng
Mỗi request chạy 200ms với 128MB RAM
Duration = 1M × 0.2s × 0.125GB = 25,000 GB-seconds
Cost = $0 (trong Free Tier)

Bước tiếp theo

Sau khi nắm được Lambda cơ bản, bạn nên học tiếp:

Lambda Layers - Chia sẻ code và dependencies
Step Functions - Orchestrate nhiều Lambdas
SAM/Serverless Framework - IaC chuyên cho serverless
X-Ray - Distributed tracing

Bài tiếp theo: Auto Scaling - Tự động scale EC2 theo nhu cầu.