Bo Bo Han

Bo Bo Han

Cloud & System Engineer

1x AWS Certified | 1x Microsoft Security Certified | 1x Github Certified | Managed Service

About Me

Site Reliability Engineer (SRE) & Senior System Engineer with 10+ years of experience managing large-scale infrastructure (700+ servers) and designing resilient cloud platforms. Specialized in IaaS (AWS/GCP/Huawei Cloud), PaaS (ECS/Kubernetes/EKS), and SaaS integration (IaC). Proven track record of reducing operational costs by 20% and improving deployment efficiency by 60% through automation. Bringing deep expertise in Linux systems internals and high-availability architecture.

Request Full Resume

Languages

English (Fluent) German (Learning)

Experience

Jun 2025 - Present
Cloud Platform Engineer
Trivyst (HumbleLab Pte.Ltd), Singapore (Remote)
  • Architect and operate highly resilient AWS cloud infrastructure, achieving 99.99% uptime for global client workloads.
  • Defined and monitored SLIs/SLOs to ensure service reliability, utilizing Error Budgets to make data-driven decisions on release velocity versus stability.
  • Drive Infrastructure as Code (IaC) adoption using Terraform and Ansible, fully automating environment provisioning and configuration management.
  • Implement trunk-based CI/CD pipelines with GitHub Actions, standardizing automated testing and zero-downtime releases.
  • Develop Python and Bash automation scripts that reduced manual operational workload by over 60%.
  • Containerize applications using Docker, ensuring consistency across development, staging, and production environments.
  • Modernized infrastructure by deploying and scaling containerized applications using AWS ECS and EKS (Kubernetes), improving system reliability and deployment frequency.
  • Developed autonomous AI agents to automate complex multi-step workflows, integrating LLMs with internal APIs to streamline operations.
Feb 2024 - Mar 2025
Site Reliability Engineer (SRE)
Onenex (Atlas Digi Myanmar Ltd.), Myanmar
  • Optimized AWS infrastructure to reduce monthly operational costs by 20% through resource rightsizing and reserved instance strategies.
  • Managed AWS and Huawei Cloud infrastructure and hybrid environments, supporting scalable enterprise solutions.
  • Enhanced system reliability and achieved 99.99% uptime by automating configuration management across Ubuntu fleets using Ansible.
  • Implemented robust Backup & Disaster Recovery strategies using AWS Backup, ensuring data integrity and compliance with RPO/RTO targets.
  • Automated environment provisioning for multiple projects using Ansible, reducing setup time from days to hours.
  • Resolved interactive incidents monthly by implementing proactive alerting via CloudWatch and New Relic.
  • Established proactive monitoring frameworks using Prometheus, Grafana, New Relic, AWS CloudWatch and Uptime Robot to maintain high availability and 99.9% uptime, aligning technical metrics (SLIs) with business agreements (SLAs).
  • Managed cloud budget and cost estimation, delivering monthly optimization reports that identified and eliminated waste in underutilized resources.
  • Streamlined software delivery by migrating legacy deployment scripts to modular, reusable GitLab CI templates, improving pipeline maintainability across multiple projects.
Jun 2019 – Feb 2024
Senior System Engineer (VAS Product)
ZTE Corporation, Myanmar
  • Led engineering for a 700+ server production environment, ensuring stability for critical Value-Added Services (VAS).
  • Improved operational efficiency by 50% through rigorous Linux administration (SUSE, CGSL) and hardware management (Blade servers).
  • Executed complex infrastructure projects including rack installation and data commissioning, reducing migration downtime by 30%.
  • Managed ZTE Private Cloud (TECS) with 99.9% availability, supporting critical VAS platforms (SMSC, SDP, USSDGW) for GSM/CDMA/PSTN networks.
  • Reduced migration downtime by 30% for 100+ server transitions by performing signaling traces using TCP Dump to optimize network performance.
  • Improved operational efficiency by 50% through rigorous Linux administration (SUSE, CGSL) and hardware management (Blade Servers, ZTE ZXCloud E9000).
  • Enhanced Backup & DR processes by 25%, managing Oracle database, PostgreSQL database, and Sybase databases and ensuring data integrity across DR sites.
  • Secured operations by mitigating 40+ threats annually through vulnerability scans, security agent deployment, and network hardening.
  • Monitored infrastructure performance using NetNumen U31, Zabbix, and alarm systems to ensure uninterrupted service delivery.
  • Integrated new products and services into VAS platforms, managing VMware vCenter, VMware ESXi and ZTE cloud platforms (NFVO, VNFM).
  • Led physical and application migrations, including hardware installation, cabling, and data commissioning while ensuring compliance with HSE rules.
Feb 2014 – Sept 2018
Infrastructure Managed Service Engineer
Huawei Technologies, Myanmar
  • Maintained 99.95% uptime for NGBSS mission-critical systems.
  • Administered 500+ large-scale Linux (SUSE) infrastructure and storage systems (OceanStor), improving storage performance by 15%.
  • Streamlined Linux OS administration (SUSE) for server maintenance, boosting issue resolution speed and improving uptime by 25% through proactive monitoring.
  • Managed FortiGate firewalls and McAfee IPS/IDS to safeguard infrastructure, ensuring compliance with ITIL and ITSM processes.
  • Administered Simpana CommVault Backup system and Huawei OceanStor 9000 storage, enhancing data backup efficiency by 80% for seamless disaster recovery.
  • Monitored system health using Huawei iManager (U2000) and HP tools (SiteScope, NNMi), optimizing NGBSS and PCRF systems.
  • Led NOC team operations, overseeing 24/7 monitoring and resolving incidents via HP Service Manager ticketing (ITSM) and HP TeMIP tools.
  • Maintained Huawei ATAE/USAU servers and Universal Server Manager (USM), ensuring reliable infrastructure performance.
  • Applied Oracle SQL database skills to assist in database maintenance and backup/recovery strategies.
  • Collaborated with vendors for procurement and maintained detailed documentation, ensuring compliance with HSE standards.
September 2009 – May 2014
Network & System Engineer (Early Career)
Net City, Cyber City, ACE Data Systems & COM
  • Increased internet uptime to 98%, managing WAN/LAN networks, by configuring load balancing at Net City.
  • Supported secure operations, reducing hardware failures by 30%, through repairs and monitoring at Cyber City and COM/ACE.

Skills

Cloud & Virtualization

AWS GCP Huawei Cloud ZTE TECS VMware Kubernetes

System Administration

Linux (RedHat/Ubuntu/CentOS/SUSE/CGSL) Windows Server Server Management (ATAC/ATAE Blade Server/ZTE R5300) IAM Active Directory

Network & Infrastructure

TCP Dump Brocade SW300 Load Balancing Rack Installation Server Mounting Cabling Data Commissioning

Monitoring & SRE

Prometheus Grafana Zabbix ZTE NetNumen U31 AWS CloudWatch New Relic Datadog Huawei I2000 Uptime Robot SLO/SLI/SLA Incident Management

Databases

Oracle SQL PostgreSQL Sybase MySQL MariaDB Vector DB (AlloyDB)

DevOps & Automation

Ansible Terraform (IaC) Docker (Containerization) Git (VCS) GitHub Actions CI/CD (GitLab/CircleCI) Jenkins SonarQube

Programming

Bash YAML JSON Python

AI-Assisted Operations

Model Context Protocol (MCP) Autonomous Agents Multi Agent Orchestration n8n Workflow Automation LLM Integration Prompt Engineering Self-Healing Infrastructure Agentic Workflows AI Infrastructure Analysis Workflow Optimization

Management & Soft Skills

ITSM ITIL SCRUM (Agile) Problem-Solving Adaptability Self-Learning Time Management Debugging Collaboration

Projects

Autonomous AWS SysAdmin Agent (AI/MCP)

Designed a secure "Agentic" control plane using AWS Fargate, Terraform, and Model Context Protocol (MCP). Built headless AI agent for autonomous server diagnostics and remediation.

High-Availability Web App on AWS ECS

Provisioned resilient infrastructure using Terraform, deploying Dockerized app on AWS ECS Fargate with ALB, strict IAM policies, and automated GitHub Actions pipeline.

AWS Cost Saver Bot

Developed serverless bot using Python (boto3) and Lambda to analyze daily AWS spend, detect anomalies, and send real-time cost reports via SNS and Telegram API.

Production-Grade EKS Cluster

Architected highly available Kubernetes cluster on AWS EKS using Terraform, with VPC networking, IAM security, automated pipelines, and OIDC providers for secure pod identity.

Frontend ToDo App & Slack Alerts

Deploying a frontend Todo application using AWS CloudFront and S3, with real-time Slack alerts for bucket changes.

WordPress on EC2

Deployed a WordPress site on AWS EC2 with automated setup.

CI/CD Pipeline

Implemented a CI/CD pipeline using GitHub Actions and AWS.

Certifications & Badges

AWS Certified AWS Solution Architect Associate Microsoft Certified Microsoft Security Operations Analyst GitHub Certified GitHub Foundations Docker Essentials Docker Essentials ISO 27001 ISO/IEC 27001 Information Security Kubernetes Kubernetes Fundamentals (LFS258) KodeKloud 100 Days of DevOps (Level 1) KodeKloud Docker KodeKloud Docker (Level 1) KodeKloud Ansible KodeKloud Ansible (Level 1) Cybersecurity Cybersecurity Essentials (LFC108) KEDA Scaling Cloud Native Applications GitOps Introduction to GitOps Zero Trust Introduction to Zero Trust

Education

2010 - 2012
Dip in Network Engineering
2006 - 2008
B.Sc (Zoology)
Terminal
guest@bobohan:~$