Author Image

Hi šŸ‘‹, I am Zhihong (Cerek)

Zhihong (Cerek) Shi

DevOps Engineer

Seasoned Linux System Administrator with over 5+ years of experience in managing and optimizing Linux-based infrastructures. Seeking a challenging role as a DevOps Engineer where I can leverage my expertise in linux system, and cloud services, to contribute to a dynamic team in a fast-paced environment. Dedicated to implementing robust and efficient systems, improving operational efficiency.

Operating System - Linux
Containerization - Kubernetes, Docker
Coding - Python, Bash
Database - MySQL, MongoDB, Redis
Team Work
Multitasking

Experiences

1
KZ Kitchen Cabinet & Stone, Inc.

Aug 2021 - Jun 2024

San Jose, California

IT Technician

Aug 2021 - Jun 2024

Responsibilities:
  • Building Internal Service(Solo project), 1. FullStack for developing and maintain the IT Ticket System using Django DRF, React, MySQL(More details in personal project – WorkPortal System). 2. Develop and deploy the notification service via python flask framework. 3. Building the internal wiki service via Material for MkDocs.
  • Infrastructure Management, 1. Using Ansible initial the bare-metal server to build proxmox cluster and manage the VMs. 2. Build the kubernetes cluster via kubeadm. 3. Build the Image Registry via Harbor. 4. Build the GitLab server and GitLab runner for source code management and CICD.
  • Automation, 1. Designed workflows and implemented GitLab CI/CD pipelines for deploy internal application to kubernetes cluster, reducing deployment time by 40%. The pipeline function include code compile, build docker image, push image to harbor, update deployments using new image via ArgoCD. 2. Create a crontab job for backup the database, and upload the backup files to NAS storage.
  • Network Design and Maintenance, 1. Plan and design network infrastructure to support the company’s operational needs. 2. Drawing layout, configuration of switch, routers, and access points. 3. Install and configure all the devices, ensuring secure and reliable connectivity. 4. Monitor network performance via Zabbix , troubleshooting issues.
  • Help Desk, 1. End-user device management, work closely with fleet management tools(ABM, Jamf, Jumpcloud), maintain the devices including desktops, latops, and mobile devices. 2. Ticketing Management, create and manage tickets for reported issue, offer technical support for end-user. 3. Login & Permission Management Integrate the SSO with Gsuit, netSuite, 8x8 via Jumpcloud platform.

NetEase Games

Apr 2020 - Apr 2021

Guangzhou, China

Senior Site Reliability Engineer

Apr 2020 - Apr 2021

Responsibilities:
  • Automation and Scripting, Provide techincal support for production environment with gaming projects. 1. Evaluating resources by demand base on stress testing result; 2. Integrate new projects to internal SRE systems; 3. Build tools(using python & shell scripts) to reduce time of bi-weekly maintenance; 4. Planning auto scale to offer best performance during the peak and save costs on lower peak.
  • Experience and knowledge with troubleshooting and performance tuning, such as, find out big key problem, rebalance key problem with Redis; tackle the deadlock problem, enhance SQL statements with MySQL. Familiar with high availability architecture with MySQL Replication, Redis cluster and MongoDB Replication.
  • Basic of Containerization, combine gitlab with docker to build images and run containers automatically in development environment for improving productivity. Entry- level usage with kubernetes in internal environment.
2

3
BAIOO Family

May 2017 - Mar 2020

Guangzhou, China

Site Reliability Engineer

May 2017 - Mar 2020

Responsibilities:
  • Cloud Migration, Managed the migration of on-premises applications and services from traditional datacenter to Aliyun, Qcloud cloud platform. Successfully migrated 200+ servers to cloud, resulting in 35% reduction in operational costs.
  • Implementation of Monitoring Solutions, Deployed Zabbix for network and server monitoring, resulting in 30% decrease in downtime. Configured alerts and automated reports, enhancing the IT team’s ability to address potential issues procatively.

Skills

Education

B.Sc. in Networking Engineer
GPA: 3.4 out of 4
Taken Courses:
Course Name Total Credit Obtained Credit
Data Structures and Algorithm 4 3.75
Network Security 4 3.5
Operating System 4 3.75
Cloud Infrastructure Essentials 4 3.75

Projects

Work Portal Server
Work Portal Server
Contributor Nov 2023 - Present

Work Portal Server is the back-end of WorkPortal Project which is a all-in-one system offer solutions for HR, IT, Project management.

Work Portal Web
Work Portal Web
Contributor Nov 2023 - Present

Work Portal Server is the back-end of WorkPortal Project which is a all-in-one system offer solutions for HR, IT, Project management.

WildSRE Wiki
WildSRE Wiki
Owner Oct 2022 - Present

The WildSRE Wiki is a comprehensive resource designed for learning and mastering DevOps skills. It serves as a detailed notebook, offering structured materials and in-depth tutorials on various DevOps concepts and practices.