CI/CD 持续集成与部署实践
CI/CD 流水线设计与实践:GitLab CI、GitHub Actions 的 Pipeline 编写、多环境部署策略及 DevOps 工程化落地。
CI/CD实践
学习目标
- 掌握CI/CD流水线的设计原理和最佳实践
- 熟练使用主流CI/CD工具(Jenkins、GitLab CI/CD、GitHub Actions)
- 理解并实践DevOps核心原则和方法
- 实现基础设施即代码和配置管理
- 设计企业级CI/CD系统架构
核心概念
CI/CD流水线架构设计
流水线阶段设计:
一个完整的企业级CI/CD流水线通常包含以下核心阶段:
-
代码检出与准备
- 从代码仓库拉取最新代码
- 检查变更范围,执行差异化构建
- 准备构建环境和依赖管理
-
构建与代码质量
- 源代码编译/打包
- 静态代码分析(SonarQube)
- 安全扫描(SAST、依赖检查)
-
自动化测试
- 单元测试与覆盖率报告
- 集成测试
- 契约测试
- 性能基准测试
-
制品管理
- 构建产物版本化存储
- 容器镜像构建与扫描
- 制品签名与验证
-
部署流程
- 测试环境部署
- 验收测试环境部署
- 预生产环境部署
- 生产环境部署
-
验证与监控
- 部署后自动化验证
- 应用性能监控
- 用户体验监控
- 告警触发与通知
流水线架构示例:
[代码仓库] → [构建服务] → [制品仓库] → [部署服务] → [环境集群]
↑ ↑ ↑ ↑ ↓
└───────────┴───────────┴───────────┴──────────────┘
监控与反馈闭环
主流CI/CD工具实战
Jenkins高级配置
基于Kubernetes的Jenkins动态节点:
# Jenkins Kubernetes配置
apiVersion: v1
kind: ConfigMap
metadata:
name: jenkins-kubernetes-config
data:
config.yaml: |
clouds:
- kubernetes:
name: "kubernetes-cloud"
serverUrl: "https://kubernetes.default.svc.cluster.local"
namespace: "jenkins"
credentialsId: "kubernetes-service-account"
jenkinsUrl: "http://jenkins-master:8080"
jenkinsTunnel: "jenkins-agent:50000"
containerCapStr: "10"
retentionTimeout: 300
connectTimeout: 30
readTimeout: 60
templates:
- name: "maven"
namespace: "jenkins"
label: "kubernetes-maven"
containers:
- name: "maven"
image: "maven:3.8.6-openjdk-11"
command: "cat"
ttyEnabled: true
envVars:
- envVar:
key: "JENKINS_URL"
value: "http://jenkins-master:8080"
声明式Pipeline示例:
pipeline {
agent {
kubernetes {
yaml """
apiVersion: v1
kind: Pod
spec:
containers:
- name: maven
image: maven:3.8.6-openjdk-11
command: ['cat']
tty: true
- name: docker
image: docker:20.10
command: ['cat']
tty: true
volumeMounts:
- name: docker-sock
mountPath: /var/run/docker.sock
volumes:
- name: docker-sock
hostPath:
path: /var/run/docker.sock
"""
}
}
environment {
ARTIFACT_NAME = 'microservice'
ARTIFACT_VERSION = "${BUILD_NUMBER}-${GIT_COMMIT.take(7)}"
DOCKER_REGISTRY = 'harbor.example.com'
DOCKER_IMAGE = "${DOCKER_REGISTRY}/devops/${ARTIFACT_NAME}:${ARTIFACT_VERSION}"
// 从凭据管理系统获取敏感信息
DOCKER_CREDS = credentials('docker-registry-creds')
K8S_CONFIG = credentials('k8s-config')
}
stages {
stage('代码检出') {
steps {
checkout scm
}
}
stage('代码质量检查') {
steps {
container('maven') {
withSonarQubeEnv('SonarQube') {
sh 'mvn sonar:sonar \
-Dsonar.projectKey=${ARTIFACT_NAME} \
-Dsonar.qualitygate.wait=true'
}
}
}
}
stage('构建') {
steps {
container('maven') {
sh 'mvn clean package -DskipTests'
archiveArtifacts artifacts: 'target/*.jar', fingerprint: true
}
}
}
stage('单元测试') {
steps {
container('maven') {
sh 'mvn test'
}
}
post {
always {
junit 'target/surefire-reports/*.xml'
jacoco(execPattern: 'target/jacoco.exec')
}
}
}
stage('构建Docker镜像') {
steps {
container('docker') {
sh "echo $DOCKER_CREDS_PASSWORD | docker login $DOCKER_REGISTRY -u $DOCKER_CREDS_USR --password-stdin"
sh "docker build -t $DOCKER_IMAGE ."
sh "docker scan --severity HIGH,CRITICAL $DOCKER_IMAGE || true"
sh "docker push $DOCKER_IMAGE"
}
}
}
stage('部署到开发环境') {
steps {
container('docker') {
sh "mkdir -p ~/.kube"
sh "cp $K8S_CONFIG_FILE ~/.kube/config"
sh "sed -i 's|IMAGE_TAG|${ARTIFACT_VERSION}|g' k8s/dev/deployment.yaml"
sh "kubectl apply -f k8s/dev/ --namespace=development"
sh "kubectl rollout status deployment/${ARTIFACT_NAME} --namespace=development --timeout=180s"
}
}
}
stage('部署到测试环境') {
steps {
timeout(time: 1, unit: 'HOURS') {
input message: '是否部署到测试环境?', ok: '部署'
}
container('docker') {
sh "sed -i 's|IMAGE_TAG|${ARTIFACT_VERSION}|g' k8s/test/deployment.yaml"
sh "kubectl apply -f k8s/test/ --namespace=testing"
sh "kubectl rollout status deployment/${ARTIFACT_NAME} --namespace=testing --timeout=180s"
}
}
}
stage('集成测试') {
steps {
container('maven') {
sh 'mvn verify -DskipUnitTests -Dapi.url=http://${ARTIFACT_NAME}.testing.svc.cluster.local'
}
}
}
}
post {
success {
slackSend channel: '#deployments',
color: 'good',
message: "部署成功: ${ARTIFACT_NAME} v${ARTIFACT_VERSION}"
}
failure {
slackSend channel: '#deployments',
color: 'danger',
message: "部署失败: ${ARTIFACT_NAME} v${ARTIFACT_VERSION}"
}
always {
cleanWs()
}
}
}
GitLab CI/CD高级配置
使用GitLab CI/CD进行多环境部署:
stages:
- 代码质量
- 构建
- 测试
- 构建镜像
- 镜像扫描
- 部署开发
- 部署测试
- 部署预生产
- 部署生产
variables:
ARTIFACT_NAME: "microservice"
DOCKER_REGISTRY: "registry.example.com"
DOCKER_IMAGE: "${DOCKER_REGISTRY}/${CI_PROJECT_PATH}:${CI_COMMIT_SHORT_SHA}"
KUBE_CONTEXT_DEV: "dev-cluster"
KUBE_CONTEXT_TEST: "test-cluster"
KUBE_CONTEXT_PROD: "prod-cluster"
default:
image: docker:20.10
services:
- docker:20.10-dind
tags:
- docker
代码质量检查:
stage: 代码质量
image: maven:3.8.6-openjdk-11
script:
- mvn sonar:sonar \
-Dsonar.projectKey=${CI_PROJECT_NAME} \
-Dsonar.host.url=${SONAR_URL} \
-Dsonar.login=${SONAR_TOKEN} \
-Dsonar.qualitygate.wait=true
rules:
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
- if: $CI_COMMIT_BRANCH == 'main'
构建:
stage: 构建
image: maven:3.8.6-openjdk-11
script:
- mvn clean package -DskipTests
artifacts:
paths:
- target/*.jar
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == 'push'
单元测试:
stage: 测试
image: maven:3.8.6-openjdk-11
script:
- mvn test
artifacts:
reports:
junit: target/surefire-reports/*.xml
coverage_report:
coverage_format: cobertura
path: target/site/cobertura/coverage.xml
rules:
- if: $CI_PIPELINE_SOURCE == 'push'
构建Docker镜像:
stage: 构建镜像
script:
- echo ${CI_REGISTRY_PASSWORD} | docker login ${DOCKER_REGISTRY} -u ${CI_REGISTRY_USER} --password-stdin
- docker build -t ${DOCKER_IMAGE} .
- docker push ${DOCKER_IMAGE}
rules:
- if: $CI_COMMIT_BRANCH == 'develop'
- if: $CI_COMMIT_BRANCH == 'main'
镜像扫描:
stage: 镜像扫描
image: aquasec/trivy:latest
script:
- trivy image --severity HIGH,CRITICAL --exit-code 1 ${DOCKER_IMAGE}
rules:
- if: $CI_COMMIT_BRANCH == 'main'
部署开发环境:
stage: 部署开发
image: bitnami/kubectl:latest
script:
- mkdir -p ~/.kube
- echo ${KUBE_CONFIG_DEV} | base64 -d > ~/.kube/config
- export KUBECONFIG=~/.kube/config
- sed -i "s|IMAGE_TAG|${CI_COMMIT_SHORT_SHA}|g" k8s/dev/deployment.yaml
- kubectl apply -f k8s/dev/ --namespace=development
- kubectl rollout status deployment/${ARTIFACT_NAME} --namespace=development --timeout=180s
rules:
- if: $CI_COMMIT_BRANCH == 'develop'
部署测试环境:
stage: 部署测试
image: bitnami/kubectl:latest
script:
- mkdir -p ~/.kube
- echo ${KUBE_CONFIG_TEST} | base64 -d > ~/.kube/config
- export KUBECONFIG=~/.kube/config
- sed -i "s|IMAGE_TAG|${CI_COMMIT_SHORT_SHA}|g" k8s/test/deployment.yaml
- kubectl apply -f k8s/test/ --namespace=testing
- kubectl rollout status deployment/${ARTIFACT_NAME} --namespace=testing --timeout=180s
rules:
- if: $CI_COMMIT_BRANCH == 'main'
when: manual
部署预生产环境:
stage: 部署预生产
image: bitnami/kubectl:latest
script:
- mkdir -p ~/.kube
- echo ${KUBE_CONFIG_PROD} | base64 -d > ~/.kube/config
- export KUBECONFIG=~/.kube/config
- sed -i "s|IMAGE_TAG|${CI_COMMIT_SHORT_SHA}|g" k8s/staging/deployment.yaml
- kubectl apply -f k8s/staging/ --namespace=staging
- kubectl rollout status deployment/${ARTIFACT_NAME} --namespace=staging --timeout=180s
rules:
- if: $CI_COMMIT_BRANCH == 'main'
when: manual
environment:
name: staging
url: https://staging.example.com
部署生产环境:
stage: 部署生产
image: bitnami/kubectl:latest
script:
- mkdir -p ~/.kube
- echo ${KUBE_CONFIG_PROD} | base64 -d > ~/.kube/config
- export KUBECONFIG=~/.kube/config
# 蓝绿部署策略
- if kubectl get deployment ${ARTIFACT_NAME}-blue --namespace=production 2>/dev/null; then
NEW_COLOR=green;
OLD_COLOR=blue;
else
NEW_COLOR=blue;
OLD_COLOR=green;
fi
- cp k8s/prod/deployment.yaml k8s/prod/deployment-${NEW_COLOR}.yaml
- sed -i "s|IMAGE_TAG|${CI_COMMIT_SHORT_SHA}|g" k8s/prod/deployment-${NEW_COLOR}.yaml
- sed -i "s|${ARTIFACT_NAME}|${ARTIFACT_NAME}-${NEW_COLOR}|g" k8s/prod/deployment-${NEW_COLOR}.yaml
- kubectl apply -f k8s/prod/deployment-${NEW_COLOR}.yaml --namespace=production
- kubectl rollout status deployment/${ARTIFACT_NAME}-${NEW_COLOR} --namespace=production --timeout=300s
- kubectl apply -f k8s/prod/service-${NEW_COLOR}.yaml --namespace=production
- echo "部署完成,等待验证..."
rules:
- if: $CI_COMMIT_BRANCH == 'main'
when: manual
environment:
name: production
url: https://api.example.com
基础设施即代码(IaC)实践
Terraform企业级配置管理
模块化Terraform配置:
# main.tf
provider "aws" {
region = var.aws_region
}
# 导入VPC模块
module "vpc" {
source = "./modules/vpc"
vpc_cidr = var.vpc_cidr
vpc_name = var.project_name
availability_zones = var.availability_zones
# 公共子网
public_subnets = var.public_subnets
# 私有子网
private_subnets = var.private_subnets
# NAT网关配置
enable_nat_gateway = true
single_nat_gateway = false
}
# 导入安全组模块
module "security_groups" {
source = "./modules/security_groups"
vpc_id = module.vpc.vpc_id
# 安全组规则
ssh_allowed_cidrs = var.ssh_allowed_cidrs
http_allowed_cidrs = var.http_allowed_cidrs
https_allowed_cidrs = var.https_allowed_cidrs
}
# 导入EC2模块
module "ec2_instances" {
source = "./modules/ec2"
instance_count = var.web_instance_count
instance_type = var.web_instance_type
ami = var.web_ami
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
security_group_ids = [module.security_groups.web_sg_id]
# 标签
tags = {
Name = "${var.project_name}-web"
Environment = var.environment
}
}
CI/CD中的Terraform部署流程:
# .gitlab-ci.yml Terraform部分
stages:
- terraform-plan
- terraform-apply
variables:
TF_ROOT: ${CI_PROJECT_DIR}/terraform
TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_PROJECT_NAME}
terraform-plan:
stage: terraform-plan
image: hashicorp/terraform:latest
script:
- cd ${TF_ROOT}
- terraform init -backend-config="address=${TF_ADDRESS}" -backend-config="lock_address=${TF_ADDRESS}/lock" -backend-config="unlock_address=${TF_ADDRESS}/lock" -backend-config="username=${GITLAB_USERNAME}" -backend-config="password=${CI_JOB_TOKEN}" -backend-config="lock_method=POST" -backend-config="unlock_method=DELETE" -backend-config="retry_wait_min=5"
- terraform validate
- terraform plan -out=tfplan
- terraform show -json tfplan | jq -r '(.resource_changes[] | [.change.actions[], .type, .change.after.id]) | @tsv' > plan_summary.txt
artifacts:
paths:
- ${TF_ROOT}/tfplan
- ${TF_ROOT}/plan_summary.txt
rules:
- if: $CI_COMMIT_BRANCH == 'main'
terraform-apply:
stage: terraform-apply
image: hashicorp/terraform:latest
script:
- cd ${TF_ROOT}
- terraform init -backend-config="address=${TF_ADDRESS}" -backend-config="lock_address=${TF_ADDRESS}/lock" -backend-config="unlock_address=${TF_ADDRESS}/lock" -backend-config="username=${GITLAB_USERNAME}" -backend-config="password=${CI_JOB_TOKEN}" -backend-config="lock_method=POST" -backend-config="unlock_method=DELETE" -backend-config="retry_wait_min=5"
- terraform apply -auto-approve tfplan
rules:
- if: $CI_COMMIT_BRANCH == 'main'
when: manual
部署策略与最佳实践
蓝绿部署实现
基于Kubernetes的蓝绿部署:
# k8s/prod/service-blue.yaml
apiVersion: v1
kind: Service
metadata:
name: api-service-blue
namespace: production
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
selector:
app: api-service
color: blue
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
# k8s/prod/service-green.yaml
apiVersion: v1
kind: Service
metadata:
name: api-service-green
namespace: production
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
selector:
app: api-service
color: green
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
# k8s/prod/service-active.yaml (指向当前活跃版本)
apiVersion: v1
kind: Service
metadata:
name: api-service
namespace: production
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
selector:
app: api-service
color: blue # 切换这里的值来切换流量
ports:
- port: 80
targetPort: 8080
type: LoadBalancer
金丝雀部署策略
#!/bin/bash
# 金丝雀部署脚本
DEPLOYMENT_NAME="api-service"
NEW_VERSION="1.2.3"
NAMESPACE="production"
# 1. 更新部署,设置副本数为1(金丝雀)
kubectl set image deployment/${DEPLOYMENT_NAME} ${DEPLOYMENT_NAME}=registry.example.com/${DEPLOYMENT_NAME}:${NEW_VERSION} -n ${NAMESPACE}
kubectl scale deployment/${DEPLOYMENT_NAME} --replicas=1 -n ${NAMESPACE}
# 2. 等待金丝雀版本就绪
kubectl rollout status deployment/${DEPLOYMENT_NAME} -n ${NAMESPACE}
# 3. 监控金丝雀版本(模拟)
echo "监控金丝雀版本10分钟..."
sleep 600
# 4. 增加金丝雀比例到50%
kubectl scale deployment/${DEPLOYMENT_NAME} --replicas=3 -n ${NAMESPACE}
# 5. 再次监控
echo "监控50%流量版本10分钟..."
sleep 600
# 6. 完成全量部署
kubectl scale deployment/${DEPLOYMENT_NAME} --replicas=6 -n ${NAMESPACE}
echo "金丝雀部署完成,全量版本已上线"
实践案例
企业级多环境CI/CD系统
系统架构:
[开发团队] → [代码仓库(GitLab)] → [CI/CD(GitLab CI)] → [制品库(Harbor)]
↓
[基础设施团队] → [基础设施代码] → [IaC流水线] → [Kubernetes集群]
↑
[运维团队] ← [监控系统(Prometheus+Grafana)] ← [日志系统(ELK)]
多环境配置管理:
# config_manager.py
import os
import yaml
import argparse
from jinja2 import Template
def load_environment_config(env_name):
"""加载环境配置"""
config_path = f"configs/{env_name}.yaml"
with open(config_path, 'r') as f:
return yaml.safe_load(f)
def render_kubernetes_manifests(template_dir, output_dir, config):
"""渲染Kubernetes清单文件"""
if not os.path.exists(output_dir):
os.makedirs(output_dir)
for filename in os.listdir(template_dir):
if filename.endswith('.j2'):
with open(os.path.join(template_dir, filename), 'r') as f:
template = Template(f.read())
rendered_content = template.render(**config)
output_filename = filename.replace('.j2', '.yaml')
with open(os.path.join(output_dir, output_filename), 'w') as f:
f.write(rendered_content)
print(f"已生成: {output_filename}")
def main():
parser = argparse.ArgumentParser(description='环境配置管理器')
parser.add_argument('--env', required=True, help='环境名称(dev/test/staging/prod)')
parser.add_argument('--output', default='output', help='输出目录')
args = parser.parse_args()
# 加载配置
config = load_environment_config(args.env)
# 渲染清单
render_kubernetes_manifests('templates/k8s', args.output, config)
if __name__ == '__main__':
main()
自动化测试集成
在CI/CD中集成自动化测试:
# .gitlab-ci.yml 测试部分
集成测试:
stage: integration_test
image: maven:3.8.6-openjdk-11
services:
- name: postgres:13
alias: postgres
- name: redis:6
alias: redis
variables:
POSTGRES_USER: test_user
POSTGRES_PASSWORD: test_password
POSTGRES_DB: test_db
script:
- mvn verify -DskipUnitTests \
-Dspring.datasource.url=jdbc:postgresql://postgres:5432/test_db \
-Dspring.datasource.username=test_user \
-Dspring.datasource.password=test_password \
-Dspring.redis.host=redis
artifacts:
reports:
junit: target/failsafe-reports/*.xml
性能测试:
stage: performance_test
image: locustio/locust
script:
- pip install -r requirements-test.txt
- locust -f performance_tests/locustfile.py --host=http://api-service.test.svc.cluster.local \
--headless -u 100 -r 10 --run-time 5m --csv=results
artifacts:
paths:
- results_stats.csv
- results_history.csv
rules:
- if: $CI_COMMIT_BRANCH == 'main'
when: manual
监控与反馈
CI/CD系统监控
Prometheus监控配置:
# prometheus-job-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-job-config
namespace: monitoring
data:
ci_cd_jobs.yaml: |
- job_name: 'jenkins'
metrics_path: /prometheus
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: jenkins
action: keep
- job_name: 'gitlab-ci'
metrics_path: /metrics
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: gitlab-ci
action: keep
Grafana仪表板JSON:
{
"annotations": { "list": [] },
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"title": "CI/CD流水线成功率",
"type": "gauge",
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 },
"targets": [
{
"expr": "sum(rate(jenkins_builds_duration_seconds_count{result=\"SUCCESS\"}[24h])) / sum(rate(jenkins_builds_duration_seconds_count[24h])) * 100",
"interval": "",
"legendFormat": "成功率"
}
]
},
{
"title": "平均构建时间",
"type": "graph",
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 0 },
"targets": [
{
"expr": "avg(rate(jenkins_builds_duration_seconds_sum[24h]) / rate(jenkins_builds_duration_seconds_count[24h]))",
"interval": "",
"legendFormat": "平均构建时间(秒)"
}
]
}
],
"refresh": "30s",
"schemaVersion": 27,
"style": "dark",
"tags": ["ci-cd"],
"templating": { "list": [] },
"time": { "from": "now-24h", "to": "now" },
"timepicker": {},
"timezone": "",
"title": "CI/CD系统监控",
"uid": "ci-cd-monitoring",
"version": 1
}
总结与最佳实践
-
CI/CD系统设计原则:
- 安全性优先: 所有阶段都应包含安全检查
- 可观测性: 全面监控流水线各阶段
- 可扩展性: 模块化设计,支持多种工具集成
- 一致性: 环境一致性保证
- 可靠性: 故障自愈和回滚机制
-
DevOps文化建设:
- 打破团队壁垒,建立跨职能团队
- 自动化文化推广
- 持续学习和改进机制
- 明确责任与授权
-
企业级实施策略:
- 从小规模试点开始,逐步推广
- 建立明确的成功指标
- 注重培训和知识共享
- 持续优化流程和工具
-
未来发展趋势:
- GitOps实践
- AI辅助的CI/CD
- 无服务器CI/CD
- 云原生CI/CD架构
扩展学习资源
-
官方文档:
-
进阶学习:
- SRE实践与CI/CD结合
- 混沌工程在CI/CD中的应用
- 大规模微服务CI/CD策略