如何基于Percona Operator for PostgreSQL部署备用/临时集群

拥有备用集群可确保最大的数据可用性和灾难恢复解决方案。在本博客文章中，我们将介绍如何使用流复制设置备用集群，以及如何创建使用远程pgBackRest仓库的临时/备用集群。源集群和目标集群可以部署在不同的命名空间、区域或数据中心，彼此之间没有依赖关系。

让我们深入了解以下每个过程。

使用流复制构建备用集群

以下是已经设置并运行的主/主集群

1
2
3
4
5
6
7


kubectl get pods -n postgres-operator
NAME                                           READY   STATUS      RESTARTS        AGE
cluster1-backup-wffk-9lbcf                     0/1     Completed   0               2d22h
cluster1-instance1-wltm-0                      4/4     Running     1 (6h39m ago)   22h
cluster1-pgbouncer-556659fb94-szvjt            2/2     Running     0               3d21h
cluster1-repo-host-0                           2/2     Running     0               2d22h
percona-postgresql-operator-6746bff4c7-729z5   1/1     Running     3 (11h ago)     3d21h

为了让备用集群连接到主集群，我们需要在[cr.yaml]文件的以下部分暴露服务：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


image: docker.io/percona/percona-postgresql-operator:2.7.0-ppg17.5.2-postgres
imagePullPolicy: Always
postgresVersion: 17
# port: 5432

expose:
# annotations:
# my-annotation: value1
# labels:
# my-label: value2
type: clusterIP

1

kubectl apply -f cr.yaml -n postgres-operator

1
2
3
4
5
6
7
8


kubectl get services -n postgres-operator
NAME                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
cluster1-ha          ClusterIP   10.43.101.40    <none>        5432/TCP   2d15h
cluster1-ha-config   ClusterIP   None            <none>        <none>     2d15h
cluster1-pgbouncer   ClusterIP   10.43.149.182   <none>        5432/TCP   2d15h
cluster1-pods        ClusterIP   None            <none>        <none>     2d15h
cluster1-primary     ClusterIP   None            <none>        5432/TCP   2d15h
cluster1-replicas    ClusterIP   10.43.85.169    <none>        5432/TCP   2d15h

以下确切的端点详细信息将稍后在备用集群配置中使用：

1

<service-name>.<namespace>.svc.cluster.local

例如：

1

cluster1-ha.postgres-operator.svc.cluster.local

接下来，我们需要确保已从主集群复制所有证书，并在备用集群上部署相同的证书，该备用集群设置在不同的命名空间[postgres-operator2]下：

1
2
3


kubectl get secret cluster1-cluster-ca-cert -n postgres-operator -o yaml > backup-cluster1-cluster-ca-cert.yaml
kubectl get secret cluster1-cluster-cert -n postgres-operator -o yaml > backup-cluster1-cluster-cert.yaml
kubectl get secret cluster1-replication-cert -n postgres-operator -o yaml > backup-cluster1-replication-cert.yaml

在备份后从新设置/备用集群中删除旧证书（如果需要）：

1
2
3


kubectl get secret cluster1-cluster-ca-cert -n postgres-operator2 -o yaml > backup-cluster1-cluster-ca-cert.yaml
kubectl get secret cluster1-cluster-cert -n postgres-operator2 -o yaml > backup-cluster1-cluster-cert.yaml
kubectl get secret cluster1-replication-cert -n postgres-operator2 -o yaml > backup-cluster1-replication-cert.yaml

1
2
3


kubectl delete secret cluster1-cluster-ca-cert -n postgres-operator2
kubectl delete secret cluster1-cluster-cert -n postgres-operator2
kubectl delete secret cluster1-replication-cert -n postgres-operator2

在应用新的密钥更改之前，确保根据新集群更改命名空间[postgres-operator2]：

1
2
3


kubectl apply -f backup-cluster1-cluster-ca-cert.yaml -n postgres-operator2
kubectl apply -f backup-cluster1-cluster-cert.yaml -n postgres-operator2
kubectl apply -f backup-cluster1-replication-cert.yaml -n postgres-operator2

如果我们将证书名称更改为任何不同的命名，则需要在备用[cr.yaml]文件中相应执行更改，并在那里重新应用更改：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


secrets:
# customRootCATLSSecret:
# name: cluster1-ca-cert
# items:
# - key: "tls.crt"
# path: "root.crt"
# - key: "tls.key"
# path: "root.key"
customTLSSecret:
name: cluster1-cert
customReplicationTLSSecret:
name: replication1-cert

此外，我们还需要启用备用选项，并在备用[cr.yaml]中添加主端点详细信息：

1
2
3


standby:
enabled: true
host: cluster1-ha.postgres-operator.svc.cluster.local

最后，我们可以部署修改后的更改：

1

kubectl apply -f deploy/cr.yaml -n pg-operator

如果更改未反映，请确保删除pod和关联的pvc：

1
2


kubectl delete pvc cluster1-instance1-ft6m-pgdata -n postgres-operator2
kubectl delete pod cluster1-instance1-ft6m-0 -n postgres-operator2

在备用集群上验证更改

主/领导集群：

1
2
3
4
5
6
7


kubectl exec -it cluster1-instance1-wltm-0 -n postgres-operator -- sh
hello=# dt
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | h1 | table | postgres
(1 rows)

备用集群：

1
2
3
4
5
6
7


kubectl exec -it cluster1-instance1-ft6m-0 -n postgres-operator2 -- sh
hello=# dt
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | h1 | table | postgres
(1 rows)

1
2
3
4
5
6
7


sh-5.1$ patronictl list
+ Cluster: cluster1-ha (7569663519331602522) -------------------------+----------------+---------------------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------------------------+-----------------------------------------+----------------+---------------------+----+-----------+
| cluster1-instance1-ft6m-0 | cluster1-instance1-ft6m-0.cluster1-pods | Standby Leader | in archive recovery | 6 | |
+---------------------------+-----------------------------------------+----------------+---------------------+----+-----------+
sh-5.1$

使用pgBackRest仓库构建备用/临时集群

考虑以下备用集群：

1
2
3
4
5
6


kubectl get pods -n postgres-operator2
NAME READY STATUS RESTARTS AGE
cluster1-instance1-ft6m-0 4/4 Running 0 36h
cluster1-pgbouncer-556659fb94-qk2ng 2/2 Running 0 2d15h
cluster1-repo-host-0 2/2 Running 0 2d15h
percona-postgresql-operator-6746bff4c7-w7l9h 1/1 Running 0 3d11h

接下来，我们需要在密钥文件中设置我们的bucket/S3凭据：

1
2
3
4
5


cat <<EOF | base64 -b 0
[global]
repo1-s3-key=minioadmin
repo1-s3-key-secret=minioadmin
EOF

输出：

1

W2dsb2JhbF0KcmVwbzEtczMta2V5PW1pbmlvYWRtaW4KcmVwbzEtczMta2V5LXNlY3JldD1taW5pb2FkbWluCg==

1
2
3
4
5
6
7


apiVersion: v1
kind: Secret
metadata:
name: cluster1-pgbackrest-secrets
type: Opaque
data:
s3.conf: W2dsb2JhbF0KcmVwbzEtczMta2V5PW1pbmlvYWRtaW4KcmVwbzEtczMta2V5LXNlY3JldD1taW5pb2FkbWluCg==

1

kubectl apply -f deploy/cluster1-pgbackrest-secrets.yaml -n postgres-operator2

注意 - 对于配置其他存储类型（如GCB、ABS等），请参考手册：https://docs.percona.com/percona-operator-for-postgresql/2.0/backups-storage.html#__tabbed_1_3

一旦部署了密钥文件，我们需要在[cr.yaml]文件的pgBackRest备份部分添加远程bucket/端点详细信息以及上述密钥[cluster1-pgbackrest-secrets]。远程S3仓库中存储的备份由具有类似pgBackRest配置的主主集群节点启动。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180


backups:
# trackLatestRestorableTime: true
pgbackrest:
# metadata:
# labels:
image: docker.io/percona/percona-pgbackrest:2.55.0
# initContainer:
# image: docker.io/percona/percona-postgresql-operator:2.7.0
# resources:
# limits:
# cpu: 2.0
# memory: 4Gi
# requests:
# cpu: 1.0
# memory: 3Gi
# containerSecurityContext:
# runAsUser: 1001
# runAsGroup: 1001
# runAsNonRoot: true
# privileged: false
# allowPrivilegeEscalation: false
# readOnlyRootFilesystem: true
# capabilities:
# add:
# - NET_ADMIN
# - SYS_TIME
# drop:
# - ALL
# seccompProfile:
# type: Localhost
# localhostProfile: localhost/profile.json
# procMount: Default
# seLinuxOptions:
# type: spc_t
# level: s0:c123,c456
# containers:
# pgbackrest:
# resources:
# limits:
# cpu: 200m
# memory: 128Mi
# requests:
# cpu: 150m
# memory: 120Mi
# pgbackrestConfig:
# resources:
# limits:
# cpu: 200m
# memory: 128Mi
# requests:
# cpu: 150m
# memory: 120Mi
#
configuration:
- secret:
name: cluster1-pgbackrest-secrets
# jobs:
# restartPolicy: OnFailure
# backoffLimit: 2
# priorityClassName: high-priority
# ttlSecondsAfterFinished: 60
# resources:
# limits:
# cpu: 200m
# memory: 128Mi
# requests:
# cpu: 150m
# memory: 120Mi
# tolerations:
# - effect: NoSchedule
# key: role
# operator: Equal
# value: connection-poolers
#
# securityContext:
# fsGroup: 1001
# runAsUser: 1001
# runAsNonRoot: true
# fsGroupChangePolicy: "OnRootMismatch"
# runAsGroup: 1001
# seLinuxOptions:
# type: spc_t
# level: s0:c123,c456
# seccompProfile:
# type: Localhost
# localhostProfile: localhost/profile.json
# supplementalGroups:
# - 1001
# sysctls:
# - name: net.ipv4.tcp_keepalive_time
# value: "600"
# - name: net.ipv4.tcp_keepalive_intvl
# value: "60"
#
global:
# repo1-retention-full: "14"
# repo1-retention-full-type: time
repo1-path: /pgbackrest/postgres-operator/cluster1/repo1
# repo1-cipher-type: aes-256-cbc
repo1-s3-uri-style: path
repo1-s3-verify-tls: 'n'
# repo2-path: /pgbackrest/postgres-operator/cluster1-multi-repo/repo2
# repo3-path: /pgbackrest/postgres-operator/cluster1-multi-repo/repo3
# repo4-path: /pgbackrest/postgres-operator/cluster1-multi-repo/repo4
repoHost:
# resources:
# limits:
# cpu: 200m
# memory: 128Mi
# requests:
# cpu: 150m
# memory: 120Mi
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
labelSelector:
matchLabels:
postgres-operator.crunchydata.com/data: pgbackrest
topologyKey: kubernetes.io/hostname
# tolerations:
# - effect: NoSchedule
# key: role
# operator: Equal
# value: connection-poolers
# priorityClassName: high-priority
#
# topologySpreadConstraints:
# - maxSkew: 1
# topologyKey: my-node-label
# whenUnsatisfiable: ScheduleAnyway
# labelSelector:
# matchLabels:
# postgres-operator.crunchydata.com/pgbackrest: ""
#
# securityContext:
# fsGroup: 1001
# runAsUser: 1001
# runAsNonRoot: true
# fsGroupChangePolicy: "OnRootMismatch"
# runAsGroup: 1001
# seLinuxOptions:
# type: spc_t
# level: s0:c123,c456
# seccompProfile:
# type: Localhost
# localhostProfile: localhost/profile.json
# supplementalGroups:
# - 1001
# sysctls:
# - name: net.ipv4.tcp_keepalive_time
# value: "600"
# - name: net.ipv4.tcp_keepalive_intvl
# value: "60"
#
manual:
repoName: repo1
options:
- --type=full
# initialDelaySeconds: 120
repos:
# - name: repo1
# schedules:
# full: "0 0 * * 6"
# differential: "0 1 * * 1-6"
# incremental: "0 1 * * 1-6"
# volume:
# volumeClaimSpec:
# storageClassName: standard
# accessModes:
# - ReadWriteOnce
# resources:
# requests:
# storage: 1Gi
- name: repo1
s3:
bucket: "ajtest"
endpoint: "https://host.k3d.internal:9000"
region: "us-east-1"

此外，在[cr.yaml]文件中启用备用并提及目标仓库名称：

1
2
3


standby:
enabled: true
repoName: repo1

最后，我们可以应用修改：

1

kubectl apply -f deploy/cr.yaml -n postgres-operator2

验证数据同步

现有的pgBackRest备份现在也会在备用端列出：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


kubectl exec -it cluster1-repo-host-0 -n postgres-operator -- sh
sh-5.1$ pgbackrest info
stanza: db
status: ok
cipher: none

db (current)
wal archive min/max (17): 00000002000000000000000B/000000060000000000000022

full backup: 20251107-164421F
timestamp start/stop: 2025-11-07 16:44:21+00 / 2025-11-07 16:44:24+00
wal start/stop: 00000002000000000000000C / 00000002000000000000000C
database size: 30.7MB, database backup size: 30.7MB
repo1: backup set size: 4MB, backup size: 4MB

full backup: 20251107-165613F
timestamp start/stop: 2025-11-07 16:56:13+00 / 2025-11-07 16:56:17+00
wal start/stop: 000000020000000000000013 / 000000020000000000000013
database size: 38.3MB, database backup size: 38.3MB
repo1: backup set size: 5MB, backup size: 5MB

full backup: 20251111-070032F
timestamp start/stop: 2025-11-11 07:00:32+00 / 2025-11-11 07:00:35+00
wal start/stop: 000000060000000000000025 / 000000060000000000000026
database size: 38.8MB, database backup size: 38.8MB
repo1: backup set size: 5.1MB, backup size: 5.1MB

此外，如果我们访问备用数据库，数据同步将反映在那里：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


kubectl exec -it cluster1-instance1-ft6m-0 -n postgres-operator2 -- sh
sh-5.1$ psql
psql (17.5 - Percona Server for PostgreSQL 17.5.2)
Type "help" for help.

postgres=# c hello
You are now connected to database "hello" as user "postgres".
hello=# dt
List of relations
Schema | Name | Type | Owner
--------+------+-------+----------
public | h1 | table | postgres
(1 rows)

如果更改未反映，请尝试删除旧的pod/pvc：

1
2


kubectl delete pod <pod_name> -n <namespace>
kubectl delete pvc <pvc_name> -n <namespace>

总结

我们上面讨论的程序基本上概述了在基于k8s/Percona operator的环境中从源主集群部署新的独立/备用集群的几种方法。这还提供了灵活性，既可以服务于具有连续数据流的目的，也可以仅构建具有精确数据集的一次性集群。

基于Percona Operator的PostgreSQL备用集群部署指南

本文详细介绍了如何使用Percona Operator for PostgreSQL部署备用集群，涵盖流复制和pgBackRest远程仓库两种方法，实现在不同命名空间、区域或数据中心的集群部署。

如何基于Percona Operator for PostgreSQL部署备用/临时集群

使用流复制构建备用集群

使用pgBackRest仓库构建备用/临时集群

总结