Skip to content

Master not reachable when "enableConnectionPooler"=True #1304

@yfoelling

Description

@yfoelling

Hey,

i have just started using the postgres operator of zalando.
I am using the provided helm charts for the operator and the ui in the version of the v1.6.0 tag (586b46d).

All the deployments seem to work fine. If i build a Cluster, there seems to be no issues at all.
But as soon as i want to enable the connection pooler, that status in the ui is stuck at "Waiting for master to become available".
I don't really find any relevant log messages about the issue.

I tried a bunch of different combinations, with other postgres versions, less/more replicas and also tried to create the postgres via kubectl directly. But in all cases the same issue occurs as soon as i enable the connection pooler.

I am using Helm v3.4.2, with the charts provided in this repository. I reseted all configuration that i made in the values.yaml's, now the only thing changed to the original is the (listening-)namespace.

The Kubernetes Cluster is version v1.19.4 and my kubectl 1.18.3.
The k8s cluster contains of multiple Nodes on virtual-machines via rke, the storage is provided via longhorn.

here is my values.yaml for the operator helm chart:

image:
  registry: registry.opensource.zalan.do
  repository: acid/postgres-operator
  tag: v1.6.0
  pullPolicy: "IfNotPresent"

# Optionally specify an array of imagePullSecrets.
# Secrets must be manually created in the namespace.
# ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
# imagePullSecrets:
  # - name: myRegistryKeySecretName

podAnnotations: {}
podLabels: {}

configTarget: "ConfigMap"

# JSON logging format
enableJsonLogging: false

# general configuration parameters
configGeneral:
  # choose if deployment creates/updates CRDs with OpenAPIV3Validation
  enable_crd_validation: "true"
  # update only the statefulsets without immediately doing the rolling update
  enable_lazy_spilo_upgrade: "false"
  # set the PGVERSION env var instead of providing the version via postgresql.bin_dir in SPILO_CONFIGURATION
  enable_pgversion_env_var: "true"
  # start any new database pod without limitations on shm memory
  enable_shm_volume: "true"
  # enables backwards compatible path between Spilo 12 and Spilo 13 images
  enable_spilo_wal_path_compat: "false"
  # etcd connection string for Patroni. Empty uses K8s-native DCS.
  etcd_host: ""
  # Select if setup uses endpoints (default), or configmaps to manage leader (DCS=k8s)
  # kubernetes_use_configmaps: "false"
  # Spilo docker image
  docker_image: registry.opensource.zalan.do/acid/spilo-13:2.0-p2
  # max number of instances in Postgres cluster. -1 = no limit
  min_instances: "-1"
  # min number of instances in Postgres cluster. -1 = no limit
  max_instances: "-1"
  # period between consecutive repair requests
  repair_period: 5m
  # period between consecutive sync requests
  resync_period: 30m
  # can prevent certain cases of memory overcommitment
  # set_memory_request_to_limit: "false"

  # map of sidecar names to docker images
  # sidecar_docker_images: ""

  # number of routines the operator spawns to process requests concurrently
  workers: "8"

# parameters describing Postgres users
configUsers:
  # postgres username used for replication between instances
  replication_username: standby
  # postgres superuser name to be created by initdb
  super_username: postgres

configKubernetes:
  # default DNS domain of K8s cluster where operator is running
  cluster_domain: cluster.local
  # additional labels assigned to the cluster objects
  cluster_labels: application:spilo
  # label assigned to Kubernetes objects created by the operator
  cluster_name_label: cluster-name
  # annotations attached to each database pod
  # custom_pod_annotations: "keya:valuea,keyb:valueb"

  # key name for annotation that compares manifest value with current date
  # delete_annotation_date_key: "delete-date"

  # key name for annotation that compares manifest value with cluster name
  # delete_annotation_name_key: "delete-clustername"

  # list of annotations propagated from cluster manifest to statefulset and deployment
  # downscaler_annotations: "deployment-time,downscaler/*"

  # enables initContainers to run actions before Spilo is started
  enable_init_containers: "true"
  # toggles pod anti affinity on the Postgres pods
  enable_pod_antiaffinity: "false"
  # toggles PDB to set to MinAvailabe 0 or 1
  enable_pod_disruption_budget: "true"
  # enables sidecar containers to run alongside Spilo in the same pod
  enable_sidecars: "true"
  # namespaced name of the secret containing infrastructure roles names and passwords
  # infrastructure_roles_secret_name: postgresql-infrastructure-roles

  # list of annotation keys that can be inherited from the cluster manifest
  # inherited_annotations: owned-by

  # list of label keys that can be inherited from the cluster manifest
  # inherited_labels: application,environment

  # timeout for successful migration of master pods from unschedulable node
  # master_pod_move_timeout: 20m

  # set of labels that a running and active node should possess to be considered ready
  # node_readiness_label: ""

  # namespaced name of the secret containing the OAuth2 token to pass to the teams API
  # oauth_token_secret_name: postgresql-operator

  # defines the template for PDB (Pod Disruption Budget) names
  pdb_name_format: "postgres-{cluster}-pdb"
  # override topology key for pod anti affinity
  pod_antiaffinity_topology_key: "kubernetes.io/hostname"
  # namespaced name of the ConfigMap with environment variables to populate on every pod
  # pod_environment_configmap: "default/my-custom-config"
  # name of the Secret (in cluster namespace) with environment variables to populate on every pod
  # pod_environment_secret: "my-custom-secret"

  # specify the pod management policy of stateful sets of Postgres clusters
  pod_management_policy: "ordered_ready"
  # label assigned to the Postgres pods (and services/endpoints)
  pod_role_label: spilo-role
  # service account definition as JSON/YAML string to be used by postgres cluster pods
  # pod_service_account_definition: ""

  # role binding definition as JSON/YAML string to be used by pod service account
  # pod_service_account_role_binding_definition: ""

  # Postgres pods are terminated forcefully after this timeout
  pod_terminate_grace_period: 5m
  # template for database user secrets generated by the operator
  secret_name_template: "{username}.{cluster}.credentials.{tprkind}.{tprgroup}"
  # set user and group for the spilo container (required to run Spilo as non-root process)
  # spilo_runasuser: "101"
  # spilo_runasgroup: "103"
  # group ID with write-access to volumes (required to run Spilo as non-root process)
  # spilo_fsgroup: "103"

  # whether the Spilo container should run in privileged mode
  spilo_privileged: "false"
  # storage resize strategy, available options are: ebs, pvc, off
  storage_resize_mode: pvc
  # operator watches for postgres objects in the given namespace
  watched_namespace: "postgres-operator"  # listen to all namespaces

# configure resource requests for the Postgres pods
configPostgresPodResources:
  # CPU limits for the postgres containers
  default_cpu_limit: "1"
  # CPU request value for the postgres containers
  default_cpu_request: 100m
  # memory limits for the postgres containers
  default_memory_limit: 500Mi
  # memory request value for the postgres containers
  default_memory_request: 100Mi
  # hard CPU minimum required to properly run a Postgres cluster
  min_cpu_limit: 250m
  # hard memory minimum required to properly run a Postgres cluster
  min_memory_limit: 250Mi

# timeouts related to some operator actions
configTimeouts:
  # timeout when waiting for the Postgres pods to be deleted
  pod_deletion_wait_timeout: 10m
  # timeout when waiting for pod role and cluster labels
  pod_label_wait_timeout: 10m
  # interval between consecutive attempts waiting for postgresql CRD to be created
  ready_wait_interval: 3s
  # timeout for the complete postgres CRD creation
  ready_wait_timeout: 30s
  # interval to wait between consecutive attempts to check for some K8s resources
  resource_check_interval: 3s
  # timeout when waiting for the presence of a certain K8s resource (e.g. Sts, PDB)
  resource_check_timeout: 10m

# configure behavior of load balancers
configLoadBalancer:
  # DNS zone for cluster DNS name when load balancer is configured for cluster
  db_hosted_zone: db.example.com
  # annotations to apply to service when load balancing is enabled
  # custom_service_annotations: "keyx:valuez,keya:valuea"

  # toggles service type load balancer pointing to the master pod of the cluster
  enable_master_load_balancer: "false"
  # toggles service type load balancer pointing to the replica pod of the cluster
  enable_replica_load_balancer: "false"
  # define external traffic policy for the load balancer
  external_traffic_policy: "Cluster"
  # defines the DNS name string template for the master load balancer cluster
  master_dns_name_format: '{cluster}.{team}.{hostedzone}'
  # defines the DNS name string template for the replica load balancer cluster
  replica_dns_name_format: '{cluster}-repl.{team}.{hostedzone}'

# options to aid debugging of the operator itself
configDebug:
  # toggles verbose debug logs from the operator
  debug_logging: "true"
  # toggles operator functionality that require access to the postgres database
  enable_database_access: "true"

# parameters affecting logging and REST API listener
configLoggingRestApi:
  # REST API listener listens to this port
  api_port: "8080"
  # number of entries in the cluster history ring buffer
  cluster_history_entries: "1000"
  # number of lines in the ring buffer used to store cluster logs
  ring_log_lines: "100"

# configure interaction with non-Kubernetes objects from AWS or GCP
configAwsOrGcp:
  # Additional Secret (aws or gcp credentials) to mount in the pod
  # additional_secret_mount: "some-secret-name"

  # Path to mount the above Secret in the filesystem of the container(s)
  # additional_secret_mount_path: "/some/dir"

  # AWS region used to store ESB volumes
  aws_region: eu-central-1

  # enable automatic migration on AWS from gp2 to gp3 volumes
  enable_ebs_gp3_migration: "false"
  # defines maximum volume size in GB until which auto migration happens
  # enable_ebs_gp3_migration_max_size: "1000"

  # GCP credentials for setting the GOOGLE_APPLICATION_CREDNETIALS environment variable
  # gcp_credentials: ""

  # AWS IAM role to supply in the iam.amazonaws.com/role annotation of Postgres pods
  # kube_iam_role: ""

  # S3 bucket to use for shipping postgres daily logs
  # log_s3_bucket: ""

  # S3 bucket to use for shipping WAL segments with WAL-E
  # wal_s3_bucket: ""

  # GCS bucket to use for shipping WAL segments with WAL-E
  # wal_gs_bucket: ""

# configure K8s cron job managed by the operator
configLogicalBackup:
  # image for pods of the logical backup job (example runs pg_dumpall)
  logical_backup_docker_image: "registry.opensource.zalan.do/acid/logical-backup:v1.6.0"
  # path of google cloud service account json file
  # logical_backup_google_application_credentials: ""

  # prefix for the backup job name
  logical_backup_job_prefix: "logical-backup-"
  # storage provider - either "s3" or "gcs"
  logical_backup_provider: "s3"
  # S3 Access Key ID
  logical_backup_s3_access_key_id: ""
  # S3 bucket to store backup results
  logical_backup_s3_bucket: "my-bucket-url"
  # S3 endpoint url when not using AWS
  logical_backup_s3_endpoint: ""
  # S3 region of bucket
  logical_backup_s3_region: ""
  # S3 Secret Access Key
  logical_backup_s3_secret_access_key: ""
  # S3 server side encryption
  logical_backup_s3_sse: "AES256"
  # backup schedule in the cron format
  logical_backup_schedule: "30 00 * * *"


# automate creation of human users with teams API service
configTeamsApi:
  # team_admin_role will have the rights to grant roles coming from PG manifests
  # enable_admin_role_for_users: "true"

  # operator watches for PostgresTeam CRs to assign additional teams and members to clusters
  enable_postgres_team_crd: "false"
  # toogle to create additional superuser teams from PostgresTeam CRs
  # enable_postgres_team_crd_superusers: "false"

  # toggle to grant superuser to team members created from the Teams API
  # enable_team_superuser: "false"

  # toggles usage of the Teams API by the operator
  enable_teams_api: "false"
  # should contain a URL to use for authentication (username and token)
  # pam_configuration: https://info.example.com/oauth2/tokeninfo?access_token= uid realm=/employees

  # operator will add all team member roles to this group and add a pg_hba line
  # pam_role_name: zalandos

  # List of teams which members need the superuser role in each Postgres cluster
  # postgres_superuser_teams: "postgres_superusers"

  # List of roles that cannot be overwritten by an application, team or infrastructure role
  # protected_role_names: "admin"

  # role name to grant to team members created from the Teams API
  # team_admin_role: "admin"

  # postgres config parameters to apply to each team member role
  # team_api_role_configuration: "log_statement:all"

  # URL of the Teams API service
  # teams_api_url: http://fake-teams-api.default.svc.cluster.local

# configure connection pooler deployment created by the operator
configConnectionPooler:
  # db schema to install lookup function into
  connection_pooler_schema: "pooler"
  # db user for pooler to use
  connection_pooler_user: "pooler"
  # docker image
  connection_pooler_image: "registry.opensource.zalan.do/acid/pgbouncer:master-9"
  # max db connections the pooler should hold
  connection_pooler_max_db_connections: "60"
  # default pooling mode
  connection_pooler_mode: "transaction"
  # number of pooler instances
  connection_pooler_number_of_instances: "2"
  # default resources
  connection_pooler_default_cpu_request: 500m
  connection_pooler_default_memory_request: 100Mi
  connection_pooler_default_cpu_limit: "1"
  connection_pooler_default_memory_limit: 100Mi

rbac:
  # Specifies whether RBAC resources should be created
  create: true

crd:
  # Specifies whether custom resource definitions should be created
  # When using helm3, this is ignored; instead use "--skip-crds" to skip.
  create: true

serviceAccount:
  # Specifies whether a ServiceAccount should be created
  create: true
  # The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name:

podServiceAccount:
  # The name of the ServiceAccount to be used by postgres cluster pods
  # If not set a name is generated using the fullname template and "-pod" suffix
  name: "postgres-pod"

# priority class for operator pod
priorityClassName: ""

# priority class for database pods
podPriorityClassName: ""

resources:
  limits:
    cpu: 500m
    memory: 500Mi
  requests:
    cpu: 100m
    memory: 250Mi

# Affinity for pod assignment
# Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}

# Tolerations for pod assignment
# Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
tolerations: []

# Node labels for pod assignment
# Ref: https://kubernetes.io/docs/user-guide/node-selection/
nodeSelector: {}

controllerID:
  # Specifies whether a controller ID should be defined for the operator
  # Note, all postgres manifest must then contain the following annotation to be found by this operator
  # "acid.zalan.do/controller": <controller-ID-of-the-operator>
  create: false
  # The name of the controller ID to use.
  # If not set and create is true, a name is generated using the fullname template
  name:

Also the values.yaml for the ui helm chart:

# Default values for postgres-operator-ui.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

replicaCount: 1

# configure ui image
image:
  registry: registry.opensource.zalan.do
  repository: acid/postgres-operator-ui
  tag: v1.6.0
  pullPolicy: "IfNotPresent"

# Optionally specify an array of imagePullSecrets.
# Secrets must be manually created in the namespace.
# ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
# imagePullSecrets:
#   - name: 

rbac:
  # Specifies whether RBAC resources should be created
  create: true

serviceAccount:
  # Specifies whether a ServiceAccount should be created
  create: true
  # The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name:

# configure UI pod resources
resources:
  limits:
    cpu: 200m
    memory: 200Mi
  requests:
    cpu: 100m
    memory: 100Mi

# configure UI ENVs
envs:
  # IMPORTANT: While operator chart and UI chart are idendependent, this is the interface between
  # UI and operator API. Insert the service name of the operator API here!
  operatorApiUrl: "http://postgres-operator:8080"
  operatorClusterNameLabel: "cluster-name"
  resourcesVisible: "False"
  targetNamespace: "postgres-operator"

# configure UI service
service:
  type: "ClusterIP"
  port: "80"
  # If the type of the service is NodePort a port can be specified using the nodePort field
  # If the nodePort field is not specified, or if it has no value, then a random port is used
  # notePort: 32521

# configure UI ingress. If needed: "enabled: true"
ingress:
  enabled: false
  annotations: {}
    # kubernetes.io/ingress.class: nginx
    # kubernetes.io/tls-acme: "true"
  hosts:
    - host: ui.example.org
      paths: [""]
  tls: []
  #  - secretName: ui-tls
  #    hosts:
  #      - ui.exmaple.org

I would really appreciate any help. If you need any additional info, feel free to ask :-)

thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions