Cluster controller first iteration#1688
Cluster controller first iteration#1688DmytroPI-dev wants to merge 4 commits intofeature/database-controllersfrom
Conversation
|
CLA Assistant Lite bot: I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request |
| // Storage overrides the storage size from ClusterClass. | ||
| // Example: "5Gi" | ||
| // +optional | ||
| // +kubebuilder:validation:XValidation:rule="self == null || oldSelf == null || quantity(self).compareTo(quantity(oldSelf)) >= 0",message="storage size can only be increased" |
There was a problem hiding this comment.
We aren't comparing it to the value inherited from ClusterClass, is there any mechanism that would also prohibit the following scenerio? Would CMPG block this behaviour?
- Create the cluster without implicitly specifying the Storage (it's inherited from the ClusterClass) - e.g. 20Gi
- Cluster created
- Putting 10 Gi here
There was a problem hiding this comment.
Storage is not downgradable, but good point, will check, did I understand you correctly, @limak9182 , that you are describing a situation when user created a default cluster, and then updated spec?
api/v4/cluster_types.go
Outdated
| @@ -0,0 +1,111 @@ | |||
| /* | |||
| Copyright 2021. | |||
| // Storage overrides the storage size from ClusterClass. | ||
| // Example: "5Gi" | ||
| // +optional | ||
| // +kubebuilder:validation:XValidation:rule="self == null || oldSelf == null || quantity(self).compareTo(quantity(oldSelf)) >= 0",message="storage size can only be increased" |
There was a problem hiding this comment.
If I understand this rule correctly, self=null allows situation when user adds the attribute with value and then remove it. IN that case we would rely on class provided value which might be lower. IMO we should not allow such behaviour and once set attribute should not be removed
There was a problem hiding this comment.
But it also checks, if the quantity is not lower then set previously, so will check but should be fine.
There was a problem hiding this comment.
I dont this is the way it works. You put || between rules so it means that either of this rule succeed the whole validation pass i.e:
- self refers to the current/new value being validated
- oldSelf refers to the previous value (when updating an existing resource)
- self == null || oldSelf == null - If either value is null, the validation passes (this allows creation and
deletion) - quantity(self).compareTo(quantity(oldSelf)) >= 0 - If both values exist, convert them to Kubernetes
quantities and compare them. The new value must be greater than or equal to the old value.
There was a problem hiding this comment.
I have added separate validation methods to check values inherited from ClusterClass, as, IMO, using CEL I can check only against old/new values from Cluster, not ClusterClass
There was a problem hiding this comment.
Will update the logic, you are correct.
There was a problem hiding this comment.
And will update for PG version as well
| } | ||
|
|
||
| func init() { | ||
| SchemeBuilder.Register(&Cluster{}, &ClusterList{}) |
There was a problem hiding this comment.
I initially named the CRD's ClusterClass, Cluster and Database but when I look at this right now I think we can do better. Maybe we can rename ClusterClass to DatabaseClusterClass, Cluster to DatabaseCluster and leave Database as is?
There was a problem hiding this comment.
We should discuss it, I have some ideas as well.
…nt copyright year.
|
|
||
| // CNPG config status, for reference when getting cluster status. | ||
| // +optional | ||
| CNPGConfigStatus *CNPGConfig `json:"cnpgConfigStatus,omitempty"` |
There was a problem hiding this comment.
We probably shouldnt return in our Cluster status Specific to CNPG directly - the aim of this abstraction is to allow lower level pg operator swap so the Status ( and input) should be generic
There was a problem hiding this comment.
I agree, but we'd probably need to have some more details when describing our cluster, i.e, image version, storage size, etc, which are available in clusters.postgresql.cnpg.io, that's why I put it here (not working yet)
There was a problem hiding this comment.
you have most of this info when you describe your Cluster CR as you should see merged params ( from class and cluster CR itself). Here we should keep the status not static config
| @@ -0,0 +1,9 @@ | |||
| apiVersion: enterprise.splunk.com/v4 | |||
There was a problem hiding this comment.
since this is examples should we also create class postgresql-dev? Otherwise this example failes
There was a problem hiding this comment.
But you've created it already, in enterprise_v4_clusterclass_dev.yaml
There was a problem hiding this comment.
yes, but examples should be probably atomic?
| Name: resourceName, | ||
| Namespace: "default", | ||
| }, | ||
| // TODO(user): Specify other spec details if needed. |
There was a problem hiding this comment.
do you plan to fills this comments?
There was a problem hiding this comment.
This comes from a boilerplate, I didn't write any tests yet.
There was a problem hiding this comment.
ok nvm, lets leave it for now
| if statusErr := r.updateClusterStatus(ctx, cluster, cnpgCluster, err); statusErr != nil { | ||
| logger.Error(statusErr, "Failed to update Cluster status after ensure up to date error") | ||
| } | ||
| return ctrl.Result{}, err |
There was a problem hiding this comment.
should we return combined err if also statusErr is present?
| } | ||
| return ctrl.Result{}, err | ||
| } | ||
| if updated { |
There was a problem hiding this comment.
Do we need this variable. Maybe just err returned is enough then if err == nil we return Cluster is up to date. you can add addiitonal logs to ensureClusterUpToDate if you want and this will be enough
| resultConfig.PostgresVersion = clusterClassDefaulfConfig.PostgresVersion | ||
| } | ||
| if resultConfig.Resources == nil { | ||
| resultConfig.Resources = clusterClassDefaulfConfig.Resources |
There was a problem hiding this comment.
what happens if clusterClassDefaulfConfig.Resources is also nil? We dont specify default anywhere and we cast this value in line https://github.com/splunk/splunk-operator/pull/1688/changes#diff-63aa8a2ed732cb58bc15b9cb714fe27e1f4ba134541b9b0219867d8e9f2b2326R218 to int.
| resultConfig.Storage = clusterClassDefaulfConfig.Storage | ||
| } | ||
| if len(resultConfig.PostgreSQLConfig) == 0 { | ||
| resultConfig.PostgreSQLConfig = clusterClassDefaulfConfig.PostgreSQLConfig |
There was a problem hiding this comment.
nil: We should consider initializing clusterClassDefaultConfig.PostgreSQLConfig when it is nil. While nil is currently accepted by CNPG, this becomes risky if we later decide to modify the value directly in the controller, for example:
cnpgCluster.Spec.PostgresConfiguration.Parameters["max_connections"] = "100"
This would cause a panic, and the attribute would also be omitted from kubectl describe output of our CR. Initializing it to an empty map ({}) would be safer.
Alternatively, we can set a default at the class level so the parameter is always initialized to an empty map:
// +kubebuilder:default={}
same goes for pgHBA
| ) (*cnpgv1.Cluster, error) { | ||
|
|
||
| // Validate that required fields are present in the merged configuration before creating the CNPG Cluster. | ||
| if err := validateClusterConfig(mergedConfig, clusterClass); err != nil { |
There was a problem hiding this comment.
do we need this? why not enforce this on openAPI level i.e put a comment that the attribute is mandatory? It feels a bit redundant + with every change we will need to update this function
| if err := validateClusterConfig(mergedConfig, clusterClass); err != nil { | ||
| return false, err | ||
| } | ||
| // Validate that storage size is not decreased |
There was a problem hiding this comment.
this is already checked by CEL
| if err := validateStorageSize(mergedConfig.Storage, cnpgCluster.Spec.StorageConfiguration.Size); err != nil { | ||
| return false, err | ||
| } | ||
| // Validate that PostgresVersion is not decreased |
There was a problem hiding this comment.
do we have that requirement? maybe lets also put it as a part of CEL validation
| Resources: resources, | ||
| } | ||
|
|
||
| if !equality.Semantic.DeepEqual(cnpgCluster.Spec, desiredState) { |
There was a problem hiding this comment.
can't we simply check if previous merged cluster config == current merged config? We can just fetch the current cluster state and compare with mergedConfig. If there is any difference then initiate clusterSpec. In that case we dont even need separate function for a new cluster (defineNewCNPGCluster) vs updating cluster as at the end the process is exactly the same
| } | ||
|
|
||
| // validateClusterConfig checks that all required fields are present in the merged configuration. | ||
| func validateClusterConfig(mergedConfig *enterprisev4.ClusterSpec, clusterClass *enterprisev4.ClusterClass) error { |
There was a problem hiding this comment.
as mentioned in one of the comment I feel like validate functions are redundant here - we can use k8s mechanism to validate input and reject CR apply even before we trigger reconciliation loop
Description
Cluster controller for CNPG first iteration
Key Changes
Drafted API and controller for Cluster controller
Testing and Verification
Manual tests only
Related Issues
JIRA: CPI-1883
PR Checklist