A new value "rename" has been added to the Ceph pool spec to allow
pools to be renamed in a brownfield deployment. For greenfield the
pool will be created and renamed in a single deployment step, and
for a brownfield deployment in which the pool has already been
renamed previously no changes will be made to pool names.
Change-Id: I3fba88d2f94e1c7102af91f18343346a72872fde
The current pool init job only allows the finding of PGs in the
"peering" or "activating" (or active) states, but it should also
allow the other possible states that can occur while the PG
autoscaler is running ("unknown" and "creating" and "recover").
The helm test is already allowing these states, so the pool init
job is being changed to also allow them to be consistent.
Change-Id: Ib2c19a459c6a30988e3348f8d073413ed687f98b
This patchset makes the current ceph-client helm test more specific
about checking each of the PGs that are transitioning through inactive
states during the test. If any single PG spends more than 30 seconds in
any of these inactive states (peering, activating, creating, unknown,
etc), then the test will fail.
Also, if after the three minute PG checking period is expired, we will
no longer fail the helm test, as it is very possible that the autoscaler
could be still adjusting the PGs for several minutes after a deployment
is done.
Change-Id: I7f3209b7b3399feb7bec7598e6e88d7680f825c4
This patchset will add the capability to configure the
Ceph RBD pool job to leave failed pods behind for debugging
purposes, if it is desired. Default is to not leave them
behind, which is the current behavior.
Change-Id: Ife63b73f89996d59b75ec617129818068b060d1c
This patch resolves a helm test problem where the test was failing
if it found a PG state of "activating". It could also potentially
find a number of other states, like premerge or unknown, that
could also fail the test. Note that if these transient PG states are
found for more than 3 minutes, the helm test fails.
Change-Id: I071bcfedf7e4079e085c2f72d2fbab3adc0b027c
When autoscaling is disabled after pools are created, there is an
opportunity for some autoscaling to take place before autoscaling
is disabled. This change checks to see if autoscaling needs to be
disabled before creating pools, then checks to see if it needs to
be enabled after creating pools. This ensures that autoscaling
won't happen when autoscaler is disabled and autoscaling won't
start prematurely as pools are being created when it is enabled.
Change-Id: I8803b799b51735ecd3a4878d62be45ec50bbbe19
The autoscaler was introduced in the Nautilus release. This
change only sets the pg_num value for a pool if the autoscaler
is disabled or the Ceph release is earlier than Nautilus.
When pools are created with the autoscaler enabled, a pg_num_min
value specifies the minimum value of pg_num that the autoscaler
will target. That default was recently changed from 8 to 32
which severely limits the number of pools in a small cluster per
https://github.com/rook/rook/issues/5091. This change overrides
the default pg_num_min value of 32 with a value of 8 (matching
the default pg_num value of 8) using the optional --pg-num-min
<value> argument at pool creation and pg_num_min value for
existing pools.
Change-Id: Ie08fb367ec8b1803fcc6e8cd22dc8da43c90e5c4
Currently pool quotas and pg_num calculations are both based on
percent_total_data values. This can be problematic when the amount
of data allowed in a pool doesn't necessarily match the percentage
of the cluster's data expected to be stored in the pool. It is
also more intuitive to define absolute quotas for pools.
This change adds an optional pool_quota value that defines an
explicit value in bytes to be used as a pool quota. If pool_quota
is omitted for a given pool, that pool's quota is set to 0 (no
quota).
A check_pool_quota_target() Helm test has also been added to
verify that the sum of all pool quotas does not exceed the target
quota defined for the cluster if present.
Change-Id: I959fb9e95d8f1e03c36e44aba57c552a315867d0
This reverts commit 910ed906d0.
Reason for revert: May be causing upstream multinode gates to fail.
Change-Id: I1ea7349f5821b549d7c9ea88ef0089821eff3ddf
The wait_for_pgs() function in the rbd pool job waits for all PGs
to become active before proceeding, but in the event of an upgrade
that decreases pg_num values on one or more pools it sees PGs in
the clean+premerge+peered state as peering and waits for "peering"
to complete. Since these PGs are in the process of merging into
active PGs, waiting for the merge to complete is unnecessary. This
change will reduce the wait time in this job significantly in
these cases.
Change-Id: I9a2985855a25cdb98ef6fe011ba473587ea7a4c9
The 'ceph pg dump_stuck' command that looks for PGs that are stuck
inactive doesn't include the 'inactive' keyword, so it also finds
PGs that are active that it believes are stuck. This change adds
the 'inactive' keyword to the command so only inactive PGs are
considered.
Change-Id: Id276deb3e5cb8c7e30f5a55140b8dbba52a33900
This commit introduces the following helm test improvement for the
ceph-client chart:
1) Reworks the pg_validation function so that it allows some time for
peering PGs to finish peering, but fail if any other critical errors are
seen. The actual pg validation was split out into a function called
check_pgs(), and the pg_validation function manages the looping aspects.
2) The check_cluster_status function now calls pv_validation if the
cluster status is not OK. This is very similar to what was happening
before, except now, the logic will not be repeated.
Change-Id: I65906380817441bd2ff9ff9cfbf9586b6fdd2ba7
This PS is to address security best practices concerning running
containers as a non-privileged user and disallowing privilege
escalation. Ceph-client is used for the mgr and mds pods.
Change-Id: Idbd87408c17907eaae9c6398fbc942f203b51515
This is to fix the logic to disable the autosclaer on pools as
its not considering newly created pools.
Change-Id: I76fe106918d865b6443453b13e3a4bd6fc35206a
Since we introduced chart version check in gates, requirements are not
satisfied with strict check of 0.1.0
Change-Id: I15950b735b4f8566bc0018fe4f4ea9ba729235fc
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
This corrects an issue in the create_pool function with checking
if the pg autoscaler should be enabled.
Change-Id: Id9be162fd59cc452477f5cc5c5698de7ae5bb141
Added chart lint in zuul CI to enhance the stability for charts.
Fixed some lint errors in the current charts.
Change-Id: I9df4024c7ccf8b3510e665fc07ba0f38871fcbdb
The PS updates queries in wait_for_pgs function (init pool script). The queries
were updated to handle the cases when PGs have "activating" and "peered"
statuses.
Change-Id: Ie93797fcb72462f61bca3a007f6649ab46ef4f97
This is to export the ceph cluster name as environment variable
since its getting referred by scripts.
also to fix the query to get inactive pgs.
Change-Id: I1db5cfbd594c0cc6d54f748f22af5856d9594922
The PS updates queries in wait_for_pgs function in ceph-client and
ceph-osd charts. It allows more accurately check the status of PGs.
The output of the "ceph pg ls" command may contain many PG statuses,
like "active+clean", "active+undersized+degraded", "active+recovering",
"peering" and etc. But along with these statuses there may be such as
"stale+active+clean". To avoid the wrong interpretation of the status
of the PSs the filter was changed from "startswith(active+)" to
"contains(active)".
Also PS adds a delay after restart of the pods to post-apply job.
It allows to reduce the number of useless queries to kubernetes.
Change-Id: I0eff2ce036ad543bf2554bd586c2a2d3e91c052b
The recently-added crush weight comparison in reweight_osds() that
checks weights for zero isn't working correctly because the
expected weight is being calculated to two decimal places and then
compared against "0" as a string. This updates the comparison
string to "0.00" to match the calculation.
Change-Id: I29387a597a21180bb7fba974b4daeadf6ffc182d
If circumstances are such that the reweight function believes
OSD disks have zero size, refrain from reweighting OSDs to 0.
This can happen if OSDs are deployed with the noup flag set.
Also move the setting and unsetting of flags above this
calculation as an additional precautionary measure.
Change-Id: Ibc23494e0e75cfdd7654f5c0d3b6048b146280f7
This change is to address a memory leak in the ceph-mgr deployment.
The leak has also been noted in:
https://review.opendev.org/#/c/711085
Without this change memory usage for the active ceph-mgr pod will
steadily increase by roughly 100MiB per hour until all available
memory has been exhausted. Reset messages will also be seen in the
active and standby ceph-mgr pod logs.
Sample messages:
---
0 client.0 ms_handle_reset on v2:10.0.0.226:6808/1
0 client.0 ms_handle_reset on v2:10.0.0.226:6808/1
0 client.0 ms_handle_reset on v2:10.0.0.226:6808/1
---
The root cause of the resets and associated memory leak appears to
be due to multiple ceph pods sharing the same IP address (due to
hostNetwork being true) and PID (due to hostPID being false).
In the messages above the "1" at the end of the line is the PID.
Ceph appears to use the Version:IP:Port/PID (v2:10.0.0.226:6808/1)
tuple as a unique identifier. When hostPID is false conflicts arise.
Setting hostPID to true stops the reset messages and memory leak.
Change-Id: I9821637e75e8f89b59cf39842a6eb7e66518fa2c
The PS updates wait_for_inactive_pgs function:
- Changed the name of the function to wait_for_pgs
- Added a query for getting status of pgs
- All pgs should be in "active+" state at least three times in a row
Change-Id: Iecc79ebbdfaa74886bca989b23f7741a1c3dca16
The PS adds the check of target osd value. The expected amount of OSDs
should be always more or equal to existing OSDs. If there is more OSDs
than expected it means that the value is not correct.
Change-Id: I117a189a18dbb740585b343db9ac9b596a34b929
Currently the Ceph helm tests pass when the deployed Ceph cluster
is unhealthy. This change expands the cluster status testing
logic to pass when all PGs are active and fail if any PG is
inactive.
The PG autoscaler is currently causing the deployment to deploy
unhealthy Ceph clusters. This change also disables it. It should
be re-enabled once those issues are resolved.
Change-Id: Iea1ff5006fc00e4570cf67c6af5ef6746a538058
The PS adds noup flag check to Ceph-client and Ceph-OSD helm tests.
It allows successfully pass the tests even if noup flag is set.
Change-Id: Ida43d83902d26bef3434c47e71959bb2086ad82a
The PS adds the check of count of OSDs. It ensures that expected amount
of OSDs is present at the moment of creation of a pool.
The expected amount of OSDs is calculated based on target amount of OSDs
and required percent of OSDs.
Change-Id: Iadf36dbeca61c47d9a9db60cf5335e4e1cb7b74b
https://review.opendev.org/733193 removed the reweight_osds()
function from the ceph-client and weighted OSDs as they are added
in the ceph-osd chart instead. Since then some situations have
come up where OSDs were already deployed with incorrect weights
and this function is needed in order to weight them properly later
on. This new version calculates an expected weight for each OSD,
compares it to the OSD's actual weight, and makes an adjustment if
necessary.
Change-Id: I58bc16fc03b9234a08847d29aa14067bec05f1f1
The PS updates helm test and replaces "expected_osds" variable
by the amount of OSDs available in the cluster (ceph-client).
Also the PS updates the logic of calculation of minimum amount of OSDs.
Change-Id: Ic8402d668d672f454f062bed369cac516ed1573e
Unrestrict octal values rule since benefits of file modes readability
exceed possible issues with yaml 1.2 adoption in future k8s versions.
These issues will be addressed when/if they occur.
Also ensure osh-infra is a required project for lint job, that matters
when running job against another project.
Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
Currently OSDs are added by the ceph-osd chart with zero weight
and they get reweighted to proper weights in the ceph-client chart
after all OSDs have been deployed. This causes a problem when a
deployment is partially completed and additional OSDs are added
later. In this case the ceph-client chart has already run and the
new OSDs don't ever get weighted correctly. This change weights
OSDs properly as they are deployed instead. As noted in the
script, the noin flag may be set during the deployment to prevent
rebalancing as OSDs are added if necessary.
Added the ability to set and unset Ceph cluster flags in the
ceph-client chart.
Change-Id: Ic9a3d8d5625af49b093976a855dd66e5705d2c29
This commit rewrites lint job to make template linting available.
Currently yamllint is run in warning mode against all templates
rendered with default values. Duplicates detected and issues will be
addressed in subsequent commits.
Also all y*ml files are added for linting and corresponding code changes
are made. For non-templates warning rules are disabled to improve
readability. Chart and requirements yamls are also modified in the name
of consistency.
Change-Id: Ife6727c5721a00c65902340d95b7edb0a9c77365
This patch currently breaks cinder helm test in the OSH cinder jobs
blocking the gate. Proposing to revert to unblock the jobs.
This reverts commit f59cb11932.
Change-Id: I73012ec6f4c3d751131f1c26eea9266f7abc1809
Currently OSDs are added by the ceph-osd chart with zero weight
and they get reweighted to proper weights in the ceph-client chart
after all OSDs have been deployed. This causes a problem when a
deployment is partially completed and additional OSDs are added
later. In this case the ceph-client chart has already run and the
new OSDs don't ever get weighted correctly. This change weights
OSDs properly as they are deployed instead. As noted in the
script, the noin flag may be set during the deployment to prevent
rebalancing as OSDs are added if necessary.
Added the ability to set and unset Ceph cluster flags in the
ceph-client chart.
Change-Id: Iac50352c857d874f3956776c733d09e0034a0285