* Bump galaxy.yml to next expected version
* Refactor check_galaxy + fix version (#10729)
* Remove checks for docs using exact tags
Instead use a more generic documentation for installing kubespray as a
collection from git.
* Check that we upgraded galaxy.yml to next version
This is only intented to check for human error. The version in galaxy
should be the next (which does not mean the same if we're on master or a
release branch).
* Set collection version to KUBESPRAY_NEXT_VERSION
* CI: Document the 'all-in-one' layout + small refactoring (#10725)
* Rename aio to all-in-one and document it
ADTM.
Acronyms don't tell much.
* Refactor vm_count in tests provisioning
* Add test case for calico using etcd datastore (#10722)
* Add multinode ci layout
* Add test case for calico using etcd datastore
* Fix calico-node in etcd mode (#10438)
* Calico : add ETCD endpoints to install-cni container
* Calico : remove nodename from configmap in etcd mode
---------
Co-authored-by: Olivier Levitt <olivier.levitt@gmail.com>
* Disable control plane allocating podCIDR for nodes when using calico
Calico does not use the .spec.podCIDR field for its IP address
management.
Furthermore, it can false positives from the kube controller manager if
kube_network_node_prefix and calico_pool_blocksize are unaligned, which
is the case with the default shipped by kubespray.
If the subnets obtained from using kube_network_node_prefix are bigger,
this would result at some point in the control plane thinking it does
not have subnets left for a new node, while calico will work without
problems.
Explicitely set a default value of false for calico_ipam_host_local to
facilitate its use in templates.
* Don't default to kube_network_node_prefix for calico_pool_blocksize
They have different semantics: kube_network_node_prefix is intended to
be the size of the subnet for all pods on a node, while there can be
more than on calico block of the specified size (they are allocated on
demand).
Besides, this commit does not actually change anything, because the
current code is buggy: we don't ever default to
kube_network_node_prefix, since the variable is defined in the role
defaults.
* kubernetes: add hashes for 1.27.8, 1.26.11
Make 1.27.8 default.
* Convert exoscale tf provider to new version (#10646)
This is untested. It passes terraform validate to un-broke the CI.
* Update 0040-verify-settings.yml (#10699)
remove embedded template
---------
Co-authored-by: piwinkler <9642809+piwinkler@users.noreply.github.com>
* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane
* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane
* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane
* Migrate node-role.kubernetes.io/master to node-role.kubernetes.io/control-plane
Refactor NRI (Node Resource Interface) activation in CRI-O and
containerd. Introduce a shared variable, nri_enabled, to streamline
the process. Currently, enabling NRI requires a separate update of
defaults for each container runtime independently, without any
verification of NRI support for the specific version of containerd
or CRI-O in use.
With this commit, the previous approach is replaced. Now, a single
variable, nri_enabled, handles this functionality. Also, this commit
separates the responsibility of verifying NRI supported versions of
containerd and CRI-O from cluster administrators, and leaves it to
Ansible.
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
(cherry picked from commit 1fd31ccc28)
* [containerd] Add Configuration option for Node Resource Interface
Node Resource Interface (NRI) is a common is a common framework for
plugging domain or vendor-specific custom logic into container
runtime like containerd. With this commit, we introduce the
containerd_disable_nri configuration flag, providing cluster
administrators the flexibility to opt in or out (defaulted to 'out')
of this feature in containerd. In line with containerd's default
configuration, NRI is disabled by default in this containerd role
defaults.
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
* [cri-o] Add configuration option for Node Resource Interface
Node Resource Interface (NRI) is a common is a common framework for
plugging domain or vendor-specific custom logic into container
runtimes like containerd/crio. With this commit, we introduce the
crio_enable_nri configuration flag, providing cluster
administrators the flexibility to opt in or out (defaulted to 'out')
of this feature in cri-o runtime. In line with crio's default
configuration, NRI is disabled by default in this cri-o role
defaults.
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
---------
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
(cherry picked from commit f964b3438d)
* Fix containerd_registries in config_path for mirrors and remove nerdctl global insecure_registry setting
* Make containerd hosts.toml mode 0640
* Add containerd_registries_mirrors and keep containerd_registries to pass packet_debian11-calico-upgrade
Set owner/group to root/root when unarchiving kata-containers binary to prevent kata-containers binaries/directories and especially / from getting chowned to 1001:123, the file owner specified in the kata-containers archive
* tests: replace fedora35 with fedora37
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: replace fedora36 with fedora38
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* docs: update fedora version in docs
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* molecule: upgrade fedora version
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: upgrade fedora images for vagrant and kubevirt
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* vagrant: workaround to fix private network ip address in fedora
Fedora stop supporting syconfig network script so we added a workaround
here
https://github.com/hashicorp/vagrant/issues/12762#issuecomment-1535957837
to fix it.
* netowrkmanager: do not configure dns if using systemd-resolved
We should not configure dns if we point to systemd-resolved.
Systemd-resolved is using NetworkManager to infer the upstream DNS
server so if we set NetworkManager to 127.0.0.53 it will prevent
systemd-resolved to get the correct network DNS server.
Thus if we are in this case we just don't set this setting.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* image-builder: update centos7 image
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* gitlab-ci: mark fedora packet jobs as allow failure
Fedora networking is still broken on Packet, let's mark it as allow
failure for now.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Update reset.yml
reset confirmation user input fix
* Update reset.yml
added default for non-interactive run in ci/cd
* fix reset_confirmation in reset.yml
* skip reset_confirmation promtp when reset_confirmation is defined via extra-vars option (for tests)
* check both string type and object type with user_input for reset_confirmation var
* reset_confirmation_prompt in conjunction with reset_confirmation
improvement inspired by:
https://github.com/kubernetes-sigs/kubespray/pull/10288#issuecomment-1637056880
* project: update all dependencies including ansible
Upgrade to ansible 7.x and ansible-core 2.14.x. There seems to be issue
with ansible 8/ansible-core 2.15 so we remain on those versions for now.
It's quite a big bump already anyway.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: install aws galaxy collection
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* ansible-lint: disable various rules after ansible upgrade
Temporarily disable a bunch of linting action following ansible upgrade.
Those should be taken care of separately.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve deprecated-module ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve no-free-form ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve schema[meta] ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve schema[playbook] ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve schema[tasks] ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve risky-file-permissions ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve risky-shell-pipe ansible-lint error
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: remove deprecated warn args
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: use fqcn for non builtin tasks
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: resolve syntax-check[missing-file] for contrib playbook
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* project: use arithmetic inside jinja to fix ansible 6 upgrade
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: cleanup stale packet namespace automatically
Cancelled job on Gitlab can produce stale VMs as the delete playbook
will never be executed. This commits allow removing old vms by getting
all the namespace created from the same branch with an older pipeline
id.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: cleanup stale packet namespace after 2 hours
This ensure that we don't have any packet namespace remaining for more
than 2 hours. All the jobs complete usually within 30min-1hour so 2
hours is enough to detect a stale namespace.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: ignore vm cleanup failure
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: use pipeline_id var instead of fetching namespace for cleanup packet vm
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
New line was not inserted between image and imagePullPolicy for some
reasons with the jinja. Simplifying this altogether should fix this.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Fix upgrade-path for kubelet-csr-approver
Fixes an error when you enable kubelet-csr-approver when upgrading.
It hangs waiting for the certificate to be approved since the
kubelet-csr-approver is not installed yet.
* Add missing package when using helm role
Molecule 5.0 require ansible-core 2.12.10.
So this commit we update ansible-core from 2.12.5 to 2.12.10.
We also drop supporting two ansible-core version. Also we now use the "oldest"
still supported ansible-core version as both 2.11 is EOL and not
supported by molecule.
tests/molecule: remove linting in molecule to support molecule 5
tests/molecule: remove role name check for molecule 5 support
Kubespray doesn't use ansible galaxy style naming so we have to disable
that check.
contrib/inventory_builder: fix tox.ini for tox4
tests/molecule: fix get_playbook in testinfra tests
tests: upgrade most tests requirements
Exclude ansible-lint for now, I will do that in a separate PR.
tests/molecule: force kvm driver option
If we don't do this it fallbacks to qemu emulated on our CI for some
reasons.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Fix `The task includes an option with an undefined variable` for 1.27
* delete old flag --container-runtime
Signed-off-by: Victor Login <batazor@evrone.com>
---------
Signed-off-by: Victor Login <batazor@evrone.com>
Add option to configure class as the default class
Add option to disable wathcing for ingresses without class
Remove redundant if that always evaluates to true
Fix default value missing for ingress_nginx_default
``
"msg": "Failed to template loop_control.label: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'item'. 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'item'", "skip_reason": "Conditional result was False"}
``
fixes case when multus should NOT be included.
Calling bootstrap in facts.yaml so that we can always collect facts even on
new nodes. This is useful when you want to add nodes to an inventory
beforehand and then collect facts and scale the cluster with the scale
playbook and --limits. With dynamic inventory sometimes it might be more
difficult to add the nodes after running the facts playbook in this
specific situation.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
This feature no longer works on Ansible 6 / ansible-core 2.13. We do not
support these version officially yet but this will help for the future
upgrade and may help some people running those inadvertently.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
- Test with new version: 37.20230322.3.0. Both containerd and
cri-o is tested
- bugfix: when we use crio and the var bin_dir is changed,
there will be some error about the new bin dir.
* remove-debian9-support
* Add six module into openstack-cleanup/requirements.txt (#10099)
To fix tf-elastx_cleanup job which was failed with the following error:
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/generic/password.py", line 16, in <module>
from keystoneauth1.identity import v3
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/__init__.py", line 27, in <module>
from keystoneauth1.identity.v3.oauth2_mtls_client_credential import * # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/oauth2_mtls_client_credential.py", line 17, in <module>
import six
ModuleNotFoundError: No module named 'six'
---------
Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
According to the canal github[1] the repo is not maintained over 5 years.
In addition, the README says
```
Originally, we thought we might more deeply integrate the two projects
(possibly even going as far as a rebranding!). However, over time it
became clear that that wasn't really necessary to fulfil our goal of
making them work well together. Ultimately, we decided to focus on
adding features to both projects rather than doing work just to
combine them.
```
So it is difficult to support canal by Kubespray at this situation.
[1]: https://github.com/projectcalico/canal
To fix tf-elastx_cleanup job which was failed with the following error:
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/generic/password.py", line 16, in <module>
from keystoneauth1.identity import v3
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/__init__.py", line 27, in <module>
from keystoneauth1.identity.v3.oauth2_mtls_client_credential import * # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/oauth2_mtls_client_credential.py", line 17, in <module>
import six
ModuleNotFoundError: No module named 'six'
* Drop CI jobs related to canal
According to the canal github[1] the repo is not maintained over 5 years.
In addition, the README says
Originally, we thought we might more deeply integrate the two projects
(possibly even going as far as a rebranding!). However, over time it
became clear that that wasn't really necessary to fulfil our goal of
making them work well together. Ultimately, we decided to focus on
adding features to both projects rather than doing work just to
combine them.
So we don't need to run CI jobs related to the canal at this situation.
[1]: https://github.com/projectcalico/canal
* Update ci.md
* chore(helm-apps): fix README example
README shows a non-working example according to the specs for this role.
* Add support for kubelet-csr-approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Add tests for kubelet-csr-approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Add Documentation for Kubelet CSR Approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Co-authored-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Remove deprecated provider and fix flatcar configs
* Refactor for DRYness
* Add missing line endings
* Enable tests for hetzner terraform in CI
* Add missing inventory for CI tests
* Fix: vSphere Error: `Apply a CSI secret manifest`
This PR will fix an issue that you will see on 2nd deploy when deploying External vSphere
How to re-produce:
1. Set custom `vsphere_csi_namespace: "vmware-system-csi"`
2. Deploy as usual
3. Observe no errors
4. Deploy 2nd time without `reset`
5. Playbook fails with:
```
TASK [kubernetes-apps/csi_driver/vsphere : vSphere CSI Driver | Apply a CSI secret manifest]
fatal: [node-00]: FAILED! => changed=true
censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result'
```
* create namespace if does not exist
* lint fix
* try to fix lint errors
* fix `too few spaces before comment`
* change the order of applied manifests
* typo
* [cilium] fix rbac and upgrade hubble v0.11.0 (#3)
* [cilium] fix rbac for LB bgp ipam
* [cilium] Upgrade Hubble to v0.11.0 and add mTLS between Hubble UI and Hubble Relay
* fix dns domain hubble for tls
---------
Co-authored-by: Thuon Jeremy <d107869@olinfra1.infra.bdm.outscale.c1.dav.fr>
* Fix blank line
---------
Co-authored-by: Thuon Jeremy <d107869@olinfra1.infra.bdm.outscale.c1.dav.fr>
Cilium 1.13.1 changed how the cilium-cni binary gets placed in /opt/cni/bin,
so that it takes place in an init container rather than in the main agent.
This commit removes the variable `use_localhost_as_kubeapi_loadbalancer`
and rather detects that we are in a situation where we can use the
localhost apiserver loadbalancer (meaning that we use the localhost load
balancer and that the same ports are used for both the load balancer and
the kube-apiserver).
This also cleanups the calico code to use `kube_apiserver_global_endpoint`
rather than implementing the same logic all over again.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* node: fix default kubelet/runtime cgroups when kube_reserved is false (default)
Commit 1c4db6132d introduced a notion of
kube_reserved. This introduced a breaking change defaulting to use
kube.slice for the container_manager and the kubelet as if kube_reserved
was always enabled whereas it is disabled by default.
This commit fixes this by bringing back system.slice whenever
kube_reserved is disabled.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* inventory/sample: change false for kube_reserved as its the default
Changing the commented value in sample inventory to the actual default
value.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* network_plugin/custom_cni: add CNI to apply provided manifests
Add a new simple custom_cni to install provided Kubernetes manifests.
This could be useful to use manifests directly provided by a CNI when
there are not support by Kubespray (i.e.: helm chart or any other manifests
generation method).
Co-authored-by: James Landrein <james.landrein@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* network_plugin/custom_cni: add test with cilium
Co-authored-by: James Landrein <james.landrein@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Co-authored-by: James Landrein <james.landrein@proton.ch>
crun_bin_dir was used to specify the destination of the crun binary during the
download process. This path must match with the value provided in the CRI-O
configuration file. So changing its value to bin_dir helps to mismatch errors.
Signed-off-by: Victor Morales <chipahuac@hotmail.com>
requirements-$ANSIBLE_VERSION.yml doesn't exist in Kubespray repo.
That was for supporting ansible 2.10-, and now Kubespray supports
2.11+. So this drops the part to avoid confusion.
Reducing the number of layers, increasing readability, reducing the size of the image (how much I can’t check, it’s impossible for me to build due to the unavailability of the vagrant repository)
* Fix yq install in argocd role: use download_file instead of get_url
* Fix use download_file instead of get_url to download argocd-install manifest in argocd role
* Fix order and add arm64 checksum
* Fix: Failed to template loop_control.label: 'None'
Since Kubespray v2.21.0, the commit a98ab40434 removes copying
contrib/ accidentaly. The contrib/ contains useful tools like offline
tools etc. This adds the contrib/ to Dockerfile again.
1.5.7 was released Aug 2, 2021 and 1.6.1 came out on Dec 13, 2022.
There's been a good amount of new features, improvements and fixes since
1.5.7 and the changelogs for each version are available in the docs:
https://ara.readthedocs.io/en/latest/changelog-release-notes.html
This fixes the CrashLoopBackoff error that appears because envoy
configuration has changed a lot and upstream removed the envoy proxy to
use nginx only instead. Those changes are based on upstream cilium helm.
It is quite confusing that there's an all-caps, bolded comment that seems to imply that `etcd_download_url` is relevant only when not using host-based deployment. The opposite is true: of course the artifact download URL is relevant and required for host-based etcd.
Perhaps the entire comment can be read in a different way, and should perhaps be reworded entirely, cf. https://github.com/kubernetes-sigs/kubespray/blob/374438a3d631e98b34db4433d972fb4a221c0505/docs/offline-environment.md?plain=1#L38
Removing the "**DON'T**" matches the way the other comments in this file are written and matches my personal interpretation.
- Supported Docker versions are 18.09, 19.03 and 20.10. The *recommended* Docker version is 20.10. `Kubelet` might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. the YUM ``versionlock`` plugin or ``apt pin``).
- Supported Docker versions are 18.09, 19.03, 20.10, 23.0 and 24.0. The *recommended* Docker version is 20.10 (except on Debian bookworm which without supporting for 20.10 and below any more). `Kubelet` might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. the YUM ``versionlock`` plugin or ``apt pin``).
- The cri-o version should be aligned with the respective kubernetes version (i.e. kube_version=1.20.x, crio_version=1.20)
## Requirements
- **Minimum required version of Kubernetes is v1.23**
- **Ansible v2.11+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- **Minimum required version of Kubernetes is v1.25**
- **Ansible v2.14+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- The target servers must have **access to the Internet** in order to pull docker images. Otherwise, additional configuration is required (See [Offline Environment](docs/offline-environment.md))
- The target servers are configured to allow **IPv4 forwarding**.
- If using IPv6 for pods and services, the target servers are configured to allow **IPv6 forwarding**.
- The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
- If kubespray is ran from non-root user account, correct privilege escalation method
- If kubespray is run from non-root user account, correct privilege escalation method
should be configured in the target servers. Then the `ansible_become` flag
or command parameters `--become or -b` should be specified.
@@ -217,13 +227,11 @@ You can choose among ten network plugins. (default: `calico`, except Vagrant use
- [Calico](https://docs.projectcalico.org/latest/introduction/) is a networking and network policy provider. Calico supports a flexible set of networking options
- [Calico](https://docs.tigera.io/calico/latest/about/) is a networking and network policy provider. Calico supports a flexible set of networking options
designed to give you the most efficient networking across a range of situations, including non-overlay
and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts,
pods, and (if using Istio and Envoy) applications at the service mesh layer.
- [canal](https://github.com/projectcalico/canal): a composition of calico and flannel plugins.
- [cilium](http://docs.cilium.io/en/latest/): layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
- [weave](docs/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster.
@@ -240,6 +248,9 @@ You can choose among ten network plugins. (default: `calico`, except Vagrant use
- [multus](docs/multus.md): Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.
- [custom_cni](roles/network-plugin/custom_cni/) : You can specify some manifests that will be applied to the clusters to bring you own CNI and use non-supported ones by Kubespray.
See `tests/files/custom_cni/README.md` and `tests/files/custom_cni/values.yaml`for an example with a CNI provided by a Helm Chart.
The network plugin to use is defined by the variable `kube_network_plugin`. There is also an
option to leverage built-in cloud provider networking instead.
See also [Network checker](docs/netcheck.md).
@@ -265,7 +276,7 @@ See also [Network checker](docs/netcheck.md).
If the release note file(/tmp/kubespray-release-note) contains "### Uncategorized" pull requests, those pull requests don't have a valid kind label(`kind/feature`, etc.).
It is necessary to put a valid label on each pull request and run the above release-notes command again to get a better release note)
It is necessary to put a valid label on each pull request and run the above release-notes command again to get a better release note
# Running systemd-machine-id-setup doesn't create a unique id for each node container on Debian,
# handle manually
- name:Re-create unique machine-id (as we may just get what comes in the docker image), needed by some CNIs for mac address seeding (notably weave)# noqa 301
- name:Re-create unique machine-id (as we may just get what comes in the docker image), needed by some CNIs for mac address seeding (notably weave)
- name:Ensure Gluster brick and mount directories exist.
file:"path={{ item }} state=directory mode=0775"
file:
path:"{{ item }}"
state:directory
mode:0775
with_items:
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
- name:Configure Gluster volume with replicas
gluster_volume:
gluster.gluster.gluster_volume:
state:present
name:"{{ gluster_brick_name }}"
brick:"{{ gluster_brick_dir }}"
replicas:"{{ groups['gfs-cluster'] | length }}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip']|default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host:"{{ inventory_hostname }}"
force:yes
run_once:true
when:groups['gfs-cluster']|length > 1
when:groups['gfs-cluster'] | length > 1
- name:Configure Gluster volume without replicas
gluster_volume:
gluster.gluster.gluster_volume:
state:present
name:"{{ gluster_brick_name }}"
brick:"{{ gluster_brick_dir }}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip']|default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
@@ -12,7 +12,7 @@ This will install a Kubernetes cluster on Equinix Metal. It should work in all l
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your Equinix Metal project.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
file to generate a dynamic inventory that is consumed by [cluster.yml](../../..//cluster.yml)
file to generate a dynamic inventory that is consumed by [cluster.yml](../../../cluster.yml)
to actually install Kubernetes with Kubespray.
### Kubernetes Nodes
@@ -60,16 +60,16 @@ Terraform will be used to provision all of the Equinix Metal resources with base
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
This will be the base for subsequent Terraform commands.
#### Equinix Metal API access
Your Equinix Metal API key must be available in the `PACKET_AUTH_TOKEN` environment variable.
Your Equinix Metal API key must be available in the `METAL_AUTH_TOKEN` environment variable.
This key is typically stored outside of the code repo since it is considered secret.
If someone gets this key, they can startup/shutdown hosts in your project!
@@ -80,10 +80,12 @@ The Equinix Metal Project ID associated with the key will be set later in `clust
For more information about the API, please see [Equinix Metal API](https://metal.equinix.com/developers/api/).
For more information about terraform provider authentication, please see [the equinix provider documentation](https://registry.terraform.io/providers/equinix/equinix/latest/docs).
Example:
```ShellSession
export PACKET_AUTH_TOKEN="Example-API-Token"
export METAL_AUTH_TOKEN="Example-API-Token"
```
Note that to deploy several clusters within the same project you need to use [terraform workspace](https://www.terraform.io/docs/state/workspaces.html#using-workspaces).
@@ -101,7 +103,7 @@ This helps when identifying which hosts are associated with each cluster.
While the defaults in variables.tf will successfully deploy a cluster, it is recommended to set the following values:
- cluster_name = the name of the inventory directory created above as $CLUSTER
- metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
- equinix_metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
#### Enable localhost access
@@ -119,12 +121,13 @@ Once the Kubespray playbooks are run, a Kubernetes configuration file will be wr
In the cluster's inventory folder, the following files might be created (either by Terraform
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `terraform/metal` directory :
`.gitignore` file in the `contrib/terraform/equinix` directory :
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
- `.lock.hcl`
You can still add them manually if you want to.
@@ -135,7 +138,7 @@ plugins. This is accomplished as follows:
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.