Architecture Recommendations
If you are installing Rancher on a single node, the main architecture recommendation that applies to your installation is that the node running Rancher should be separate from downstream clusters.
Separation of Rancher and User Clusters
A user cluster is a downstream Kubernetes cluster that runs your apps and services.
If you have a Docker installation of Rancher, the node running the Rancher server should be separate from your downstream clusters.
If Rancher is intended to manage downstream Kubernetes clusters, the Kubernetes cluster that the Rancher server runs on should also be separate from the downstream user clusters.
Why HA is Better for Rancher in Production
We recommend installing the Rancher server on a high-availability Kubernetes cluster, primarily because it protects the Rancher server data. In a high-availability installation, a load balancer serves as the single point of contact for clients, distributing network traffic across multiple servers in the cluster and helping to prevent any one server from becoming a point of failure.
We don't recommend installing Rancher in a single Docker container, because if the node goes down, there is no copy of the cluster data available on other nodes and you could lose the data on your Rancher server.
K3s Kubernetes Cluster Installations
One option for the underlying Kubernetes cluster is to use K3s Kubernetes. K3s is Rancher's CNCF certified Kubernetes distribution. It is easy to install and uses half the memory of Kubernetes, all in a binary of less than 100 MB. Another advantage of K3s is that it allows an external datastore to hold the cluster data, allowing the K3s server nodes to be treated as ephemeral.
RKE Kubernetes Cluster Installations
In an RKE installation, the cluster data is replicated on each of three etcd nodes in the cluster, providing redundancy and data duplication in case one of the nodes fails.
Recommended Load Balancer Configuration for Kubernetes Installations
We recommend the following configurations for the load balancer and Ingress controllers:
- The DNS for Rancher should resolve to a Layer 4 load balancer (TCP).
- The Load Balancer should forward port TCP/80 and TCP/443 to all 3 nodes in the Kubernetes cluster.
- The Ingress controller will redirect HTTP to HTTPS and terminate SSL/TLS on port TCP/443.
- The Ingress controller will forward traffic to port TCP/80 on the pod in the Rancher deployment.
Environment for Kubernetes Installations
It is strongly recommended to install Rancher on a Kubernetes cluster on hosted infrastructure such as Amazon's EC2 or Google Compute Engine.
For the best performance and greater security, we recommend a dedicated Kubernetes cluster for the Rancher management server. Running user workloads on this cluster is not advised. After deploying Rancher, you can create or import clusters for running your workloads.
Recommended Node Roles for Kubernetes Installations
The below recommendations apply when Rancher is installed on a K3s Kubernetes cluster or an RKE Kubernetes cluster.
K3s Cluster Roles
In K3s clusters, there are two types of nodes: server nodes and agent nodes. Both servers and agents can have workloads scheduled on them. Server nodes run the Kubernetes master.
For the cluster running the Rancher management server, we recommend using two server nodes. Agent nodes are not required.
RKE Cluster Roles
If Rancher is installed on an RKE Kubernetes cluster, the cluster should have three nodes, and each node should have all three Kubernetes roles: etcd, controlplane, and worker.
Contrasting RKE Cluster Architecture for Rancher Server and for Downstream Kubernetes Clusters
Our recommendation for RKE node roles on the Rancher server cluster contrasts with our recommendations for the downstream user clusters that run your apps and services.
Rancher uses RKE as a library when provisioning downstream Kubernetes clusters. Note: The capability to provision downstream K3s clusters will be added in a future version of Rancher.
For downstream Kubernetes clusters, we recommend that each node in a user cluster should have a single role for stability and scalability.
RKE only requires at least one node with each role and does not require nodes to be restricted to one role. However, for the clusters that run your apps, we recommend separate roles for each node so that workloads on worker nodes don't interfere with the Kubernetes master or cluster data as your services scale.
We recommend that downstream user clusters should have at least:
- Three nodes with only the etcd role to maintain a quorum if one node is lost, making the state of your cluster highly available
- Two nodes with only the controlplane role to make the master component highly available
- One or more nodes with only the worker role to run the Kubernetes node components, as well as the workloads for your apps and services
With that said, it is safe to use all three roles on three nodes when setting up the Rancher server because:
- It allows one
etcd
node failure. - It maintains multiple instances of the master components by having multiple
controlplane
nodes. - No other workloads than Rancher itself should be created on this cluster.
Because no additional workloads will be deployed on the Rancher server cluster, in most cases it is not necessary to use the same architecture that we recommend for the scalability and reliability of downstream clusters.
For more best practices for downstream clusters, refer to the production checklist or our best practices guide.
Architecture for an Authorized Cluster Endpoint (ACE)
If you are using an authorized cluster endpoint (ACE), we recommend creating an FQDN pointing to a load balancer which balances traffic across your nodes with the controlplane
role.
If you are using private CA signed certificates on the load balancer, you have to supply the CA certificate, which will be included in the generated kubeconfig file to validate the certificate chain. See the documentation on kubeconfig files and API keys for more information.
ACE support is available for registered RKE2 and K3s clusters. To view the manual steps to perform on the downstream cluster to enable the ACE, click here.