Version 1

This documentation is for Deis v1 PaaS. For Workflow (v2) documentation visit https://deis.com/docs/workflow/.

Production deployments

Many Deis users are running Deis quite successfully in production. When readying a Deis deployment for production workloads, there are some additional (but optional) recommendations.

Isolating the Planes

Whether built for evaluation or to host production applications, when managing a small Deis cluster (three to five nodes), it is reasonable to accept the platform’s default behavior wherein the Control Plane, Data Plane, and Router Mesh are not isolated from one another. (See Architecture.) This means Control Plane components such as the Controller or Database will be eligible to run on any node, as will the Router Mesh and the Data Plane components such as Logspout, Publisher, and deployed applications.

In larger clusters however, nodes are more easily thought of as a commodity. Operators may scale clusters out to meet demand or in to conserve resources. In such cases, it is beneficial to isolate the Control Plane, which has no significant need to scale (and optionally, the Router Mesh) to a small, fixed number of nodes that are exempt from such scaling events. This eliminates the possibility that Control Plane components running on a decommissioned node will experience downtime as they are rescheduled. Additionally, this reserves the resources of a large (and possibly dynamic) pool of nodes for the workloads that are most likely to scale– applications.

See Isolating the Planes for further details.

Isolating etcd

The Deis Control Plane, Data Plane, and Router Mesh components all depend on an etcd cluster for service discovery and configuration.

Whether built for evaluation or to host production applications, when managing a small Deis cluster (three to five nodes), it is reasonable to accept the platform’s default behavior wherein etcd runs on every node within the cluster.

In larger Deis clusters however, running etcd on every node can have a deleterious effect on overall cluster performance since it increases the time required for nodes to reach consensus on writes and leader elections. In such cases, it is beneficial to isolate etcd to a small, fixed number of nodes. All other nodes in the Deis cluster may run an etcd proxy. Proxies will forward read and write requests to active participants in the etcd cluster (leader or followers) without affecting the time required for etcd nodes to reach consensus on writes or leader elections.

Note

The benefit of running an etcd proxy on any node not running a full etcd process is that any container or service depending on etcd can connect to etcd easily via localhost from any node in the Deis cluster.

Also see CoreOS cluster architecture documentation for further details.

See Isolating etcd for further details.

Running Deis without Ceph

The Deis Control Plane makes use of Ceph to provide persistent storage for the Registry, Database, and Logger components. The additional operational complexity of Ceph is tolerated because of the need for persistent storage for platform high availability.

Alternatively, persistent storage can be achieved by running an external S3-compatible blob store, PostgreSQL database, and log service. For users on AWS, the convenience of Amazon S3 and Amazon RDS make the prospect of running a Ceph-less Deis cluster quite reasonable.

Running a Deis cluster without Ceph provides several advantages:

  • Removal of state from the control plane (etcd is still used for configuration)
  • Reduced resource usage (Ceph can use up to 2GB of RAM per host)
  • Reduced complexity and operational burden of managing Deis

See Running Deis without Ceph for details on removing this operational complexity.

Preseeding containers

When a host in your CoreOS cluster fails or becomes unresponsive, the CoreOS scheduler will relocate any cluster services on that machine to another host. These services come up on the new host just fine, but a component’s first task is to pull the corresponding Docker image from Docker Hub. Depending on factors such as available bandwidth, network latency, and performance of the Docker Hub platform, this can take some time. Failover is not finished until the pull completes and the component starts.

To minimize component downtime should failover occur, it is recommended to preseed the Docker images for Deis on all hosts in a cluster. This will pull all the images to the host’s local Docker graph, so if failover should occur, a component can start quickly.

A preseed script is provided as a script already loaded on CoreOS hosts.

On all hosts in the cluster, run:

$ /run/deis/bin/preseed

This will pull all component images for the installed version of Deis.

Review security considerations

There are some additional security-related considerations when running Deis in production, and users can consider enabling a firewall on the CoreOS hosts as well as the router component.

See Security considerations for details.

Back up data

Backing up data regularly is recommended. See Backing Up and Restoring Data for steps.

Change Registration Mode

Changing the registration process is highly recommended in production. By default, registrations for a new cluster are open to anyone with the proper URL. Once the admin user has registered with a new cluster, it is recommended that you either turn off registrations entirely or enable the admin only registration feature.

Please see the following documentation: Customizing controller

Configure logging and monitoring

Many users already have external monitoring or logging systems, and connecting Deis to these platforms is quite simple. Review Platform logging and Platform monitoring.

Enable TLS

Using TLS to encrypt traffic (including Deis client traffic, such as login credentials) is crucial. See Installing SSL for the Platform.