Hashicorp is in a bit of a purple patch regarding product announcements. Having just released Vault 1.4 which we reviewed here, they have just announced the release of a public Beta of Consul 1.8, their next-generation service mesh product.
What is Consul?
Hashicorp’s Consul is a product designed from the ground up to be friendly to both the DevOps community and application developers, making it perfect for modern, elastic infrastructures. It is a service mesh solution that produces a fully-featured control plane with Service discovery, configuration, and segmentation functionality.
Sounds good, but what exactly does that mean? At it’s simplest Consul is a clustering technology, but to call it that is to minimize its functionality. It’s so much more than clustering.
Consul has a separated management plane, and a service plane. Consul is used to provide resilient and highly available access to data, across multiple regions, and provide a consensus, at a high level. There are Servers and clients, these converse across a protocol called the Gossip protocol; it carries things membership information, including automatic discovery of new potential members, this allows auto-scaling of the cluster. Gossip protocol also manages failure detection and is much more scalable and information-rich than simple heartbeat schemes. The last function is to carry out leadership elections, this is an important function as the leader is responsible for the consensus protocol. The Gossip Protocol is split into two types LAN Gossip and WAN gossip.
The consensus protocol manages data consistency and is based on RAFT, this is a consensus algorithm based on Paxos. RAFT is a very complex protocol and way beyond the scope of this post if you are interested in understanding it on a more than personal level read this paper, but be warned you will need plenty to drink as it is drier than the Gobi desert.
Each Consul node runs an agent, whose responsibility is to health check the services on the node and to talk to one or more servers. They are not required for service discovery or leadership elections. When a server or component needs to discover a service or node it will query any of the Consul servers or any of the Consul agents. The agents forward any query to the servers automatically.
Each data center runs a cluster of Consul servers. When a cross-data center service discovery or configuration request is made, the local Consul servers forward the request to the remote data center and return the result.
As a result of this communication the servers will build a catalog, by aggregating the information that is submitted by the agents, this catalog maintains the high-level view of the cluster, including which services are available, which nodes run those services, health information, and more.
Key features of Consul
Consul has many features. Let’s look at the key ones.
- Service Discovery: Clients of Consul can register a service, such as API or MySQL, other clients can interrogate Consul to discover which members provide a given service. Once integrated with DNS or addressed via HTTP, applications can easily find the services they depend upon.
- Middleware Automation: the ability to automate the configuration of network devices and to modify hardware load balancers to point to new applications.
- Health Checking: Consul clients provide many health checks, either associated with a given service for example, “is the webserver returning 200 OK”, or concerned with the functionality of the local node for example: “is memory utilization below 90%”. This information can be used by an operator or 3rd party monitoring solution to monitor cluster health, further, this information is used by the service discovery components to route traffic away from unhealthy hosts thereby maintaining availability.
- KV Store: Applications can make use of Consul’s hierarchical key/value store for any number of purposes, including dynamic configuration, feature flagging, coordination, leader election, and more. The simple HTTP API makes it easy to use. Vault uses this when you scale up from a single server to a highly available and resilient clustered environment.
- Secure Service Communication: Consul can generate and distribute TLS certificates for services to establish mutual TLS connections. Intentions can be used to define which services are allowed to communicate. Service segmentation can be easily managed with intentions that can be changed in real-time instead of using complex network topologies and static firewall rules.
- Multi Datacenter: Consul supports multiple datacenters out of the box. This means users of Consul do not have to worry about building additional layers of abstraction to grow to multiple regions.
Now that we have a good overview of what functionality Consul can provide, let’s move on to what’s new in 1.8. Now this statement must be predicated with a statement that this is a public beta and what is revealed here may not be in the GA version of the product, or it is, it could still have a status tag of unsupported experimental. What this means is “well it will maybe work as designed but we have not done enough testing on it to guarantee stability or performance”
What’s new in Consul 1.8
Hashicorp has added the following features to Consul 1.8 which aim to lower the barrier to entry for adopting a service mesh in disparate environments. One of the key advantages of Consul is the ability to scale without increasing manual overhead. Consul 1.8 moves service mesh ability into both Kubernetes and non-Kubernetes environments in any region and or public or private cloud.
Additionally, Consul 1.8 adds some needed features which are unfortunately only available in the enterprise version to allow “identity-based authentication”, and help fulfill compliance requirements through operational traceability.
This release includes the following features:
- Ingress Gateway: Provides a quick on-ramp path to allow applications that reside outside the service mesh to communicate with services within the mesh.
- Terminating Gateway: Allows applications that reside within the service mesh to communicate with existing services outside of the mesh.
- WAN Federation over Mesh Gateway:Simplifies multi-cluster and multi-datacenter Consul environments (federated datacenters) by sending all cross-environment traffic through the Mesh Gateways instead of requiring all Consul servers to be exposed across networks.
- Audit Logging (Consul Enterprise only): Provides a way for security administrators to capture all user actions made in a Consul cluster to fulfill compliance requirements, and carry out risk assessments.
- Single Sign-On (Consul Enterprise only): Enables operators to delegate identity authentication to browser-enabled SSO providers (e.g. Okta). Allows users to self-serve ACL tokens to manage constructs within Consul (groups of services, namespaces, KV, central config, etc.), Provides a workflow to provision tokens for services and users based on SSO identity.
Let’s have a closer look at these new features:
First, we will look at the features that are included in the Open-source version of the product, these feature updates will be available free to use to all Consul product users.
Ingress Gateway. Enables access to services inside a service mesh available to those services outside of the service mesh. It will take an inbound TCP connection from a legacy device or application, terminate it and set up a TLS connection into the service mesh.
Terminating Gateway: This works in the opposite direction and provides an egress solution, in this case, the Gateway logically spans all egress points and terminates the Service mesh TLS connections and instantiates standard TCP connection to the receiving application, or device.
Wan Federation over Mesh Gateway: this provides connectivity between networks, however, and I think this is funky, it provides a solution to protect against overlapping IP addresses and ranges across different datacenters. This is a common issue in Enterprises that are using containers and Kubernetes based deployments. Their deployment processes will often utilize the same IP ranges for their namespaces across different datacenters or regions, this caused many issues in traditional infrastructure and prevents Enterprises joining these deployments into a single environment when using traditional VPNs to provide connectivity.
Next, we move on to the features that are only part of the costed Enterprise versions whilst it is understandable from a commercial point of view why these are only available here; it still upsetting that from a security point of view that they are not available to those that use the Open-source version as they provide significant value.
Audit Logging (Enterprise only)
Consul is the first Hashicorp product outside of Vault to integrate full audit logging across all the endpoints in Consul.
This is an important move towards proving to your audit teams the viability and security of your product, not just a method of proving who did what and when. This is log is distributed across
Single Sign-On (Enterprise only)
Prior to Consul 1.8 access to has been via ACL tokens and management of those has been done via consul this obviously leads to an identity overhead due to having to manage consul as a separate identity domain. As of this release you can bind OpenID into consul roles and policies, now you can login with Okta, Auth0 or Ping. This is a massive move forward but does ignore the elephant in the room of Active Directory. The vast majority of Enterprise Clients use AD to manage their identity access and policies, there will still need to be a bridge between the two identity domains, but once federation has been configured correctly access should be seamless.
Other miscellaneous improvements and changes
Also as a result of the new functionality HashiCorp have undertaken some changes to the UI to improve clarity
Another improvement is to build upon the introduction in 1.7 of layer 7 traffic policy management, and there has aslo been further enhancements to JSON Web Token Authentication handling.
What it all means
If all these features are in the final GA version of version 1.8 it will be a very impressive update to what is an already feature-rich product. The addition of gateways services will lower the bar to entry. One of the major downsides to a traditional service mesh is that you are either inside the mesh and can communicate freely or outside looking in, wondering what the cool kids are doing. This sets a high bar for entry as many services will not be able to communicate with your new cloud-centric services, and this is a major blocker for adoption for many enterprises as they are running may disparate application set across many generations of technology paradigms; from mainframes to mini-computers, to physical and virtual servers running Windows and Linux, on traditional virtualization or public and private cloud platforms to newer applications that are deployed on Containers. The ability to effectively merge Kubernetes clusters residing on different clouds by the use of a Mesh Gateway is a massive step forward in resilience and scalability, this coupled with the ability to functionally handle overlapping IP ranges and addresses will mean this is a go to solution as a service mesh, integrating with network devices to automate changes to firewall, routing and Load Balancing rules can an will speed up application upgrades in a CI/CD environment. Whether the Networking Operations team will welcome this change is another question entirely. The only down side that I have is the fact that Auditing and single Sign-on are reserved for the Enterprise version. Smaller companies that do not have the pockets that a costed environment will need also need the ability to simplify their Identity Management stature and prove an audit trail.
All in all this is a solid upgrade to Version 1.7 and Consul is still improving.