TransWikia.com

How to automate the key exchange in WireGuard when you deploy a cluster of machines?

DevOps Asked on December 30, 2021

Let’s assume you want to deploy a cluster of machines on Hetzer Cloud. For simplicity let’s call them worker1, worker2, worker3. They need to communicate with a server called master, which will be running on different account then the workers. Ideally, the whole setup should not be open to the internet. Unfortunately, Hetzner supports only private networks within the same account.

To make it work, you can setup your own VPN using WireGuard. Conceptually, it is not hard. You need to setup three connections (between the master and each worker). The tricky part is how to automate the key exchange. Ideally, it should not be more work if you deploy additional workers (e.g. 100 instead 3 workers).

Setting up such a VPN cluster sounds like a common problem, but I cannot find any recommendations on how to setup 1-to-n or n-to-m connections, only tutorials on how to peer two machines. I’m thinking of automating the key exchange with Ansible (generate keys, gather them, install them on the master), but wanted to check first whether there is an easier solution to the problem that I missed.

In SSH, workers could share their key, which would simplify the problem. In WireGuard, keys cannot be shared, as far as I understood. How would you automate the setup of a VPN with WireGuard, so each worker can reach the master? Or is WireGuard the wrong choice for the problem?

Clarification:

  • In my scenario, it is not possible to move the workers and master to the same account; otherwise, Hetzner networks would be the straightforward solution for setting up a private network.
  • If you are not familiar with Hetzner Cloud, it is not a problem. You can assume that you get normal Linux machines, but then you are on your own (it does not support VPC peering across accounts as AWS does). Yet you can use all Linux tools available for creating the VPN setup. WireGuard would be my first choice, but I’m open to other techniques.

3 Answers

The githubixx wireguard ansible role might be what you are looking for. The author of the role says:

I use WireGuard to setup a fully meshed VPN (every host can directly connect to every other host) and run my Kubernetes (K8s) cluster at Hetzner Cloud (but you should be able to use any hoster you want). So the important components like the K8s controller and worker nodes (which includes the pods) only communicate via encrypted WireGuard VPN. Also (as already mentioned) I've two clients. Both have kubectl installed and are able to talk to the internal Kubernetes API server by using WireGuard VPN.

I reviewed the tasks and the template, but I haven't tried it out yet myself. It supports different versions of linux. It installs wireguard. It handles key generation, distribution, and wireguard configuration for a full mesh VPN. For each host it configures an interface with the host's private key as well as peer configuration for all the other hosts using their own public keys. You can set variables in the ansible inventory for each host (e.g. addresses) to tailor the configurations to your network.

Answered by jzzz on December 30, 2021

I've got an Ansible role that may help you. It was designed to act as a gateway, but I think it may still work for you (hopefully without having to tweak it).

The most helpful aspect is that it can run in either client or server mode. So what you would do in your playbook:

  1. Run it on master in server mode,
  2. Save the generated public key to a variable.
  3. Loop though the workers running it in client mode providing the server's public key.
  4. Collect all of the public keys into a list variable.
  5. Rerun it on master in server mode, this time with all of the worker keys.

Then, you should be good to go. I normally don't put all of this together, just do a single run for each device, but if you want to automate the entire thing (especially if you have many workers), then you should be able to get all of that into a single playbook.

For more information, see the role documentation.

I don't recommend sharing keys across devices for the reason you mentioned.

Answered by colan on December 30, 2021

(Disclaimer: As the question has not received an answer yet, I came to the conclusion that there is currently no out-of-the-box solution. Thus, I'm sharing what I ended up doing, and hope it will help others. Still, I would be interested to learn of a better solution if there is one.)


WireGuard can be used, but each of the n-to-m connections will need its own keys. Managing that by hand becomes tedious if the number of machines increases. Therefore, it is advisable to automated it:

  1. Generate all keys in advance
  2. Distribute the keys to the machine using Ansible (e.g. by generating systemd services wg0.network and wg0.netdev)

Note that step 1 assumes that the maximum number of machines is known in advance. If you cannot provide an upper bound, this approach will not work. Instead you will need to implement a dynamic key generation and exchange protocol.

If you end up using fewer machines then the configure upper bound, it will still work. Distributing the keys means that WireGuard will allow connections from different machines, but does not require that all peers are always available.


Scaling: In my setup, the upper bound was in the order of 100. That works without problems, but I did not test how it scales if you need to go higher (e.g. to the order of 1000 or 10,000).

What if there is no upper bound? -- That is a more difficult problem, and I don't know how to solve it best. As said, you would need to generate key dynamically and exchange them. I was considering opening SSH and using it to exchange on-the-fly generated WireGuard keys.

Note that with SSH, you can reuse the same key on different machines. That eliminates the need to exchange dynamically generated keys. But be aware that reusing the same key for all machines is a trade-off between convenience and security. If one machine gets compromised, the attacker will gain access to all other machines, too.

Maybe there is some solution that is more elegant and more secure. Perhaps directly using Wireguard is not a good idea if you need to support an arbitrary number of dynamic hosts. It is easy to introduce subtle security issues when implementing your own key exchange protocol.

Answered by Philipp Claßen on December 30, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP