Technical

Overview for cloud platorm engineers.

Integration

cloud-init is our driver which makes everything happen in/on your cloud.

There are a variety of details that must happen in order to make this work. Firstly, at a virtual/hardware level, it needs to identify that platform; it does this using dmidecode/smbios looking at device information presented by the vendor; looking at system/chassis information: for example, the system uuid should begin with ec2 on AWS.

Once this is discovered; it is possible to use this datasource to retrieve metadata. Although cloud-init can run through all datasources; we explicitly set this on the cloud platform as this greatly decreases cloud-init’s run time.

We do however, see runs where this is not found. In this case :code:identify_aws() returns false and metadata requests aren’t performed. What happens next depends upon a previous instantiation being found, or DataSourceNone running.

The metadata retrieval is critical to actually standing up the instance: it is this that assigns your chosen key-pair to the ssh/authorized_keys of the cloud user (which we nominate and set up via cloud-init) so that you may actually log into your server.

For EC2-styles clouds that support it(ie AWS, AliCloud); cloud-init always uses the IMDSv2 protocol; and it is important to set positive datasource timeout such that an auth token request is made for future metadata endpoints to use; as otherwise cloud-init can silently fail to retrieve data.

Our cloud-init is configured with DEBUG-level logging; and directing this to the console; if your vendor supports viewing the console logs; you should see the results of your cloud-init run. Of particular importance, is where it shows authorized public keys:

ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+
ci-info: | Keytype |                                       Fingerprint (sha256)                                      | Options | Comment |
ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+
ci-info: | ssh-rsa | 0d:7b:f2:be:3a:9a:41:cd:f2:b6:00:b6:9a:45:26:df:a7:6d:42:ed:71:07:b5:3c:98:2b:d0:ee:f8:f5:77:25 |    -    |  mykey  |
ci-info: +---------+-------------------------------------------------------------------------------------------------+---------+---------+

You may verify this is indeed the signature of your key:

$ awk '{print $2}' ~/.ssh/MYKEY.pub | base64 -d | sha256sum | cut -d ' ' -f 1 | sed 's/../&:/g;s/:$//'
0d:7b:f2:be:3a:9a:41:cd:f2:b6:00:b6:9a:45:26:df:a7:6d:42:ed:71:07:b5:3c:98:2b:d0:ee:f8:f5:77:25

We have a strong governance suite which verifies/instruments this on the cloud marketplace instances we ship.

Operating System

Our BastionLinux brand is a RedHat/Fedora derivative; but we have added this to all of the tools we ship such that it is recognised as it’s own platform family within Red Hat. We are interested in establishing that Operating System/brand on your cloud. Alternatively, we need to know the most appropriate OS to register our AMI’s as.

Instance Types

Our strategy for selecting target images is fairly simple; we ship to what we believe represents the most transparent to a sophisticated user.

On AWS; we’re only shipping to nitro-based systems that support console logging.

Pricing

Our current pricing strategy is to charge a margin over the vendor’s base price. We do this by looking at the per-region prices of the instance types. It turns out this is non-trivial for vendors like AWS: where we are presently using the awspricing third-party library. This library caches ~0.5G of information in order to calculate prices, and it is necessary for us to warm up this cache in order for applications not to timeout using it. Building this cache takes ~80 minutes. It is difficult to comprehend how pricing can be so resource-intensive!