Note to self: DNS naming conventions for infrastructure

Published Aug 6, 2019. 6 minutes to read.
Tagged with SystemsNetworking.

The DNS naming scheme I use for infrastructure was designed to:

  • Describe “what”, in “what state”, and “where”; and
  • Be easily queryable and groupable in the inventory management system; and
  • Be translatable to reverse domain name notation (ex. com.exmaple.infra...), again, for queries and when documenting things - for example, to use as a wiki structure.

DOs

  • Document things. Especially clusters and generations within the same service root.
  • Document relationships between different services, generations, and clusters.
  • Isolate your DNS. Infrastructure DNS should not be queryable from unauthorized devices (zero trust - remember?).

DON’Ts

  • Don’t include business purposes - DNS should be used as a street address. It is Duke's at 1596 2nd Ave, New York not I will have a beer and some wings at Duke's, at 1596 2nd Ave, New York. It’s just superfluous and useless.
  • Don’t reuse identifiers after obsoleting them. Have a new one. There are plenty to go around. Immutable infrastructure is more of a replacing thing with the same thing in a different state, so reuse is permissible there.
  • DO NOT register these DNS entries in a public DNS server (provided by your registrar, Cloudflare, and who knows what else).

Components of infrastructure asset FQDN

<edge> - Identifier conforming to e[1-9][0-9]*. A network “edge” - server, virtual machine, router, anything.

You should not use different prefixes for edge nodes (for example, to identify the purpose of the node). Node identifiers should be meaningless - but documented.

Old identifiers of discarded infra should not be reused with a single exception of replacing infra on purpose, with immutable infrastructure deployment being a good example, replacing identical hardware being another.

<cluster> - Cluster identifier conforming to c[1-9][0-9]*. It identifies a group of nodes somehow linked to one another to deliver the same service within the same generation. Independent nodes that will (supposedly) never be part of a cluster should never be grouped under a single cluster of nodes. For example, 3 independent MySQL servers should be e1.c1, e1.c2 and e1.c3, not e<1-3>.c1.

<gen> - Deployment generation conforming to g[1-9][0-9]*. Bump the generation every time you destructively migrate a thing to another state. E.g. spin up a new version of something, migrate everything there and discard the old hosts. This is unlikely to happen often. Also used when the same service is deployed twice with different versions concurrently. For example, PostgreSQL 9 and 11 (for some reason). You should not use generation numbers to reflect version numbers and version increments of the service group.

<sgroup> - service group conforming to [a-z-]+. The service group is used to describe what service the node provides. For nodes with more than one service, come up with your short-name and document that. For example, jamy for Java + MySQL. For single service nodes, use the name of the service. Do not use pointless and generic names like web, vpn, etc. The only thing where such naming is permissible is for purpose-specific hardware, like ups for UPS, hwfw for hardware firewall, etc, but even then, better to use vendor name or some such, ex. apc or mcrs (MikroTik CRS).

<loc> - Physical location / region / data center - [a-z-]+. Should use the name of an availability zone, or region, or data center used by the vendor. Several components should be separated by a dash (-) if the vendor provides several location tags (for example, r3-dc2-de for Rack 3, DC 2 in Germany). Invent your own for local assets. City short-name + N is a good start, for example, nyc1 for New York office 1.

<vendor> - Provider / vendor / co-locator - [a-z-]+. Examples include aws, gcloud, ibm, azure, and so forth. Use local for assets directly owned and controlled by the organization, e.x. office hardware. This is NOT to be used to identify the manufacturer of hardware and such.

<root> - a root domain. For infrastructure used organization-wide infra.example.com is a good choice. Multiple customers? Use a customer-specific second-level domain (infra.customer.com) or customer-specific third-level domain (infra.customer.example.com). Standalone project? Use a project-specific second-level domain (infra.project.com) or project-specific third-level domain (infra.project.example.com). Use for office hardware? Same rules. Infrastructure is infrastructure and business purpose is irrelevant.

Full format

<edge>.<cluster>.<gen>.<sgroup>.<loc>.<vendor>.<root>

Common examples for web infrastructure

e8.c1.g1.nginx.us-east-1.aws.infra.example.com

e3.c3.g1.mysql.eu-west-1.aws.infra.example.com

e5.c2.g1.mongo.wdc07.ibm.infra.example.com

e1.c1.g1.mssql.eastus2.azure.infra.example.com

Example for your LAN

e1.c1.g1.unifiap.nyc1.local.infra.example.com

e1.c1.g1.optiplex.nyc1.local.infra.example.com

Edges, clusters, and cluster routing

In addition to resource-specific DNS naming, you might sometimes need to be able to address a group of nodes or be able to route to a specific node in the cluster, for example - the current primary of the cluster.

Note that adding these records is on an as-needed basis. You do not need to create virtual records like these unless you intend to use them. Don’t expect these records to be present unless they need to be present.

Addressing all nodes in a cluster or section for load distribution and discovery

There are several scenarios where it might be necessary to be able to discover or even load-distribute between one or more nodes in a cluster or several clusters.

To do this, you can use DNS record that points to ALL nodes in the given section simply by skipping edge or a cluster name.

For example, to address all nodes in a cluster, you can use this DNS form:

<cluster>.<gen>.<sgroup>.<loc>.<vendor>.<root>

Or to address all nodes in all clusters in one generation:

<gen>.<sgroup>.<loc>.<vendor>.<root>

You should use A/AAAA records to address other nodes in this virtual record. You should not use level-specific DNS to address nodes that might be split logically at a later date. For example, if all nodes in the cluster are equal now, but there is a technical possibility they might not be eventually, use partitioning instead. This is a pre-optimization worth having.

Addressing a group of nodes in a cluster - partitioning

To address several special nodes in a cluster that logically differ from others, you can create virtual record in this form:

<partition>.<cluster>.<gen>.<sgroup>.<loc>.<vendor>.<root>

<partition> - Identifier conforming to p[1-9][0-9]*. A logical group of nodes in a cluster. The purpose of each group should be documented somewhere.

It is OK to have all nodes listed in a partition. You should use A/AAAA records to address other nodes in this virtual record.

Addressing the current primary node in the cluster

Use the primary. Simple as that.

primary.<cluster>.<gen>.<sgroup>.<loc>.<vendor>.<root>

If possible, use A/AAAA records pointing to a floating IP that is assigned to the current primary node. If not, CNAME can be used, as well as plain A/AAAA record that points to a non-floating IP of the primary node.

© Matīss Treinis 2022, all rights, some wrongs and most of the lefts reserved.
Unless explicitly stated otherwise, this article is licensed under a Creative Commons Attribution 4.0 International License.
All software code samples available in this page as part of the article content (code snippets and similar) are licensed under the terms and conditions of Apache License, version 2.0.