Anthony J. Martinez

Quick Look @ netns

A quick and dirty approach to creating a network namespace for an application that is otherwise unable to bind a specific interface. An added bonus here specifies a distinct DNS server for the namespace itself.

A few details about the example system:

  1. The default network interface is eth0
  2. We want to jail an application's network to VLAN 42
  3. There is a forwarding DNS server on the router managing VLAN 42
  4. We will call the network namespace the_answer
  5. We will use 172.30.242.16/29 as an IPv4 subnet unlikely to immediately collide with anything else

First make sure DNS works:

# as root (directly or with appropriate sudo usage)
mkdir -p /etc/netns/the_answer
echo -n "nameserver 172.30.242.17\nsearch .\n" > /etc/netns/the_answer/resolv.conf

Now it's time to flex iproute2 for all it's worth

# as root (directly or with appropriate sudo usage)
ip netns add the_answer
ip link add name eth0.42 link eth0 type vlan id 42
ip link set dev eth0.42 netns the_answer
ip netns exec the_answer ip link set dev eth0.42 up
ip netns exec the_answer ip address add 172.30.242.18/29 broadcast 172.30.242.23 dev eth0.42
ip netns exec the_answer ip route add default via 172.30.242.17 dev eth0.42

At this point you have the_answer configured and can launch network applications inside it that will utilize the available interface, route, and DNS without having to know anything about their configuration in advance. Just run:

ip netns exec the_answer [your_application] [args]

Migration to axum

It has been a long time coming, but I have finally completed the migration to axum on my arse project. The current release, v0.18.1, adds an ability to serve files from sub-directories within the asset routes and fixes a mime-type typo that was preventing users from having the content of their arse-served sites loaded correctly in lynx. The focus of this release was the migration itself, and the resulting code could stand to be refactored so another release will happen when I have time.

Shared Access with GPG

For probably four or five years I have battled with the annoying need to pkill gpg-agent before I could ssh somehost and use the auth key on my smartcard for authentication. Then afterwards, I had to kill pcscd usually with systemctl or just by unplugging my smartcard and plugging it back in again. Today, I fixed that with but the few short lines below:

Add the following to ${HOME}/.gnupg/scdaemon.conf

# On Fedora 41 this is /usr/lib64/libpcsclite.so.1
pcsc-driver {your_path_to_libpcsclite}
card-timeout 5
disable-ccid
pcsc-shared

If you want to go further down this rabbit hole and potentially do something different I found this information here originally.

Exceptions

This is will be a short one - Please Do Not:

try {
  functionWithNoDeclaredThrow();
} catch (final Exception e) {
  logger.error("Error");
}

In any language... if you could be so kind.

Languages

For much longer than a decade I have limted the programming languages in which I would even attempt to solve problems to a small set that I felt comfortable, or perhaps even safe. Initially, that was almost exclusively limited to Python. Several years ago the list was was doubled and grew to include Rust. There were advantages to this in the workplace. Few were as knowledgable of the latest and greatest Python features, and even fewer knew anything at all about Rust. Accordingly, there were also disadvantages to this in the workplace. No one would accept the risk of having one Rust developer, for example.

My own inability to know something is wrong, broken, or otherwise in need of attention led me to wonder if I had erred in my outright avoidance of Java, or my fear of C and C++. Somewhere around the end of May, I decided I would face C++ head on and learn C++17. Turns out I had little to fear beyond the numerous nonsensical ways in which the language seems to actively try to blow your leg off. Out of necessity, I ended up making production commits in Java as well. Through all of this, I followed the same methods I do in Python and Rust when I don't know the answer: I Read The Documentation.

If there is any point hidden in here it is probably that the languages ultimately do not matter that much. If you have a design and detail oriented mind, and there is a manual to be read, you more than likely can figure out some way forward. It may not be perfect, but that's why reviews exist.

SSH at Scale with OpenSSH Certificates - SmartCard Backed Keys

Many weeks ago, when I prematurely declared a final note on the SSH at Scale topic, I was left wondering if I could go further and loop my GPG Smartcard into the mix. This post details what I found hiding just below the surface of OpenSSH and OpenSC to let one maintain all necessary secrets on a SmartCard and still use OpenSSH Certificates for login.

First find your pkcs11 module (on Linux)

$ PKCS11_MODULE=$(pkcs11-tool | sed -n 's|.*(default:\(.*\))|\1|p;')
$ echo "${PKCS11_MODULE}"
/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so

Using the PKCS15 interface, export the public parts of associated keys for signing and authentication. In the case of my Librem Key, the relevant IDs are 01 and 03:

Signature Pubkey:

$ pkcs15-tool --read-ssh-key 01 -o sig.pub
Using reader with a card: Purism, SPC Librem Key (000000000000000000009BB4) 00 00

$ cat sig.pub
ssh-rsa AAAAB3N...DVUhQ== Signature key

Authentication Pubkey:

$ pkcs15-tool --read-ssh-key 03 -o auth.pub
Using reader with a card: Purism, SPC Librem Key (000000000000000000009BB4) 00 00

$ cat auth.pub 
ssh-rsa AAAAB3N...fuzQ== Authentication key

Now that both public keys are on disk, one can utilize the corresponding SmartCard-backed private keys in a normal OpenSSH Certificate flow. The first, and least intuitive, difference is in the call to ssh-keygen. When the private key is on disk one passes the path after -s and this key is used as the Certificate Authority. In the new case, the path to the public key is given after -s and the path to the pkcs11 module is given after -D. An example follows, signing the exported auth.pub to create a certificate:

$ ssh-keygen -D ${PKCS11_MODULE} -s sig.pub -n ${USER} -I EXAMPLE_CERT -z $(date +%s) -V +15m auth.pub
Enter PIN for 'OpenPGP card (User PIN (sig))': 
Signed user key auth-cert.pub: id "EXAMPLE_CERT" serial 1662515192 for amartinez valid from 2022-09-06T20:45:00 to 2022-09-06T21:01:32

$ ssh-keygen -Lf auth-cert.pub 
auth-cert.pub:
        Type: ssh-rsa-cert-v01@openssh.com user certificate
        Public key: RSA-CERT SHA256:V2KMVlJjPOn86z6a2srEcnMQj78OujEXJ597PJ6+wyY
        Signing CA: RSA SHA256:HoXa4G9gmsln+8gOUPEeKNmcCA0cppiUlmUuEjt8joA (using rsa-sha2-512)
        Key ID: "EXAMPLE_CERT"
        Serial: 1662515192
        Valid: from 2022-09-06T20:45:00 to 2022-09-06T21:01:32
        Principals: 
                amartinez
        Critical Options: (none)
        Extensions: 
                permit-X11-forwarding
                permit-agent-forwarding
                permit-port-forwarding
                permit-pty.
                permit-user-rc

Using the generated cert with a private key on the SmartCard requires that the target host is configured to trust the key used as a CA above. An earlier post details how such a configuration could be accomplished. Assuming your target host is properly configured all that is left is telling ssh where to find the pkcs11 module, -I, and which certificate to utilize -o CertificateFile=.... Another example follows doing just that:

$ ssh -I ${PKCS11_MODULE} -o CertificateFile=auth-cert.pub beaglebone
Enter PIN for 'OpenPGP card (User PIN)': 

Last login: Wed Sep  7 00:25:47 2022 from ...
amartinez@beaglebone:~$ sudo grep -i cert /var/log/auth.log | tail -3 | grep -v grep
Sep  7 01:52:21 beaglebone sshd[1344]: Accepted publickey for amartinez from ... port 40728 ssh2: RSA-CERT SHA256:V2KMVlJjPOn86z6a2srEcnMQj78OujEXJ597PJ6+wyY ID EXAMPLE_CERT (serial 1662515192) CA RSA SHA256:HoXa4G9gmsln+8gOUPEeKNmcCA0cppiUlmUuEjt8joA

The scenario used in these examples was exceedingly simple, and in production one should use an entirely different smartcard (or other pkcs11 device like a TPM) to act as the CA. If, like me, you tend to use your SmartCard with gpg rather than pcscd it is worth noting that use of the certificate works just fine without the -I option to ssh provided the associated key is available through gpg-agent serving as your ssh-agent. Signing, as far as I can tell, does require first stopping the local user's gpg-agent and then starting pcscd globally to use pkcs11. As a final note, the above works exactly the same way in Windows with the exception of the format of the paths given.

Cross-Compiling with Debian Multiarch

About a year ago, I created a project to keep notes on how I cross-compiled one of my Rust crates for some legacy systems on both ARMv7 and i686. At the time it was sufficient to statically link the entire C-runtime into these binaries. I left it at that and continued along my merry way without a care in the world for cross-compilation of any higher complexity.

Fast-forward to a few weeks ago, and I again had need for cross-compilation of a Rust binary at work towards an ARMv7 target with a dependency on the target system's libpcap. Time and time again my attempts, based on my original experiences with cross-compiling Rust for ARMv7, were met with frustrating failure. The linker could not find lpcap.

My searching of the vast series of tubes did not bring me any direct joy, often with some critical steps missing, but a distant memory reminded me of deb --add-architecture. This was the magic I sought, and if you seek to do similar work the finer points follow:

  1. If you have dependencies on libraries that differ between architectures, as many do, then a Debian based system may help you get those headers your crossbuild toolchain needs
  2. Use dpkg --add-architecture <TARGET_ARCH> to add your target architecture
  3. Run apt update to get the list of available packages for that target.
  4. Once you know which libraries you depend on, make note of the specific version present in your release. libpcap-dev itself has no armhf installation candidate, but apt search libpcap-dev will show you that libpcap0.8-dev (in Debian Stretch) is the package you really want and it does have an installation candidate.
  5. Continuing the libpcap-dev example, run apt -y install libpcap0.8-dev:armhf to install development headers specific to the armhf architecture.

Doing this has saved quite a bit of time as I can now compile on a much more powerful system than the embedded industrial PC that is my actual target (or the Beaglebone Black that has the same processor).

SSH at Scale - Revisited

The final note in my series on secure operation of SSH at scale will be brief:

Make sure to pay attention to MaxStartups.

Setting this too high will likely cause major performance issues as the CPU on any servers peg, and stay pegged. Setting it too low will negatively impact the systems trying to connect to your server. The setting itself controls how many connections can be in a "startup" state - prior to having completed authentication. Be sure to consider all expected use that sshd may answer, including client probes to verify the server is up. If these are driving a need to increase MaxStartups, try running a separate service specifically to handle these probes. Deconflict ports as needed.

PureBoot's Missing Feature

TL;DR

Purism's PureBoot offering on their products misses a critical feature:

Inspection and validation of the EC firmware.

Purism claims this is not a vulnerability, but a feature not yet implemented. It is on a roadmap that lacks dates that could be shared, according to Purism's CSO.

Read the full report below.

As Reported to Purism

The Full Report
Lack of EC validation risks undetected hardware backdoor in the Purism EC

Summary
--------

The discussion around this vulnerability started when Anthony J. Martinez
forked the Librem EC project and created a branch to add numpad
functionality. Upon successfully building, flashing, and using the newly
crafted EC firmware both Anthony and John Marrett were concerned by the lack
of notification or warning from PureBoot.

The integration of the keyboard firmware into the EC code seemed a promising
avenue for attacks. The keyboard firmware being based on QMK made it easy to
explore the code base and look at prior art.

Considering the possibilities we identified two promising avenues of attack:

 - Command injection where a common keypress is used to launch an attack
 - A keylogging attack

Command Injection
--------

The QMKhuehuebr [1] project on GitHub implements an attack where, when the
user presses Super+L (the default lock key on PureOS), a sequence of commands is
executed before the lock.

This attack can be used to launch and persist a shell script or binary from a
web source and use it to gain persistent access to the user account

Keylogging
--------

Making use of the Dynamic Macro [2] feature of QMK an attacker could program
the firmware to record the first characters typed after boot. This would
include disk and OS passphrases entered during boot up. This information
could easily be persisted in RAM as long as the PC was powered on, possibly
including standby. More concerted development could allow the attacker to
store the recorded keystrokes in the EC firmware space, though this would be
more complex to implement.

The recorded keystrokes could be recovered by the attacker either through
physical access to the device. They could also be transmitted to a remote
server by combining the keylogging attack with the command injection attack
described above.

Protection
--------

It's not clear to us how this attack can be prevented. The best method would
be for the EC to be validated before being loaded or by PureBoot; however, we
are not certain that this can be done. Our understanding is that the EC
handles boot processes prior to these systems being initialized.

There are a few important questions in this space as well:

Can this attack be stopped if the user makes use of the Librem [3] key? We can
still flash the EC, though it would require disassembling the laptop to
access the EC flash chip as discussed in this blog post [4] on EC booting
issues.

Based on Purism response: The Librem key will not prevent EC access and
flashing.

Can you prevent booting from USB on the Librem 14? Does the Librem default,
when purchased, to a configuration requiring a password to boot from USB?

Based on Purism response: Will not be implemented. Restricts user freedom.

It will be possible to disable EC writing using DIP switches [5] which
would protect against a remote attacker compromising the firmware. We
question the effectiveness of this as a means of protection as this
class of vulnerability is best suited to an evil maid type of attack.

Based on Purism response: Being implemented, with no clear timeline.

Proof of Concept
--------

We have not developed a proof of concept for this attack, but we may do so
at a later point in time.

Vendor Response
--------

Purism is working on enabling the DIP switches to lock EC writing and suggests
the use of glitter nail polish [6] to make tampering evident. They state that
this issue is common to the entire industry and not specific to their product
line. They also prioritize user freedom over solutions that would lock the
user out of the EC.

Purism has declined to request a CVE to identify this vulnerability, but do
acknowledge that an actual vulnerability exists.

Conclusion
--------

From our perspective:

 - Modification of the EC will allow an attacker to subvert the mechanisms
   that follow it, including PureBoot functionality
 - Requiring a password to boot from USB would increase the difficulty and
   evidence of this attack at low cost
 - The use of the DIP switches may be the only technically feasible solution
   that is possible at this time
 - Respecting user freedom is extremely important
 - Cryptographic validation of the EC, ideally allowing the user to sign their
   own firmware, is the only completely effective solution to confirm that the
   security of the device has not been compromised

Timeline
--------

2021-12-05  Initial communication of vulnerability to Purism security contact
2021-12-06  Initial response from Purism team, discussion follows, all delays
            on researcher side
2021-12-20  Final response from Purism team
2021-12-27  Revision of vulnerability write up based on discussions

References
--------

[1] https://github.com/mthbernardes/QMKhuehuebr/
[2] https://docs.qmk.fm/#/feature_dynamic_macros
[3] https://puri.sm/products/librem-key/
[4] https://puri.sm/posts/wrangling-the-ec-adventures-in-power-sequencing/
[5] https://forums.puri.sm/t/ec-protection-librem-14/15050
[6] https://puri.sm/posts/anti-interdiction-update-six-month-retrospective/
Personal Note

It is no secret that I have been a vocal supporter of Purism and their efforts to bring Open and Secure systems to the masses. This often comes at a price premium which is understood to actively support the high quality development of libre software and systems. Omitting, in design, a check so critical while simultaneously posting with great frequency about the extreme level of security offered does not sit well with me. Attempts to contribute resolutions to other identified gaps have gone nowhere beyond making this vulnerability clear. Then there is the Librem 5 debacle, which further erodes confidence in the company itself. As my hardware already in hand works, I will not turn it into eWaste. I just cannot, in good faith, add to it or suggest it to anyone else.

SSH At Scale With OpenSSH Certificates - Final points

This is the third post in a series on using OpenSSH Certificates to secure access to large numbers of similar devices. The first post can be found here, and the second can be found here.

The practical example left the means of scaling to the immagination, but there is one thing that is not obvious without looking at the source code of ssh-keygen itself:

A certificate may not have more than 255 principals

To use certificates at scale, where we will assume to be talking on the order of 2^16 or more, one needs to split a target queue into chunks of no more than 255 devices. Once split up into these chunks, it is rather simple to fetch certificates allowing access to devices from each chunk and map those to their corresponding target lists. Simple set and dictionary objects can handle this well within Python, for example, and can be used to feed a worker pool. The AsyncSSH library in Python supports certificates exceptionally well, and would make a good basis from which one could build both an SSH CA and client tooling capable of highly concurrent and secure access to a very large number of similar targets.

It is worth noting that when using certificates and eliminating password usage, an emergency hatch is necessary. My exmaple used a single CA key, and the loss (or compromise) of that key would be Very Bad News™. One should not depend on a single point of failure, so consider a rotational scheme where your devices know up front about a possible set of keys. Perhaps, if you are in an embedded environment, your base image contains one common CA that will always be available in an emergency but is hopefully never needed in production. Such a key should ideally be stored away in an airgapped HSM with strict, and audited, access policies governing its use. Another set of CA keys may be defined during device provisioning, and could correspond to keys available from a certificate service available over some network connection.

Finally, when using certificates in the ways discussed in this series, client keys can be ephemeral. The certificate authority grantes the powers needed to access the systems that trust it to any public key it signs. If this is combined securely with an external auth provider trusted by the CA, then any client tooling created can utilize per-job key material that is itself never exported to disk. When validity periods are kept to a minimum, this greatly reduces the potential for abuse and the window of opportunity for attacks is narrowed.

SSH At Scale With OpenSSH Certificates - Practical Example

As promised in my last post, here is an example setup for how one might use machine-specific data to shape SSH access using OpenSSH Certificates. To avoid too much irriation on my local system, I created a simple test container using podman and the following Dockerfile:

# syntax=docker/dockerfile:1
FROM docker.io/alpine:latest 

RUN apk --no-cache add openssh-server bash && \
    adduser -s /bin/bash -D -u 1001 demo && \
    echo "demo:$(dd if=/dev/urandom bs=1 count=32 2>/dev/null | base64)" | chpasswd -c sha512 && \
    mkdir -p /opt/ssh/config

WORKDIR /opt/ssh

CMD ["/bin/bash", "/opt/ssh/config/run"]

The content of /opt/ssh/config/run is as follows:

#!/bin/bash

set -e

HOST_KEY="/opt/ssh/config/ssh_host_ecdsa_key"
CONF="/opt/ssh/config/sshd_config"

if [ ! -e "${HOST_KEY}" ]; then
   ssh-keygen -t ecdsa -b 256 -N '' -q -f "${HOST_KEY}"
fi

/usr/sbin/sshd -D -e -h "${HOST_KEY}" -f "${CONF}"

The reference sshd_config is:

# Setting some core values that are helpful for use in a
# system using Certificates.

HostKey /opt/ssh/config/ssh_host_ecdsa_key
HostCertificate /opt/ssh/config/ssh_host_ecdsa_key-cert.pub

LoginGraceTime 10s
PermitRootLogin no
StrictModes yes
MaxAuthTries 3
MaxSessions 10

PasswordAuthentication no
PubkeyAuthentication yes

AuthorizedKeysFile	 none

TrustedUserCAKeys /opt/ssh/config/ssh_ca_keys
AuthorizedPrincipalsCommand /opt/ssh/config/auth_principals %u
AuthorizedPrincipalsCommandUser nobody


# override default of no subsystems
Subsystem	sftp	/usr/lib/ssh/sftp-server

The magic happens in auth_principals:

#!/bin/bash

set -e

case "${1}" in
    "demo")
	echo "${HOSTNAME}-demo"
	;;
esac

While auth_principals is a fairly trivial Bash example the key points are that:

  1. AuthorizedPrincipalsCommand needs to return a string matching one of the principals encoded on the presented certificate when a user tries to login with a given username

  2. This can call upon anything the machine knows about itself and can programmatically acces. The use of HOSTNAME is just an example. As an administrator you can do as you like. Be creative!

Running the Example

After building the test container, fire it up:

$ podman run --rm -it --hostname=$(openssl rand -hex 8) -p 9022:22 -v ./config:/opt/ssh/config:Z ssh-cert-example
Server listening on 0.0.0.0 port 22.
Server listening on :: port 22.

...

Note the volume mount of a config directory which itself contains:

  1. The scripts, and configs, shown above: sshd_config, run, and auth_princpals
  2. The HostKey and HostCertificate
  3. A hello script that I will set as the ForceCommand on a sample user certificate
  4. An ssh_ca_keys file referenced in sshd_config with the public key associated with the SSH key used as a CA signing key.

Given that I created the container with a random HOSTNAME value, the container needs a little inspecting before I can proceed:

$ podman inspect strange_heisenberg | jq '.[0].Config.Hostname'
"cac0e8fd0d329d7e"
Minting a Certificate

From the information above we know that a user, demo can access the system by presenting an OpenSSH Certificate with a principal matching ${HOSTNAME}-demo. Given that the HOSTNAME variable expands to cac0e8fd0d329d7e let us sign a certificate accordingly:

$ ssh-keygen \
	-I demo@ajmartinez.com \
	-V $(date +%Y%m%d%H%M%S):$(date --date="+15 minutes" +%Y%m%d%H%M%S) \
	-z $(date +%s) \
	-n cac0e8fd0d329d7e-demo \
	-O force-command=/opt/ssh/config/hello \
	-s ../ssh_ca \
	demo.pub
Signed user key demo-cert.pub: id "demo@ajmartinez.com" serial 1642122125 for cac0e8fd0d329d7e-demo valid from 2022-01-13T19:02:05 to 2022-01-13T19:17:05

Checking the contents:

$ ssh-keygen -Lf demo-cert.pub 
demo-cert.pub:
        Type: ecdsa-sha2-nistp256-cert-v01@openssh.com user certificate
        Public key: ECDSA-CERT SHA256:qfxed1FR8kXtXMXBWTjEjwPLjBKWz0nKbthaFGGVO/E
        Signing CA: ECDSA SHA256:Tb4fK9xMEtZRnxHlXsvXaPoPj1A8vtxNXvWkb1Wpju8 (using ecdsa-sha2-nistp384)
        Key ID: "demo@ajmartinez.com"
        Serial: 1642122125
        Valid: from 2022-01-13T19:02:05 to 2022-01-13T19:17:05
        Principals: 
                cac0e8fd0d329d7e-demo
        Critical Options: 
                force-command /opt/ssh/config/hello
        Extensions: 
                permit-X11-forwarding
                permit-agent-forwarding
                permit-port-forwarding
                permit-pty
                permit-user-rc
Logging In

The easiest part of all, logging in as a client:

$ ssh -i demo -o CertificateFile=demo-cert.pub -p 9022 demo@localhost
Welcome to cac0e8fd0d329d7e! OpenSSH Certificates are cool huh?
Shared connection to localhost closed.

What trickery is this? No prompt to accept a random key fingerprint into my known_hosts?? Surely you jest!? No, I just added the ssh_ca.pub to ~/.ssh/known_hosts as a @cert-authority entry:

@cert-authority [localhost]:9022,[::1]:9022 ecdsa-sha2-nistp384 AAAAE2VjZHNhLXNoYTItbmlzdHAzODQAAAAIbmlzdHAzODQAAABhBMGDesyChnteRlL3/fkcFUQk+qDuL5dnbFPeT8oejuaDOv4UT3yLU/2bXJZlEjbknztORXuy3ViqCBQskqPkfPglyv0Uqpn4VhRbh9j1fK6MzcPg50OWDw1hioCohazx7w==
Checking Server Access

In the world of distributed SSH access without certifiate use, and with an industry worst-practice of shared accounts with shared credentials, no one ever has any clue who logged in as demo. Maybe it was someone authorized to do so. Maybe it was someone who left an organization a decade ago.

The output from my test container, for each access, looks like this:

Accepted publickey for demo from 10.0.2.100 port 53930 ssh2: ECDSA-CERT SHA256:qfxed1FR8kXtXMXBWTjEjwPLjBKWz0nKbthaFGGVO/E ID demo@ajmartinez.com (serial 1642122125) CA ECDSA SHA256:Tb4fK9xMEtZRnxHlXsvXaPoPj1A8vtxNXvWkb1Wpju8
Received disconnect from 10.0.2.100 port 53930:11: disconnected by user
Disconnected from user demo 10.0.2.100 port 53930
Accepted publickey for demo from 10.0.2.100 port 53934 ssh2: ECDSA-CERT SHA256:qfxed1FR8kXtXMXBWTjEjwPLjBKWz0nKbthaFGGVO/E ID demo@ajmartinez.com (serial 1642122125) CA ECDSA SHA256:Tb4fK9xMEtZRnxHlXsvXaPoPj1A8vtxNXvWkb1Wpju8
Received disconnect from 10.0.2.100 port 53934:11: disconnected by user
Disconnected from user demo 10.0.2.100 port 53934

And after waiting for my 15-minute validity period to expire:

Certificate invalid: expired
maximum authentication attempts exceeded for demo from 10.0.2.100 port 53940 ssh2 [preauth]
Disconnecting authenticating user demo 10.0.2.100 port 53940: Too many authentication failures [preauth]

Conclusion

The building blocks for OpenSSH certificate use are simple and accessible to admins of all skill levels. Substantial benefits exist over the use of LDAP, authorized_keys, or shared credentials:

  1. Certificate auth is lightning fast
  2. One need only maintain TrustedUserCAKeys on servers
  3. One need only maintain @cert-authority entries, which can be scoped hostnames, IPs, etc, on client systems
  4. Certificates are portable. If you have a system in an airgapped bunker, one can mint a certificate with a limited validity period attached to an ephemeral private key that will allow access to the system to someone physically present. Try that with LDAP.
  5. It is clear who accessed what and when.

While the examples given were simple and manually executed on a Bash shell, there are a number of ways one could build a highly-available (and secure) web service CA. Python and Rust both have appropriate libraries, and I am certain other languages do as well. With a little imagination, and a lot of attention to detail, you too can have secure SSH access that is both easy to deploy and easy to maintain.

There are a few limitations to be aware of, and I will cover these in another post.

SSH At Scale with OpenSSH Certificates

The Issue

Service and maintenance of widely deployed Linux based systems can be a challenging task. This often requires distributed global support personnel with varying levels of system access. A means of auditing when and where that access is used should be a strict requirement. In fleets of IoT devices where base configurations are common, but resources are limited, one might find a need to balance system simplicity and complex access models for support teams.

An Ideal Solution with OpenSSH Certificates

OpenSSH Certificates provide a means of gating access to Linux systems with extremely minimal overhead on either the client or server, and are supported in most every version of OpenSSH released in the last decade. If your systems are not severely deprecated this solution can work for you.

Configuration for use of certificates is quite simple, and requires no more than an understanding of a few parameters in sshd_config:

Example Flow
Example IdentityToken
{
	"user_id": "someone@example.org",
	"principals": ["device_id1", "device_id2", "device_idN"],
	"nbf": 1641016800,
	"exp": 1641017100
}
Example Certificate

A number of ways exist by which one might mint an OpenSSH Certificate, ssh-kegen included.

Assuming a CA is running some process that accepts JWTs, validates the signing JWK, and verifies claim fields against some input validation defined by organizational needs, the creation of a certificate for the IdentityToken shown above might look like:

ssh-keygen \
	-I someone@example.org \
	-s ${CA_KEY_PATH} \
	-n device_id1,device_id2,device_idN \
	-z 12345678 \
	-V $(date --date=@1641016800 +%Y%m%d%H%M%S):$(date --date=@1641017100) \
	user_provided_pubkey.pub
Abstracted use case

The resulting user_provided_pubkey-cert.pub from the example above can then be returned to the user who may use the certificate to access systems where:

When such access occurs, the authorization logs will show that:

Conclusion

OpenSSH versions from any non-deprecated distribution have supported certificate login for several years. A simple, and robust, solution exists for accessing distributed systems at scale. With some creativity, and a toolbox of open standards, one can provide secure and auditable access to systems over SSH. In a later post, I will share samples showing how one might configure clients and servers for OpenSSH Certificate use.

Fun with Emacs

Somewhere around a year ago I bailed on vi(m) as my primary editor. I did this after two decades of faithfully carrying the vi(m) torch in the holy flame wars of "no myyyy editor is better." For the first few weeks of GNU/Emacs use, I tried to use Emacs natively with its own keybindings. This was, to an old vi(m) user, maddening. I quickly found myself using EVIL mode, and proclaiming that Emacs is actually the best version of vi(m) in existence (and yes, I tried NeoVim and all the rest).

In a given week, I frequently find myself connected to remote hosts on which Emacs is not installed and this has left me with plenty of time to use my old friends from the vi family. With some twenty years of familiarity it is not like any of the keybindings or wizard-like motions have fled my mind. At some point, quite probably because it is often difficult to reason about which mode one might be in when accessing a system over a remote link best described as "glacial", I started to get pretty tired of smacking Esc all the time. I confess to frequently abusing sed -i when I know exactly what I want to change and where.

There is an easier way that I keep forgetting even exists: TRAMP.

This very post was written using TRAMP to:

  1. Connect to my server as my normal user
  2. Change to the user under which my site's process runs
  3. Create and edit the markdown from which the post is rendered

Some of the more advanced extensions I run in Emacs for use as a rather powerful Python and Rust IDE appear to conflict with TRAMP, but running in a minimal config is trivial and functional so I may not even mess with figuring out exactly where the failure is induced. Over the next few weeks I may mess with seeing how this works in my day to day job where I must frequently access remote hosts just to edit a text file or two. Doing it all from Emacs has some appeal.

Even with TRAMP, I still find myself annoyed with changing modes to do a number of things. EVIL was removed from my init.el and several modes are now less hampered by binding conflicts. While it's only been a day, I am enjoying it rather a lot. Adding a little enjoyment back to computing is worth it.

Life Below Gig

For the last two years or so, I have lived in The Netherlands and enjoyed gigabit downstream at home. Unfortunate family circumstances have me back home in Texas, and the step backwards having lost some 80% of my downstream is remarkable. Almost more alarming is the huge lack of upstream (6mbps vs 40mbps).

It's clear that when I return I will have to pay whatever extortion Xfinity requires as they're the monopoly power in my zip code. It's also clear that A Rust Site Engine likely needs some new features like pagination to make it easier to consume this site if you don't have super fast internet.

Basic Tails Setup

The following mini-guide will take you down the path to a basic Tails install with one important extra feature: support for offline USB HSM use.

Requirements

Getting Tails

Here you have two choices, which are well described here, but boil down to:

HSM Support

Once you have a base Tails install the rest is quite simple.

  1. Boot your new Tails USB
  2. Connect to Tor
  3. Hit Super and start typing "Configure persistent volume"
  4. Create your passphrase to encrypt the persistent storage volume
  5. Click the Create button
  6. When the feature list appears, enable "Additional Software"
  7. Reboot
  8. Unlock your persistent storage in the Welcome Screen
  9. Under "Additional Settings" on the Welcome Screen expand the options and choose "Administration Password"
  10. Connect to Tor
  11. Open a terminal and run sudo apt update && sudo apt --yes install opensc libengine-pkcs11-openssl
  12. Tails will update and ask if you want to persist this Additional Software. Tell it yes, you want the additional software available every time you unlock your Persistent Storage

At this point, if you reboot and unlock your persistent storage your Tails system will be able to use any USB HSM supported by OpenSC. Installation of software from the persistent storage does not require an administration password, and for added security it is probably best to avoid setting one unless your workflow requires administrative rights for some reason. After your software finishes installing from persistent storage you are ready to use your HSM directly with tools like:

Signing Example

# Here 20 is the key ID of a signing key on a Nitrokey HSM 2
amnesia@amnesia:~$ openssl dgst -engine pkcs11 -keyform e -sign 20 -out special.sig special.img
engine "pkcs11" set
Enter PKCS#11 token PIN for UserPIN (MY-MAGIC-KEY):

# And now to verify the resulting signature
amnesia@amnesia:~$ openssl dgst -engine pkcs11 -keyform e -verify 20 -signature special.sig special.img
engine "pkcs11" set
Verified OK

Use the latest Rust

Given the risks associated with CVE-2021-29922 anyone using A Rust Site Engine should make sure to build with Rust 1.53 or later. Generally, ARSE should only be used behind a reverse proxy that mitigates the risks but safety is a short rustup update && cargo install arse away.

TLS Implementation Failures

By now we have all attempted to access a website in any modern browser and found ourselves reading a warning that proceeding is dangerous. These tend to pop up when one encounters self-signed certificates, which themselves are not inherently evil, rather than certificates issued by one of the many globally trusted root certificate authorities. Failures in TLS implementation are not necessarily due to the use of self-signed certificates, but could rest in a failure to add the signing certifiate to the appropriate trust store after having verified the signer is who they say they are.

Everyone verifies certificates, right? Failing to do so extinguishes any real benefit of transport layer security, and exposes an extraordinarily large attack surface in the multitude of RESTful APIs and chat services that make the world of IoT tick. If, for whatever reason, your service does not mandate client certificates how safe can you be if you are not certain your clients are checking certificates? Since it requires more work to ignore certificate checking (examples below) surely no one is goiing the extra mile to do it wrong...

Unfortunately, ignoring certificate checks is fairly normal in some circles (looking at you, IoT) and if you want to know if a device on your network is guilty the process for finding out is trivial. This, of course, also means that a malicious attack is just as easy. So is preventing such attacks: always check certificates.

Are you curious if the brand new IoT widget you just recieved is Doing It Right™? By now we know every one of these devices is constantly phoning home to the mothership about your every move, but how can you check if this is done securely? Glad you asked!

If a picture is worth a thousand words...

No time to watch an ASCII Cast?

  1. bettercap to gather information on network hosts, and ARP spoof
  2. sslsplit to forge TLS certs on the fly
  3. An iptables pre-routing NAT rule to direct TLS traffic through sslsplit
  4. tshark to inspect the raw traffic, and anything intercepted by sslsplit
  5. Five minutes of your time

Final Thoughts

If the answer to "are you verifying certificates?" is no, then you are doing it wrong and putting both sides of your communications at risk. If you are a developer, and you do not know if you are checking certificates go take a look at your libraries and find out which extra options you need to use to disable checking. Search your source for these options. If you find them, file a bug and fix it. Immediately!

SmartCards and Fedora

Attempting to use my second GPG Smartcard with Fedora presented some challenges in dealing with pcscd. The root cause is that polkit does not allow normal users access to pcsc or the smartcard itself. This can be resolved with a single rule:

In /etc/polkit-1/rules.d/42-pcsc.rules:

polkit.addRule(
  function(action, subject) {
    if ((action.id == "org.debian.pcsc-lite.access_pcsc" ||
        action.id == "org.debian.pcsc-lite.access_card") &&
        subject.isInGroup("wheel")) {
          return polkit.Result.YES;
        }
});

For the subject.isInGroup condition, I used the group wheel as I am the only member of that group on the system in question. Use your own descretion here, or use an even more specific condition to allow only one user like subject.user == "foo".

Additional Points

While this does allow access through pkcs11 and pkcs15 tools or gpg, I have not yet found the magic potion that will allow me to use both. Whichever tools are used first have a monopoly on the device. That said, on a modern Linux distro just using pkcs11 ought to do the trick.

Update: 2021-06-18

You can simply kill gpg-agent if you wish to use the pkcs11 interface after gpg takes a greedy lock on the device.

Encryption

Use -engine pkcs11 with openssl subcommands that support it:

openssl rsautl -engine pkcs11 -keyform e -inkey <KEY_ID> -encrypt -in <INPUT> -out <OUTPUT>

SSH

Use "pkcs11:id=%<KEY_ID>?pin-value=<PIN>" as the identity file argument for ssh either on the command line, or in an ssh_config file. You will likely wish to get the PIN value itself from somewhere so it's not just in plaintext in your history:

ssh -i "pkcs11:id=%03?pin-value=123456" user@host

Or in an ssh_config file:

Host host
  IdentityFile "pkcs11:id=%03?pin-value=123456"
  User user

Adding SSH Agent Support to Split GPG

Split GPG is a very cool feature of Qubes OS but it leaves out one critical feature: enabling SSH support so the GPG backend qube can make use of an authentication subkey. There are a few different ways to solve this, and this guide provided some of the inspiration for what follows.

The Landscape

Here are the requirements for what follows:

Qubes RPC Policy

The first step is to configure an appropriate Qubes RPC Policy. A basic, and generally sane option, is to use a default configuration that asks the user to approve all requests and allows any qube to target any other qube with such a request. In my own configuration there are explicit allow rules for specific qubes where I use SSH frequently for admin purposes.

In dom0 create /etc/qubes-rpc/policy/qubes.SshAgent:

admin personal-gpg allow
@anyvm @anyvm ask

Actions in the Split GPG VM

The following actions all take place in the qube configured to act as the GPG backend for a Split GPG configuration.

Enable SSH support for gpg-agent:

$ echo "enable-ssh-support" >> /home/user/.gnupg/gpg-agent.conf

Update .bash_profile to use the gpg-agent socket as SSH_AUTH_SOCK by appending:

unset SSH_AUTH_SOCK
if [ "${gnupg_SSH_AUTH_SOCK_by:-0}" -ne $$ ]; then
	export SSH_AUTH_SOCK="$(gpgconf --list-dirs agent-ssh-socket)"
fi
export GPG_TTY=$(tty)
gpg-connect-agent updatestartuptty /bye >/dev/null

Create /rw/config/qubes.SshAgent with the following content, and make it executable:

#!/bin/sh
# Qubes Split SSH Script

# Notification for requests
notify-send "[`qubesdb-read /name`] SSH Agent access from: $QREXEC_REMOTE_DOMAIN"

# SSH connection
socat - UNIX-CONNECT:$SSH_AUTH_SOCK

Update /rw/config/rc.local appending the following:

ln -s /rw/config/qubes.SshAgent /etc/qubes-rpc/qubes.SshAgent

Sourcing .bash_profile and /rw/config/rc.local should put the qube in a state where, if available, a GPG authetication subkey will be available to ssh-agent:

Example from my system:

[user@personal-gpg ~]$ ssh-add -l
4096 SHA256:V2KMVlJjPOn86z6a2srEcnMQj78OujEXJ597PJ6+wyY (none) (RSA)

Template VM Modifications

For my tastes it made the most sense to make a systemd service available to all qubes using my f33-dev template, and then start that service from /rw/config/rc.local on qubes I want to use the new feature.

In the approprite Template VM create a service similar to the following, but replace personal-gpg with the name of your Split GPG backend qube.

/etc/systemd/system/split-ssh.service:

[Unit]
Description=Qubes Split SSH
StartLimitIntervalSec=500
StartLimitBurst=5

[Service]
Type=simple
User=user
Group=user
Restart=on-failure
RestartSec=5s
WorkingDirectory=/home/user
Environment="AGENT_SOCK=/run/user/1000/SSHAgent" "AGENT_VM=personal-gpg"
ExecStart=socat "UNIX-LISTEN:${AGENT_SOCK},fork" "EXEC:qrexec-client-vm ${AGENT_VM} qubes.SshAgent"

[Install]
WantedBy=multi-user.target

Once this has been added run the following, and shut the template qube down:

sudo systemctl daemon-reload

The Client Side

In the actual SSH client qubes, there are a few actions required to complete the loop.

Append the following to .bashrc - make sure this matches the AGENT_SOCK in your systemd service:

### Split SSH Config
export SSH_AUTH_SOCK="/run/user/1000/SSHAgent"

In /rw/config/rc.local append the following to start the service:

systemctl start split-ssh

Source .bashrc and /rw/config/rc.local and with the split GPG backend qube running test that your key is available:

[user@admin ~]$ ssh-add -l
4096 SHA256:V2KMVlJjPOn86z6a2srEcnMQj78OujEXJ597PJ6+wyY (none) (RSA)

Since my Qubes RPC policy allows the admin qubes to reach personal-gpg without my confirmation, a system notification appears stating:

[personal-gpg] SSH Agent access from: admin

Conclusion

With a few simple steps the power of Split GPG can be extended to include SSH Agent support. As a result, network-attached qubes used for administration of remote assets no longer directly store the private key material used for authentication and the attack surface is that much smaller. There are a few ways to get the pubkey to add to remote ~/.ssh/authorized_keys but the easiest way is probably ssh-add -L.

Security Keys and Qubes OS

With the arrival of my second GPG Smartcard, I thought now would be a good time to go over how I use Qubes OS features along with some more general products for various signing, encryption, and authentication tasks.

The Landscape

Here are the various components at play:

The base of all but one of my qubes is Fedora.

Getting Started

The new security key needs to have its PIN set, and since my Qubes OS configuration uses a USB qube it will be necessary to give my running disposable VM access to the key itself:

In dom0, where my target vm is disp4632 and my BACKEND:DEVID is sys-usb:2-1:

$ qvm-usb attach disp4632 sys-usb:2-1

In the disposable VM run:

[user@disp4632 ~]$ gpg --card-status
Reader ...........: Purism, SPC Librem Key (000000000000000000009BB1) 00 00
Application ID ...: D276000124010303000500009BB10000
Application type .: OpenPGP
Version ..........: 3.3
Manufacturer .....: ZeitControl
Serial number ....: 00009BB1
Name of cardholder: [not set]
Language prefs ...: de
Salutation .......: 
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa2048 rsa2048 rsa2048
Max. PIN lengths .: 64 64 64
PIN retry counter : 3 0 3
Signature counter : 0
KDF setting ......: off
Signature key ....: [none]
Encryption key....: [none]
Authentication key: [none]
General key info..: [none]

Change the PINs

  1. gpg --card-edit in the disposable VM
  2. admin at the gpg/card> prompt
  3. passwd at the gpg/card> prompt
  4. Select 1 and follow the prompts, where the first PIN is the default: 123456
  5. Select 3 and follow the prompts, where the first Admin PIN is the default: 12345678
  6. Select q and quit.

Initialize the Security Key

Remove the new Security Key

In dom0 run:

qvm-usb detach disp4632 sys-usb:2-1
Insert the original Security Key

In dom0 run:

# Assuming you plugged the original key into the same port
qvm-usb attach disp4632 sys-usb:2-1
Insert and mount the USB drive

In dom0 find the appropriate block device, and attach it to the disposable VM:

qvm-block list
...

qvm-block attach disp4632 sys-usb:sdb1

In the disposable VM find the attached disk (likely /dev/xvdi)

[user@disp4632 ~]$ lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
--- SNIP ---
xvdi    202:128  1 28.9G  0 disk

Then mount the disk:

[user@disp4632 ~]$ udisksctl mount -b /dev/xvdi
Mounted /dev/xvdi at /mnt/removable

Note that I did not sudo mount /dev/xvdi /mnt/removable as the operation does not require root, and we do not use powers we do not need do we?!

Extract the encrypted backup from the USB drive
[user@disp4632 ~]$ cp /mnt/removable/gpg-backup/backup* .
Unmount and Remove the USB drive
[user@disp4632 ~]$ udisksctl unmount -b /dev/xvdi
Unmounted /dev/xvdi.

In dom0:

qvm-block detach disp4632 sys-usb:sdb1
Decrypt the backup

This assumes you have installed opensc and have pkcs15-tool and pkcs11 drivers.

First, find the Key ID for encryption key on your existing Security Key:

[user@disp4632 ~]$ pkcs15-tool -D
Using reader with a card: Purism, SPC Librem Key (000000000000000000009BB4) 00 00
PKCS#15 Card [OpenPGP card]:
        Version        : 0
        Serial number  : 000500009bb4
        Manufacturer ID: ZeitControl
        Language       : de
        Flags          : PRN generation, EID compliant

// SNIP

Private RSA Key [Encryption key]
        Object Flags   : [0x03], private, modifiable
        Usage          : [0x22], decrypt, unwrap
        Access Flags   : [0x1D], sensitive, alwaysSensitive, neverExtract, local
        Algo_refs      : 0
        ModLength      : 4096
        Key ref        : 1 (0x01)
        Native         : yes
        Auth ID        : 02
        ID             : 02 <-- THIS ID
        MD:guid        : ee23dccc-fc38-2dc2-3bc8-bb5f859168d4

// SNIP

Now use it to decrypt the pbkdf2 key used to encrypt the GPG backup tarball itself. This hybrid encryption scheme allows securely storing data of arbitrary sizes and using pbkdf2 with randomly generated secrets and then encrypting those secrets with the Security Key's encryption key.

Decrypting the pbkdf2 password file with the Security Key:

[user@disp4632 ~]$ openssl rsautl -engine pkcs11 -keyform e -decrypt -inkey 02 -in backup.key.enc -out backup.key
engine "pkcs11" set.
Enter PKCS#11 token PIN for OpenPGP card (User PIN):

Decrypting the GPG backup with the pbkdf2 password file:

[user@disp4632 ~]$ openssl enc -chacha20 -pbkdf2 -pass file:backup.key -d -in backup.tar.gz.enc -out backup.tar.gz
Extract the backup
tar xf backup.tar.gz
Verify the keyring is in tact
[user@disp4632 ~]$ gpg -k
/home/user/.gnupg/pubring.kbx
-----------------------------
pub   rsa4096 2021-05-08 [C]
      FCBF31FDB34C8555027AD1AF0AD2E8529F5D85E1
uid           [ultimate] Anthony J. Martinez <@ajmartinez:txrx.staart.one>
sub   rsa4096 2021-05-08 [S]
sub   rsa4096 2021-05-08 [E]
sub   rsa4096 2021-05-08 [A]

Remove the original Security Key

In dom0:

qvm-usb detach disp4632 sys-usb:2-1
Insert the new Security Key again

In dom0:

qvm-usb attach disp4632 sys-usb:2-1
Export the signing, encryption, and authentication subkeys to the Security Key

Edit the key in expert mode:

[user@disp4632 ~]$ gpg --expert --edit-key FCBF31FDB34C8555027AD1AF0AD2E8529F5D85E1

In the gpg> prompt select each subkey and use the keytocard command.

Example, using the signing key (key 1):

gpg> key 1

sec  rsa4096/0AD2E8529F5D85E1
     created: 2021-05-08  expires: never       usage: C   
     trust: ultimate      validity: ultimate
ssb* rsa4096/A2206FDD769DBCFC <-- NOTICE THE * HERE - this key is selected
     created: 2021-05-08  expires: never       usage: S   
ssb  rsa4096/6BE6910237B3B233
     created: 2021-05-08  expires: never       usage: E   
ssb  rsa4096/FD94BDD7BED5E262
     created: 2021-05-08  expires: never       usage: A   
[ultimate] (1). Anthony J. Martinez <anthony@ajmartinez.com>
[ultimate] (2)  Anthony J. Martinez <@ajmartinez:txrx.staart.one>

gpg> keytocard
gpg> key 1 <-- this is to deselect key 1

Repeat the above for key 2 and 3.

Verify the card status
[user@disp4632 ~]$ gpg --card-edit

Reader ...........: Purism, SPC Librem Key (000000000000000000009BB1) 00 00
Application ID ...: D276000124010303000500009BB10000
Application type .: OpenPGP
Version ..........: 3.3
Manufacturer .....: ZeitControl
Serial number ....: 00009BB1
Name of cardholder: [not set]
Language prefs ...: de
Salutation .......: 
URL of public key : [not set]
Login data .......: [not set]
Signature PIN ....: forced
Key attributes ...: rsa4096 rsa4096 rsa4096
Max. PIN lengths .: 64 64 64
PIN retry counter : 3 0 3
Signature counter : 0
KDF setting ......: off
Signature key ....: C9ED 41D4 EB62 80BB E61F  0E59 A220 6FDD 769D BCFC
      created ....: 2021-05-08 11:43:52
Encryption key....: 335D C8BC E4A6 8FFF B9B5  CBEF 6BE6 9102 37B3 B233
      created ....: 2021-05-08 11:44:54
Authentication key: D157 68B9 CCCF 4FB5 6FC2  971E FD94 BDD7 BED5 E262
      created ....: 2021-05-08 11:45:39

Cleanup

From here the new Security Key is configured, and the disposable VM is of no further use. Since disposable VMs are destroyed when the application they were created to run is stopped, the only cleanup necessary is to close the terminal to the disposable VM.

Additional Notes

On my system, I also have vault and personal-gpg qubes. These are both network-isolated and function much the same way the physical key does. The personal-gpg qube holds the very same subkeys as both Librem Keys, and through the use of Split GPG allows for a smartcard-like use of the qube from my other qubes. In a later post, I will detail how I use QubesRPC in personal-gpg to also serve as my ssh-agent for using the authentication subkey in things like my admin qube to prevent me from needing dozens of copies of my SSH private keys everywhere. The vault qube is home to the master secret key, and as such never has any data fed in to it.

The process used to decrypt data can be reversed to encrypt data as well. I will leave that as an exercise for the reader, but the short version is that instead of the decrypt option(s) for the openssl tools use their encrypt counterparts. If you wish to generate a random secret to use with pbkdf2 the following should do the trick:

openssl rand -base64 -out secret.key 32