(Avoid) Implementing STARTTLS

STARTTLS seems simple. It consists of a single message to switch to encryption. But as you zoom in, you start to see increasingly intricate issues that keep unfolding. It’s best to avoid it.

Sierpiński carpet. Infinite perimeter and zero area. Start with a square, split the square into 9 equal squares, remove the central square, and continue recursively.

Yet… someone needs it.

You start bargaining: You know that STARTTLS is a real-world attack target. And you know that it’s an easy target. You really want to avoid it. You try to make STARTTLS unattractive, convince your colleagues to repeat their talk about STARTTLS at FOSDEM’ 24, submit a follow-up talk to FOSDEM ‘25, too, and keep repeating that we should avoid it

By the way, the Call for Participation in the “Modern Email” FOSDEM ‘25 DevRoom is still open!

And yet… someone needs it.

As an open-source maintainer in the email space you will (sooner than later) find yourself in a position where someone is desperately asking for help because their email provider doesn’t support implicit TLS but only STARTTLS. If these servers are operated by smaller organizations, universities, etc., you should ask them to update their infrastructure. But, sooner than later, you may end up blocked by some unreasonable entity¹. So…

How to (avoid) implementing STARTTLS?

… and still making everyone happy?

STARTTLS consists of two phases: a plaintext and an encrypted phase. In the plaintext phase, there is always the possibility that data was meddled with by a network attacker.

Consider this trace:

S: * OK Hello, World!       // Who is greeting me here, actually?
S: * 100000000000000 exists // Are you serious?
C: A STARTTLS
S: A NO                     // Uhm... okay?
<----- TLS handshake ----->
C: B fetch 100000000000000 ...

Nothing before the TLS handshake can be trusted. And I mean nothing. Not the “OK”, not the “Hello, World!”, certainly not the “100000000000000 exists”, and also not the “NO” (which is a typical STARTTLS stripping attack). We must throw away every byte before the TLS handshake.

This raises the question, “Why should we expose our application to a potential adversary already?” and brings us to my suggestion: Let’s avoid (most of) STARTTLS. Let’s not parse anything before the TLS handshake and try to transition to TLS no matter what. Let’s do what openssl s_client -starttls imap does.

Minimal STARTTLS implementation (in Rust)

Our STARTTLS client implementation could be …

fn main() {
    // Start with plaintext.
    let stream = {
        let stream = TcpStream::connect("127.0.0.1:1143");
        let stream = do_starttls_prefix(stream);

        // We now have a TcpStream that expects a TLS handshake *exactly* as on port 993.
        stream
    };

    // Proceed with TLS.
    let stream = TlsStream::connect(stream);

    // Nit: The server won't send another `Greeting` after STARTTLS.
    let mut client = Client::new_without_greeting();

    // Continue as you would with implicit TLS (993).
    //...
}

/// Bring a STARTTLS connection to the point where TLS is expected.
async fn do_starttls_prefix(stream: TcpStream) -> TcpStream {
    let reader = BufReader::new(stream);
    let mut lines = reader.lines();

    // Receive greeting.
    // Note: Greeting is *always* a single line.
    let _ = lines.next_line().await.unwrap();

    // Send STARTTLS command.
    lines.get_mut().write(b"A STARTTLS\r\n").await.unwrap();

    // Receive (and discard) all lines up until we get a
    // command completion result for our "A STARTTLS" command.
    loop {
        let line = lines.next_line().await.unwrap().unwrap();
        if line.starts_with("A ") {
            break;
        }
    }

    lines.into_inner().into_inner()
}

While this example is specific to IMAP, you can do the same for SMTP, POP3, LDAP, etc. Think of STARTTLS as an “insecure prefix” that you barely need to overcome to make your server accept your beautiful TLS bytes. Treat STARTTLS as a guest. Greet it warmly and offer it what it needs, but keep it from taking over your space so that your routines stay intact.

FAQ

Q: Is the greeting indeed a single line?

Yes. None of the greeting ABNF rules …

Tree of all ABNF rules used by greeting.

… allow a newline. Further, “IMAP folklore” suggests that a Code must not contain literals. At least, all extensions I saw so far seem to be very careful about this “rule”.

Q: Is the loop required?

Yes. IMAP servers may send their * CAPABILITIES ... right after the greeting. The loop discards these lines.

Q: Why without_greeting?

STARTTLS is defined so that after the transition to TLS, there will not be a second greeting. This is a minor difference between a session started via implicit TLS (port 993) and a session that we primed towards TLS.

Q: Is .starts_with enough?

Probably yes. The implementation could be made more solid, e.g., by using a random token in case the server sends a literal. But… why should it?

Q: Is good error reporting still possible?

We lose precision in our errors. When we implement STARTTLS this way, we cannot detect a server that does not advertise STARTTLS or rejects our command. We will only see a TLS error. This is the price we pay for a reduced attack surface.

Microsoft supports IMAPS on port 993 but not SUBMISSIONS on port 465. This has an interesting historical background. Today, it’s unreasonable. ↩︎

(Avoid) Implementing STARTTLS

Sierpiński carpet. Infinite perimeter and zero area. Start with a square, split the square into 9 equal squares, remove the central square, and continue recursively.

Yet… someone needs it.

And yet… someone needs it.

How to (avoid) implementing STARTTLS?

Minimal STARTTLS implementation (in Rust)

FAQ

Tree of all ABNF rules used by greeting.