A Blast From the Past: Executing Code in Terminal Emulators via Escape Sequences

In the beginning of time, there were hardware terminal emulatorsalso called ttys, which are programs emulating a video terminal. In modern computers, we're mostly used to graphical user interfaces (GUIs), whereas a terminal emulator like xterm is used to access the command line interfaces (CLIs) or text user interfaces (TUIs) of programs.

In this article we'll talk about how we can use an Apache CVE-2013-1862 vulnerability to issue an escape sequence attack. The most common terminals used in the old days were the following.

  • VT52: a CRT computer terminal build by Digital Equipment Corporation (DEC) in 1974 and is presented on the picture below. It supported 95 ASCII characters, 32 graphic characters and bi-directional scrolling [3].

    escape1

  • VT100: is a terminal build by Digital Equipment Corporation (DEC) in 1978, which had additional graphical features like: blinking, bold text, underlining, etc [2]. The VT100 is primarily being emulated by most terminal emulators nowdays. It supports ANSI escape codes described later in the article.

    escape2

  • VT220: is a terminal build by Digital Equipment Corporation (DEC) in 1983, which used a much faster microprocessor at a time and supported 8-bit character set.

    escape3

In the table below we can see the capabilities of most used terminal emulators that we can choose from today.

escape4

Terminal emulators usually support escape sequences that abide to the terminal control sequence standards like: ECMA-48, ANSI X3.64 or ISO/IEC 6429. ANSI escape code or escape sequence is a way of sending metadata and control information to the terminal by using the same communication channel by specifically encoding special characters. Since only one communication channel is available, the only thing we can do is print text to the screen and there's no way to actually support anything else. In such cases escape sequences are very useful, because they allow us to mark certain characters as being a command, while the rest of the characters is data.

Many programs use escape sequences directly to introduce different functionality like changing the color of the important text to red. In reality, programs should not emit terminal escape sequences manually, but should rather use ncurses, tput or reset utilities.

Linux terminal controls can be used for calling special functionality in the terminal by using control characters. Character encoding is primarily used to represent printable characters in the screen, but special characters used for presenting additional information about the text are supported – such additional information is the position of the text, a message beep indicating the text has been received, etc [10].

Escape Sequences

The escape characters are used to support different features like formatting, color change, cursor position, reconfigure the keyboard, update the window title, etc. The encoding works by using a specific sequence of bytes, which are embedded directly into the text stream, but are handled specially by the terminal. There are a different set of escape sequences that was supported by hardware terminals in the old days or is supported by terminal emulators nowdays – we have to look up whether an escape sequence is supported by specific terminal in order to use it.

Today, most terminal emulators interpret at least some of the ANSI escape sequences, which can be generated by a wide range of console programs like: grep, sed, awk, cat, tail, less, more, emacs, etc.

The ASCII table is presented on the picture below, so we can observe the character mentioned with ease.

escape5

The standard ASCII control codes are the following (and are also presented on the ASCII picture above) [10]:

ASCII control codes
Acronym Name Description
NUL Null Used as a string terminator, especially in C programming language.
SOH Start of Heading First character of the message header.
STX Start of text First character of the message text.
ETX End of Text Used as a break character (Ctrl-C) to interrupt a program.
EOT End of Transmission Used to signal end of file, which is used to logout from a terminal.
ENQ Enquiry A signal used for checking whether the receiving end is still present.
ACK Acknowledge A signal used to indicate a successful receipt of the message used as a response code to ENQ.
BEL Bell Used to invoke a sound bell in terminal.
BS Backspace Delete the character last character to the left of the current position of the cursor.
HT Horizontal Tabulation Introduce a special tab stop character.
LF Line Feed Introduce an end of line.
VT Vertical Tabulation Position the form at the next line tab stop.
FF Form Feed Used to clear the screen in some terminal emulators.
CR Carriage Return Often used together with LF to mark the end of line.
SO Shift Out Switch to an alternative character set.
SI Shift In Return to regular character set after Shift Out.
DLE Data Link Escape Causes the next character to be interpreted as raw data and not as a control code or graphic character.
DC1 Device Control One Reserved for controlling the connected device.
DC2 Device Control Two Reserved for controlling the connected device.
DC3 Device Control Three Reserved for controlling the connected device.
DC4 Device Control Four Reserved for controlling the connected device.
NAK Negative Acknowledge Indicates an error was detected in the previous received block and a re-transmission should occur.
SYN Synchronous Idle Used in synchronous transmission systems to provide a signal from which synchronous correction may be achieved.
ETB End of Transmission Block Indicates the end of a transmission block of data.
CAN Cancel Indicates that the data preceding it are in error or are to be disregarded.
EM End of medium Indicates that the end of the usable portion of the tape has been reached.
SUB Substitute Indicates that a garbled or invalid characters had been received.
ESC Escape The ESC can be used in software user interfaces to exit from a screen/menu/mode and on terminals to signal that what follows is a special command sequence rather than normal text.
FS File Separator Used for file separation.
GS Group separator Used for group separation.
RS Record Separator Used for record separation.
US Unit separator Used for unit separation.
SP Space It causes the active position to be advanced by one character position.
DEL Delete On terminals, the character is generated by pressing the backspace, which deletes the last character.

We should also look at the extended control codes presented in the table below [10]. Extended control codes basically use the ESC control code character that represents an escape sequence. Each escape sequence starts with ESC character(0x1B in hexadecimal representation) also called escape character and is used to invoke alternative interpretation of subsequent characters – therefore that character will be interpreted as command rather than as data. Remember than an escape character can be any character, like a backslash \ in C++, which is used to escape characters for a new line, a new tabulator, etc. A character sequence starts with an ESC escape character followed by 0x40-0x5F character.

Extended control codes
ESC Acronym Name Description
ESC @ PAD Padding Character Not part of ECMA-48.
ESC A HOP High Octet Preset Not part of ECMA-48.
ESC B BPH Break Permitted Here Follows a graphic character where a line break is permitted.
ESC C NBH No Break Here Follows the graphic character that is not to be broken.
ESC D IND Index Move the active position one line down, to eliminate ambiguity about the meaning of LF.
ESC E NEL Next Line Equivalent to CRLF and is used to mark end of line.
ESC F SSA Start of Selected Area Used by block-oriented terminals.
ESC G ESA End of Selected Area Used by block-oriented terminals.
ESC H HTS Horizontal Tabulation Set Causes a character tabulation stop to be set at the active position.
ESC I HTJ Horizontal Tabulation With Justification The spaces or lines are placed preceding the active field so that preceding graphic character is placed just before the next tab stop.
ESC J VTS Vertical Tabulation Set Causes a line tabulation stop to be set at the active position.
ESC K PLD Partial Line Forward Used to produce subscripts and superscrips.
ESC L PLU Partial Line Backward Used to produce subscripts and superscrips.
ESC M RI Reverse Line Feed  
ESC N SS2 Single-Shift 2 Next character invokes a graphic character from the G2 graphic sets.
ESC O SS3 Single-Shift 3 Next character invokes a graphic character from the G3 graphic sets.
ESC P DCS Device Control String Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C).
ESC Q PU1 Private Use 1 Reserved for a function without standardized meaning for private use as required.
ESC R PU2 Private Use 2 Reserved for a function without standardized meaning for private use as required.
ESC S STS Set Transmit State  
ESC T CCH Cancel Character Destructive backspace, intended to eliminate ambiguity about meaning of BS.
ESC U MW Message Waiting  
ESC V SPA Start of Protected Area Used by block-oriented terminals.
ESC W EPA End of Protected Area Used by block-oriented terminals.
ESC X SOS Start of String Followed by a control string terminated by ST (0x9C) that may contain any character except SOS or ST.
ESC Y SGCI Single Graphic Character Introducer Not part of ISO/IEC 6429.
ESC Z SCI Single Character Introducer To be followed by a single printable character (0x20 through 0x7E) or format effector (0x08 through 0x0D) used to access a graphic character regardless of which graphic or control set is in use.
ESC [ CSI Control Sequence Introducer Used to introduce control sequences that take parameters.
ESC ST String Terminator,  
ESC ] OSC Operating System Command Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences.
ESC ^ PM Privacy Message Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences.
ESC _ APC Application Program Command Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences.

From the tables above we can infer that escape sequences can be used in two ways:

  • Single Control Sequences: escape sequences starting with ESC 0x1B character followed by the ASCII character 0x40-0x5F (this is effectively presented in the table above).
  • Parameter Control Sequences: a special case of single control sequence "ESC [", which is used for control sequences that take parameters. The sequence is called CSI or Control Sequence Introducer.

The structure of ANSI escape sequences is as follows, where the first line presents the most generic representation of the escape sequence from which all escape sequences can be derived. The rest of the lines contain generally preferred and widely used escape sequences, which are easier to understand and work with.

ESC[ <private\_chars><n1>;<n2><trailing\_chars><final\_byte>
ESC[ <n1>;<n2><final\_byte>
ESC[ <n1><final\_byte>
ESC[ <final\_byte>

The elements in the escape sequences are described in the table below.

Escape sequence elements
Element Description
ESC [ The control sequence introducer character, which specifies the control sequence that takes parameters.
private_chars Optional parameter specifying leading private mode characters from range 0x30-0x3F.
n1 Optional number parameter by default set to either 0 or 1 depending on the command.
; Colon used for future standardization, which delimits the first and second number and can be omitted if the n2 and trailing_chars are not used.
n2 Optional number parameter by default set to either 0 or 1 depending on the command.
trailing_chars Optional parameter – if omitted, the semicolon can also be omitted.
final_byte Any character in range 0x40-0x7E, which can be modified by intermediate bytes in range 0x20-0x2F.

Prior to showing the real world usage of escape sequences, let's also take a look at the man page of echo command used to display a line of text in the terminal emulator. We'll be using echo command to show how escape sequences can be used when printing text to a terminal emulator. When the echo is used together with the -e parameter, it will enable interpretation of backslash escape characters, which are the following:

  • \\ : backslash
  • \a : alert (BEL)
  • \b : backspace
  • \c : produce no further output
  • \e : escape
  • \f : form feed
  • \n : new line
  • \r : carriage return
  • \t : horizontal tab
  • \v : vertical tab
  • \0NNN : byte with octal value NNN (1 to 3 digits)
  • \xHH : byte with hexadecimal value HH (1 to 2 digits)

Let's take a look at the few escape sequences and how we can use them [5]:

Examples of escape sequences
CSI Command/Description Image
CSI n A Moves the cursor n (default 1) cells up. The first echo prints the sentence "This is a sunny day." normally. The second echo first prints "This is a " and then moves one line up overwriting part of the command-line text with the rest of the string "sunny day.". Then a new line is also printed that overwrites the original text "This is a sunny ". escape6
CSI n B Moves the cursor n (default 1) cells down. The second echo command prints the first part of the string, adds a new line and the rest of the string is displayed. escape7
CSI n C Moves the cursor n (default 1) cells forward. The first part of the string is displayed, after which an additional space character is introduced and the rest of the string displayed. The result are two spaced being displayed between the two words. escape8
CSI n D Moves the cursor n (default 1) cells back. The first part of the string is displayed, after which the last character is deleted and the rest of the string displayed – the result is the deletion of the space character that delimits the two words. escape9

Let's also present the color escape sequences that are quite often used in shell scripts to present text in a different color. I use the following color escape sequences to define different colors that can be used for displaying text in a terminal emulator.

black='\\e[00;30m'
red='\\e[00;31m'
green='\\e[00;32m'
yellow='\\e[00;33m'
blue='\\e[00;34m'
magenta='\\e[00;35m'
cyan='\\e[00;36m'
white='\\e[00;37m'
grey='\\e[30;1m\\]\\[\\016'
none='\\e[00m'

The syntax of the color escape sequences differs based on whether we're trying to change text color or background color in a terminal emulator. The syntax for each of them is presented below:

  • text : \e[{attribute_code};{text_color_code}m
  • background : \e[{attribute_code};{text_color_code};{background_color_code}m

The color for representing text elements starts with the control sequence introducer followed by the attribute code, followed by a semicolon ';' followed by the text color code and a letter 'm'. The black color above is represented with an escape sequence '\e[00;30m', where the 00 is the attribute_code and the 30 is the text_color_code.

There are many more escape sequences that can be used in the terminal emulator and can be located on the internet at [5, 11, 12]. Most of them are used for visually representing text displayed in a terminal emulator, while some can also be dangerous and can be used for arbitrary code execution – we'll take a look at those commands in the next section of the article.

To use the same example as before, we can display the word 'sunny' in red by adding the red escape color sequence before the word like presented on the picture below. After the sunny word, we must reset the text color back to black by using the black escape color sequence, otherwise the rest of the line will be displayed with a red text.

escape10

So far we've only used "echo -e" to use escape sequences in order to present certain words in red instead of in black color. But when an attacker wants to exploit a vulnerability in an Apache web server, he must be able to include special characters in log files, which get parsed as escape sequences when being displayed in the terminal emulator. Below we can see the usage of printf command where the standard string sentence is being piped to the test.bin file. When displaying the contents of test.bin file to terminal emulator with cat command, the escape sequences are taken into account, since the "sunny" is written in color red. At the end we're also displaying the contents of the test.bin file in hexadecimal format with xxd command to see the hexadecimal bytes consituting the text file.

escape11

We can also edit the text file in vim editor to see the bytes comprising a file.

escape12

Below I've highlighted the ESC byte, which has a hexadecimal representation 0x1b, the '[' character 0x5b and the zeros 0x30. What follows is the ; (0x3b), 3 (0x33), 1 (0x31), m (0x6d), etc.

escape13

Therefore if an attacker would like to use the escape sequences, he must be able to save those bytes into the log file, which will consequentially be outputted by a victim by using the cat command.

Exploitation

Escape characters are often used in shell scripts to provide syntax highlighting, but they can also be used by an attacker to cause terminal emulator to print garbage characters as well as possibly execute arbitrary commands in the system on which the terminal emulator is running. For an attack to be possible, an attacker must be able to display arbitrary data in the victim's terminal emulator. An attacker can do this by using security vulnerabilities in different software components to save save arbitrary data to the log files – an example is the CVE-2013-1892vulnerability in Apache HTTP server 2.2.x before 2.2.25, where the mod_rewrite module writes arbitrary data to the log file without sanitizing special characters, which could be used to present escape sequences. A complete description of the vulnerability is presented below.

mod_rewrite.c in the mod_rewrite module in the Apache HTTP Server 2.2.x before 2.2.25 writes data to a log file without sanitizing non-printable characters, which might allow remote attackers to execute arbitrary commands via an HTTP request containing an escape sequence for a terminal emulator.

Because the vulnerability exists in Apache HTTP server, it allows attacker to write escape sequences into the error log files.

Dangerous Escape Sequences

Terminal emulators support multiple features as described below [8]:

  • Screen Dumping: a screen dump escape sequence will open arbitrary file and write the current content of the terminal into the file. Some terminal emulators will not write to existing files, but only to new files, while others will simply overwrite the file with the new contents. An attacker might use this feature to create a new backdoor PHP file in the DocumentRoot of the web server, which can later be used to execute arbitrary commands.
  • Window Title: an escape sequence exists for setting the window title, which will change the window title string. This feature can be used together with another escape sequence, which reads the current window title and prints it to the current command line. Since a carriage return character is prohibited in the window title, an attacker can store the command in a window title and print it to the current command line, but it would still require a user to press enter in order to execute it. There are techniques for making the command invisible, like setting the text color to the same color as the background, which increases the changes of user pressing the enter key.
  • Command Execution: some terminal emulators could even allow execution of the command directly by using an escape sequence.

The test the screen dumping capability, we will download the rxvt version 2.7.8, which we can do by using the following commands. The compilation will most probably fail, which is why we need to apply thispatch, which will solve the problem and we'll be able to build successfully.

# apt-get install build-essential libXt-dev
# wget http://downloads.sourceforge.net/project/rxvt/rxvt-dev/2.7.8/rxvt-2.7.8.tar.gz
# tar xvzf rxvt-2.7.8.tar.gz
# cd rxvt-2.7.8/
# ./configure && make && make install

Then we have to consult the rxvt documentation available in rxvt-2.7.8/doc/rxvtRef.html file, which contains the following escape sequences:

Used escape sequences in rxvt-2.7.8
Escape Sequence Description
ESC c Reset the current contents of the terminal window, where the ESC is presented with 'e' character.
LF Insert a new line into the current terminal window; the 'n' character'.
ESC ] 55;Pt Save all scrollback buffer of the current terminal window into a file specified by Pt, where the ESC is presented with 'e' character.
BEL, Bell; the 'a' character.  

Let's now execute the command presented on the picture below in the newly built rxvt; we're using echo to introduce escape sequences to the current terminal emulator. The red box (number 1) represents the "ESC c", which resents the contents of the current terminal window. Next the "data" string representing the actual file contents is echoed to the terminal window. Afterwards, there's the "ESC ]55;Pt", where the Pt text string is '/tmp/index.php'. That command dumps the current contents of the terminal window into the file /tmp/index.php. At the end there's a bell character indicating the end of command.

escape14

Once the command is executed, only the word "data" is present in the terminal window, while the rest of the characters have been reset.

escape15

If we display the contents of the /tmp/index.php file, we can observer that the word data is present in it with a bunch of new line characters. Therefore the attack on the terminal emulator has been successful and an arbitrary file was written on the file system. An attacker can use this functionality to create a backdoor in the DocumentRoot of the Apache web server in order to gain complete access to the system.

escape16

The actual vulnerable code is present in the rxvt-2.7.8/src/command.c source file, which contains the code presented below. The code is using a switch statement, which checks whether the current switch statement is Xterm_dumpscreen in which case, a file handle is open and the contents of the screen are dumped to the file, then the file is closed.

escape17

Conclusion

In this article we've seen how we can use escape sequences to do various actions in the terminal emulator. Its important to emphasize that escape sequences used in up-to-date terminal emulators used today most likely don't have such vulnerabilities, since they have been mitigated in the past. However, terminal emulators developed recently and used on other devices, like on embedded devices, Android, iOS can still contain such vulnerabilities.

All in all, this makes an interesting attack vector, which we must be aware of when penetration testing. If we can convince the application into writing a non-printable characters to a log file, then arbitrary escape sequences can be included in the file, which could be used to execute arbitrary code in a vulnerable version of a terminal emulator, write an arbitrary file to the file system, etc.

An additional research needs to be done to determine whether recent terminal emulators are using dangerous escape sequences that could lead to a system compromise. The terminal emulators on Android, iOS and other embedded systems are the ones that should be checked.

Comments