In the beginning of time, there were hardware terminal emulatorsalso called ttys, which are programs emulating a video terminal. In modern computers, we're mostly used to graphical user interfaces (GUIs), whereas a terminal emulator like xterm is used to access the command line interfaces (CLIs) or text user interfaces (TUIs) of programs.
In this article we'll talk about how we can use an Apache CVE-2013-1862 vulnerability to issue an escape sequence attack. The most common terminals used in the old days were the following.
In the table below we can see the capabilities of most used terminal emulators that we can choose from today.
Terminal emulators usually support escape sequences that abide to the terminal control sequence standards like: ECMA-48, ANSI X3.64 or ISO/IEC 6429. ANSI escape code or escape sequence is a way of sending metadata and control information to the terminal by using the same communication channel by specifically encoding special characters. Since only one communication channel is available, the only thing we can do is print text to the screen and there's no way to actually support anything else. In such cases escape sequences are very useful, because they allow us to mark certain characters as being a command, while the rest of the characters is data.
Many programs use escape sequences directly to introduce different functionality like changing the color of the important text to red. In reality, programs should not emit terminal escape sequences manually, but should rather use ncurses, tput or reset utilities.
Linux terminal controls can be used for calling special functionality in the terminal by using control characters. Character encoding is primarily used to represent printable characters in the screen, but special characters used for presenting additional information about the text are supported – such additional information is the position of the text, a message beep indicating the text has been received, etc [10].
Escape Sequences
The escape characters are used to support different features like formatting, color change, cursor position, reconfigure the keyboard, update the window title, etc. The encoding works by using a specific sequence of bytes, which are embedded directly into the text stream, but are handled specially by the terminal. There are a different set of escape sequences that was supported by hardware terminals in the old days or is supported by terminal emulators nowdays – we have to look up whether an escape sequence is supported by specific terminal in order to use it.
Today, most terminal emulators interpret at least some of the ANSI escape sequences, which can be generated by a wide range of console programs like: grep, sed, awk, cat, tail, less, more, emacs, etc.
The ASCII table is presented on the picture below, so we can observe the character mentioned with ease.
The standard ASCII control codes are the following (and are also presented on the ASCII picture above) [10]:
Acronym | Name | Description |
---|---|---|
NUL | Null | Used as a string terminator, especially in C programming language. |
SOH | Start of Heading | First character of the message header. |
STX | Start of text | First character of the message text. |
ETX | End of Text | Used as a break character (Ctrl-C) to interrupt a program. |
EOT | End of Transmission | Used to signal end of file, which is used to logout from a terminal. |
ENQ | Enquiry | A signal used for checking whether the receiving end is still present. |
ACK | Acknowledge | A signal used to indicate a successful receipt of the message used as a response code to ENQ. |
BEL | Bell | Used to invoke a sound bell in terminal. |
BS | Backspace | Delete the character last character to the left of the current position of the cursor. |
HT | Horizontal Tabulation | Introduce a special tab stop character. |
LF | Line Feed | Introduce an end of line. |
VT | Vertical Tabulation | Position the form at the next line tab stop. |
FF | Form Feed | Used to clear the screen in some terminal emulators. |
CR | Carriage Return | Often used together with LF to mark the end of line. |
SO | Shift Out | Switch to an alternative character set. |
SI | Shift In | Return to regular character set after Shift Out. |
DLE | Data Link Escape | Causes the next character to be interpreted as raw data and not as a control code or graphic character. |
DC1 | Device Control One | Reserved for controlling the connected device. |
DC2 | Device Control Two | Reserved for controlling the connected device. |
DC3 | Device Control Three | Reserved for controlling the connected device. |
DC4 | Device Control Four | Reserved for controlling the connected device. |
NAK | Negative Acknowledge | Indicates an error was detected in the previous received block and a re-transmission should occur. |
SYN | Synchronous Idle | Used in synchronous transmission systems to provide a signal from which synchronous correction may be achieved. |
ETB | End of Transmission Block | Indicates the end of a transmission block of data. |
CAN | Cancel | Indicates that the data preceding it are in error or are to be disregarded. |
EM | End of medium | Indicates that the end of the usable portion of the tape has been reached. |
SUB | Substitute | Indicates that a garbled or invalid characters had been received. |
ESC | Escape | The ESC can be used in software user interfaces to exit from a screen/menu/mode and on terminals to signal that what follows is a special command sequence rather than normal text. |
FS | File Separator | Used for file separation. |
GS | Group separator | Used for group separation. |
RS | Record Separator | Used for record separation. |
US | Unit separator | Used for unit separation. |
SP | Space | It causes the active position to be advanced by one character position. |
DEL | Delete | On terminals, the character is generated by pressing the backspace, which deletes the last character. |
We should also look at the extended control codes presented in the table below [10]. Extended control codes basically use the ESC control code character that represents an escape sequence. Each escape sequence starts with ESC character(0x1B in hexadecimal representation) also called escape character and is used to invoke alternative interpretation of subsequent characters – therefore that character will be interpreted as command rather than as data. Remember than an escape character can be any character, like a backslash \ in C++, which is used to escape characters for a new line, a new tabulator, etc. A character sequence starts with an ESC escape character followed by 0x40-0x5F character.
ESC | Acronym | Name | Description |
---|---|---|---|
ESC @ | PAD | Padding Character | Not part of ECMA-48. |
ESC A | HOP | High Octet Preset | Not part of ECMA-48. |
ESC B | BPH | Break Permitted Here | Follows a graphic character where a line break is permitted. |
ESC C | NBH | No Break Here | Follows the graphic character that is not to be broken. |
ESC D | IND | Index | Move the active position one line down, to eliminate ambiguity about the meaning of LF. |
ESC E | NEL | Next Line | Equivalent to CRLF and is used to mark end of line. |
ESC F | SSA | Start of Selected Area | Used by block-oriented terminals. |
ESC G | ESA | End of Selected Area | Used by block-oriented terminals. |
ESC H | HTS | Horizontal Tabulation Set | Causes a character tabulation stop to be set at the active position. |
ESC I | HTJ | Horizontal Tabulation With Justification | The spaces or lines are placed preceding the active field so that preceding graphic character is placed just before the next tab stop. |
ESC J | VTS | Vertical Tabulation Set | Causes a line tabulation stop to be set at the active position. |
ESC K | PLD | Partial Line Forward | Used to produce subscripts and superscrips. |
ESC L | PLU | Partial Line Backward | Used to produce subscripts and superscrips. |
ESC M | RI | Reverse Line Feed | |
ESC N | SS2 | Single-Shift 2 | Next character invokes a graphic character from the G2 graphic sets. |
ESC O | SS3 | Single-Shift 3 | Next character invokes a graphic character from the G3 graphic sets. |
ESC P | DCS | Device Control String | Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). |
ESC Q | PU1 | Private Use 1 | Reserved for a function without standardized meaning for private use as required. |
ESC R | PU2 | Private Use 2 | Reserved for a function without standardized meaning for private use as required. |
ESC S | STS | Set Transmit State | |
ESC T | CCH | Cancel Character | Destructive backspace, intended to eliminate ambiguity about meaning of BS. |
ESC U | MW | Message Waiting | |
ESC V | SPA | Start of Protected Area | Used by block-oriented terminals. |
ESC W | EPA | End of Protected Area | Used by block-oriented terminals. |
ESC X | SOS | Start of String | Followed by a control string terminated by ST (0x9C) that may contain any character except SOS or ST. |
ESC Y | SGCI | Single Graphic Character Introducer | Not part of ISO/IEC 6429. |
ESC Z | SCI | Single Character Introducer | To be followed by a single printable character (0x20 through 0x7E) or format effector (0x08 through 0x0D) used to access a graphic character regardless of which graphic or control set is in use. |
ESC [ | CSI | Control Sequence Introducer | Used to introduce control sequences that take parameters. |
ESC | ST | String Terminator, | |
ESC ] | OSC | Operating System Command | Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences. |
ESC ^ | PM | Privacy Message | Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences. |
ESC _ | APC | Application Program Command | Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences Followed by a string of printable characters (0x20 through 0x7E) and format effectors (0x08 through 0x0D), terminated by ST (0x9C). This control code was intended to be used to allow in-band signaling of protocol information, but can also be used as part of escape sequences. |
From the tables above we can infer that escape sequences can be used in two ways:
- Single Control Sequences: escape sequences starting with ESC 0x1B character followed by the ASCII character 0x40-0x5F (this is effectively presented in the table above).
- Parameter Control Sequences: a special case of single control sequence "ESC [", which is used for control sequences that take parameters. The sequence is called CSI or Control Sequence Introducer.
The structure of ANSI escape sequences is as follows, where the first line presents the most generic representation of the escape sequence from which all escape sequences can be derived. The rest of the lines contain generally preferred and widely used escape sequences, which are easier to understand and work with.
ESC[ <private\_chars><n1>;<n2><trailing\_chars><final\_byte> ESC[ <n1>;<n2><final\_byte> ESC[ <n1><final\_byte> ESC[ <final\_byte>
The elements in the escape sequences are described in the table below.
Element | Description |
---|---|
ESC [ | The control sequence introducer character, which specifies the control sequence that takes parameters. |
private_chars | Optional parameter specifying leading private mode characters from range 0x30-0x3F. |
n1 | Optional number parameter by default set to either 0 or 1 depending on the command. |
; | Colon used for future standardization, which delimits the first and second number and can be omitted if the n2 and trailing_chars are not used. |
n2 | Optional number parameter by default set to either 0 or 1 depending on the command. |
trailing_chars | Optional parameter – if omitted, the semicolon can also be omitted. |
final_byte | Any character in range 0x40-0x7E, which can be modified by intermediate bytes in range 0x20-0x2F. |
Prior to showing the real world usage of escape sequences, let's also take a look at the man page of echo command used to display a line of text in the terminal emulator. We'll be using echo command to show how escape sequences can be used when printing text to a terminal emulator. When the echo is used together with the -e parameter, it will enable interpretation of backslash escape characters, which are the following:
- \\ : backslash
- \a : alert (BEL)
- \b : backspace
- \c : produce no further output
- \e : escape
- \f : form feed
- \n : new line
- \r : carriage return
- \t : horizontal tab
- \v : vertical tab
- \0NNN : byte with octal value NNN (1 to 3 digits)
- \xHH : byte with hexadecimal value HH (1 to 2 digits)
Let's take a look at the few escape sequences and how we can use them [5]:
Let's also present the color escape sequences that are quite often used in shell scripts to present text in a different color. I use the following color escape sequences to define different colors that can be used for displaying text in a terminal emulator.
black='\\e[00;30m' red='\\e[00;31m' green='\\e[00;32m' yellow='\\e[00;33m' blue='\\e[00;34m' magenta='\\e[00;35m' cyan='\\e[00;36m' white='\\e[00;37m' grey='\\e[30;1m\\]\\[\\016' none='\\e[00m'
The syntax of the color escape sequences differs based on whether we're trying to change text color or background color in a terminal emulator. The syntax for each of them is presented below:
- text : \e[{attribute_code};{text_color_code}m
- background : \e[{attribute_code};{text_color_code};{background_color_code}m
The color for representing text elements starts with the control sequence introducer followed by the attribute code, followed by a semicolon ';' followed by the text color code and a letter 'm'. The black color above is represented with an escape sequence '\e[00;30m', where the 00 is the attribute_code and the 30 is the text_color_code.
There are many more escape sequences that can be used in the terminal emulator and can be located on the internet at [5, 11, 12]. Most of them are used for visually representing text displayed in a terminal emulator, while some can also be dangerous and can be used for arbitrary code execution – we'll take a look at those commands in the next section of the article.
To use the same example as before, we can display the word 'sunny' in red by adding the red escape color sequence before the word like presented on the picture below. After the sunny word, we must reset the text color back to black by using the black escape color sequence, otherwise the rest of the line will be displayed with a red text.
So far we've only used "echo -e" to use escape sequences in order to present certain words in red instead of in black color. But when an attacker wants to exploit a vulnerability in an Apache web server, he must be able to include special characters in log files, which get parsed as escape sequences when being displayed in the terminal emulator. Below we can see the usage of printf command where the standard string sentence is being piped to the test.bin file. When displaying the contents of test.bin file to terminal emulator with cat command, the escape sequences are taken into account, since the "sunny" is written in color red. At the end we're also displaying the contents of the test.bin file in hexadecimal format with xxd command to see the hexadecimal bytes consituting the text file.
We can also edit the text file in vim editor to see the bytes comprising a file.
Below I've highlighted the ESC byte, which has a hexadecimal representation 0x1b, the '[' character 0x5b and the zeros 0x30. What follows is the ; (0x3b), 3 (0x33), 1 (0x31), m (0x6d), etc.
Therefore if an attacker would like to use the escape sequences, he must be able to save those bytes into the log file, which will consequentially be outputted by a victim by using the cat command.
Exploitation
Escape characters are often used in shell scripts to provide syntax highlighting, but they can also be used by an attacker to cause terminal emulator to print garbage characters as well as possibly execute arbitrary commands in the system on which the terminal emulator is running. For an attack to be possible, an attacker must be able to display arbitrary data in the victim's terminal emulator. An attacker can do this by using security vulnerabilities in different software components to save save arbitrary data to the log files – an example is the CVE-2013-1892vulnerability in Apache HTTP server 2.2.x before 2.2.25, where the mod_rewrite module writes arbitrary data to the log file without sanitizing special characters, which could be used to present escape sequences. A complete description of the vulnerability is presented below.
mod_rewrite.c in the mod_rewrite module in the Apache HTTP Server 2.2.x before 2.2.25 writes data to a log file without sanitizing non-printable characters, which might allow remote attackers to execute arbitrary commands via an HTTP request containing an escape sequence for a terminal emulator.
Because the vulnerability exists in Apache HTTP server, it allows attacker to write escape sequences into the error log files.
Dangerous Escape Sequences
Terminal emulators support multiple features as described below [8]:
- Screen Dumping: a screen dump escape sequence will open arbitrary file and write the current content of the terminal into the file. Some terminal emulators will not write to existing files, but only to new files, while others will simply overwrite the file with the new contents. An attacker might use this feature to create a new backdoor PHP file in the DocumentRoot of the web server, which can later be used to execute arbitrary commands.
- Window Title: an escape sequence exists for setting the window title, which will change the window title string. This feature can be used together with another escape sequence, which reads the current window title and prints it to the current command line. Since a carriage return character is prohibited in the window title, an attacker can store the command in a window title and print it to the current command line, but it would still require a user to press enter in order to execute it. There are techniques for making the command invisible, like setting the text color to the same color as the background, which increases the changes of user pressing the enter key.
- Command Execution: some terminal emulators could even allow execution of the command directly by using an escape sequence.
The test the screen dumping capability, we will download the rxvt version 2.7.8, which we can do by using the following commands. The compilation will most probably fail, which is why we need to apply thispatch, which will solve the problem and we'll be able to build successfully.
# apt-get install build-essential libXt-dev # wget http://downloads.sourceforge.net/project/rxvt/rxvt-dev/2.7.8/rxvt-2.7.8.tar.gz # tar xvzf rxvt-2.7.8.tar.gz # cd rxvt-2.7.8/ # ./configure && make && make install
Then we have to consult the rxvt documentation available in rxvt-2.7.8/doc/rxvtRef.html file, which contains the following escape sequences:
Escape Sequence | Description |
---|---|
ESC c | Reset the current contents of the terminal window, where the ESC is presented with 'e' character. |
LF | Insert a new line into the current terminal window; the 'n' character'. |
ESC ] 55;Pt | Save all scrollback buffer of the current terminal window into a file specified by Pt, where the ESC is presented with 'e' character. |
BEL, Bell; the 'a' character. |
Let's now execute the command presented on the picture below in the newly built rxvt; we're using echo to introduce escape sequences to the current terminal emulator. The red box (number 1) represents the "ESC c", which resents the contents of the current terminal window. Next the "data" string representing the actual file contents is echoed to the terminal window. Afterwards, there's the "ESC ]55;Pt", where the Pt text string is '/tmp/index.php'. That command dumps the current contents of the terminal window into the file /tmp/index.php. At the end there's a bell character indicating the end of command.
Once the command is executed, only the word "data" is present in the terminal window, while the rest of the characters have been reset.
If we display the contents of the /tmp/index.php file, we can observer that the word data is present in it with a bunch of new line characters. Therefore the attack on the terminal emulator has been successful and an arbitrary file was written on the file system. An attacker can use this functionality to create a backdoor in the DocumentRoot of the Apache web server in order to gain complete access to the system.
The actual vulnerable code is present in the rxvt-2.7.8/src/command.c source file, which contains the code presented below. The code is using a switch statement, which checks whether the current switch statement is Xterm_dumpscreen in which case, a file handle is open and the contents of the screen are dumped to the file, then the file is closed.
Conclusion
In this article we've seen how we can use escape sequences to do various actions in the terminal emulator. Its important to emphasize that escape sequences used in up-to-date terminal emulators used today most likely don't have such vulnerabilities, since they have been mitigated in the past. However, terminal emulators developed recently and used on other devices, like on embedded devices, Android, iOS can still contain such vulnerabilities.
All in all, this makes an interesting attack vector, which we must be aware of when penetration testing. If we can convince the application into writing a non-printable characters to a log file, then arbitrary escape sequences can be included in the file, which could be used to execute arbitrary code in a vulnerable version of a terminal emulator, write an arbitrary file to the file system, etc.
An additional research needs to be done to determine whether recent terminal emulators are using dangerous escape sequences that could lead to a system compromise. The terminal emulators on Android, iOS and other embedded systems are the ones that should be checked.
References:
[1] Terminal emulator, https://en.wikipedia.org/wiki/Terminal_emulator.
[2] VT100, https://en.wikipedia.org/wiki/VT100.
[3] VT52, https://en.wikipedia.org/wiki/VT52.
[4] VT220, https://en.wikipedia.org/wiki/VT220.
[5] ANSI escape code, https://en.wikipedia.org/wiki/ANSI_escape_code.
[6] Ascii Table, http://www.asciitable.com.
[7] CVE-2013-1862, http://www.cvedetails.com/cve/CVE-2013-1862.
[8] TERMINAL EMULATOR SECURITY ISSUES, http://marc.info/?l=bugtraq&m=104612710031920&q=p3.
[9] Xterm Control Sequences, http://www.xfree86.org/current/ctlseqs.html.
[10] C0 and C1 control codes, https://en.wikipedia.org/wiki/C0_and_C1_control_codes.
[11] ANSI Escape sequences (ANSI Escape codes), http://ascii-table.com/ansi-escape-sequences.php.
[12] ANSI Escape Sequences, http://www.isthe.com/chongo/tech/comp/ansi_escapes.html.
Comments