TransWikia.com

Removing ANSI color codes from text stream

Super User Asked on November 27, 2021

Examining the output from

perl -e 'use Term::ANSIColor; print color "white"; print "ABCn"; print color "reset";'

in a text editor (e.g., vi) shows the following:

^[[37mABC
^[[0m

How would one remove the ANSI color codes from the output file? I suppose the best way would be to pipe the output through a stream editor of sorts.

The following does not work

perl -e 'use Term::ANSIColor; print color "white"; print "ABCn"; print color "reset";' | perl -pe 's/^[[37m//g' | perl -pe 's/^[[0m//g'

15 Answers

Python port of Adam Katz's excellent and comprehensive perl answer:

    def escape_ansi(line):
        re1 = re.compile(r'x1b[[x30-x3f]*[x20-x2f]*[x40-x7e]')
        re2 = re.compile(r'x1b[PX^_].*?x1b\')
        re3 = re.compile(r'x1b][^a]*(?:a|x1b\)')
        re4 = re.compile(r'x1b[[]A-Z\^_@]')
        # re5: zero-width ASCII characters
        # see https://superuser.com/a/1388860
        re5 = re.compile(r'[x00-x1fx7f-x9fxad]+')

        for r in [re1, re2, re3, re4, re5]:
            line = r.sub('', line)

        return line

This includes the C0/C1 sequence removal, so remove that if you don't need it. I realize this is not optimized since it's multiple regex passes, but it did the trick for me and optimization wasn't a concern for me.

Answered by Kevin on November 27, 2021

I've had to look this up too many times, so I decided to make a free online tool for it. No need to remember sed commands for this!

Hope it works well for you, too: https://maxschmitt.me/ansistrip/

Answered by Macks on November 27, 2021

There's also a dedicated tool for the job: ansifilter. Use the default --text output format.

ref: https://stackoverflow.com/a/6534712

Answered by Juan on November 27, 2021

Combining @Adam-Katz @Mike answers I get:

sed -E $'s|x1b\[[0-\?]*[ -/]*[@-~]||g;
         s|x1b[PX^_][^x1b]*x1b\\||g;
         s:x1b\][^x07]*(x07|x1b\\)::g;
         s|x1b[@-_]||g'

This should work on macos, linux, and mingw64x (Git for Windows)

Note: On super old GNU sed (pre 4.2), the -E flag needs to be replaced with -r (like CentOS 6.0 old)

Explanation of regexs

1st: An ANSI CSI Code consists of (in order)

  1. One x1b
  2. One [
  3. Zero or more parameter bytes 0x30-0x3f
  4. Zero or more intermediate bytes 0x20-0x2f
  5. One final byte 0x40-0x7f

2nd and 3rd: I'm unfamiliar with with in practice, but have read about them in the linked page.

4th: Just a catch all to get all remaining escape codes, assuming there are zero extra bytes. As these codes could do anything they want, it's possible data bytes get left behind, but extremely unlikely as they aren't used much in practice.

Answered by Andy on November 27, 2021

ansi2txt

https://unix.stackexchange.com/a/527259/116915

cat typescript | ansi2txt | col -b
  • ansi2txt: remove ANSI color codes
  • col -b: remove ^H or ^M

Answered by yurenchen on November 27, 2021

"tput sgr0" left this control character ^(B^[
Here is a modified version to take care of that.

perl -pe 's/e[[(][0-9;]*[mGKFB]//g' logfile.log

Answered by GustafAnkarloo on November 27, 2021

I believe this is an authoritative removal of all ANSI escape sequences:

perl -pe '
  s/e[[x30-x3f]*[x20-x2f]*[x40-x7e]//g;
  s/e[PX^_].*?e\//g;
  s/e][^a]*(?:a|e\)//g;
  s/e[[]A-Z\^_@]//g;'

(Please note that perl, like many other languages (but not sed), accepts e as the escape character Esc, x1b or 33 by code, shown in terminals as ^[. I'm using it here because it seems more intuitive.)

This perl command, which you can run all on one line if you prefer, has four replacements in it:

The first goes after CSI sequences (escape code sequences that begin with the "Control Sequence Introducer" of Esc[, which covers a lot more than the Select Graphic Rendition sequences that make up the color codes and other text decorations).

The second replacement removes the remaining sequences that involve trailing characters and terminate with ST (the String Terminator, Esc). The third replacement is the same thing but also allows Operating System Command sequences to end with a BEL (x07, 07, often a).

The fourth replacement removes the remaining escapes.

Also consider removing other zero-width ASCII characters such as BEL and other more obscure C0 and C1 control characters. I've been using s/[x00-x1fx7f-x9fxad]+//g, which also includes Delete and Soft Hyphen. This excludes Unicode's higher coded zero-width characters but I believe it's exhaustive for ASCII (Unicode x00-xff). If you do this, remove these last since they can be involved in longer sequences.

Answered by Adam Katz on November 27, 2021

This is what worked for me (tested on Mac OS X)

perl -pe 's/[[0-9;]*[mGKF]//g'

Answered by Miguel Mota on November 27, 2021

commandlinefu gives this answer which strips ANSI colours as well as movement commands:

sed "s,x1B[[0-9;]*[a-zA-Z],,g"

For just colours, you want:

 sed "s,x1B[[0-9;]*m,,g"

Answered by Tom Hale on November 27, 2021

If you prefer something simple, you could use my strip-ansi-cli package (Node.js required):

$ npm install --global strip-ansi-cli

Then use it like this:

$ strip-ansi < colors.o

Or just pass in a string:

$ strip-ansi '^[[37mABC^[[0m'

Answered by Sindre Sorhus on November 27, 2021

I had similar problem with removing characters added from collecting interactive top output via putty and this helped:

cat putty1.log | perl -pe 's/x1b.*?[mGKH]//g'

Answered by Michał Faleński on November 27, 2021

The "answered" question didn't work for me, so I created this regex instead to remove the escape sequences produced by the perl Term::ANSIColor module.

cat colors.o | perl -pe 's/x1b[[^m]+m//g;

Grawity's regex should work fine, but using +'s appears to work ok too.

Answered by castl3bravo on November 27, 2021

I have found out a better escape sequence remover if you're using MacOS. Check this:

perl -pe 's/x1b[[0-9;]*[mG]//g'

Answered by user204331 on November 27, 2021

The characters ^[[37m and ^[[0m are part of the ANSI escape sequences (CSI codes).  See also these specifications.

Using GNU sed

sed 's/x1b[[0-9;]*m//g'
  • x1b (or x1B) is the escape special character
    (sed does not support alternatives e and 33)
  • [ is the second character of the escape sequence
  • [0-9;]* is the color value(s) regex
  • m is the last character of the escape sequence

⚠ On macOS, the default sed command does not support special characters like e as pointed out by slm and steamer25 in the comments. Use instead gsed that you can install using brew install gnu-sed.

Example with OP's command line:   (OP means Original Poster)

perl -e 'use Term::ANSIColor; print color "white"; print "ABCn"; print color "reset";' | 
      sed 's/x1b[[0-9;]*m//g'

Tom Hale suggests to remove all other escape sequences using [a-zA-Z] instead of just the letter m specific to the graphics mode (color) escape sequence. But [a-zA-Z] may be too wide and could remove too much. Michał Faleński and Miguel Mota propose to remove only some escape sequences using [mGKH] and [mGKF] respectively. Britton Kerin indicates K must also be used in addition to m to remove the colors from gcc error/warning (do not forget to redirect gcc 2>&1 | sed...).

sed 's/x1b[[0-9;]*m//g'           # Remove color sequences only
sed 's/x1b[[0-9;]*[a-zA-Z]//g'    # Remove all escape sequences
sed 's/x1b[[0-9;]*[mGKH]//g'      # Remove color and move sequences
sed 's/x1b[[0-9;]*[mGKF]//g'      # Remove color and move sequences
Last escape
sequence
character   Purpose
---------   -------------------------------
m           Graphics Rendition Mode (including Color)
G           Horizontal cursor move
K           Horizontal deletion
H           New cursor position
F           Move cursor to previous n lines

Using perl

The version of sed installed on some operating systems may be limited (e.g. macOS). The command perl has the advantage of being generally easier to install/update on more operating systems. Adam Katz suggests to use e (same as x1b) in PCRE.

Choose your regex depending on how much commands you want to filter:

perl -pe 's/e[[0-9;]*m//g'          # Remove colors only
perl -pe 's/e[[0-9;]*[mG]//g'
perl -pe 's/e[[0-9;]*[mGKH]//g'
perl -pe 's/e[[0-9;]*[a-zA-Z]//g'
perl -pe 's/e[[0-9;]*m(?:e[K)?//g' # Adam Katz's trick

Example with OP's command line:

perl -e 'use Term::ANSIColor; print color "white"; print "ABCn"; print color "reset";' 
      | perl -pe 's/e[[0-9;]*m//g'

Usage

As pointed out by Stuart Cardall's comment, this sed command line is used by the project Ultimate Nginx Bad Bot (1000 stars) to clean up the email report ;-)

Answered by oHo on November 27, 2021

What is displayed as ^[ is not ^ and [; it is the ASCII ESC character, produced by Esc or Ctrl[ (the ^ notation means the Ctrl key).

ESC is 0x1B hexadecimal or 033 octal, so you have to use x1B or 33 in your regexes:

perl -pe 's/33[37m//g; s/33[0m//g'

perl -pe 's/33[d*(;d*)*m//g'

Answered by user1686 on November 27, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP