我如何从Python中的string中删除ANSI转义序列

这是我的string:

'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m' 

我正在使用代码来检索SSH命令的输出,我希望我的string只包含'examplefile.zip'

我可以用来删除额外的转义序列?

用正则expression式删除它们:

 import re ansi_escape = re.compile(r'\x1b[^m]*m') ansi_escape.sub('', sometext) 

演示:

 >>> import re >>> ansi_escape = re.compile(r'\x1b[^m]*m') >>> sometext = 'ls\r\n\x1b[00m\x1b[01;31mexamplefile.zip\x1b[00m\r\n\x1b[01;31m' >>> ansi_escape.sub('', sometext) 'ls\r\nexamplefile.zip\r\n' 

这个问题被接受的答案只考虑颜色和字体的影响。 有很多不以'm'结尾的序列,例如光标定位,删除和滚动区域。

控制序列(又名ANSI转义序列)的完整正则expression式是

 /(\x9B|\x1B\[)[0-?]*[ -\/]*[@-~]/ 

请参阅ECMA-48第5.4节和ANSI转义码

function

基于Martijn Pieters♦对 Jeff的正则expression式 的回答 。

 def escape_ansi(line): ansi_escape = re.compile(r'(\x9B|\x1B\[)[0-?]*[ -/]*[@-~]') return ansi_escape.sub('', line) 

testing

 def test_remove_ansi_escape_sequence(self): line = '\t\u001b[0;35mBlabla\u001b[0m \u001b[0;36m172.18.0.2\u001b[0m' escaped_line = escape_ansi(line) self.assertEqual(escaped_line, '\tBlabla 172.18.0.2') 

testing

如果你想自己运行它,使用python3 (更好的unicode支持,blablabla)。 这里是testing文件应该如何:

 import unittest import re def escape_ansi(line): … class TestStringMethods(unittest.TestCase): def test_remove_ansi_escape_sequence(self): … if __name__ == '__main__': unittest.main() 

build议的正则expression式并没有为我做的伎俩,所以我创build了一个我自己的。 以下是我基于这里find的规范创build的Python正则expression式

 ansi_regex = r'\x1b(' \ r'(\[\??\d+[hl])|' \ r'([=<>a-kzNM78])|' \ r'([\(\)][a-b0-2])|' \ r'(\[\d{0,2}[ma-dgkjqi])|' \ r'(\[\d+;\d+[hfy]?)|' \ r'(\[;?[hf])|' \ r'(#[3-68])|' \ r'([01356]n)|' \ r'(O[mlnp-z]?)|' \ r'(/Z)|' \ r'(\d+)|' \ r'(\[\?\d;\d0c)|' \ r'(\d;\dR))' ansi_escape = re.compile(ansi_regex, flags=re.IGNORECASE) 

我在下面的代码片段(基本上是ascii-table.com页面的复制粘贴)testing了我的正则expression式,

 \x1b[20h Set \x1b[?1h Set \x1b[?3h Set \x1b[?4h Set \x1b[?5h Set \x1b[?6h Set \x1b[?7h Set \x1b[?8h Set \x1b[?9h Set \x1b[20l Set \x1b[?1l Set \x1b[?2l Set \x1b[?3l Set \x1b[?4l Set \x1b[?5l Set \x1b[?6l Set \x1b[?7l Reset \x1b[?8l Reset \x1b[?9l Reset \x1b= Set \x1b> Set \x1b(A Set \x1b)A Set \x1b(B Set \x1b)B Set \x1b(0 Set \x1b)0 Set \x1b(1 Set \x1b)1 Set \x1b(2 Set \x1b)2 Set \x1bN Set \x1bO Set \x1b[m Turn \x1b[0m Turn \x1b[1m Turn \x1b[2m Turn \x1b[4m Turn \x1b[5m Turn \x1b[7m Turn \x1b[8m Turn \x1b[1;2 Set \x1b[1A Move \x1b[2B Move \x1b[3C Move \x1b[4D Move \x1b[H Move \x1b[;H Move \x1b[4;3H Move \x1b[f Move \x1b[;f Move \x1b[1;2 Move \x1bD Move/scroll \x1bM Move/scroll \x1bE Move \x1b7 Save \x1b8 Restore \x1bH Set \x1b[g Clear \x1b[0g Clear \x1b[3g Clear \x1b#3 Double-height \x1b#4 Double-height \x1b#5 Single \x1b#6 Double \x1b[K Clear \x1b[0K Clear \x1b[1K Clear \x1b[2K Clear \x1b[J Clear \x1b[0J Clear \x1b[1J Clear \x1b[2J Clear \x1b5n Device \x1b0n Response: \x1b3n Response: \x1b6n Get \x1b[c Identify \x1b[0c Identify \x1b[?1;20c Response: \x1bc Reset \x1b#8 Screen \x1b[2;1y Confidence \x1b[2;2y Confidence \x1b[2;9y Repeat \x1b[2;10y Repeat \x1b[0q Turn \x1b[1q Turn \x1b[2q Turn \x1b[3q Turn \x1b[4q Turn \x1b< Enter/exit \x1b= Enter \x1b> Exit \x1bF Use \x1bG Use \x1bA Move \x1bB Move \x1bC Move \x1bD Move \x1bH Move \x1b12 Move \x1bI \x1bK \x1bJ \x1bZ \x1b/Z \x1bOP \x1bOQ \x1bOR \x1bOS \x1bA \x1bB \x1bC \x1bD \x1bOp \x1bOq \x1bOr \x1bOs \x1bOt \x1bOu \x1bOv \x1bOw \x1bOx \x1bOy \x1bOm \x1bOl \x1bOn \x1bOM \x1b[i \x1b[1i \x1b[4i \x1b[5i 

希望这会帮助别人:)

如果你想删除\r\n位,你可以通过这个函数传递string( sarnold ):

 def stripEscape(string): """ Removes all escape sequences from the input string """ delete = "" i=1 while (i<0x20): delete += chr(i) i += 1 t = string.translate(None, delete) return t 

尽pipe如此,这会把转义序列前面和后面的文本混在一起。 因此,使用Martijn的过滤string'ls\r\nexamplefile.zip\r\n' ,您将得到lsexamplefile.zip 。 注意所需文件名前面的ls

我将首先使用stripEscape函数来移除转义序列,然后将输出传递给Martijn的正则expression式,这将避免连接不需要的位。