如何使'cut'命令将多个连续的分隔符视为一个？

我试图从基于列的“空间”调整的文本stream中提取某个（第四个）字段。我试图按照以下方式使用cut命令：

cat text.txt | cut -d " " -f 4

不幸的是， cut不会将多个空格视为一个分隔符。我可以通过AWKpipe道

awk '{ printf $4; }'

或sed

sed -E "s/[[:space:]]+/ /g"

要崩溃的空间，但我想知道是否有办法处理cut和几个本地分隔符？

尝试：

 cat text.txt | tr -s ' ' | cut -d ' ' -f4

从tr man页面：

 -s， -  squeeze-repeatsreplace重复字符的每个input序列
                         SET1中列出了一次
                        那个angular色

当你在你的问题上发表评论时， awk是真的要走的路。正如kev的回答所显示的，使用cut可以和tr -s一起挤压空间。

但是，让我为未来的读者通过所有可能的组合。解释在testing部分。

tr | 切

 tr -s ' ' < file | cut -d' ' -f4

 awk '{print $4}' file

 while read -r _ _ _ myfield _ do echo "forth field: $myfield" done < file

 sed -r 's/^([^ ]*[ ]*){3}([^ ]*).*/\2/' file

给定这个文件，让我们来testing这些命令：

 $ cat a this is line 1 more text this is line 2 more text this is line 3 more text this is line 4 more text

 $ cut -d' ' -f4 a is # it does not show what we want! $ tr -s ' ' < a | cut -d' ' -f4 1 2 # this makes it! 3 4 $

 $ awk '{print $4}' a 1 2 3 4

这将顺序读取字段。通过使用_我们表明这是一个一次性variables作为一个“垃圾variables”忽略这些领域。这样，我们将$myfield作为第四个字段存储在文件中，而不pipe它们之间的空间。

 $ while read -r _ _ _ a _; do echo "4th field: $a"; done < a 4th field: 1 4th field: 2 4th field: 3 4th field: 4

这捕获了三组空格，没有空格([^ ]*[ ]*){3} 。然后，它捕捉到第四个字段的空间，最后用\1打印。

 $ sed -r 's/^([^ ]*[ ]*){3}([^ ]*).*/\2/' a 1 2 3 4

在被cut了太多限制之后，我写下了自己的替代品，我称之为“减less类固醇”。

削减提供了什么可能是最简约的解决scheme，以及许多其他相关的剪切/粘贴问题。

其中一个例子就是解决这个问题：

 $ cat text.txt 0 1 2 3 0 1 2 3 4 $ cuts 2 text.txt 2 2

cuts支持：

以及更多。没有一个是由标准cut提供的。

另请参阅： https : //stackoverflow.com/a/24543231/1296044

源文件（免费软件）： http : //arielf.github.io/cuts/

对于我知道的版本，不，这是不可能的。 cut主要用于parsing分隔符不是空格的文件（例如/etc/passwd ）并且具有固定数量的字段。连续的两个分隔符意味着一个空的字段，这也是空白的。

这个Perl单行显示了Perl与awk的紧密联系：

 perl -lane 'print $F[3]' text.txt

不过，@ $F[0] autosplit数组从$F[0]开始，而awk字段以$1开始