如何在Bash中定义哈希表？

什么是相同的Python字典，但在Bash（应该跨OS X和Linux）。

猛击4

Bash 4本身支持这个function。确保脚本的hashbang是#!/usr/bin/env bash或#!/bin/bash或其他任何引用bash而不是sh 。确保你正在执行你的脚本，而不是做像sh script这样会导致你的bash hashbang被忽略的东西。这是基本的东西，但是很多东西仍然在失败，因此重新进行迭代。

您通过执行以下操作来声明关联数组：

 declare -A animals

你可以使用正常的数组赋值操作符来填充元素：

 animals=( ["moo"]="cow" ["woof"]="dog")

或合并它们：

 declare -A animals=( ["moo"]="cow" ["woof"]="dog")

然后像普通数组一样使用它们。 "${animals[@]}"扩展了这些值， "${!animals[@]}" （注意! ）扩展了这些键。不要忘记引用他们：

 echo "${animals[moo]}" for sound in "${!animals[@]}"; do echo "$sound - ${animals[$sound]}"; done

Bash 3

在bash 4之前，你没有关联数组。 不要使用eval来模拟它们 。你必须避免像瘟疫一样的eval，因为它是 shell脚本的瘟疫。最重要的原因是你不想把你的数据当作可执行代码（还有很多其他的原因）。

首先，只要考虑升级到bash 4.严重。 现在的未来就是 ，停止过去的生活，并且通过强迫你的代码中的愚蠢的破坏和丑陋的黑客，每一个可怜的灵魂都坚持维护它。

如果你有一些愚蠢的借口，为什么你“ 不能升级 ”， declare是一个更安全的select。它不像eval那样将数据评估为bash代码，因此它不允许任意代码注入。

让我们通过引入这些概念来准备答案：

首先，间接的（认真的，除非你是精神病患者，或者有其他不好的借口来写这些黑客，否则不要用这个）。

 $ animals_moo=cow; sound=moo; i="animals_$sound"; echo "${!i}" cow

其次， declare ：

 $ sound=moo; animal=cow; declare "animals_$sound=$animal"; echo "$animals_moo" cow

把他们放在一起：

 # Set a value: declare "array_$index=$value" # Get a value: arrayGet() { local array=$1 index=$2 local i="${array}_$index" printf '%s' "${!i}" }

让我们使用它：

 $ sound=moo $ animal=cow $ declare "animals_$sound=$animal" $ arrayGet animals "$sound" cow

注意： declare不能放入函数中。任何在bash函数中declare使用都会把它创build的variables变成本地函数的范围，这意味着我们不能使用它来访问或修改全局数组。（在bash 4中，你可以使用declare -g声明全局variables – 但是在bash 4中，你应该首先使用关联数组，而不是这个黑客。

概要

升级到bash 4并使用declare -A 。如果你不能，那么在做如上所述的丑陋的黑客之前，请考虑完全切换到awk 。绝对不要离开eval hackery。

有参数替代，虽然它可能是非PC，以及…像间接。

 #!/bin/bash # Array pretending to be a Pythonic dictionary ARRAY=( "cow:moo" "dinosaur:roar" "bird:chirp" "bash:rock" ) for animal in "${ARRAY[@]}" ; do KEY="${animal%%:*}" VALUE="${animal##*:}" printf "%s likes to %s.\n" "$KEY" "$VALUE" done printf "%s is an extinct animal which likes to %s\n" "${ARRAY[1]%%:*}" "${ARRAY[1]##*:}"

BASH 4的方式当然更好，但是如果你需要黑客，只有黑客才能做到。你可以用类似的技术来search数组/散列。

这是我在这里寻找的：

 declare -A hashmap hashmap["key"]="value" hashmap["key2"]="value2" echo "${hashmap["key"]}" for key in ${!hashmap[@]}; do echo $key; done for value in ${hashmap[@]}; do echo $value; done echo hashmap has ${#hashmap[@]} elements

这对我来说不适用于bash 4.1.5：

 animals=( ["moo"]="cow" )

您可以进一步修改hput（）/ hget（）接口，以便按如下方式命名散列：

 hput() { eval "$1""$2"='$3' } hget() { eval echo '${'"$1$2"'#hash}' }

接着

 hput capitals France Paris hput capitals Netherlands Amsterdam hput capitals Spain Madrid echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`

这可以让你定义其他不冲突的地图（例如，由首都城市进行国家查找的'歹徒'）。但是，无论如何，我想你会发现，这是非常糟糕的，性能明智的。

如果你真的想要快速的哈希查找，有一个可怕的，可怕的黑客真的很好。它是这样的：把你的key / values写到一个临时文件中，每行一个，然后用'grep'^ $ key''把它们取出来，使用cut或者awk或者sed或者其他任何方法来获取值。

就像我说的那样，这听起来很糟糕，而且听起来好像应该慢一点，做各种不必要的IO，但是实际上速度非常快（磁盘caching真棒，不是吗？），即使是非常大的散列表。你必须自己执行关键的唯一性，等等。即使你只有几百个条目，输出文件/ grep组合将会快得多 – 以我的经验快几倍。它也吃更less的内存。

以下是一种方法：

 hinit() { rm -f /tmp/hashmap.$1 } hput() { echo "$2 $3" >> /tmp/hashmap.$1 } hget() { grep "^$2 " /tmp/hashmap.$1 | awk '{ print $2 };' } hinit capitals hput capitals France Paris hput capitals Netherlands Amsterdam hput capitals Spain Madrid echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`

 hput () { eval hash"$1"='$2' } hget () { eval echo '${hash'"$1"'#hash}' } hput France Paris hput Netherlands Amsterdam hput Spain Madrid echo `hget France` and `hget Netherlands` and `hget Spain`

 $ sh hash.sh Paris and Amsterdam and Madrid

考虑使用bash内置读取的解决scheme，如以下ufw防火墙脚本中的代码段所示。这种方法的优点是按照需要使用多个分隔的字段集合（不仅仅是2个）。我们使用了| 因为端口范围说明符可能需要一个冒号，即6001：6010 。

 #!/usr/bin/env bash readonly connections=( '192.168.1.4/24|tcp|22' '192.168.1.4/24|tcp|53' '192.168.1.4/24|tcp|80' '192.168.1.4/24|tcp|139' '192.168.1.4/24|tcp|443' '192.168.1.4/24|tcp|445' '192.168.1.4/24|tcp|631' '192.168.1.4/24|tcp|5901' '192.168.1.4/24|tcp|6566' ) function set_connections(){ local range proto port for fields in ${connections[@]} do IFS=$'|' read -r range proto port <<< "$fields" ufw allow from "$range" proto "$proto" to any port "$port" done } set_connections

我真的很喜欢Al P的答案，但希望廉价地执行唯一性，所以我更进一步 – 使用一个目录。有一些明显的限制（目录文件限制，无效的文件名），但它应该适用于大多数情况。

 hinit() { rm -rf /tmp/hashmap.$1 mkdir -p /tmp/hashmap.$1 } hput() { printf "$3" > /tmp/hashmap.$1/$2 } hget() { cat /tmp/hashmap.$1/$2 } hkeys() { ls -1 /tmp/hashmap.$1 } hdestroy() { rm -rf /tmp/hashmap.$1 } hinit ids for (( i = 0; i < 10000; i++ )); do hput ids "key$i" "value$i" done for (( i = 0; i < 10000; i++ )); do printf '%s\n' $(hget ids "key$i") > /dev/null done hdestroy ids

在我的testing中它也performance得更好一些。

 $ time bash hash.sh real 0m46.500s user 0m16.767s sys 0m51.473s $ time bash dirhash.sh real 0m35.875s user 0m8.002s sys 0m24.666s

只是以为我会介入。干杯！

编辑：添加hdestroy（）

我同意@lhunath和其他人，关联数组是与Bash 4一起去的方式。如果你坚持Bash 3（OSX，旧的发行版，你不能更新），你也可以使用expr，这应该是无处不在，一个string和正则expression式。我喜欢它，特别是当字典不是太大时。

在键和值中select2个不能使用的分隔符（例如'，'和'：'）
把你的地图写成一个string（注意分隔符'，'也在开头和结尾）
```
 animals=",moo:cow,woof:dog," 
```

使用正则expression式来提取值

 get_animal { echo "$(expr "$animals" : ".*,$1:\([^,]*\),.*")" }

拆分string以列出项目

 get_animal_items { arr=$(echo "${animals:1:${#animals}-2}" | tr "," "\n") for i in $arr do value="${i##*:}" key="${i%%:*}" echo "${value} likes to $key" done }

现在你可以使用它：

 $ animal = get_animal "moo" cow $ get_animal_items cow likes to moo dog likes to woof

只要使用文件系统

文件系统是可以用作哈希映射的树结构。你的散列表将是一个临时目录，你的密钥将是文件名，你的值将是文件内容。好处是它可以处理巨大的hashmaps，并且不需要特定的shell。

创build哈希表

hashtable=$(mktemp -d)

添加一个元素

echo $value > $hashtable/$key

阅读一个元素

value=$(< $hashtable/$key)

性能

当然，它很慢，但不是那么慢。我在我的机器上testing了它，使用SSD和btrfs ， 每秒读取/写入大约3000个元素 。

有两件事，你可以使用内存而不是/ tmp在任何内核2.6中使用/ dev / shm（Redhat）其他发行版可能会有所不同。还可以使用read来重新实现hget，如下所示：

 function hget { while read key idx do if [ $key = $2 ] then echo $idx return fi done < /dev/shm/hashmap.$1 }

另外，假设所有密钥都是唯一的，则返回将读取循环短路并防止必须读取所有条目。如果你的实现可以有重复的键，那么简单地把这个返回忽略掉。这节省了阅读和分叉grep和awk的开销。对于这两种实现使用/ dev / shm在使用3入口哈希search最后一个条目时使用了time hget：

grep的/ awk中：

 hget() { grep "^$2 " /dev/shm/hashmap.$1 | awk '{ print $2 };' } $ time echo $(hget FD oracle) 3 real 0m0.011s user 0m0.002s sys 0m0.013s

读/回声：

 $ time echo $(hget FD oracle) 3 real 0m0.004s user 0m0.000s sys 0m0.004s

在多次调用中，我从来没有看到less于50％的改进。由于使用了/dev/shm ，这可以全部归于叉头。

Bash 3解决scheme：

在阅读一些答案时，我把一个快速的小函数放在一起，我想回馈一下，可能会帮助别人。

 # Define a hash like this MYHASH=("firstName:Milan" "lastName:Adamovsky") # Function to get value by key getHashKey() { declare -a hash=("${!1}") local key local lookup=$2 for key in "${hash[@]}" ; do KEY=${key%%:*} VALUE=${key#*:} if [[ $KEY == $lookup ]] then echo $VALUE fi done } # Function to get a list of all keys getHashKeys() { declare -a hash=("${!1}") local KEY local VALUE local key local lookup=$2 for key in "${hash[@]}" ; do KEY=${key%%:*} VALUE=${key#*:} keys+="${KEY} " done echo $keys } # Here we want to get the value of 'lastName' echo $(getHashKey MYHASH[@] "lastName") # Here we want to get all keys echo $(getHashKeys MYHASH[@])

在bash之前，在bash中没有使用关联数组的好方法。你最好的select是使用一种实际上支持这种事物的解释型语言，比如awk。另一方面，bash 4支持它们。

至于bash 3中不太好的方法，这里有一个参考可能会有所帮助： http : //mywiki.wooledge.org/BashFAQ/006

一位同事刚刚提到这个线程。我在bash中独立实现了哈希表，并且不依赖于版本4.从2010年3月的一篇博客文章（在这里的一些答案之前），题目是bash中的哈希表：

 # Here's the hashing function ht() { local ht=`echo "$*" |cksum`; echo "${ht//[!0-9]}"; } # Example: myhash[`ht foo bar`]="a value" myhash[`ht baz baf`]="b value" echo ${myhash[`ht baz baf`]} # "b value" echo ${myhash[@]} # "a value b value" though perhaps reversed

当然，它会对cksum进行外部调用，因此速度有所减慢，但实现非常干净且可用。这不是双向的，内置的方式好多了，但是也不能真的被使用。 Bash是快速一次性的，这样的事情应该很less涉及可能需要哈希的复杂性，除了可能在您的.bashrc和朋友。

为了获得更多的性能，请记住，grep有一个停止函数，当它发现第n个匹配时停止，在这种情况下，n将是1。

grep –max_count = 1 …或grep -m 1 …

我也使用bash4的方式，但我发现和烦人的错误。

我需要dynamic更新关联数组的内容，所以我用这种方式：

 for instanceId in $instanceList do aws cloudwatch describe-alarms --output json --alarm-name-prefix $instanceId| jq '.["MetricAlarms"][].StateValue'| xargs | grep -E 'ALARM|INSUFFICIENT_DATA' [ $? -eq 0 ] && statusCheck+=([$instanceId]="checkKO") || statusCheck+=([$instanceId]="allCheckOk" done

我发现用bash 4.3.11附加到字典中的一个现有的键导致附加值，如果已经存在。因此，例如在一些重复后，值的内容是“checkKOcheckKOallCheckOK”，这是不好的。

没有问题4.3.39如果已经存在，附加一个现有的密钥意味着降低实际值。

我解决了这个问题，只是在cicle之前清理/声明statusCheck关联数组：

 unset statusCheck; declare -A statusCheck

我使用dynamicvariables在bash 3中创build了HashMaps。我解释了如何在我的回答： Shell脚本中的关联数组

你也可以看看shell_map ，这是一个在bash 3中做的HashMap实现。

如何在Bash中定义哈希表？

猛击4

Bash 3

概要

只要使用文件系统

创build哈希表

添加一个元素

阅读一个元素

性能

有没有办法找出如何“深”PHP数组？

PHP：获取关联数组的第n项

如何检查一个特定的密钥是否存在于一个散列或不是？

如何从javascript关联数组中删除对象？

PHP prepend关联数组与文字键？

Java关联数组

Javascript的：使用整数作为关联数组中的键？

将关联数组更改为索引数组/获取Zend_Table_Row_Abstract作为非关联

在关联数组上删除vs拼接

在PHP中迭代复杂的关联数组