Bash Array简介

codingman included in Linux

2016-03-29 3352 words 7 minutes

Contents

1.简介

Bash 数组是强大的数据结构，在我们处理文件或字符串的集合时非常有用。在本教程中，我们将探索如何使用它们。

2. 数组的类型

Bash 中有两种类型的数组：

索引数组——可以通过整数索引访问值
关联数组——可以通过键访问值（这也称为映射）

在我们的示例中，我们将主要使用第一种类型，但偶尔也会讨论地图。一个特殊的方面是Bash 数组没有最大限制。也没有关于连续分配的要求。

我们稍后会解释最后一部分。现在，让我们看看如何定义数组。

2.1. 索引数组

我们可以通过多种方式声明索引数组。

让我们首先使用带有*-a选项的declare*关键字：

declare -a indexed_array

此外，我们可以使用一些字符串值初始化数组：

declare -a indexed_array=("Blogdemo" "is" "cool")

由于我们提供了一些初始值，我们可以跳过 declare 关键字和标志：

indexed_array=("Blogdemo" "is" "cool")

我们还可以在初始化序列中使用索引：

indexed_array=([0]="Blogdemo" [1]="is" [2]="cool")

或者我们可以一一分配元素：

indexed_array[0]="Blogdemo"
indexed_array[1]="is"
indexed_array[2]="cool"

在 Bash 中，所有这些样本都是有效的数组定义。但是，请注意索引从零开始。还记得我们所说的关于连续分配没有限制的内容吗？让我们看看发生了什么：

indexed_array[0]="Blogdemo"
indexed_array[2]="cool"

这个例子也是一个有效的数组定义。与其他元素不同，索引 1 处的元素没有被分配，但这并没有什么坏处。

2.2. 关联数组

现在，让我们看看如何声明关联数组。

与索引数组不同，我们必须使用***-A*选项显式声明关联数组**：

declare -A associative_array=(["one"]="Blogdemo" ["two"]="is" ["three"]="cool")

同样，我们可以通过指定键和相应的值来分配更多的值：

associative_array["four"]="yeah"

当然，现在我们的地图中有四个元素。

3. 基本操作

接下来，我们将看看我们可以对数组进行的一些基本操作。

3.1. 打印数组元素

打印数组元素是最直观和基本的操作之一。让我们看看它是什么样子的：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Array elements : ${indexed_array[@]}"

我们得到输出：

Array elements : Blogdemo is cool

在这里，我们使用@符号作为索引来指定我们数组的所有成员。

我们还用${}构造包围了数组变量。这会触发 Bash参数扩展。稍后我们将看到一些有趣的可能性。

或者，我们也可以使用*符号作为索引来获得相同的输出：

echo "Array elements : ${indexed_array[*]}"

但是，这两个选项之间存在差异。我们稍后也会讨论这个问题。

3.2. 增强迭代

增强迭代类似于Java 中的列表迭代：

declare -a indexed_array=("Blogdemo" "is" "cool")
for element in ${indexed_array[@]}
    do
       echo "["$element"]"
    done

在这里，我们再次使用${indexed_array[@]}的参数扩展来返回数组的所有元素。

然后，我们只是循环遍历它：

[Blogdemo]
[is]
[cool]

当然，这也适用于* 符号：

for element in ${indexed_array[*]}
    do
        echo "["$element"]"
    done

我们还可以在关联数组上使用增强迭代：

declare -A associative_array=(["one"]="Blogdemo" ["two"]="is" ["three"]="cool")
for element in ${associative_array[@]}
    do
        echo "["$element"]"
    done

让我们看一下输出：

[is]
[cool]
[Blogdemo]

我们可以看到我们的输出没有排序。打印元素的顺序与初始化序列中的顺序不同。

3.3. 使用索引迭代

我们可以在数组上使用基于索引的迭代和增强的循环结构：

declare -a indexed_array=("Blogdemo" "is" "cool")
for index in ${!indexed_array[@]}
    do
        echo "["$index"]:["${indexed_array[$index]}"]"
    done

现在，我们使用${!indexed_array[@]}返回数组的所有索引：

[0]:[Blogdemo]
[1]:[is]
[2]:[cool]

让我们看看如果我们有一个关联数组会发生什么：

declare -A associative_array=(["one"]="Blogdemo" ["two"]="is" ["three"]="cool")
for key in ${!associative_array[@]}
    do
        echo "["$key"]:["${associative_array[$key]}"]"
    done

我们的索引值现在实际上是 map 中的键。

同样，输出中没有特定的顺序：

[two]:[is]
[three]:[cool]
[one]:[Blogdemo]

当然，我们也可以做一个经典的增量循环结构：

for ((index=0; index < ${#indexed_array[@]} ; index++))
    do
        echo "["$index"]:["${indexed_array[$index]}"]"
    done

与之前的方法不同，现在我们使用${#indexed_array[@]}来检索数组中的元素数量。

我们之前看到关联数组返回键而不是索引。因此，这种类型的增量循环不适用于 map。

3.4. 在数组中插入元素

一开始，我们说 Bash 数组没有大小和连续性约束。因此，我们可以使用简单的赋值在任何索引处插入元素：

declare -a indexed_array=("Blogdemo" "is" "cool")
indexed_array[2]="lorem"
indexed_array[5]="ipsum"
for index in ${!indexed_array[@]}
    do
        echo "["$index"]:["${indexed_array[$index]}"]"
    done

这不是一种非常直观的添加值的方法，这样做会产生令人困惑的结果：

[0]:[Blogdemo]
[1]:[is]
[2]:[lorem]
[5]:[ipsum]

在这种情况下，索引是不连续的。我们只是跳过了位置 3 和 4。

这仅在尝试替换数组中的现有元素时才有意义。假设我们之前的示例，我们可以使用更简洁的方法将值附加到我们的数组中：

indexed_array+=("lorem")
for index in ${!indexed_array[@]}
    do
        echo "["$index"]:["${indexed_array[$index]}"]"
    done

让我们看看这里发生了什么：

[0]:[Blogdemo]
[1]:[is]
[2]:[cool]
[3]:[lorem]

我们使用+=构造将一个新值附加到我们的数组中。这也可以用来初始化数组。

3.5. 从数组中删除元素

当我们想从数组中删除项目时，我们使用unset构造：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Array elements : ${indexed_array[@]}"
unset indexed_array[1]
echo "Size of array after removal: ${#indexed_array[@]}"
echo "Array elements after removal: ${indexed_array[@]}"

在此示例中，我们删除了索引 1 处的元素：

Array elements : Blogdemo is cool
Size of array after removal: 2
Array elements after removal : Blogdemo cool

让我们看看如果我们在不提供任何索引的情况下使用unset会发生什么：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Array elements : ${indexed_array[*]}"
unset indexed_array
echo "Size of array after removal: ${#indexed_array[@]}"
echo "Removed complete array : ${indexed_array[@]}"

我们得到输出：

Array elements : Blogdemo is cool
Size of array after removal: 0
Removed complete array :

这意味着我们的数组被完全删除了。如果我们使用@和*符号作为索引，也会发生同样的事情：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Array elements : ${indexed_array[*]}"
unset indexed_array[@]
echo "Size of array:" ${#indexed_array[@]}
echo "Removed complete array : ${indexed_array[@]}"

现在让我们尝试在刚刚删除的数组中添加一个元素：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Array elements : ${indexed_array[@]}"
unset indexed_array[@]
indexed_array+=("lorem ipsum")
echo "Array elements : ${indexed_array[@]}"

这是可能的，因为我们正在重新初始化数组：

Array elements : Blogdemo is cool
Array elements : lorem ipsum

4. 高级操作

现在，让我们看看一些涉及数组的高级场景。

4.1. 在迭代中使用引号

还记得我们的第一个迭代示例吗？我们说过@和*之间有区别。 让我们看看这是关于什么的：

declare -a indexed_array=("Blogdemo" "is" "cool")
for element in "${indexed_array[*]}"
    do
        echo "["$element"]"
    done

现在输出：

[Blogdemo is cool]

这并不是我们所期望的。*现在让我们以@*作为索引再试一次：

for element in "${indexed_array[@]}"
    do
        echo "["$element"]"
    done

我们得到输出：

[Blogdemo]
[is]
[cool]

让我们解释一下这里发生了什么。我们在${indexed_array[*]}构造周围使用了双引号。

因此，当使用*符号时，数组的元素是由空格字符分隔的单个单词。

当我们的数组元素中包含空格字符时，这种方法特别有用：

declare -a indexed_array=("Blogdemo is" "so much" "cool")
echo "Without quotes:"
for element in ${indexed_array[@]}
    do
        echo "["$element"]"
    done
echo "With quotes:"
for element in "${indexed_array[@]}"
    do
        echo "["$element"]"
    done

以及相应的输出：

Without quotes:
[Blogdemo]
[is]
[so]
[much]
[cool]
With quotes:
[Blogdemo is]
[so much]
[cool]

请注意，当使用不带引号的构造时，数组元素会受到进一步的分词 。

4.2. 变换数组

通常，当我们考虑转换数组的元素时，首先想到的是迭代。

幸运的是，Bash 提供了一些巧妙的参数扩展技巧，使我们免于迭代。

我们只会看一些有趣的，但我们可以随时查看 Bash 手册页了解更多信息。

让我们首先搜索并替换数组中的特定元素：

declare -a indexed_array=("Blogdemo is" "so much" "cool" "cool cool")
echo "Initial array : ${indexed_array[@]}"
echo "Replacing cool with better: ${indexed_array[@]/cool/better}"

这里我们使用${indexed_array[@]/cool/better}语法来实现这个替换：

Initial array :  Blogdemo is so much cool cool cool
Replacing cool with better: Blogdemo is so much better better cool

但是，这只替换了每个元素中第一次出现的字符串。如果我们想要所有字符串，那么我们需要使用*//* 语法：

declare -a indexed_array=("Blogdemo is" "so much" "cool" "cool cool")
echo "Replacing cool with better: ${indexed_array[@]//cool/better}"

我们得到了想要的结果：

Replacing cool with better: Blogdemo is so much better better better

让我们看看如果我们不指定替换字符串会发生什么：

echo "Replacing cool with nothing: ${indexed_array[@]//cool}"

这将从我们的数组中删除匹配的字符串：

Replacing cool with nothing: Blogdemo is so much

我们需要记住，**这种搜索默认区分大小写，**除非我们修改 shell 的可选行为。

现在，让我们尝试更改为大写：

declare -a indexed_array=("Blogdemo" "is" "cool")
echo "Uppercasing sentence case: ${indexed_array[@]^}"
echo "Uppercasing all characters: ${indexed_array[@]^^}"

在这里**，我们使用*^和^^更改为大写*：

Uppercasing sentence case: Blogdemo Is Cool
Uppercasing all characters: BLOGDEMO IS COOL

另一方面，我们也可以使用 , 和 ,, 结构进行小写转换：

indexed_array=("Blogdemo" "Is" "COol")
echo "Lowercasing sentence case: ${indexed_array[@],}"
echo "Lowercasing all characters: ${indexed_array[@],,}"

我们得到了想要的结果：

Lowercasing sentence case: Blogdemo is cOol
Lowercasing all characters: blogdemo is cool

4.3. 数组之间的赋值

在之前的转换示例中，我们的初始数组没有被修改。每次我们只是应用一个转换，但结果只是打印到标准输出。如果我们想保留结果，我们可以将它们分配到一个单独的数组中：

indexed_array=("BLogdemo" "Is" "COol")
lowercased_array=(${indexed_array[@],})
echo "Lowercasing sentence case: ${lowercased_array[@]}"

以及相应的输出：

Lowercasing sentence case: bLogdemo is cOol

让我们再举一个变换的例子，这个赋值很有用：

indexed_array=("Blogdemo is" "so much" "cool")
echo "Uppercasing sentence case1: ${indexed_array[@]^}"
echo "No of elements in first_array: ${#indexed_array[@]}"
second_array=(${indexed_array[@]})
echo "Uppercasing sentence case2: ${second_array[@]^}"
echo "No of elements in second_array: ${#second_array[@]}"

我们希望这两个输出匹配。然而，这种情况并非如此：

Uppercasing sentence case1: Blogdemo is So much Cool
No of elements in first_array: 3
Uppercasing sentence case2: Blogdemo Is So Much Cool
No of elements in second_array: 5

所以发生了什么事？还记得前面迭代示例中的引用吗？由于我们没有使用引号，因此Bash 应用了分词，并且second_array包含的元素比 first 多。在某些情况下，赋值对于合并数组很有用：

declare -a fist_array=("Blogdemo" "is" "cool")
declare -a second_array=("lorem" "ipsum")
declare -a merged=(${fist_array[@]} ${second_array[@]})
echo "First array : ${fist_array[@]}"
echo "Second array : ${second_array[@]}"
echo "Merged array : ${merged[@]}"

让我们仔细看看我们做了什么。我们用第一个和第二个数组的所有元素初始化合并数组：

First array : Blogdemo is cool
Second array : lorem ipsum
Merged array : Blogdemo is cool lorem ipsum

4.4. 偏移和长度遍历

在 Bash 中，这正式称为子字符串扩展，但也适用于索引数组。但是，对于map，它具有未定义的行为。有时，我们需要提取数组的特定部分：

declare -a indexed_array=("Blogdemo" "is" "cool" "and" "better" "than" "before")
echo "Offset 1 length 3: ${indexed_array[@]:1:3}"

让我们看看这是做什么的：

Offset 1 length 3: is cool and

此构造采用输入偏移量和长度。然后它选择从 index=offset开始的长度x元素。 但是如果我们省略长度呢：

echo "Offset 1 no length: ${indexed_array[@]:1}"

然后我们从偏移量开始获取数组的所有元素，直到结束：

Offset 1 no length: is cool and better than before

现在让我们看看使用负偏移量是否会改变一些东西：

echo "Offset -1 length 3: ${indexed_array[@]: -4:3}"

我们得到一些有趣的结果：

Offset -1 length 3: and better than

负偏移量被认为是相对于数组的最大索引。

使用负偏移量时还要注意空格字符。如果我们不使用它，Bash 会将它与另一个构造混淆。