在Groovy中XML简介

2019-04-16 3784 words 8 minutes

Contents

1. 简介

Groovy 提供了大量专用于遍历和操作 XML 内容的方法。

在本教程中，我们将演示如何在 Groovy中使用各种方法从 XML 添加、编辑或删除元素。我们还将展示如何从头开始创建 XML 结构。

2. 定义模型

让我们在资源目录中定义一个 XML 结构，我们将在整个示例中使用它：

<articles>
    <article>
        <title>First steps in Java</title>
        <author id="1">
            <firstname>Siena</firstname>
            <lastname>Kerr</lastname>
        </author>
        <release-date>2018-12-01</release-date>
    </article>
    <article>
        <title>Dockerize your SpringBoot application</title>
        <author id="2">
            <firstname>Jonas</firstname>
            <lastname>Lugo</lastname>
        </author>
        <release-date>2018-12-01</release-date>
    </article>
    <article>
        <title>SpringBoot tutorial</title>
        <author id="3">
            <firstname>Daniele</firstname>
            <lastname>Ferguson</lastname>
        </author>
        <release-date>2018-06-12</release-date>
    </article>
    <article>
        <title>Java 12 insights</title>
        <author id="1">
            <firstname>Siena</firstname>
            <lastname>Kerr</lastname>
        </author>
        <release-date>2018-07-22</release-date>
    </article>
</articles>

并将其读入InputStream变量：

def xmlFile = getClass().getResourceAsStream("articles.xml")

3. XmlParser

让我们开始使用XmlParser类探索这个流。

3.1. 读取

读取和解析 XML 文件可能是开发人员必须执行的最常见的 XML 操作。XmlParser提供了一个非常简单的接口，这意味着：

def articles = new XmlParser().parse(xmlFile)

此时，我们可以使用 GPath 表达式访问 XML 结构的属性和值。

现在让我们使用Spock 实现一个简单的测试来检查我们的articles对象是否正确：

def "Should read XML file properly"() {
    given: "XML file"
    when: "Using XmlParser to read file"
    def articles = new XmlParser().parse(xmlFile)
    then: "Xml is loaded properly"
    articles.'*'.size() == 4
    articles.article[0].author.firstname.text() == "Siena"
    articles.article[2].'release-date'.text() == "2018-06-12"
    articles.article[3].title.text() == "Java 12 insights"
    articles.article.find { it.author.'@id'.text() == "3" }.author.firstname.text() == "Daniele"
}

要了解如何访问 XML 值以及如何使用 GPath 表达式，让我们先关注一下XmlParser#parse操作结果的内部结构。

articles对象是 groovy.util.Node的一个实例。每个 Node都包含一个名称、属性映射、值和父节点（可以是null或另一个 Node）。

在我们的例子中，articles的值是一个groovy.util.NodeList实例，它是一个Node集合的包装类。NodeList扩展了java.util.ArrayList类，该类提供按索引提取元素。要获取Node的字符串值，我们使用groovy.util.Node#text()。

在上面的例子中，我们介绍了几个 GPath 表达式：

articles.article[0].author.firstname — 获取第一篇文章的作者名字——articles.article[n]将直接访问第n篇文章
’*’ - 获取article的子项列表 - 相当于groovy.util.Node#children()
author.’@id’ — 获取author元素的id属性 - *author.’@attributeName’*通过其名称访问属性值（等价物是：author[’@id’]和[’@attributeName’]）

3.2. 添加节点

与前面的示例类似，我们先将 XML 内容读入一个变量。这将允许我们定义一个新节点并使用groovy.util.Node#append将其添加到我们的文章列表中。

现在让我们实现一个测试来证明我们的观点：

def "Should add node to existing xml using NodeBuilder"() {
    given: "XML object"
    def articles = new XmlParser().parse(xmlFile)
    when: "Adding node to xml"
    def articleNode = new NodeBuilder().article(id: '5') {
        title('Traversing XML in the nutshell')
        author {
            firstname('Martin')
            lastname('Schmidt')
        }
        'release-date'('2019-05-18')
    }
    articles.append(articleNode)
    then: "Node is added to xml properly"
    articles.'*'.size() == 5
    articles.article[4].title.text() == "Traversing XML in the nutshell"
}

正如我们在上面的例子中看到的，这个过程非常简单。

还请注意，我们使用了groovy.util.NodeBuilder，它是对Node定义使用Node构造函数的一个很好的替代方案。

3.3. 修改节点

我们还可以使用XmlParser修改节点的值。为此，让我们再次解析 XML 文件的内容。接下来，我们可以通过更改Node对象的value字段来编辑内容节点。

让我们记住，当 XmlParser使用 GPath 表达式时，我们总是检索NodeList的实例，因此要修改第一个（也是唯一一个）元素，我们必须使用它的索引来访问它。

让我们通过编写一个快速测试来检查我们的假设：

def "Should modify node"() {
    given: "XML object"
    def articles = new XmlParser().parse(xmlFile)
    when: "Changing value of one of the nodes"
    articles.article.each { it.'release-date'[0].value = "2019-05-18" }
    then: "XML is updated"
    articles.article.findAll { it.'release-date'.text() != "2019-05-18" }.isEmpty()
}

在上面的示例中，我们还使用了Groovy Collections API 来遍历NodeList。

3.4. 更换节点

接下来，让我们看看如何替换整个节点，而不是只修改其中一个值。

与添加新元素类似，我们将使用NodeBuilder进行Node定义，然后使用groovy.util.Node#replaceNode替换其中的一个现有节点：

def "Should replace node"() {
    given: "XML object"
    def articles = new XmlParser().parse(xmlFile)
    when: "Adding node to xml"
    def articleNode = new NodeBuilder().article(id: '5') {
        title('Traversing XML in the nutshell')
        author {
            firstname('Martin')
            lastname('Schmidt')
        }
        'release-date'('2019-05-18')
    }
    articles.article[0].replaceNode(articleNode)
    then: "Node is added to xml properly"
    articles.'*'.size() == 4
    articles.article[0].title.text() == "Traversing XML in the nutshell"
}

3.5. 删除节点

使用XmlParser删除节点非常棘手。虽然 Node类提供了*remove(Node child)*方法，但在大多数情况下，我们不会单独使用它。

相反，我们将展示如何删除其值满足给定条件的节点。

默认情况下，使用Node.NodeList引用链访问嵌套元素会返回相应子节点的副本。因此，我们不能直接在我们的articles集合上使用java.util.NodeList#removeAll方法。

要通过谓词删除节点，我们必须首先找到所有符合我们条件的节点，然后遍历它们并每次调用父节点上的java.util.Node#remove方法。

让我们实现一个测试，删除所有作者的 id 不是3的文章：

def "Should remove article from xml"() {
    given: "XML object"
    def articles = new XmlParser().parse(xmlFile)
    when: "Removing all articles but the ones with id==3"
    articles.article
      .findAll { it.author.'@id'.text() != "3" }
      .each { articles.remove(it) }
    then: "There is only one article left"
    articles.children().size() == 1
    articles.article[0].author.'@id'.text() == "3"
}

正如我们所见，作为我们删除操作的结果，我们收到了一个只有一篇文章的 XML 结构，它的 id 是3。

4. XmlSlurper

Groovy 还提供了另一个专门用于处理 XML 的类。在本节中，我们将展示如何使用XmlSlurper 读取和操作 XML 结构。

4.1. 读取

与我们之前的示例一样，让我们从解析文件中的 XML 结构开始：

def "Should read XML file properly"() {
    given: "XML file"
    when: "Using XmlSlurper to read file"
    def articles = new XmlSlurper().parse(xmlFile)
    then: "Xml is loaded properly"
    articles.'*'.size() == 4
    articles.article[0].author.firstname == "Siena"
    articles.article[2].'release-date' == "2018-06-12"
    articles.article[3].title == "Java 12 insights"
    articles.article.find { it.author.'@id' == "3" }.author.firstname == "Daniele"
}

如我们所见，该接口与XmlParser的接口相同。但是，输出结构使用groovy.util.slurpersupport.GPathResult，它是Node的包装类。GPathResult通过包装 *Node#text()*提供了方法的简化定义，例如：*equals()和toString( )。*因此，我们可以直接使用它们的名称来读取字段和参数。

4.2. 添加节点

添加Node也与使用XmlParser非常相似。但是，在这种情况下，groovy.util.slurpersupport。GPathResult#appendNode提供了一个将java.lang.Object的实例作为参数的方法。因此，我们可以按照NodeBuilder引入的相同约定来简化新的Node定义：

def "Should add node to existing xml"() {
    given: "XML object"
    def articles = new XmlSlurper().parse(xmlFile)
    when: "Adding node to xml"
    articles.appendNode {
        article(id: '5') {
            title('Traversing XML in the nutshell')
            author {
                firstname('Martin')
                lastname('Schmidt')
            }
            'release-date'('2019-05-18')
        }
    }
    articles = new XmlSlurper().parseText(XmlUtil.serialize(articles))
    then: "Node is added to xml properly"
    articles.'*'.size() == 5
    articles.article[4].title == "Traversing XML in the nutshell"
}

如果我们需要使用XmlSlurper 修改 XML 的结构，我们必须重新初始化我们的articles对象才能看到结果。我们可以使用groovy.util.XmlSlurper#parseText和groovy.xmlXmlUtil#serialize方法的组合来实现这一点。

4.3. 修改节点

正如我们之前提到的，GPathResult引入了一种简化的数据操作方法。话虽如此，与XmlSlurper 相比，我们可以直接使用节点名称或参数名称修改值：

def "Should modify node"() {
    given: "XML object"
    def articles = new XmlSlurper().parse(xmlFile)
    when: "Changing value of one of the nodes"
    articles.article.each { it.'release-date' = "2019-05-18" }
    then: "XML is updated"
    articles.article.findAll { it.'release-date' != "2019-05-18" }.isEmpty()
}

请注意，当我们只修改 XML 对象的值时，我们不必再次解析整个结构。

4.4. 更换节点

现在让我们开始替换整个节点。再次，GPathResult来救援。我们可以使用groovy.util.slurpersupport.NodeChild#replaceNode 轻松替换节点，它扩展了 GPathResult并遵循使用Object值作为参数的相同约定：

def "Should replace node"() {
    given: "XML object"
    def articles = new XmlSlurper().parse(xmlFile)
    when: "Replacing node"
    articles.article[0].replaceNode {
        article(id: '5') {
            title('Traversing XML in the nutshell')
            author {
                firstname('Martin')
                lastname('Schmidt')
            }
            'release-date'('2019-05-18')
        }
    }
    articles = new XmlSlurper().parseText(XmlUtil.serialize(articles))
    then: "Node is replaced properly"
    articles.'*'.size() == 4
    articles.article[0].title == "Traversing XML in the nutshell"
}

与添加节点时的情况一样，我们正在修改 XML 的结构，因此我们必须再次解析它。

4.5. 删除节点

要使用XmlSlurper 删除节点，我们可以重用groovy.util.slurpersupport.NodeChild#replaceNode方法，只需提供一个空的 Node定义：

def "Should remove article from xml"() {
    given: "XML object"
    def articles = new XmlSlurper().parse(xmlFile)
    when: "Removing all articles but the ones with id==3"
    articles.article
      .findAll { it.author.'@id' != "3" }
      .replaceNode {}
    articles = new XmlSlurper().parseText(XmlUtil.serialize(articles))
    then: "There is only one article left"
    articles.children().size() == 1
    articles.article[0].author.'@id' == "3"
}

同样，修改 XML 结构需要重新初始化我们的articles对象。

5. XmlParser与XmlSlurper

正如我们在示例中所展示的，XmlParser和XmlSlurper的用法非常相似。我们可以或多或少地达到相同的结果。但是，它们之间的一些差异可能会使天平向其中一个倾斜。

首先，XmlParser总是将整个文档解析为 DOM 结构。因此，我们可以同时读取和写入它。我们不能对XmlSlurper做同样的事情，因为它更懒惰地评估路径。因此，XmlParser会消耗更多内存。

另一方面，XmlSlurper使用更直接的定义，使其更易于使用。我们还需要记住，使用XmlSlurper对 XML 进行的任何结构更改都需要重新初始化，如果一个接一个地进行许多更改**，这可能会对性能造成不可接受的影响。**

应谨慎决定使用哪种工具，并且完全取决于用例。

6. MarkupBuilder

除了读取和操作 XML 树之外，Groovy 还提供了从头开始创建 XML 文档的工具。现在让我们使用groovy.xml.MarkupBuilder创建一个包含第一个示例中的前两篇文章的文档：

def "Should create XML properly"() {
    given: "Node structures"
    when: "Using MarkupBuilderTest to create xml structure"
    def writer = new StringWriter()
    new MarkupBuilder(writer).articles {
        article {
            title('First steps in Java')
            author(id: '1') {
                firstname('Siena')
                lastname('Kerr')
            }
            'release-date'('2018-12-01')
        }
        article {
            title('Dockerize your SpringBoot application')
            author(id: '2') {
                firstname('Jonas')
                lastname('Lugo')
            }
            'release-date'('2018-12-01')
        }
    }
    then: "Xml is created properly"
    XmlUtil.serialize(writer.toString()) == XmlUtil.serialize(xmlFile.text)
}

在上面的示例中，我们可以看到MarkupBuilder使用与我们之前用于NodeBuilder和GPathResult的Node定义相同的方法。

为了将 MarkupBuilder的输出与预期的 XML 结构进行比较，我们使用了groovy.xml.XmlUtil#serialize方法。