用于XML命令行处理的Grep和Sed等效

在执行shell脚本时，通常数据将在单行logging（如csv）的文件中。用grep和sed处理这个数据真的很简单。但是我必须经常处理XML，所以我真的很喜欢通过命令行脚本访问XML数据的方法。什么是最好的工具？

我发现xmlstarlet在这方面做得非常好。

http://xmlstar.sourceforge.net/

也应该在大多数发行版本库中可用。介绍性教程在这里：

http://www.ibm.com/developerworks/library/x-starlet.html

一些有前景的工具

nokogiri ：使用XPath和CSSselect器在Ruby中parsingHTML / XML DOM
hpricot ：已弃用
fxgrep ：使用自己的类XPath语法来查询文档。用SML编写，所以安装可能很困难。
LT XML ：从SGML工具派生的XML工具包，包括sggrep ， sgsort ， xmlnorm等。使用自己的查询语法。文件是非常正式的。用C语言编写。LT XML 2声称支持XPath，XInclude和其他W3C标准。
xmlgrep2 ：使用XPath进行简单而强大的search。用Perl ::使用XML :: LibXML和libxml2编写。
XQSharp ：支持XQuery，即XPath的扩展。编写的.NET框架。
xml-coreutils ：相当于GNU coreutils的Laird Breyer的工具包。在一篇有趣的文章中讨论理想的工具包应该包括什么。
xmldiff ：用于比较两个xml文件的简单工具。
xmltk ：似乎没有在debian，ubuntu，fedora或macports中打包，自2007年以来还没有发布，并使用非可移植的构build自动化。

xml-coreutils似乎是最好的文档和大多数面向UNIX的。

还有xml2和2xml对。它将允许通常的string编辑工具来处理XML。

例。 q.xml：

 <?xml version="1.0"?> <foo> text more text <textnode>ddd</textnode><textnode a="bv">dsss</textnode> <![CDATA[ asfdasdsa <foo> sdfsdfdsf <bar> ]]> </foo>

xml2 < q.xml

 /foo= /foo= text /foo= more text /foo= /foo/textnode=ddd /foo/textnode /foo/textnode/@a=bv /foo/textnode=dsss /foo= /foo= asfdasdsa <foo> sdfsdfdsf <bar> /foo=

xml2 < q.xml | grep textnode | sed 's!/foo!/bar/baz!' | 2xml

 <bar><baz><textnode>ddd</textnode><textnode a="bv">dsss</textnode></baz></bar>

PS还有html2 / 2html 。

对Joseph Holsten的优秀列表，我添加了Perl库XML :: XPath附带的xpath命令行脚本。从XML文件中提取信息的好方法：

  xpath -q -e '/entry[@xml:lang="fr"]' *xml

你可以使用xmllint：

 xmllint --xpath //title books.xml

应该和大多数发行版捆绑在一起，并且还与Cygwin捆绑在一起。

 $ xmllint --version xmllint: using libxml version 20900

看到：

 $ xmllint Usage : xmllint [options] XMLfiles ... Parse the XML files and output the result of the parsing --version : display the version of the XML library used --debug : dump a debug tree of the in-memory document ... --schematron schema : do validation against a schematron --sax1: use the old SAX1 interfaces for processing --sax: do not build a tree but work just at the SAX level --oldxml10: use XML-1.0 parsing rules before the 5th edition --xpath expr: evaluate the XPath expression, inply --noout

取决于你想要做什么。

XSLT可能是要走的路，但是有一条学习曲线。尝试xsltproc，并注意你可以提交参数。

还有NetBSD xmltools的xmlgre和xmlgrep！

http://blog.huoc.org/xmltools-not-dead.html

如果您在Windows上寻找解决scheme，Powershell具有读取和写入XML的内置function。

的test.xml：

 <root> <one>I like applesauce</one> <two>You sure bet I do!</two> </root>

Powershell脚本：

 # load XML file into local variable and cast as XML type. $doc = [xml](Get-Content ./test.xml) $doc.root.one #echoes "I like applesauce" $doc.root.one = "Who doesn't like applesauce?" #replace inner text of <one> node # create new node... $newNode = $doc.CreateElement("three") $newNode.set_InnerText("And don't you forget it!") # ...and position it in the hierarchy $doc.root.AppendChild($newNode) # write results to disk $doc.save("./testNew.xml")

testNew.xml：

 <root> <one>Who likes applesauce?</one> <two>You sure bet I do!</two> <three>And don't you forget it!</three> </root>

来源： https ： //serverfault.com/questions/26976/update-xml-from-the-command-line-windows

XQuery可能是一个很好的解决scheme。这是（相对）容易学习，是一个W3C标准。

我会推荐一个命令行处理器的XQSharp 。

还有命令行中的saxon-lint ，可以使用XPath 3.0 / XQuery 3.0。（其他命令行工具使用XPath 1.0）。

例子：

HTTP / HTML：

 $ saxon-lint --html --xpath 'count(//a)' http://stackoverflow.com/q/91791 328

xml：

 $ saxon-lint --xpath '//a[@class="x"]' file.xml

JEdit有一个名为“XQuery”的插件，它为XML文档提供查询function。

不是命令行，但它的工作原理！

决定你想要在XML文件上做什么操作，并创build一个脚本（可能是Python，Perl），通过shell脚本的参数公开这个function。

我第一次使用xmlstarlet ，仍然使用它。当查询变得困难时，我需要XML的xpath2和xquery特性支持我转向xidel http://www.videlibri.de/xidel.html

用于XML命令行处理的Grep和Sed等效

例子：

使用命令行工具对sorting序列中的重复项进行计数

在Windows中将目录添加到PATH环境variables

Windowsrecursiongrep命令行

在terminalMac OS X中将SSH SCP本地文件转移到远程

命令列出文件夹中的所有文件以及窗口中的子文件夹

你如何从命令行打开SourceTree？

什么是Python的http.server（或SimpleHTTPServer）更快的select？

Matlab：从命令行运行一个m文件

CLI的pdf查看器的Linux

如何从批处理脚本中运行批处理脚本？

用于XML命令行处理的Grep和Sed等效

例子 ：

使用命令行工具对sorting序列中的重复项进行计数

在Windows中将目录添加到PATH环境variables

Windowsrecursiongrep命令行

在terminalMac OS X中将SSH SCP本地文件转移到远程

命令列出文件夹中的所有文件以及窗口中的子文件夹

你如何从命令行打开SourceTree？

什么是Python的http.server（或SimpleHTTPServer）更快的select？

Matlab：从命令行运行一个m文件

CLI的pdf查看器的Linux

如何从批处理脚本中运行批处理脚本？

例子：