如何使用包含XSD的Java来validationXML文件?

我正在使用Java 5 javax.xml.validation.Validator来validationXML文件。 我已经完成了一个只使用导入的模式,一切正常。 现在我试图用另一个使用import和include的模式进行validation。 我遇到的问题是主模式中的元素被忽略,validation说它找不到它们的声明。

以下是我如何构build架构:

InputStream includeInputStream = getClass().getClassLoader().getResource("include.xsd").openStream(); InputStream importInputStream = getClass().getClassLoader().getResource("import.xsd").openStream(); InputStream mainInputStream = getClass().getClassLoader().getResource("main.xsd").openStream(); Source[] sourceSchema = new SAXSource[]{includeInputStream , importInputStream, mainInputStream }; Schema schema = factory.newSchema(sourceSchema); 

现在这里是main.xsd中声明的摘录

 <xsd:schema xmlns="http://schema.omg.org/spec/BPMN/2.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:import="http://www.foo.com/import" targetNamespace="http://main/namespace" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xsd:import namespace="http://www.foo.com/import" schemaLocation="import.xsd"/> <xsd:include schemaLocation="include.xsd"/> <xsd:element name="element" type="tElement"/> <...> </xsd:schema> 

如果我复制main.xsd中包含的XSD的代码,它工作正常。 如果我没有,validation不会find“元素”的声明。

你需要使用LSResourceResolver来工作。 请看下面的示例代码。

validation方法:

 // note that if your XML already declares the XSD to which it has to conform, then there's no need to declare the schemaName here void validate(String xml, String schemaName) throws Exception { DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); builderFactory.setNamespaceAware(true); DocumentBuilder parser = builderFactory .newDocumentBuilder(); // parse the XML into a document object Document document = parser.parse(new StringInputStream(xml)); SchemaFactory factory = SchemaFactory .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); // associate the schema factory with the resource resolver, which is responsible for resolving the imported XSD's factory.setResourceResolver(new ResourceResolver()); // note that if your XML already declares the XSD to which it has to conform, then there's no need to create a validator from a Schema object Source schemaFile = new StreamSource(getClass().getClassLoader() .getResourceAsStream(schemaName)); Schema schema = factory.newSchema(schemaFile); Validator validator = schema.newValidator(); validator.validate(new DOMSource(document)); } 

资源parsing器的实现:

 public class ResourceResolver implements LSResourceResolver { public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) { // note: in this sample, the XSD's are expected to be in the root of the classpath InputStream resourceAsStream = this.getClass().getClassLoader() .getResourceAsStream(systemId); return new Input(publicId, systemId, resourceAsStream); } } 

资源parsing器返回的input实现:

 public class Input implements LSInput { private String publicId; private String systemId; public String getPublicId() { return publicId; } public void setPublicId(String publicId) { this.publicId = publicId; } public String getBaseURI() { return null; } public InputStream getByteStream() { return null; } public boolean getCertifiedText() { return false; } public Reader getCharacterStream() { return null; } public String getEncoding() { return null; } public String getStringData() { synchronized (inputStream) { try { byte[] input = new byte[inputStream.available()]; inputStream.read(input); String contents = new String(input); return contents; } catch (IOException e) { e.printStackTrace(); System.out.println("Exception " + e); return null; } } } public void setBaseURI(String baseURI) { } public void setByteStream(InputStream byteStream) { } public void setCertifiedText(boolean certifiedText) { } public void setCharacterStream(Reader characterStream) { } public void setEncoding(String encoding) { } public void setStringData(String stringData) { } public String getSystemId() { return systemId; } public void setSystemId(String systemId) { this.systemId = systemId; } public BufferedInputStream getInputStream() { return inputStream; } public void setInputStream(BufferedInputStream inputStream) { this.inputStream = inputStream; } private BufferedInputStream inputStream; public Input(String publicId, String sysId, InputStream input) { this.publicId = publicId; this.systemId = sysId; this.inputStream = new BufferedInputStream(input); } } 

AMegmondoEmber必须对这篇文章做一些修改

我的主模式文件有一些来自兄弟文件夹的包含文件,并且包含的​​文件也有一些来自其本地文件夹的包含文件。 我还必须追踪当前资源的基础资源path和相对path。 这段代码适用于我,但请记住,假设所有xsd文件都有一个唯一的名称。 如果你有一些同名的xsd文件,但是在不同的path上有不同的内容,它可能会给你带来麻烦。

 import java.io.ByteArrayInputStream; import java.io.InputStream; import java.util.HashMap; import java.util.Map; import java.util.Scanner; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.w3c.dom.ls.LSInput; import org.w3c.dom.ls.LSResourceResolver; /** * The Class ResourceResolver. */ public class ResourceResolver implements LSResourceResolver { /** The logger. */ private final Logger logger = LoggerFactory.getLogger(this.getClass()); /** The schema base path. */ private final String schemaBasePath; /** The path map. */ private Map<String, String> pathMap = new HashMap<String, String>(); /** * Instantiates a new resource resolver. * * @param schemaBasePath the schema base path */ public ResourceResolver(String schemaBasePath) { this.schemaBasePath = schemaBasePath; logger.warn("This LSResourceResolver implementation assumes that all XSD files have a unique name. " + "If you have some XSD files with same name but different content (at different paths) in your schema structure, " + "this resolver will fail to include the other XSD files except the first one found."); } /* (non-Javadoc) * @see org.w3c.dom.ls.LSResourceResolver#resolveResource(java.lang.String, java.lang.String, java.lang.String, java.lang.String, java.lang.String) */ @Override public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) { // The base resource that includes this current resource String baseResourceName = null; String baseResourcePath = null; // Extract the current resource name String currentResourceName = systemId.substring(systemId .lastIndexOf("/") + 1); // If this resource hasn't been added yet if (!pathMap.containsKey(currentResourceName)) { if (baseURI != null) { baseResourceName = baseURI .substring(baseURI.lastIndexOf("/") + 1); } // we dont need "./" since getResourceAsStream cannot understand it if (systemId.startsWith("./")) { systemId = systemId.substring(2, systemId.length()); } // If the baseResourcePath has already been discovered, get that // from pathMap if (pathMap.containsKey(baseResourceName)) { baseResourcePath = pathMap.get(baseResourceName); } else { // The baseResourcePath should be the schemaBasePath baseResourcePath = schemaBasePath; } // Read the resource as input stream String normalizedPath = getNormalizedPath(baseResourcePath, systemId); InputStream resourceAsStream = this.getClass().getClassLoader() .getResourceAsStream(normalizedPath); // if the current resource is not in the same path with base // resource, add current resource's path to pathMap if (systemId.contains("/")) { pathMap.put(currentResourceName, normalizedPath.substring(0,normalizedPath.lastIndexOf("/")+1)); } else { // The current resource should be at the same path as the base // resource pathMap.put(systemId, baseResourcePath); } Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A"); String s1 = s.next().replaceAll("\\n", " ") // the parser cannot understand elements broken down multiple lines eg (<xs:element \n name="buxing">) .replace("\\t", " ") // these two about whitespaces is only for decoration .replaceAll("\\s+", " ").replaceAll("[^\\x20-\\x7e]", ""); // some files has a special character as a first character indicating utf-8 file InputStream is = new ByteArrayInputStream(s1.getBytes()); return new LSInputImpl(publicId, systemId, is); // same as Input class } // If this resource has already been added, do not add the same resource again. It throws // "org.xml.sax.SAXParseException: sch-props-correct.2: A schema cannot contain two global components with the same name; this schema contains two occurrences of ..." // return null instead. return null; } /** * Gets the normalized path. * * @param basePath the base path * @param relativePath the relative path * @return the normalized path */ private String getNormalizedPath(String basePath, String relativePath){ if(!relativePath.startsWith("../")){ return basePath + relativePath; } else{ while(relativePath.startsWith("../")){ basePath = basePath.substring(0,basePath.substring(0, basePath.length()-1).lastIndexOf("/")+1); relativePath = relativePath.substring(3); } return basePath+relativePath; } } } 

被接受的答案是完全正确的,但没有一些修改就不能用于Java 8。 能够指定读取导入模式的基本path也是很好的。

我在我的Java 8中使用了以下代码,它允许指定除根path之外的embedded式模式path:

 import com.sun.org.apache.xerces.internal.dom.DOMInputImpl; import org.w3c.dom.ls.LSInput; import org.w3c.dom.ls.LSResourceResolver; import java.io.InputStream; import java.util.Objects; public class ResourceResolver implements LSResourceResolver { private String basePath; public ResourceResolver(String basePath) { this.basePath = basePath; } @Override public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) { // note: in this sample, the XSD's are expected to be in the root of the classpath InputStream resourceAsStream = this.getClass().getClassLoader() .getResourceAsStream(buildPath(systemId)); Objects.requireNonNull(resourceAsStream, String.format("Could not find the specified xsd file: %s", systemId)); return new DOMInputImpl(publicId, systemId, baseURI, resourceAsStream, "UTF-8"); } private String buildPath(String systemId) { return basePath == null ? systemId : String.format("%s/%s", basePath, systemId); } } 

如果模式不能被读取,这个实现也给用户一个有意义的消息。

对于我们来说,resolveResource看起来像这样。 在一些prologexception和奇怪的元素types“xs:schema”之后必须跟有属性规范,“>”或“/>”。 元素types“xs:element”必须后面跟有属性规范,“>”或“/>”。 (由于多行故障)

由于包含的结构,path历史是需要的

 main.xsd (this has include "includes/subPart.xsd") /includes/subPart.xsd (this has include "./subSubPart.xsd") /includes/subSubPart.xsd 

所以代码如下所示:

 String pathHistory = ""; @Override public LSInput resolveResource(String type, String namespaceURI, String publicId, String systemId, String baseURI) { systemId = systemId.replace("./", "");// we dont need this since getResourceAsStream cannot understand it InputStream resourceAsStream = Message.class.getClassLoader().getResourceAsStream(systemId); if (resourceAsStream == null) { resourceAsStream = Message.class.getClassLoader().getResourceAsStream(pathHistory + systemId); } else { pathHistory = getNormalizedPath(systemId); } Scanner s = new Scanner(resourceAsStream).useDelimiter("\\A"); String s1 = s.next() .replaceAll("\\n"," ") //the parser cannot understand elements broken down multiple lines eg (<xs:element \n name="buxing">) .replace("\\t", " ") //these two about whitespaces is only for decoration .replaceAll("\\s+", " ") .replaceAll("[^\\x20-\\x7e]", ""); //some files has a special character as a first character indicating utf-8 file InputStream is = new ByteArrayInputStream(s1.getBytes()); return new LSInputImpl(publicId, systemId, is); } private String getNormalizedPath(String baseURI) { return baseURI.substring(0, baseURI.lastIndexOf(System.getProperty("file.separator"))+ 1) ; } 

被接受的答案是非常冗长的,并且首先在内存中build立一个DOM,包括似乎为我开箱,包括相对引用。

  SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); Schema schema = schemaFactory.newSchema(new File("../foo.xsd")); Validator validator = schema.newValidator(); validator.validate(new StreamSource(new File("./foo.xml"))); 

如果你不会在xml中find一个元素,你会得到xml:langexception。 元素是区分大小写的

 SchemaFactory schemaFactory = SchemaFactory .newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); Source schemaFile = new StreamSource(getClass().getClassLoader() .getResourceAsStream("cars-fleet.xsd")); Schema schema = schemaFactory.newSchema(schemaFile); Validator validator = schema.newValidator(); StreamSource source = new StreamSource(xml); validator.validate(source);