迅速提取正则expression式匹配

我想从匹配正则expression式模式的string中提取子string。

所以我正在寻找这样的东西:

func matchesForRegexInText(regex: String!, text: String!) -> [String] { ??? } 

所以这就是我所拥有的:

 func matchesForRegexInText(regex: String!, text: String!) -> [String] { var regex = NSRegularExpression(pattern: regex, options: nil, error: nil) var results = regex.matchesInString(text, options: nil, range: NSMakeRange(0, countElements(text))) as Array<NSTextCheckingResult> /// ??? return ... } 

问题是, matchesInString为我提供了一个NSTextCheckingResult数组,其中NSTextCheckingResult.range的types是NSRange

NSRangeRange<String.Index>不兼容,所以它阻止我使用text.substringWithRange(...)

任何想法如何在swift中实现这个简单的东西,没有太多的代码行?

即使matchesInString()方法接受一个String作为第一个参数,它在内部使用NSString ,范围参数必须使用NSString长度而不是Swiftstring长度。 否则“扩展字形群集”如“旗帜”将失败。

Swift 4 (Xcode 9)开始,Swift标准库提供了在Range<String.Index>NSRange之间转换的函数。

 func matches(for regex: String, in text: String) -> [String] { do { let regex = try NSRegularExpression(pattern: regex) let results = regex.matches(in: text, range: NSRange(text.startIndex..., in: text)) return results.map { String(text[Range($0.range, in: text)!]) } } catch let error { print("invalid regex: \(error.localizedDescription)") return [] } } 

例:

 let string = "🇩🇪€4€9" let matched = matches(for: "[0-9]", in: string) print(matched) // ["4", "9"] 

(Swift 3及更早的老版本的答案:)

所以你应该将给定的Swiftstring转换为NSString ,然后提取范围。 结果将自动转换为Swiftstring数组。

(Swift 1.2的代码可以在编辑历史中find。)

Swift 2(Xcode 7.3.1):

 func matchesForRegexInText(regex: String, text: String) -> [String] { do { let regex = try NSRegularExpression(pattern: regex, options: []) let nsString = text as NSString let results = regex.matchesInString(text, options: [], range: NSMakeRange(0, nsString.length)) return results.map { nsString.substringWithRange($0.range)} } catch let error as NSError { print("invalid regex: \(error.localizedDescription)") return [] } } 

例:

 let string = "🇩🇪€4€9" let matches = matchesForRegexInText("[0-9]", text: string) print(matches) // ["4", "9"] 

Swift 3(Xcode 8)

 func matches(for regex: String, in text: String) -> [String] { do { let regex = try NSRegularExpression(pattern: regex) let nsString = text as NSString let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length)) return results.map { nsString.substring(with: $0.range)} } catch let error { print("invalid regex: \(error.localizedDescription)") return [] } } 

例:

 let string = "🇩🇪€4€9" let matched = matches(for: "[0-9]", in: string) print(matched) // ["4", "9"] 

我的答案build立在给定答案的基础之上,但是通过添加额外的支持,使正则expression式匹配更加强大:

  • 不仅返回匹配,而且还返回每个匹配的所有捕获组 (请参阅下面的示例)
  • 这个解决scheme不是返回一个空数组,而是支持可选的匹配
  • 通过不打印到控制台避免do/catch ,并使用guard结构
  • String作为扩展名添加到matchingStringsString

Swift 3

 //: Playground - noun: a place where people can play import Foundation extension String { func matchingStrings(regex: String) -> [[String]] { guard let regex = try? NSRegularExpression(pattern: regex, options: []) else { return [] } let nsString = self as NSString let results = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length)) return results.map { result in (0..<result.numberOfRanges).map { result.rangeAt($0).location != NSNotFound ? nsString.substring(with: result.rangeAt($0)) : "" } } } } "prefix12 aaa3 prefix45".matchingStrings(regex: "fix([0-9])([0-9])") // Prints: [["fix12", "1", "2"], ["fix45", "4", "5"]] "prefix12".matchingStrings(regex: "(?:prefix)?([0-9]+)") // Prints: [["prefix12", "12"]] "12".matchingStrings(regex: "(?:prefix)?([0-9]+)") // Prints: [["12", "12"]], other answers return an empty array here // Safely accessing the capture of the first match (if any): let number = "prefix12suffix".matchingStrings(regex: "fix([0-9]+)su").first?[1] // Prints: Optional("12") 

Swift 2

 extension String { func matchingStrings(regex: String) -> [[String]] { guard let regex = try? NSRegularExpression(pattern: regex, options: []) else { return [] } let nsString = self as NSString let results = regex.matchesInString(self, options: [], range: NSMakeRange(0, nsString.length)) return results.map { result in (0..<result.numberOfRanges).map { result.rangeAtIndex($0).location != NSNotFound ? nsString.substringWithRange(result.rangeAtIndex($0)) : "" } } } } 

如果你想从string中提取子string,不仅仅是位置(但是包括表情符号的实际string)。 那么,下面可能是一个更简单的解决scheme。

 extension String { func regex (pattern: String) -> [String] { do { let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpressionOptions(rawValue: 0)) let nsstr = self as NSString let all = NSRange(location: 0, length: nsstr.length) var matches : [String] = [String]() regex.enumerateMatchesInString(self, options: NSMatchingOptions(rawValue: 0), range: all) { (result : NSTextCheckingResult?, _, _) in if let r = result { let result = nsstr.substringWithRange(r.range) as String matches.append(result) } } return matches } catch { return [String]() } } } 

用法示例:

 "someText 👿🏅👿⚽️ pig".regex("👿⚽️") 

将返回以下内容:

 ["👿⚽️"] 

注意使用“\ w +”可能会产生意外的“”

 "someText 👿🏅👿⚽️ pig".regex("\\w+") 

将返回这个string数组

 ["someText", "️", "pig"] 

我发现接受的答案的解决scheme不幸的是不能编译Swift 3的Linux。 这是一个修改后的版本,

 import Foundation func matches(for regex: String, in text: String) -> [String] { do { let regex = try RegularExpression(pattern: regex, options: []) let nsString = NSString(string: text) let results = regex.matches(in: text, options: [], range: NSRange(location: 0, length: nsString.length)) return results.map { nsString.substring(with: $0.range) } } catch let error { print("invalid regex: \(error.localizedDescription)") return [] } } 

主要区别是:

  1. 在Linux上Swift似乎需要删除基础对象的NS前缀,没有Swift本地等效。 (见斯威夫特进化提案#86 。)

  2. Linux上的Swift还需要为RegularExpression初始化和matches方法指定options参数。

  3. 出于某种原因,强制一个StringNSString不能在Linux上的Swift中工作,但初始化一个新的NSString作为源的String工作。

此版本也适用于MacOS / Xcode上的Swift 3,唯一的例外是您必须使用名称NSRegularExpression而不是RegularExpression

@ p4bloch如果要从一系列捕获括号中捕获结果,则需要使用rangeAtIndex(index)方法,而不是range 。 这里是@MartinR从上面的Swift2的方法,适用于捕获括号。 在返回的数组中,第一个结果[0]是整个捕获,然后单个捕获组从[1]开始。 我注释掉了map操作(所以更容易看到我改变了什么),并用嵌套循环代替它。

 func matches(for regex: String!, in text: String!) -> [String] { do { let regex = try NSRegularExpression(pattern: regex, options: []) let nsString = text as NSString let results = regex.matchesInString(text, options: [], range: NSMakeRange(0, nsString.length)) var match = [String]() for result in results { for i in 0..<result.numberOfRanges { match.append(nsString.substringWithRange( result.rangeAtIndex(i) )) } } return match //return results.map { nsString.substringWithRange( $0.range )} //rangeAtIndex(0) } catch let error as NSError { print("invalid regex: \(error.localizedDescription)") return [] } } 

一个示例用例可能是,比如说你想分割一个title yearstring,比如“Finishing Dory 2016”,你可以这样做:

 print ( matches(for: "^(.+)\\s(\\d{4})" , in: "Finding Dory 2016")) // ["Finding Dory 2016", "Finding Dory", "2016"] 

我就是这么做的,希望它能给Swift带来一个新的视angular。

在这个例子中,我将得到[]之间的任何string

 var sample = "this is an [hello] amazing [world]" var regex = NSRegularExpression(pattern: "\\[.+?\\]" , options: NSRegularExpressionOptions.CaseInsensitive , error: nil) var matches = regex?.matchesInString(sample, options: nil , range: NSMakeRange(0, countElements(sample))) as Array<NSTextCheckingResult> for match in matches { let r = (sample as NSString).substringWithRange(match.range)//cast to NSString is required to match range format. println("found= \(r)") } 

这是一个非常简单的解决scheme,它返回一个string数组与匹配

Swift 3。

 internal func stringsMatching(regularExpressionPattern: String, options: NSRegularExpression.Options = []) -> [String] { guard let regex = try? NSRegularExpression(pattern: regularExpressionPattern, options: options) else { return [] } let nsString = self as NSString let results = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length)) return results.map { nsString.substring(with: $0.range) } }