用php preg_match(正则expression式)分割单词camelCase单词

我将如何去分裂这个词:

oneTwoThreeFour 

到一个数组,以便我可以得到:

 one Two Three Four 

preg_match

我厌倦了这一点,但它只是给出了整个词

 $words = preg_match("/[a-zA-Z]*(?:[az][a-zA-Z]*[AZ]|[AZ][a-zA-Z]*[az])[a-zA-Z]*\b/", $string, $matches)`; 

您也可以使用preg_match_all作为:

 preg_match_all('/((?:^|[AZ])[az]+)/',$str,$matches); 

看见

说明:

 ( - Start of capturing parenthesis. (?: - Start of non-capturing parenthesis. ^ - Start anchor. | - Alternation. [AZ] - Any one capital letter. ) - End of non-capturing parenthesis. [az]+ - one ore more lowercase letter. ) - End of capturing parenthesis. 

你可以使用preg_split作为:

 $arr = preg_split('/(?=[AZ])/',$str); 

看见

我基本上是在大写字母之前分割inputstring。 使用的正则expression式(?=[AZ])与大写字母之前的点匹配。

我知道这是一个接受答案的老问题,但恕我直言,有一个更好的解决scheme:

 <?php // test.php Rev:20140412_0800 $ccWord = 'NewNASAModule'; $re = '/(?#! splitCamelCase Rev:20140412) # Split camelCase "words". Two global alternatives. Either g1of2: (?<=[az]) # Position is after a lowercase, (?=[AZ]) # and before an uppercase letter. | (?<=[AZ]) # Or g2of2; Position is after uppercase, (?=[AZ][az]) # and before upper-then-lower case. /x'; $a = preg_split($re, $ccWord); $count = count($a); for ($i = 0; $i < $count; ++$i) { printf("Word %d of %d = \"%s\"\n", $i + 1, $count, $a[$i]); } ?> 

请注意,这个正则expression式(就像codaddict的'/(?=[AZ])/'解决scheme – 就像一个格式良好的camelCase单词的魅力一样)只匹配string中的一个位置 ,根本不消费文本。 这个解决scheme还有另外一个好处,就是它也可以正确处理hasConsecutiveCAPShasConsecutiveCAPS ,如: StartsWithCap和: hasConsecutiveCAPS

input:

oneTwoThreeFour
StartsWithCap
hasConsecutiveCAPS
NewNASAModule

输出:

Word 1 of 4 = "one"
Word 2 of 4 = "Two"
Word 3 of 4 = "Three"
Word 4 of 4 = "Four"

Word 1 of 3 = "Starts"
Word 2 of 3 = "With"
Word 3 of 3 = "Cap"

Word 1 of 3 = "has"
Word 2 of 3 = "Consecutive"
Word 3 of 3 = "CAPS"

Word 1 of 3 = "New"
Word 2 of 3 = "NASA"
Word 3 of 3 = "Module"

编辑: "NewNASAModule"修改正则expression式,脚本和testing数据,以正确拆分: "NewNASAModule"情况下(回应rr的评论)。

@ ridgerunner的答案的function版本。

 /** * Converts camelCase string to have spaces between each. * @param $camelCaseString * @return string */ function fromCamelCase($camelCaseString) { $re = '/(?<=[az])(?=[AZ])/x'; $a = preg_split($re, $camelCaseString); return join($a, " " ); } 

虽然ridgerunner的回答很好,但似乎不适用于出现在句子中间的全部大写的子string。 我使用以下,似乎处理这些只是好的:

 function splitCamelCase($input) { return preg_split( '/(^[^AZ]+|[AZ][^AZ]+)/', $input, -1, /* no limit for replacement count */ PREG_SPLIT_NO_EMPTY /*don't return empty elements*/ | PREG_SPLIT_DELIM_CAPTURE /*don't strip anything from output array*/ ); } 

一些testing用例:

 assert(splitCamelCase('lowHigh') == ['low', 'High']); assert(splitCamelCase('WarriorPrincess') == ['Warrior', 'Princess']); assert(splitCamelCase('SupportSEELE') == ['Support', 'SEELE']); assert(splitCamelCase('LaunchFLEIAModule') == ['Launch', 'FLEIA', 'Module']); assert(splitCamelCase('anotherNASATrip') == ['another', 'NASA', 'Trip']); 
 $string = preg_replace( '/([a-z0-9])([AZ])/', "$1 $2", $string ); 

诀窍是一个可重复的模式$ 1 $ 2 $ 1 $ 2或更低UPPERlower UPPERlower等….例如helloWorld = $ 1匹配“你好”,$ 2匹配“W”和$ 1再次匹配“orld”,所以总之你得到$ 1 $ 2 $ 1或“Hello World”,与HelloWorld匹配为$ 2 $ 1 $ 2 $ 1或再次“Hello World”。 然后你可以将第一个单词的大写字母缩写为小写,或者在空间中将它们分解,或者使用_或者其他的字符来区分它们。

简单而简单。

另一种select是匹配/[AZ]?[az]+/ – 如果你知道你的input是正确的格式,它应该很好地工作。

[AZ]? 会匹配一个大写字母(或没有)。 然后[az]+会匹配所有后面的小写字母,直到下一个匹配。

工作示例: http : //www.ideone.com/MKYkX

我把酷家Ridgerunner的代码(上面)做了一个函数:

 echo deliciousCamelcase('NewNASAModule'); function deliciousCamelcase($str) { $formattedStr = ''; $re = '/ (?<=[az]) (?=[AZ]) | (?<=[AZ]) (?=[AZ][az]) /x'; $a = preg_split($re, $str); $formattedStr = implode(' ', $a); return $formattedStr; } 

这将返回: New NASA Module

也许我的问题可以帮助你,我昨天也问过同样的事情,但是关于Java

打破string大写

你可以在从“小写”滑动到“大写”

 $parts = preg_split('/([az]{1})[AZ]{1}/', $string, -1, PREG_SPLIT_DELIM_CAPTURE); //PREG_SPLIT_DELIM_CAPTURE to also return bracketed things var_dump($parts); 

令人烦恼的是,你将不得不重build$ parts中每个对应项目的单词

希望这可以帮助

首先codaddict谢谢你的模式,它帮助了很多!

我需要一个解决scheme,以防止介词“a”的存在:

例如thisIsACamelcaseSentence。

我find了解决scheme,做了两步preg_match,并用一些选项做了一个函数:

 /* * input: 'thisIsACamelCaseSentence' output: 'This Is A Camel Case Sentence' * options $case: 'allUppercase'[default] >> 'This Is A Camel Case Sentence' * 'allLowerCase' >> 'this is a camel case sentence' * 'firstUpperCase' >> 'This is a camel case sentence' * @return: string */ function camelCaseToWords($string, $case = null){ isset($case) ? $case = $case : $case = 'allUpperCase'; // Find first occurances of two capitals preg_match_all('/((?:^|[AZ])[AZ]{1})/',$string, $twoCapitals); // Split them with the 'zzzzzz' string. eg 'AZ' turns into 'AzzzzzzZ' foreach($twoCapitals[0] as $match){ $firstCapital = $match[0]; $lastCapital = $match[1]; $temp = $firstCapital.'zzzzzz'.$lastCapital; $string = str_replace($match, $temp, $string); } // Now split words preg_match_all('/((?:^|[AZ])[az]+)/', $string, $words); $output = ""; $i = 0; foreach($words[0] as $word){ switch($case){ case 'allUpperCase': $word = ucfirst($word); break; case 'allLowerCase': $word = strtolower($word); break; case 'firstUpperCase': ($i == 0) ? $word = ucfirst($word) : $word = strtolower($word); break; } // remove te 'zzzzzz' from a word if it has $word = str_replace('zzzzzz','', $word); $output .= $word." "; $i++; } return $output; } 

随意使用它,如果有一个“更容易”的方式来做到这一步,请评论!