如何在Haskell中分割string?

有没有一个标准的方法来拆分Haskell中的string?

lineswords在空间或换行符上的分割效果很好,但肯定有一个标准的方法来分割逗号? 我无法在Hoogle上find它?

具体来说,我正在寻找的东西, split "," "my,comma,separated,list"返回["my","comma","separated","list"]

谢谢。

有一个这个被称为分裂的包。

 cabal install split 

像这样使用它:

 ghci> import Data.List.Split ghci> splitOn "," "my,comma,separated,list" ["my","comma","separated","list"] 

它带有很多其他的function来分割匹配的分隔符或者有多个分隔符。

请记住,您可以查看Prelude函数的定义!

http://www.haskell.org/onlinereport/standard-prelude.html

看那里, words的定义是,

 words :: String -> [String] words s = case dropWhile Char.isSpace s of "" -> [] s' -> w : words s'' where (w, s'') = break Char.isSpace s' 

所以,把它改为一个带谓词的函数:

 wordsWhen :: (Char -> Bool) -> String -> [String] wordsWhen ps = case dropWhile ps of "" -> [] s' -> w : wordsWhen p s'' where (w, s'') = break ps' 

然后用任何你想要的谓词来调用它!

 main = print $ wordsWhen (==',') "break,this,string,at,commas" 

如果你使用Data.Text,有splitOn:

http://hackage.haskell.org/packages/archive/text/0.11.2.0/doc/html/Data-Text.html#v:splitOn

这是build立在Haskell平台。

举个例子:

 import qualified Data.Text as T main = print $ T.splitOn (T.pack " ") (T.pack "this is a test") 

要么:

 {-# LANGUAGE OverloadedStrings #-} import qualified Data.Text as T main = print $ T.splitOn " " "this is a test" 

在模块Text.Regex(Haskell平台的一部分)中,有一个函数:

 splitRegex :: Regex -> String -> [String] 

它根据正则expression式分割一个string。 API可以在Hackagefind。

使用Data.List.Split ,它使用split

 [me@localhost]$ ghci Prelude> import Data.List.Split Prelude Data.List.Split> let l = splitOn "," "1,2,3,4" Prelude Data.List.Split> :tl l :: [[Char]] Prelude Data.List.Split> l ["1","2","3","4"] Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read } Prelude Data.List.Split> let l2 = convert l Prelude Data.List.Split> :t l2 l2 :: [Integer] Prelude Data.List.Split> l2 [1,2,3,4] 

试试这个:

 import Data.List (unfoldr) separateBy :: Eq a => a -> [a] -> [[a]] separateBy chr = unfoldr sep where sep [] = Nothing sep l = Just . fmap (drop 1) . break (== chr) $ l 

只适用于单个字符,但应容易扩展。

 split :: Eq a => a -> [a] -> [[a]] split d [] = [] split ds = x : split d (drop 1 y) where (x,y) = span (/= d) s 

例如

 split ';' "a;bb;ccc;;d" > ["a","bb","ccc","","d"] 

一个尾随分隔符将被删除:

 split ';' "a;bb;ccc;;d;" > ["a","bb","ccc","","d"] 

我不知道如何给史蒂夫的回答添加评论,但我想推荐
GHC图书馆文件 ,
并在那里具体的
Data.List中的子列表函数

作为一个参考,比阅读简单的Haskell报告要好得多。

一般来说,关于何时创build一个新的子列表来支持的折叠,也应该解决它。

我昨天开始学习Haskell,所以纠正我,如果我错了,但:

 split :: Eq a => a -> [a] -> [[a]] split xy = func xy [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if y==x then func x ys ([]:(z:zs)) else func x ys ((y:z):zs) 

得到:

 *Main> split ' ' "this is a test" ["this","is","a","test"] 

或者也许你想要

 *Main> splitWithStr " and " "this and is and a and test" ["this","is","a","test"] 

这将是:

 splitWithStr :: Eq a => [a] -> [a] -> [[a]] splitWithStr xy = func xy [[]] where func x [] z = reverse $ map (reverse) z func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then func x (drop (length x) (y:ys)) ([]:(z:zs)) else func x ys ((y:z):zs) 

在ghci中的例子:

 > import qualified Text.Regex as R > R.splitRegex (R.mkRegex "x") "2x3x777" > ["2","3","777"] 

除了答案中给出的高效和预buildfunction之外,我还将添加自己的function,这些function仅仅是我自己编写的用于学习语言的Haskell函数的一部分:

 -- Correct but inefficient implementation wordsBy :: String -> Char -> [String] wordsBy sc = reverse (go s []) where go s' ws = case (dropWhile (\c' -> c' == c) s') of "" -> ws rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws) -- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';') wordsByF :: String -> (Char -> Bool) -> [String] wordsByF sf = reverse (go s []) where go s' ws = case ((dropWhile (\c' -> f c')) s') of "" -> ws rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws) 

解决scheme至less是尾recursion的,所以它们不会导致堆栈溢出。