find当地的最大值和最小值

我正在寻找一个计算有效的方法来find一个庞大的R列表的数字本地最大值/最小值希望没有for循环…

例如，如果我有一个像1 2 3 2 1 1 2 1的数据文件，我想要函数返回3和7，这是本地最大值的位置。

diff(diff(x)) （或diff(x,differences=2) ：感谢@ZheyuanLi）本质上是计算二阶导数的离散类似，所以在局部最大值处应该是负的。下面的+1注意到diff的结果比input向量短的事实。

编辑：增加@汤米的纠正情况下，delta-x不是1 …

 tt <- c(1,2,3,2,1, 1, 2, 1) which(diff(sign(diff(tt)))==-2)+1

我上面的build议（ http://finzi.psych.upenn.edu/R/library/ppc/html/ppc.peaks.html ）适用于数据噪音较大的情况。

@本的解决scheme是相当甜蜜的。它不处理以下情况：

 # all these return numeric(0): x <- c(1,2,9,9,2,1,1,5,5,1) # duplicated points at maxima which(diff(sign(diff(x)))==-2)+1 x <- c(2,2,9,9,2,1,1,5,5,1) # duplicated points at start which(diff(sign(diff(x)))==-2)+1 x <- c(3,2,9,9,2,1,1,5,5,1) # start is maxima which(diff(sign(diff(x)))==-2)+1

这是一个更强大（更慢，更丑）的版本：

 localMaxima <- function(x) { # Use -Inf instead if x is numeric (non-integer) y <- diff(c(-.Machine$integer.max, x)) > 0L rle(y)$lengths y <- cumsum(rle(y)$lengths) y <- y[seq.int(1L, length(y), 2L)] if (x[[1]] == x[[2]]) { y <- y[-1] } y } x <- c(1,2,9,9,2,1,1,5,5,1) localMaxima(x) # 3, 8 x <- c(2,2,9,9,2,1,1,5,5,1) localMaxima(x) # 3, 8 x <- c(3,2,9,9,2,1,1,5,5,1) localMaxima(x) # 1, 3, 8

使用动物园库函数rollapply：

 x <- c(1, 2, 3, 2, 1, 1, 2, 1) library(zoo) xz <- as.zoo(x) rollapply(xz, 3, function(x) which.min(x)==2) # 2 3 4 5 6 7 #FALSE FALSE FALSE TRUE FALSE FALSE rollapply(xz, 3, function(x) which.max(x)==2) # 2 3 4 5 6 7 #FALSE TRUE FALSE FALSE FALSE TRUE

然后，使用“coredata”索引值，其中'which.max'是一个“中心值”，表示一个局部最大值。你可以显然做同样的地方最低限度使用which.min而不是which.max 。

  rxz <- rollapply(xz, 3, function(x) which.max(x)==2) index(rxz)[coredata(rxz)] #[1] 3 7

我假设你不想要开始或结束的值，但是如果你这样做的话，你可以在处理之前填充你的载体的末端，而不是染色体上的端粒。

（我正在注意ppc软件包（用于进行质谱分析的“Peak Probability Contrasts”），只是因为直到阅读BenBolker上面的评论之前，我并没有意识到它的可用性，我认为增加这几个词将增加有人大规模的兴趣将在search中看到这一点。）

有一些很好的解决scheme，但这取决于你需要什么。

just diff(tt)返回差异。

您希望检测何时从增加值到减less值。一种方法是由@Ben提供：

  diff(sign(diff(tt)))==-2

这里的问题是，这只会检测从严格增加到严格减less的变化。

稍微改变将允许在峰值重复的值（对于最后发生的峰值返回TRUE ）：

  diff(diff(x)>=0)<0

然后，如果要在开始或结束时检测最大值，则只需正确地填充正面和背面

这里是包含在一个函数中的所有东西（包括find谷）：

  which.peaks <- function(x,partial=TRUE,decreasing=FALSE){ if (decreasing){ if (partial){ which(diff(c(FALSE,diff(x)>0,TRUE))>0) }else { which(diff(diff(x)>0)>0)+1 } }else { if (partial){ which(diff(c(TRUE,diff(x)>=0,FALSE))<0) }else { which(diff(diff(x)>=0)<0)+1 } } }

在以前的解决scheme中，我得到了一些麻烦，并想出了一种直接获取最小值和最大值的方法。下面的代码将做到这一点，将绘制它，标志着绿色的最低标准和红色的最高标准。与which.max()函数不同，这将从dataframe中提取最小/最大值的所有索引。在第一个diff()函数中添加了零值，以说明每当使用该函数时发生的结果丢失的减less的长度。将其插入到最里面的diff()函数调用中，可以避免在逻辑expression式之外添加一个偏移量。这并不重要，但我觉得这是一个更干净的方法。

 # create example data called stockData stockData = data.frame(x = 1:30, y=rnorm(30,7)) # get the location of the minima/maxima. note the added zero offsets # the location to get the correct indices min_indexes = which(diff( sign(diff( c(0,stockData$y)))) == 2) max_indexes = which(diff( sign(diff( c(0,stockData$y)))) == -2) # get the actual values where the minima/maxima are located min_locs = stockData[min_indexes,] max_locs = stockData[max_indexes,] # plot the data and mark minima with red and maxima with green plot(stockData$y, type="l") points( min_locs, col="red", pch=19, cex=1 ) points( max_locs, col="green", pch=19, cex=1 )

这是最低限度的解决scheme ：

@本的解决scheme

 x <- c(1,2,3,2,1,2,1) which(diff(sign(diff(x)))==+2)+1 # 5

请注意Tommy的post！

@汤米的解决scheme：

 localMinima <- function(x) { # Use -Inf instead if x is numeric (non-integer) y <- diff(c(.Machine$integer.max, x)) > 0L rle(y)$lengths y <- cumsum(rle(y)$lengths) y <- y[seq.int(1L, length(y), 2L)] if (x[[1]] == x[[2]]) { y <- y[-1] } y } x <- c(1,2,9,9,2,1,1,5,5,1) localMinima(x) # 1, 7, 10 x <- c(2,2,9,9,2,1,1,5,5,1) localMinima(x) # 7, 10 x <- c(3,2,9,9,2,1,1,5,5,1) localMinima(x) # 2, 7, 10

请注意： localMaxima和localMinima都不能在开始时处理重复的最大值/最小值！

通过回答@ 42-是伟大的，但我有一个用例，我不想使用zoo 。用dplyr使用lag和lead很容易实现：

 library(dplyr) test = data_frame(x = sample(1:10, 20, replace = TRUE)) mutate(test, local.minima = if_else(lag(x) > x & lead(x) > x, TRUE, FALSE)

像rollapply解决scheme一样，您可以分别通过lag / lead参数n和default来控制窗口大小和边缘情况。

我在其他地方发布了这个，但是我认为这是一个有趣的方法。我不确定它的计算效率是多less，但它是解决问题的一个非常简洁的方法。

 vals=rbinom(1000,20,0.5) text=paste0(substr(format(diff(vals),scientific=TRUE),1,1),collapse="") sort(na.omit(c(gregexpr('[ ]-',text)[[1]]+1,ifelse(grepl('^-',text),1,NA), ifelse(grepl('[^-]$',text),length(vals),NA))))

我今天刺了这个。我知道你说希望没有循环，但我坚持使用应用函数。有点紧凑和快速，并允许阈值规格，所以你可以大于1。

function：

 inflect <- function(x, threshold = 1){ up <- sapply(1:threshold, function(n) c(x[-(seq(n))], rep(NA, n))) down <- sapply(-1:-threshold, function(n) c(rep(NA,abs(n)), x[-seq(length(x), length(x) - abs(n) + 1)])) a <- cbind(x,up,down) list(minima = which(apply(a, 1, min) == a[,1]), maxima = which(apply(a, 1, max) == a[,1])) }

要想看到它/玩阈值，你可以运行下面的代码：

 # Pick a desired threshold # to plot up to n <- 2 # Generate Data randomwalk <- 100 + cumsum(rnorm(50, 0.2, 1)) # climbs upwards most of the time bottoms <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$minima) tops <- lapply(1:n, function(x) inflect(randomwalk, threshold = x)$maxima) # Color functions cf.1 <- grDevices::colorRampPalette(c("pink","red")) cf.2 <- grDevices::colorRampPalette(c("cyan","blue")) plot(randomwalk, type = 'l', main = "Minima & Maxima\nVariable Thresholds") for(i in 1:n){ points(bottoms[[i]], randomwalk[bottoms[[i]]], pch = 16, col = cf.1(n)[i], cex = i/1.5) } for(i in 1:n){ points(tops[[i]], randomwalk[tops[[i]]], pch = 16, col = cf.2(n)[i], cex = i/1.5) } legend("topleft", legend = c("Minima",1:n,"Maxima",1:n), pch = rep(c(NA, rep(16,n)), 2), col = c(1, cf.1(n),1, cf.2(n)), pt.cex = c(rep(c(1, c(1:n) / 1.5), 2)), cex = .75, ncol = 2)

find当地的最大值和最小值

function：

要想看到它/玩阈值，你可以运行下面的代码：

有没有办法在R中的函数中使用两个“…”语句？

增加图例字体大小ggplot2

可以使用dplyr包进行有条件的变异吗？

用0replacematrix<0.1中的所有值

R：如何让grep返回匹配，而不是整个string

具有dynamic数量variables的公式

在data.frame中添加缺失值的行最快的方法？

在ggplot2中，boxplot行的结尾代表什么？

ggplot2中的facet_wrap（）和facet_grid（）有什么区别？

在R中调试的一般建议