i have data involves success/fail variable many different locations x-y coordinates (integers between 0 , 80 both). want model expected probability @ particular location. can pretty using plyr:
sucprop <- ddply(df, .(xcrd,ycrd), function(x) data.frame(obs=nrow(x),prop=mean(x$success)))
this gives me proportion of successes @ each coordinate. success rate @ 1 point should similar @ nearby points wondering how can best take average successes of points within +-5 in both x , y direction.
so (25,50) point take average of observations within (20-30,45-55).
what best way this? can input straight .variables in ddply or have work out sort of rolling index?
you have rolling mean of sorts. here's 1 way sapply:
data.frame( x = rnorm(10, 40, 5), y = rnorm(10, 50, 7), success = rbinom(10, 1, .4)) -> ff newmean <- function(q) { ff[q,"x"] + 5 -> ff[q,"x"] - 5 -> b ff[q,"y"] + 5 -> c ff[q,"y"] - 5 -> d ff[ff$x < & ff$x >b & ff$y < c & ff$y > d, "success"] -> k mean(k) } sapply(x = 1:nrow(ff), newmean) -> ff$neighborhood_prob
Comments
Post a Comment