Rank correlation matrix in R -
how produce rank correlation matrix in elegant way in r given data frame many columns? couldn't find built-in function, tried
> test=data.frame(x=c(1,2,3,4,5), y=c(5,4,3,2,1)) > cor(rank(test))
(only 2 columns simplicity, real data has 5 columns) gave
> error in cor(rank(test)) : supply both 'x' , 'y' or matrix-like 'x'
i figured because rank
takes single vector. tried
> cor(lapply(test,rank))
to rank applied each column in data frame, treating data frame list, gave error
> supply both 'x' , 'y' or matrix-like 'x'
and ended getting working with
> cor(data.frame(lapply(test,rank))) x y x 1 -1 y -1 1
however seems pretty verbose , ugly. i'm thinking there must better way -- if what?
you doing wrong -- use kendall
method argument cor()
instead:
r> testdf <- data.frame(x=c(1,2,3,4,5), y=c(5,4,3,2,1)) r> cor(testdf, method="kendall") x y x 1 -1 y -1 1 r>
from help(cor)
:
for
cor()
, if method"kendall"
or"spearman"
, kendall's tau or spearman's rho statistic used estimate rank-based measure of association. these more robust , have been recommended if data not come bivariate normal distribution.cov()
, non-pearson method unusual available sake of completeness. note"spearman"
computescor(r(x), r(y))
(orcov(.,.)
)r(u) := rank(u, na.last="keep")
. in case of missing values, ranks calculated depending on value of use, either based on complete observations, or based on pairwise completeness reranking each pair.
Comments
Post a Comment