r - Merge data frames for Cohen's kappa -
i'm trying analyze date using r i'm not familiar r (yet) , therefore i'm totally stuck.
what try manipulate input data can use calculate cohen's kappa. problem is, rater_1, have several ratings of items , need select one. if rater_1 has given same rate on item rater_2, rating should chosen, if not rating of list can used.
i tried
unique(merge(rater_1, rater_2, all.x=true))
which brings me close, if ratings between 2 raters diverge, 1 kept.
so, question is, how from
item rating_1 1 3 2 5 3 4 item rating_2 1 2 1 3 2 4 2 1 2 2 3 4 3 2
to
item rating_1 rating_2 1 3 3 2 5 4 3 4 4
?
there fancy ways this, thought might helpful combine few basic techniques accomplish task. usually, in question, should include easy way generate data, this:
# create sample data set.seed(1) id<-rep(1:50) rater_1<-sample(1:5,50,replace=true) df1<-data.frame(id,rater_1) id<-rep(1:50,each=2) rater_2<-sample(1:5,100,replace=true) df2<-data.frame(id,rater_2)
now, here 1 simple technique doing this.
# merge data frames. all.merged<-merge(df1,df2) # id rater_1 rater_2 # 1 1 2 3 # 2 1 2 5 # 3 2 2 3 # 4 2 2 2 # 5 3 3 1 # 6 3 3 1 # find ones equal. same.rating<-all.merged[all.merged$rater_2==all.merged$rater_1,] # consider id 44, match twice. # remove duplicates. same.rating<-same.rating[!duplicated(same.rating),] # find ones never matched. not.same.rating<-all.merged[!(all.merged$id %in% same.rating$id),] # pick one. chose pick maximum. picked.rating<-aggregate(rater_2~id+rater_1,not.same.rating,max) # stick 2 together. result<-rbind(same.rating,picked.rating) result<-result[order(result$id),] # sort # id rater_1 rater_2 # 27 1 2 5 # 4 2 2 2 # 33 3 3 1 # 44 4 5 3 # 281 5 2 4 # 11 6 5 5
a fancy way this:
same.or.random<-function(x) { matched<-which.min(x$rater_1==x$rater_2) if(length(matched)>0) x[matched,] else x[sample(1:nrow(x),1),] } do.call(rbind,by(merge(df1,df2),id,same.or.random))
Comments
Post a Comment