One of our users need to find the max-value of a variable. He also needs to find the corresponding value in another variable.
As in – the maximum value in column A is in row 42. What is the value in column B, row 42.
And of course we need to do it for several groups.
Let us begin by making a dataset. Four groups in id,
library(tidyverse)
id <- 1:3
val <- c(10,20)
kor <- c("a", "b", "c")
example <- expand.grid(id,val) %>%
as_tibble() %>%
arrange(Var1) %>%
cbind(kor, stringsAsFactors=F) %>%
rename(group=Var1, value=Var2, corr = kor)
example
## group value corr ## 1 1 10 a ## 2 1 20 b ## 3 2 10 c ## 4 2 20 a ## 5 3 10 b ## 6 3 20 c
We have six observations, divided into three groups. They all have a value, and a letter in “corr” that is the corresponding value we are interested in.
So. In group 1 we should find the maximum value 20, and the corresponding value “b”.
In group 2 the max value is stil 20, but the corresponding value we are looking for is “a”.
And in group 3 the max value is yet again 20, but the corresponding value is now “c”.
How to do that?
example %>%
group_by(group) %>%
mutate(max=max(value)) %>%
mutate(max_corr=corr[(value==max)]) %>%
ungroup()
## # A tibble: 6 x 5 ## group value corr max max_corr ## <int> <dbl> <chr> <dbl> <chr> ## 1 1 10. a 20. b ## 2 1 20. b 20. b ## 3 2 10. c 20. a ## 4 2 20. a 20. a ## 5 3 10. b 20. c ## 6 3 20. c 20. c
The maximum value for all groups is 20. And the corresponding value to that in the groups is b, a and c respectively.
Isn’t there an easier solution using summarise function? Probably. But our user needs to do this for a lot of variables. And their names have nothing in common.