Sort a file containing a lot of strings alphabetically.

Calculate a value of each string - as the sum of the alphabetical position of each letter. As in:

“ABC” has the value 1 + 2 + 3

library(tidyverse)

Read the file, split by “,”, unlist and store in a tibble. Remove the ‘“’, sort:

data <- readLines("../data/0022.txt") %>% 
  strsplit(",") %>% 
  unlist() %>% 
  as_tibble() %>% 
  mutate(value = str_remove_all(value, '"')) %>% 
  arrange(value) 
## Warning in readLines("../data/0022.txt"): ufuldstændig endelig linje fundet på '../data/0022.txt'

Write a function to calculate the value:

positional_sum <- function(string){
  sapply(string, function(string){
    sum(match(strsplit(string, "")[[1]], LETTERS))
  })
}

A bit cumbersome - because I would like a vectorised version.

Now we add the positional sums to our dataframe, multiply with the position in the dataframe (row number), and add everything together:

answer <- data %>% 
  mutate(sum = positional_sum(value)) %>% 
  mutate(sum = sum*row_number()) %>% 
  summarise(answer = sum(sum))