Sådan får du en Raspberry Pi model 3 til at boote Home Assistant fra SSD disk

Det skulle være utroligt enkelt. Nettet fortæller at en Raspberry Pi 3 (og 4) per automatik kan boote fra usb-devices. Og hvis ikke den kan, så skal man bare sætte et særligt sd-kort i, der piller ved en bit i et eller andet, og så virker det.

Det gør det bare ikke, og at dømme efter de frustrerede opslag på nettet, er det ikke kun mig der har problemet.

Så skal man bare have et SD-kort med filen bootcode.bin, og så kører det, får man at vide på nettet. Det gør det også. Hver tiende gang.

Det man gør er at man placerer hele boot-partitionen på sd-kortet. Og så kører det 9 ud af 10 gange. Derefter skriver man boot_delay=2 (hvor de 2 dækker over to sekunder, det kan være det virker med 1 for dig), i config.txt. Og så virker det! Hver gang!

Themes i R

Der er sikkert bedre ovesigter. Men det her er de jeg er faldet over, og godt vil kunne huske.

TV-themes: Farver, baggrund med videre. Tematiseret efter TV-shows. Brooklyn Nine-Nine, Game of Thrones, Rick & Morty, Simpsons og Parks & Rec. Blandt mange andre.

Wes Anderson: Knap så meget temaer, som farve paletter. Der matcher Wes Andersons film.

 

 

Project Euler 209

Project Euler 206, Concealed Square

Find the unique positive integer whose square has the form 1_2_3_4_5_6_7_8_9_0,
where each “_” is a single digit.

All right. The largest square we can get is:

largest <- 1929394959697989990

And the smallest:

smallest <- 1020304050607080900

That means, that the integer we are looking for is in this interval:

sqrt(smallest)
## [1] 1010101010.1010101
sqrt(largest)
## [1] 1389026623.1062636

We know that the square has to end on 0. That means, that the integer must also end on 0.

Any integer with 0 as the last digit, will give us a square that ends with 00.

From that we can conclude, that the square end with 900, and from that, that the integer must end with either 30 or 70.

That trims the number of integers we need to check significantly.

Although the range we are looking at, is reduced only a little:

1010101030 to 1389026670.

We are going to need a couple of libraries:

library(tibble)
library(dplyr)
result <- seq(1010101030,1389026670, by = 10) %>% 
  enframe() %>% 
  filter(value%%100 %in% c(30,70)) %>% 
  mutate(square = (value)**2) %>% 
  filter(square%/%10000000000000000%%10 == 2) %>% 
  filter(square%/%100000000000000%%10 == 3) %>% 
  filter(square%/%1000000000000%%10 == 4) %>% 
  filter(square%/%10000000000%%10 == 5) %>% 
  filter(square%/%100000000%%10 == 6) %>% 
  filter(square%/%1000000%%10 == 7) %>% 
  filter(square%/%10000%%10 == 8) %>% 
  select(value)

Not very elegant. A more detailed analysis might have given the result without having to write any code.

Errorbars on barcharts

Errorbars

An errorbar is a graphical indication of the standard deviation of a measurement. Or a confidence interval.

The mean of a measurement is something. We want to illustrate how confident we are of that value.

How do we plot that?

I am going to use three libraries:

library(ggplot2)
library(dplyr)
library(tibble)

ggplot2 for plotting, dplyr to make manipulating data easier. And tibble to support a single line later.

Then, we need some data.

data <- data.frame(
  name=letters[1:3],
  mean=sample(seq(4,15),3),
  sd=c(1,0.2,3)
)

I now have a dataframe with three variables, name, mean and standard deviation (sd).

How do we plot that?

Simple:

data %>% 
  ggplot() +
  geom_bar( aes(x=name, y=mean), stat="identity", fill="skyblue") +
  geom_errorbar(aes(x=name, ymin=mean-sd, ymax = mean+sd), width=0.1)

plot of chunk unnamed-chunk-3

So. What on earth was that?

The pipe-operator %>% takes whatever is on the left-hand-side, and inserts it as the first variable in whatever is on the right-hand-side.

What is on the right-hand-side is the ggplot function. That would normally have a datat=something as the first variable. Here that is the data we constructed earlier.

To that initial plot, which is completely empty, we add a geom_bar. That plots the bars on the plot. It takes an x-value, name, and a y-value, mean. And we tell the function, that rather than counting the number of observations of each x-value (the default behavior of geom_bar), it should use the y-values provided. We also want a nice lightblue color for the bars.

To that bar-chart, we now add errorbars. geom_errorbar needs to know the x- and y-values of the bars, in order to place them correctly. It also needs to know where to place the upper errorbar, and the lower errorbar. And we supply the information that ymin, the lower, should be the mean value minus the standard deviation. And the upper bar, ymax, the sum of the mean and sd. Finally we need to decide how broad those lines should be. We do that by writing “width=0.1”. We do not actually have to, but the default value results in an ugly plot.

And there you go, a barchart with errorbars!

Next step

That was all very nice. However! We do not usually have a nice dataframe with means and standard deviations calculated directly. More often, we have a dataframe like this:

mtcars %>% 
  remove_rownames() %>% 
  select(cyl, disp) %>% 
  head()
##   cyl disp
## 1   6  160
## 2   6  160
## 3   4  108
## 4   6  258
## 5   8  360
## 6   6  225

I’ll get back to what the code actually means later.

Here we have 32 observations (only 6 shown above), of the variables “cyl” and “disp”. I would now like to make a barplot of the mean value of disp for each of the three different values or groups in cyl (4,6 and 8). And add the errorbars.

You could scroll through all the data, sort them by cyl, manually count the number of observations in each group, add the disp, divide etc etc etc.

But there is a simpler way:

mtcars %>% 
  remove_rownames() %>% 
  select(cyl, disp) %>% 
  group_by(cyl) %>% 
  summarise(mean=mean(disp), sd=sd(disp))
## # A tibble: 3 x 3
##     cyl  mean    sd
##   <dbl> <dbl> <dbl>
## 1     4  105.  26.9
## 2     6  183.  41.6
## 3     8  353.  67.8

mtcars is a build-in dataset (cars in the us in 1974). I send that, using the pipe-operator, to the function remove_rownames, that does exactly that. We don’t need them, and they will just confuse us. That result is then send to the function select, that selects the two columns/variables cyl and disp, and discards the rest. Next, we group the data according to the value of cyl. There are three different values, 4, 6 and 8. And then we use the summarise function, to calculate the mean and the standard deviation of disp, for each of the three groups.

Now we should be ready to plot. We just send the result above to the plot function from before:

mtcars %>% 
  remove_rownames() %>% 
  select(cyl, disp) %>% 
  group_by(cyl) %>% 
  summarise(mean=mean(disp), sd=sd(disp)) %>% 
  ggplot() +
    geom_bar( aes(x=cyl, y=mean), stat="identity", fill="skyblue") +
    geom_errorbar(aes(x=cyl, ymin=mean-sd, ymax = mean+sd), width=0.1)

plot of chunk unnamed-chunk-6
All we need to remember is to change “name” in the original to “cyl”. All done!

But wait! There is more!!

Those errorbars can be shown in more than one way.

Let us start by saving our means and sds in a dataframe:

data <- mtcars %>% 
  remove_rownames() %>% 
  select(cyl, disp) %>% 
  group_by(cyl) %>% 
  summarise(mean=mean(disp), sd=sd(disp))

geom_crossbar results in this:

data %>% 
ggplot() +
  geom_bar( aes(x=cyl, y=mean), stat="identity", fill="skyblue") +
  geom_crossbar( aes(x=cyl, y=mean, ymin=mean-sd, ymax=mean+sd))

plot of chunk unnamed-chunk-8

I think it is ugly. But whatever floats your boat.

Then there is just a vertival bar, geom_linerange. I think it makes it a bit more difficult to compare the errorbars. On the other hand, it results in a plot that is a bit more clean:

data %>% ggplot() +
  geom_bar( aes(x=cyl, y=mean), stat="identity", fill="skyblue") +
  geom_linerange( aes(x=cyl, ymin=mean-sd, ymax=mean+sd))

plot of chunk unnamed-chunk-9

And here is geom_pointrange. The mean is shown as a point. This probably works best without the bars.

data %>% ggplot() +
  geom_bar( aes(x=cyl, y=mean), stat="identity", fill="skyblue", alpha=0.5) +
  geom_pointrange( aes(x=cyl, y=mean, ymin=mean-sd, ymax=mean+sd))

plot of chunk unnamed-chunk-10

Project Euler 5 – Smallest multiple

What is the smallest, positive, number that can be divided by all numbers from 1 to 20 without any remainder?

We are given that 2520 is the smallest that can be divided by all numbers from 1:10.

One number that can definitely be divided by all numbers from 1:20 is:

factorial(20)
## [1] 2.432902e+18

But given that

factorial(10)
## [1] 3628800

is rather larger than 2520, it is definitely not the answer.

The answer must be a multiple of all the primes smaller than 20. A number that is divisible by 15, will be divisible by
3 and 5.

The library “numbers” have a lot of useful functions. Primes(20) returns all primes smaller than 20, and prod() returns the product of all those primes

library(numbers)
prod(Primes(20))
## [1] 9699690

Could that be the answer?

What we are looking at is the modulo-operator. 9699690 modulo 2 – what is the remainder? We know that all the remainders, dividing by 1 to 20 must be 0.

prod(Primes(20)) %% 2
## [1] 0

And our large product is divisible by 2 without a remainder.

Thankfully the operator is vectorized, so we can do all the divisions in one go:

9699690 %% 1:20
##  [1]  0  0  0  2  0  0  0  2  3  0  0  6  0  0  0 10  0 12  0 10

Nope.

9699690 %% 4
## [1] 2

Leaves a remainder.

(2*9699690) %% 4
## [1] 0

Now I just need to find the number to multiply 9699690 with, in order for all the divisions to have a remainder of 0.
That is, change i in this code until the answer is true.

i <- 2
all((i*9699690) %% 1:20 == 0)
## [1] FALSE

Starting with 1*9699690, I test if all the remainders of the divisions by all numbers from 1 to 20 is zero.
As long as they are not, I increase i by 1, save i*9699690 as the answer, and test again.
If the test is TRUE, that is all the remainders are 0, the while-loop quits, and I have the answer.

i <- 1
while(!all((i*9699690) %% 1:20 == 0)){
 i <- i + 1
 answer <- i*9699690
}

Project Euler 4 – Largest palindrome product

I solved this more than four years ago. So I have no idea if this was the way I did it originally.

The problem is:

A palindromic number is a number that reads the same both ways. Like “A man, a plan, a canal, Panama!” just with numbers.

The problem informs us, that the largest palindrome made from the product of two two-digit numbers, is 9009.

We get that from 91 × 99. We must now find the largest palindrome, made from the product of two three-digit numbers.

First, we need to figure out how to test if a number is palindromic. There might be easier ways. But this is mine.

Convert the number to a string. Split it in individual characters reverse the string, and compare to the original.

Like this, using the stringr library:

library(stringr)
rev(unlist(str_split("9009", ""))) == unlist(str_split("9009", ""))
## [1] TRUE TRUE TRUE TRUE

Everything is TRUE, as it should be – lets make a function.

is_palindromic <- function(x){
  all(rev(unlist(str_split(x, ""))) == unlist(str_split(x, "")))
}

is_palindromic(9009)
## [1] TRUE

Only problem is that it is not vectorized

is_palindromic(c(991, 9009))
## [1] FALSE

Just for the fun of it, lets write something vectorized. This is definitely not the way I did it originally.

is_palindromic <- function(x){
  unlist(lapply(lapply(str_split(x,""), rev),str_c, collapse="")) == as.character(x)
}

is_palindromic(c(991, 9009))
## [1] FALSE  TRUE

Nice.
There are definitely faster ways of doing it.

Now I just need to generate all products made up of two three-digit numbers, test if it is palindromic, and find the largest.

I use expand.grid to generate all combinations of all three-digit numbers, transmute the product into existence, and filter out all the non-palindromic products. Then I arrange the product variable in descending order, and take the first.

library(dplyr)
answer <- expand.grid(100:999, 100:999) %>% 
  transmute(product = Var1*Var2) %>% 
  filter(is_palindromic(product)) %>% 
  arrange(desc(product)) %>% 
  slice(1)

Not very fast… The main problem is that I check all 810.000 products. There is no need for that. If I instead test the products in order, from the largest, to the smallest, the first palindromic number I find, is the largest.

Let us generate a list of candidates, in descending order:

cand <- expand.grid(100:999, 100:999) %>% 
  transmute(product = Var1*Var2) %>% 
  arrange(desc(product)) %>% 
  .$product

And now, for each element in the candidate-vector, if it is palindromic, set that to be the answer, and break out of the for-loop.

for(i in cand){
  if(is_palindromic(i)){
    answer <- i
    break()
  }
}

Much faster.

Project Euler 203

Project Euler 203

Squarefree Binomial Coefficients. Building Pascals triangle to 51 rows (look it up – its a bloody nightmare to write out here): Locate all distinct numbers in that triangle, that does not divide by a square of a prime.

OK. First step is to get all the numbers in 51 rows of Pascals triangle.

R has a built in function, choose(i,j) that returns the number for row i, position j.

We can use that to iterate through all the possible positions in a 51 row triangle:

numbers <- 1
for(i in 0:50){
  for(j in 0:i){
    numbers <- c(numbers, choose(i,j))
  }
}

Next step is to make sure that we only have unique, distinct, values:

numbers <- unique(numbers)

We now need to divide each number by the square of a prime. My first instinct was to generate all primes smaller than the squareroot of the largest number in numbers.

That would be all primes lower than 11,243,247.

I would then square all those primes, and see if one of them divided neatly into the numbers in the triangel.

Thats an awfull lot of primes.

Much easier would be to note, that if the square of a prime divides neatly into a number, then the prime does as well. And is in fact a prime factor in that number.

And since we have a nice library that makes it easy to get the primefactors, thats the way to do it.

library(numbers)
library(purrr)

res <- numbers %>%
    discard(function(x) any(x%%(factors(x)**2)==0))

answer <- sum(res)+1

Passing the numbers vector to the discard function. Discard the element, if the element, modulo the primefactors in that element squared, has one (or more) results that are equal to 0.

The answer is the sum of the elements that are left. Plus 1, since 1 is discarded by the function. factors(1) – for some reason – returns 1.

Nice and simple really.

Corresponding value to a max-value

One of our users need to find the max-value of a variable. He also needs to find the corresponding value in another variable.
As in – the maximum value in column A is in row 42. What is the value in column B, row 42.

And of course we need to do it for several groups.

Let us begin by making a dataset. Four groups in id,

library(tidyverse)
id <- 1:3
val <- c(10,20)
kor <- c("a", "b", "c")


example <- expand.grid(id,val) %>% 
  as_tibble() %>% 
  arrange(Var1) %>% 
  cbind(kor, stringsAsFactors=F) %>% 
  rename(group=Var1, value=Var2, corr = kor)

example
##   group value corr
## 1     1    10    a
## 2     1    20    b
## 3     2    10    c
## 4     2    20    a
## 5     3    10    b
## 6     3    20    c

We have six observations, divided into three groups. They all have a value, and a letter in “corr” that is the corresponding value we are interested in.

So. In group 1 we should find the maximum value 20, and the corresponding value “b”.
In group 2 the max value is stil 20, but the corresponding value we are looking for is “a”.
And in group 3 the max value is yet again 20, but the corresponding value is now “c”.

How to do that?

example %>%
  group_by(group) %>% 
  mutate(max=max(value)) %>% 
  mutate(max_corr=corr[(value==max)]) %>% 
  ungroup()
## # A tibble: 6 x 5
##   group value corr    max max_corr
##   <int> <dbl> <chr> <dbl> <chr>   
## 1     1   10. a       20. b       
## 2     1   20. b       20. b       
## 3     2   10. c       20. a       
## 4     2   20. a       20. a       
## 5     3   10. b       20. c       
## 6     3   20. c       20. c

The maximum value for all groups is 20. And the corresponding value to that in the groups is b, a and c respectively.

Isn’t there an easier solution using summarise function? Probably. But our user needs to do this for a lot of variables. And their names have nothing in common.