Merging PDF-files

You may need to merge several pdf-files into one.

There are several tools. Some of them you have to pay for.

If you are in a linux environment, try this:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=final.pdf filea.pdf fileb.pdf

continue with filec.pdf etc.

Some say it is slow.

Others say, yes it is slow, but it produces smaller final files, and are able to handle “difficult pdfs”.

I say it is nice that it can be done from the command line.

Euler 3

Problem 3 in Project Euler

What is the largest prime factor in 600851475143

Super simple. The library numbers have a function primeFactors(). It simply returns all prime factors.
Put that inside a max(), and you have the result.

library(numbers)
max(primeFactors(600851475143))
## [1] Censored

Using the library purrr, it can be written like this, and this is a way of coding that I need a bit more practice in, so lets do it that way as well.

library(purrr)
600851475143 %>%
  primeFactors() %>%
  max()
## [1] Censored

The %>% are a pipe function. It takes whats on the left, and passes it to whats on the right. In this case the number is piped to the function primeFactors(), and the result of that function is piped to the max() function. And you get the result.

Lesson learned – nope. I already knew that numbers is a pretty useful library.

Command line tools

Linux have a lot of small tools, that only does one thing. But do it really well (compare to Windows, that has a lot of large tools that does everything rather badly).

This is really just a note to myself. These tools are really useful but they are not (yet) second nature to me. I often find myself in the situation, where I know there is a tool for something, that I have used several times before. But simply can’t remember what it was.

grep. Searches files for lines matching a regular expression. Useful parameters (or at least parameters I have a regular use for):

-c returns a count of the lines matching.
-n returns the linenumber (in the file) of the matching line.

tail. returns the last part of a file

-n 6 returns the last 6 lines of the file (standard 10)

head. Returns the first part of a file

-n 6 returns the first 6 lines of the file (standard 10)

cut. Removes sections of lines in a file (or other input)

-d x. Splits the line at x. Use ‘ ‘ for space
-f 1. Select the first field.

wc. Counts stuff in files.

-l. counts the lines in the file (or other input)
-w count the words in the file (or other input)

|. Piping. Takes the result of the command in front of it, and pass it to the command after it (and that is the direction. If you find examples on Stackoverflow that will only give the desired result if the direction is reversed, don’t be surprised if it does not work…)

cat. Prints one or more files to standard output (your screen).

But if we print to another output, eg with “> file.name”, we can concatenate several files.

find. Searches for files. “find .” finds everything. Pipe it to grep to search for something specific. eg “find . | grep ‘acta'” to find all files containing the string “acta”.

-print prints the complete filename.
-print0 prints the complete filename even if it includes a newline.

csvsql. Commandline tool for analysing csv-files using sql-syntax

csvstat. Commandline tool for analysing csv-files

 

Copy rows to another sheet – based on cell-values

And handling images while you’re at it.

Given: We have some data in a sheet – lets call it Source. Based on some values in another sheet – lets call that Condition – we want to copy rows from Source to a third sheet. We’ll call that Target.
To complicate things, we want to copy images as well.

Set some variables to Target, Source and Condition.
Delete the content of the existing target sheet. First alle the images, and then the rest. Note that I’m not deleting everything, just from row 6 and down.
Then for each something (d) in column B (adjust ranges – here I’m only looking at the rows from 2 to 9), check if the relevant row in Source matches, then copy to Target.

There’s a small detail here, I needed to insert an identifier in Target, defined by a value in Condition. Instead of trying to insert in Column B, I’m just searching and replacing a placeholder – “£$”, a string I was pretty certain would not show up anywhere.


Sub CopyYes()
Dim c As Range
Dim j As Integer
Dim Source As Worksheet
Dim Target As Worksheet
Dim Condition As Worksheet
Dim k As String
Dim fnd As Variant
Dim rplc As Variant
fnd = "£$"

Set Source = ActiveWorkbook.Worksheets("Ark4") 'Note that ranges in Souce and Condition below should be adjusted. We're not quite there yet.
Set Target = ActiveWorkbook.Worksheets("Ark3")
Set Condition = ActiveWorkbook.Worksheets("Ark1")

' Start by clearing target sheet
' begin with images
Target.Pictures.Delete
' Then we'll delete the rest
'
With Target
.Rows(6 & ":" & .Rows.Count).Delete
End With

j = 7 'This will start copying data to Target sheet at row 1
For Each d In Condition.Range("B2:B9") 'Ark1
k = d.Offset(0, -1)
rplc = k
For Each c In Source.Range("B2:B52") 'Ark2
If d = c Then
Source.Rows(c.row).Copy Target.Rows(j)
j = j + 1
End If
Next c
Target.Cells.Replace what:=fnd, Replacement:=rplc

Next d
'we'll end by hiding some columns
Target.Columns("A:E").Hidden = True
End Sub

Changing class on columns i a dataframe

Given a dataframe with some columns that has a “wrong” class, eg character, that should be numeric. How to make that change.

i <- as.character(1:10)
c <- rep("a",10)
c
##  [1] "a" "a" "a" "a" "a" "a" "a" "a" "a" "a"
df <- data.frame(a=i,b=i, c=c, d=i, stringsAsFactors = FALSE)
sapply(df,class)
##           a           b           c           d 
## "character" "character" "character" "character"

The dataframe has four columns. The columns a,b and d are characters, but should be numeric.

df[,c("a","b","d")] <- sapply(df[,c("a","b","d")], as.numeric)
sapply(df,class)
##           a           b           c           d 
##   "numeric"   "numeric" "character"   "numeric"

An important thing to note here is that apply functions a bit different from the other members of the apply-family:

apply(df,2,class)
##           a           b           c           d 
## "character" "character" "character" "character"
sapply(df,class)
##           a           b           c           d 
##   "numeric"   "numeric" "character"   "numeric"

What happens is that apply coerces df to an array. And all content in an array have to be of the same class. The lowest common denominator for that is character.


					

Excel dansk/english danish/english dictionary

Because you often find the solution to your excel problem on an english language site. But have to solve the problem in a danish version of Excel.
Remember that you also have to change “,” to “;”

 

 

Tilføjelsesprogram- og automatiseringsfunktioner Add-in and Automation functions
KALD CALL
EUROKONVERTER EUROCONVERT
HENTPIVOTDATA GETPIVOTDATA
REGISTER.ID REGISTER.ID
SQL.REQUEST SQL.REQUEST
Kubefunktioner Cube functions
KUBE.KPI.MEDLEM CUBEKPIMEMBER
KUBE.MEDLEM CUBEMEMBER
KUBEMEDLEM.EGENSKAB CUBEMEMBERPROPERTY
KUBEMEDLEM.RANG CUBERANKEDMEMBER
KUBESÆT CUBESET
KUBESÆT.TÆL CUBESETCOUNT
KUBEVÆRDI CUBEVALUE
Databasefunktioner Database functions
DMIDDEL DAVERAGE
DTÆL DCOUNT
DTÆLV DCOUNTA
DHENT DGET
DMAKS DMAX
DMIN DMIN
DPRODUKT DPRODUCT
DSTDAFV DSTDEV
DSTDAFVP DSTDEVP
DSUM DSUM
DVARIANS DVAR
DVARIANSP DVARP
Dato- og klokkeslætsfunktioner Date and time functions
DATO DATE
DATOVÆRDI DATEVALUE
DAG DAY
DAGE360 DAYS360
EDATO EDATE
SLUT.PÅ.MÅNED EOMONTH
TIME HOUR
MINUT MINUTE
MÅNED MONTH
ANTAL.ARBEJDSDAGE NETWORKDAYS
NU NOW
SEKUND SECOND
KLOKKESLÆT TIME
TIDSVÆRDI TIMEVALUE
IDAG TODAY
UGEDAG WEEKDAY
UGE.NR WEEKNUM
ARBEJDSDAG WORKDAY
ÅR YEAR
ÅR.BRØK YEARFRAC
Tekniske funktioner Engineering functions
BESSELI BESSELI
BESSELJ BESSELJ
BESSELK BESSELK
BESSELY BESSELY
BIN.TIL.DEC BIN2DEC
BIN.TIL.HEX BIN2HEX
BIN.TIL.OKT BIN2OCT
KOMPLEKS COMPLEX
KONVERTER CONVERT
DEC.TIL.BIN DEC2BIN
DEC.TIL.HEX DEC2HEX
DEC.TIL.OKT DEC2OCT
DELTA DELTA
FEJLFUNK ERF
FEJLFUNK.KOMP ERFC
GETRIN GESTEP
HEX.TIL.BIN HEX2BIN
HEX.TIL.DEC HEX2DEC
HEX.TIL.OKT HEX2OCT
IMAGABS IMABS
IMAGINÆR IMAGINARY
IMAGARGUMENT IMARGUMENT
IMAGKONJUGERE IMCONJUGATE
IMAGCOS IMCOS
IMAGDIV IMDIV
IMAGEKSP IMEXP
IMAGLN IMLN
IMAGLOG10 IMLOG10
IMAGLOG2 IMLOG2
IMAGPOTENS IMPOWER
IMAGPRODUKT IMPRODUCT
IMAGREELT IMREAL
IMAGSIN IMSIN
IMAGKVROD IMSQRT
IMAGSUB IMSUB
IMAGSUM IMSUM
OKT.TIL.BIN OCT2BIN
OKT.TIL.DEC OCT2DEC
OKT.TIL.HEX OCT2HEX
Finansielle funktioner Financial functions
PÅLØBRENTE ACCRINT
PÅLØBRENTE.UDLØB ACCRINTM
AMORDEGRC AMORDEGRC
AMORLINC AMORLINC
KUPONDAGE.SA COUPDAYBS
KUPONDAGE.A COUPDAYS
KUPONDAGE.ANK COUPDAYSNC
KUPONDAG.NÆSTE COUPNCD
KUPONBETALINGER COUPNUM
KUPONDAG.FORRIGE COUPPCD
AKKUM.RENTE CUMIPMT
AKKUM.HOVEDSTOL CUMPRINC
DB DB
DSA DDB
DISKONTO DISC
KR.DECIMAL DOLLARDE
KR.BRØK DOLLARFR
VARIGHED DURATION
EFFEKTIV.RENTE EFFECT
FV FV
FVTABEL FVSCHEDULE
RENTEFOD INTRATE
R.YDELSE IPMT
IA IRR
ISPMT ISPMT
MVARIGHED MDURATION
MIA MIRR
NOMINEL NOMINAL
NPER NPER
NUTIDSVÆRDI NPV
ULIGE.KURS.PÅLYDENDE ODDFPRICE
ULIGE.FØRSTE.AFKAST ODDFYIELD
ULIGE.SIDSTE.KURS ODDLPRICE
ULIGE.SIDSTE.AFKAST ODDLYIELD
YDELSE PMT
H.YDELSE PPMT
KURS PRICE
KURS.DISKONTO PRICEDISC
KURS.UDLØB PRICEMAT
NV PV
RENTE RATE
MODTAGET.VED.UDLØB RECEIVED
LA SLN
ÅRSAFSKRIVNING SYD
STATSOBLIGATION TBILLEQ
STATSOBLIGATION.KURS TBILLPRICE
STATSOBLIGATION.AFKAST TBILLYIELD
VSA VDB
INTERN.RENTE XIRR
NETTO.NUTIDSVÆRDI XNPV
AFKAST YIELD
AFKAST.DISKONTO YIELDDISC
AFKAST.UDLØBSDATO YIELDMAT
Informationsfunktioner Information functions
CELLE CELL
FEJLTYPE ERROR.TYPE
INFO INFO
ER.TOM ISBLANK
ER.FJL ISERR
ER.FEJL ISERROR
ER.LIGE ISEVEN
ER.LOGISK ISLOGICAL
ER.IKKE.TILGÆNGELIG ISNA
ER.IKKE.TEKST ISNONTEXT
ER.TAL ISNUMBER
ER.ULIGE ISODD
ER.REFERENCE ISREF
ER.TEKST ISTEXT
TAL N
IKKE.TILGÆNGELIG NA
VÆRDITYPE TYPE
Logiske funktioner Logical functions
OG AND
FALSK FALSE
HVIS IF
HVIS.FEJL IFERROR
IKKE NOT
ELLER OR
SAND TRUE
Opslags- og referencefunktioner Lookup and reference functions
ADRESSE ADDRESS
OMRÅDER AREAS
VÆLG CHOOSE
KOLONNE COLUMN
KOLONNER COLUMNS
VOPSLAG HLOOKUP
HYPERLINK HYPERLINK
INDEKS INDEX
INDIREKTE INDIRECT
SLÅ.OP LOOKUP
SAMMENLIGN MATCH
FORSKYDNING OFFSET
RÆKKE ROW
RÆKKER ROWS
RTD RTD
TRANSPONER TRANSPOSE
LOPSLAG VLOOKUP
Matematiske og trigonometriske funktioner Math and trigonometry functions
ABS ABS
ARCCOS ACOS
ARCCOSH ACOSH
ARCSIN ASIN
ARCSINH ASINH
ARCTAN ATAN
ARCTAN2 ATAN2
ARCTANH ATANH
AFRUND.LOFT CEILING
KOMBIN COMBIN
COS COS
COSH COSH
GRADER DEGREES
LIGE EVEN
EKSP EXP
FAKULTET FACT
DOBBELT.FAKULTET FACTDOUBLE
AFRUND.GULV FLOOR
STØRSTE.FÆLLES.DIVISOR GCD
HELTAL INT
MINDSTE.FÆLLES.MULTIPLUM LCM
LN LN
LOG LOG
LOG10 LOG10
MDETERM MDETERM
MINVERT MINVERSE
MPRODUKT MMULT
REST MOD
MAFRUND MROUND
MULTINOMIAL MULTINOMIAL
ULIGE ODD
PI PI
POTENS POWER
PRODUKT PRODUCT
KVOTIENT QUOTIENT
RADIANER RADIANS
SLUMP RAND
SLUMP.MELLEM RANDBETWEEN
ROMERTAL ROMAN
AFRUND ROUND
RUND.NED ROUNDDOWN
RUND.OP ROUNDUP
SERIESUM SERIESSUM
FORTEGN SIGN
SIN SIN
SINH SINH
KVROD SQRT
KVRODPI SQRTPI
SUBTOTAL SUBTOTAL
SUM SUM
SUM.HVIS SUMIF
SUM.HVISER SUMIFS
SUMPRODUKT SUMPRODUCT
SUMKV SUMSQ
SUMX2MY2 SUMX2MY2
SUMX2PY2 SUMX2PY2
SUMXMY2 SUMXMY2
TAN TAN
TANH TANH
AFKORT TRUNC
Statistiske funktioner Statistical functions
MAD AVEDEV
MIDDEL AVERAGE
MIDDELV AVERAGEA
MIDDEL.HVIS AVERAGEIF
MIDDEL.HVISER AVERAGEIFS
BETAFORDELING BETADIST
BETAINV BETAINV
BINOMIALFORDELING BINOMDIST
CHIFORDELING CHIDIST
CHIINV CHIINV
CHITEST CHITEST
KONFIDENSINTERVAL CONFIDENCE
KORRELATION CORREL
TÆL COUNT
TÆLV COUNTA
ANTAL.BLANKE COUNTBLANK
TÆLHVIS COUNTIF
TÆL.HVISER COUNTIF
KOVARIANS COVAR
KRITBINOM CRITBINOM
SAK DEVSQ
EKSPFORDELING EXPONDIST
FFORDELING FDIST
FINV FINV
FISHER FISHER
FISHERINV FISHERINV
PROGNOSE FORECAST
FREKVENS FREQUENCY
FTEST FTEST
GAMMAFORDELING GAMMADIST
GAMMAINV GAMMAINV
GAMMALN GAMMALN
GEOMIDDELVÆRDI GEOMEAN
FORØGELSE GROWTH
HARMIDDELVÆRDI HARMEAN
HYPGEOFORDELING HYPGEOMDIST
SKÆRING INTERCEPT
TOPSTEJL KURT
STOR LARGE
LINREGR LINEST
LOGREGR LOGEST
LOGINV LOGINV
LOGNORMFORDELING LOGNORMDIST
MAKS MAX
MAKSV MAXA
MEDIAN MEDIAN
MIN MIN
MINV MINA
HYPPIGST MODE
NEGBINOMFORDELING NEGBINOMDIST
NORMFORDELING NORMDIST
NORMINV NORMINV
STANDARDNORMFORDELING NORMSDIST
STANDARDNORMINV NORMSINV
PEARSON PEARSON
FRAKTIL PERCENTILE
PROCENTPLADS PERCENTRANK
PERMUT PERMUT
POISSON POISSON
SANDSYNLIGHED PROB
KVARTIL QUARTILE
PLADS RANK
FORKLARINGSGRAD RSQ
SKÆVHED SKEW
HÆLDNING SLOPE
MINDSTE SMALL
STANDARDISER STANDARDIZE
STDAFV STDEV
STDAFVV STDEVA
STDAFVP STDEVP
STDAFVPV STDEVPA
STFYX STEYX
TFORDELING TDIST
TINV TINV
TENDENS TREND
TRIMMIDDELVÆRDI TRIMMEAN
TTEST TTEST
VARIANS VAR
VARIANSV VARA
VARIANSP VARP
VARIANSPV VARPA
WEIBULL WEIBULL
ZTEST ZTEST
Tekstfunktioner Text functions
ASC ASC
BAHTTEKST BAHTTEXT
TEGN CHAR
RENS CLEAN
KODE CODE
SAMMENKÆDNING CONCATENATE
KR DOLLAR
EKSAKT EXACT
FIND, FINDB FIND, FINDB
FAST FIXED
JIS JIS
VENSTRE, VENSTREB LEFT, LEFTB
LÆNGDE, LÆNGDEB LEN, LENB
SMÅ.BOGSTAVER LOWER
MIDT, MIDTB MID, MIDB
FONETISK PHONETIC
STORT.FORBOGSTAV PROPER
ERSTAT, ERSTATB REPLACE, REPLACEB
GENTAG REPT
HØJRE, HØJREB RIGHT, RIGHTB
SØG, SØGB SEARCH, SEARCHB
UDSKIFT SUBSTITUTE
T T
TEKST TEXT
FJERN.OVERFLØDIGE.BLANKE TRIM
STORE.BOGSTAVER UPPER
VÆRDI VALUE

Replacing values in a dataframe – to what a previous value was

Given a set of data, where some values indicate that they are the same as a previous value, how to replace them with the correct value.

Eg, this dataframe:

(m <- data.frame(i=c(1:10,NA), t=c("lorem", "do", "do", "Do", "ipsum", "do", "Do", "(do)", "dolor", NA, "test"), stringsAsFactors=F))
##     i     t
## 1   1 lorem
## 2   2    do
## 3   3    do
## 4   4    Do
## 5   5 ipsum
## 6   6    do
## 7   7    Do
## 8   8  (do)
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

How to replace the first three “do”s with “lorem” and the next set of “do”s with “ipsum”

Using fill() from the tidyr package is straight forward. It takes a vector, locates all NA, and replaces them with the last, non-NA value.
Simple enough, change all the variations of “do” to NA, run fill(). Done.
One problem, there might be NAs in the dataset, that we do not want to affect.
Solution – there might be a more elegant one, but this works:

  1. Change the NAs to something that do not occur in the data
  2. Change to variations of “do” to NA
  3. Use the fill()-function
  4. Change the NAs from step 1 back to NA
library(tidyr)
rpl <- "replacement"
m[is.na(m$t),]$t <- rpl
doset <- c("do", "Do", "(do)")
m[(m$t %in% doset),]$t <- NA

m <- m %>% fill(t)
m[(m$t == rpl),]$t <- NA
m
##     i     t
## 1   1 lorem
## 2   2 lorem
## 3   3 lorem
## 4   4 lorem
## 5   5 ipsum
## 6   6 ipsum
## 7   7 ipsum
## 8   8 ipsum
## 9   9 dolor
## 10 10  <NA>
## 11 NA  test

Done!

Oh, and by the way, this is my first post generated directly from RStudio.

Hide rows, based on value of cell – in Excel

So – you want to hide some rows on a worksheet, based on the value in a cell. Or more than one.
Here’s how to do that with VBA
Find the last row of the range that you want to apply the hiding to.
Get a range of rows, in this case starting at A7, and ending at “LastRow”.
For each value in that range, if the value i column A is equal to the value in cell G1 (that is Cells(1,7), And the value three columns over, eg in colum D (that is c.Offset(0,3)), is equal to the value in cell G2 (Cells(2,7), then set the entire row to be hidden, else set it to be shown.


Private Sub Worksheet_Change(ByVal Target As Range)
Dim LastRow As Long, c As Range
Application.EnableEvents = False
LastRow = Application.WorksheetFunction.CountA(Range("A7:A100000")) + 6
On Error Resume Next
For Each c In Range("A7:A" & LastRow)
If (c.Value = Cells(1, 7).Value And c.Offset(0, 3).Value = Cells(2, 7).Value) Then
c.EntireRow.Hidden = False
Else
c.EntireRow.Hidden = True
End If
Next
On Error GoTo 0
Application.EnableEvents = True
End Sub