Sunday, December 15, 2013

Sommers'D - two dirty implementations: R vs F#

A couple of days ago I started playing with F#.

Although I'm VERY far from being skillful F# programmer, I am slowly moving forward.

Not far ago I implemented one of the measures of association in R - Somers' D.

Somers' D is sometimes used for testing concordance of internal and external risk scores / rankings.

I had some problems with finding the right formula for Somers' D Asymptotic Standard Error, and when I finally found the solution I didn't have much energy to clean up my R code ;)

I thought, using this dirty code as a base for Somers' D implementation in F# may bring interesting results. My intention was not to give R too large a head start over F#.

Still, the differences are visible in many places...

First of all, I have been pretty surprised that basic matrix operations are not available in the core F# version.

It is necessary to add F# PowerPack to work with matrices.

Even then, working with matrices in F# does not seem so natural as in R (or Matlab). Or, probably, I still know too little about F#.

A couple of examples:

constructing matrix in R:

1. PD    <- c(0.05,0.10,0.50,1,2,5,25)/100
2. total <- c(5,10,20,25,20,15,5)/100
3.
4. defaulted    <- total*PD
5. nondefaulted <- total*(1-PD)
6.
7. <- sum(total)
8.
9. portfolio <- rbind(defaulted,nondefaulted)/n

constructing matrix in F#:

1. #r "FSharp.PowerPack.dll"
2.
3. let PD             = [ 0.05; 0.10; 0.50; 1.00; 2.00; 5.00; 25.00 ]
4. let counterparties = [ 5.; 10.; 20.; 25.; 20.; 15.; 5]
5.
6. let groups = PD.Length // risk groups no.
7.
8. let div100 x = x / 100.0
9.
10. let PDprct = [ for x in PD do yield div100 x ]
11. let CPprct = [ for x in counterparties do yield div100 x ]
12.
13. let n = CPprct |> Seq.sum
14.
15. let defaulted    = [ for i in 1..groups do yield CPprct.[i-1]*PDprct.[i-1]/]
16. let nondefaulted = [ for i in 1..groups do yield CPprct.[i-1]*(1.0-PDprct.[i-1])/]
17.
18. let x = matrix [ defaulted; nondefaulted ]

calculating WR/DR in R:

1. wr <- n^2-sum(sapply(1:nrow(x)function(i) sum(x[i,])^2))

calculating WR/DR in F#:

1. let xr = x.NumRows
2.
3. let rowSum (x : matrix) (i : int) = Array.sum (RowVector.toArray (x.Row(i-1)))
4.
5. // WR / DR
6.
7. let wr =
8.
9.     let mutable mat_sum = 0.0
10.
11.     for i in 1..xr do
12.         let row2  = rowSum x i ** 2.0
13.         mat_sum   <- mat_sum + row2
14.
15.     n ** 2.0 - mat_sum

Later it gets a little better, but...

'A' function in R:

1. <- function(x,i,j) {
2.
3.   xr <- nrow(x)
4.   xc <- ncol(x)
5.
6.   sum(x[1:xr>i,1:xc>j])+sum(x[1:xr<i,1:xc<j])
7.
8. }

'A' function in F#:

1. let A (x : matrix) i j =
2.
3.     let xr = x.NumRows
4.     let xc = x.NumCols
5.
6.     let rowIdx1 = List.filter (fun x -> x>i) [ 1..xr ]
7.     let colIdx1 = List.filter (fun x -> x>j) [ 1..xc ]
8.
9.     let rowIdx2 = List.filter (fun x -> x<i) [ 1..xr ]
10.     let colIdx2 = List.filter (fun x -> x<j) [ 1..xc ]
11.
12.     let mutable Asum = 0.0
13.
14.     for r_i in rowIdx1 do
15.         for r_j in colIdx1 do
16.             Asum <- Asum + x.[r_i-1,r_j-1]
17.
18.     for r_i in rowIdx2 do
19.         for r_j in colIdx2 do
20.             Asum <- Asum + x.[r_i-1,r_j-1]
21.
22.     Asum

As I've mentioned at the beginning of the post - both codes are "dirty". Also, I definitely know R better than F# (even if it may not be apparent from the R code above ;)

Still, F# seems to require more coding and many "simple" operations (matrices...) may not be so easy in F# in comparison to R.

I'm still to find where F# excels :)

[ dirty R code ]

[ dirty F# code ]