Sunday, December 15, 2013

Sommers'D - two dirty implementations: R vs F#

A couple of days ago I started playing with F#.

Although I'm VERY far from being skillful F# programmer, I am slowly moving forward.

Not far ago I implemented one of the measures of association in R - Somers' D.

Somers' D is sometimes used for testing concordance of internal and external risk scores / rankings.

I had some problems with finding the right formula for Somers' D Asymptotic Standard Error, and when I finally found the solution I didn't have much energy to clean up my R code ;)

I thought, using this dirty code as a base for Somers' D implementation in F# may bring interesting results. My intention was not to give R too large a head start over F#.

Still, the differences are visible in many places...

First of all, I have been pretty surprised that basic matrix operations are not available in the core F# version.

It is necessary to add F# PowerPack to work with matrices.

Even then, working with matrices in F# does not seem so natural as in R (or Matlab). Or, probably, I still know too little about F#.

A couple of examples:

constructing matrix in R:

  1. PD    <- c(0.05,0.10,0.50,1,2,5,25)/100
  2. total <- c(5,10,20,25,20,15,5)/100
  3.  
  4. defaulted    <- total*PD
  5. nondefaulted <- total*(1-PD)
  6.  
  7. <- sum(total)
  8.  
  9. portfolio <- rbind(defaulted,nondefaulted)/n

constructing matrix in F#:

  1. #r "FSharp.PowerPack.dll"
  2.  
  3. let PD             = [ 0.05; 0.10; 0.50; 1.00; 2.00; 5.00; 25.00 ]
  4. let counterparties = [ 5.; 10.; 20.; 25.; 20.; 15.; 5]
  5.  
  6. let groups = PD.Length // risk groups no.
  7.  
  8. let div100 x = x / 100.0
  9.  
  10. let PDprct = [ for x in PD do yield div100 x ]
  11. let CPprct = [ for x in counterparties do yield div100 x ]
  12.  
  13. let n = CPprct |> Seq.sum
  14.  
  15. let defaulted    = [ for i in 1..groups do yield CPprct.[i-1]*PDprct.[i-1]/]
  16. let nondefaulted = [ for i in 1..groups do yield CPprct.[i-1]*(1.0-PDprct.[i-1])/]
  17.  
  18. let x = matrix [ defaulted; nondefaulted ]

calculating WR/DR in R:

  1. wr <- n^2-sum(sapply(1:nrow(x)function(i) sum(x[i,])^2))

calculating WR/DR in F#:

  1. let xr = x.NumRows
  2.  
  3. let rowSum (x : matrix) (i : int) = Array.sum (RowVector.toArray (x.Row(i-1)))
  4.  
  5. // WR / DR
  6.  
  7. let wr =
  8.  
  9.     let mutable mat_sum = 0.0
  10.  
  11.     for i in 1..xr do
  12.         let row2  = rowSum x i ** 2.0
  13.         mat_sum   <- mat_sum + row2
  14.  
  15.     n ** 2.0 - mat_sum

Later it gets a little better, but...

'A' function in R:

  1. <- function(x,i,j) {
  2.  
  3.   xr <- nrow(x)
  4.   xc <- ncol(x)
  5.  
  6.   sum(x[1:xr>i,1:xc>j])+sum(x[1:xr<i,1:xc<j])
  7.  
  8. }

'A' function in F#:

  1. let A (x : matrix) i j =
  2.  
  3.     let xr = x.NumRows
  4.     let xc = x.NumCols
  5.  
  6.     let rowIdx1 = List.filter (fun x -> x>i) [ 1..xr ]
  7.     let colIdx1 = List.filter (fun x -> x>j) [ 1..xc ]
  8.  
  9.     let rowIdx2 = List.filter (fun x -> x<i) [ 1..xr ]
  10.     let colIdx2 = List.filter (fun x -> x<j) [ 1..xc ]
  11.  
  12.     let mutable Asum = 0.0
  13.  
  14.     for r_i in rowIdx1 do
  15.         for r_j in colIdx1 do
  16.             Asum <- Asum + x.[r_i-1,r_j-1]
  17.  
  18.     for r_i in rowIdx2 do
  19.         for r_j in colIdx2 do
  20.             Asum <- Asum + x.[r_i-1,r_j-1]
  21.  
  22.     Asum

As I've mentioned at the beginning of the post - both codes are "dirty". Also, I definitely know R better than F# (even if it may not be apparent from the R code above ;)

Still, F# seems to require more coding and many "simple" operations (matrices...) may not be so easy in F# in comparison to R.

I'm still to find where F# excels :)

[ dirty R code ]

[ dirty F# code ]


No comments: