sobota, 15 lutego 2014

Visualizing option strategies with R

It has been a while since I wrote something about options.

Recently I found myself in a need for simple tool for visualizing option strategies.

Option strategy is a combination of a number of option positions - both long and short - and sometimes underlying assets (such as equities) - i.e. it has multiple "legs".

To calculate the payoff (*) of the option strategy one needs to know:

  • types of assets used to construct a particular strategy - possibilities include CALL and PUT options, as well as underlying assets
  • type of positions opened - options can be bought or sold (written); underlying assets can be bought or sold short
  • option strike prices - any option has a price at which it "activates"; for example CALL option with a strike price of 100, generates profit only when the price of underlying asset is above 100
  • option premiums - buying an option has a price, writing it generates some income
  • amount of assets utilized - for example you can buy more than 1 option
Transaction costs are not cover by my simplistic model.

(*) payoff is the value of the strategy at expiration; it should not be confused with the theoretical value derived from option pricing models such as Black-Scholes

Probably the simplest options-related investment strategy is a purchase of some stock and hedging it with PUT option - as a result, while one have unlimited profit potential, his loss is limited with the help of PUT option:

> assets.mat
     trans type   strike price amount
[1,] "buy" "base" "40"   "0"   "1"   
[2,] "buy" "put"  "35"   "2"   "1" 

A pure option strategy is Bull Call Spread, when one simultaneously buys and sells CALL options with different strike prices:

> assets.mat
     trans  type   strike price amount
[1,] "buy"  "call" "40"   "3"   "1"   
[2,] "sell" "call" "45"   "1"   "1" 

Option strategy can consist both long and short positions, and different amounts of options at opposing sides - like in the Call Backspread:

> assets.mat
     trans  type   strike price amount
[1,] "sell" "call" "40"   "4"   "1"   
[2,] "buy"  "call" "45"   "2"   "2" 

Option strategy can be composed with more than two legs - like in the Condor:

> assets.mat
     trans  type   strike price amount
[1,] "buy"  "call" "35"   "11"  "1"   
[2,] "buy"  "call" "55"   "1"   "1"   
[3,] "sell" "call" "40"   "7"   "1"   
[4,] "sell" "call" "50"   "2"   "1"  

Obviously there are many more different option strategies. You can learn about them at the following places:

And you can experiment with them and any other ideas connected with option strategies with the R code I have prepared. Enjoy!

[ R Source ]

czwartek, 9 stycznia 2014

Basics of scorecard model validation

Source: DW

Some time ago I wrote about Somers' D measure of association.

Somers' D is one of the many tools that can be used for validating scorecard models.

The task of a scorecard model is to assign scores, such as ratings, to the clients (in case of corporations usually called "counterparties") of a financial institution.

The ratings assigned by a scorecard model should reflect the probability of the client not meeting its obligations towards a bank or other financial institution.

As I showed in my previous post about relation between probability of default and interest rates, the higher risk of default needs to be compensate by higher interest rate charged.

Two aspects of a scorecard model are usually examined to assess the correctness of the model:

  1. discriminatory power
  2. calibration

Discriminatory power is about the model's ability to rate clients/counterparties according to their credit quality, i.e. verify whether clients/counterparties of similar quality fall into similar ratings classes.

Meanwhile, calibration verifies whether ratings given are correct indicators of future defaults. In other worlds, we compare probability of default forecast by the model with actual ("true") defaults, here.

Tools commonly used for the assessment of the models discriminatory power are Receiver Operating Characteristic (ROC), above mentioned Somers' D and Gini coefficient.

There is a number of tests for verifying model calibration. Probably the most often cited in the literature are: 

  • binomial test
  • Hosmer-Leweshow test
  • Spiegelhalter test
  • normal test

In most of the cases, the scorecard validation tests require knowledge of the previous scores and actual past defaults. 

When previous scores are not available and cannot be retroactively calculated, some external ratings - such as issued by credit agencies - may be used. An additional condition should be met in such a situation - the scores generated by the internal model need to be aligned with the external ratings.

As I noted before, the actual default rates change over time. Meanwhile, most of the scorecard validation tools does not take this fluctuations into account. They assume that the future will be similar to the (averaged or recent) past.

In addition, the mentioned above validation tools do not take into consideration the differences in characteristics of the past and current assets being evaluated by the scorecard model. It is assumed the model will take care of the differences. However, the assets being rated may have pretty complex internal structures hidden from the model.

Lastly, scorecard models do not usually say anything about the possible recovery rates and resolution times.

If you would like to play a little with scorecard validation tests, you may take a look at:

[ Scorecard validation tests in R ]

[ Scorecard validation tests examples in R ]

[ Measures of association in R ]

Validation tests in R

niedziela, 15 grudnia 2013

Sommers'D - two dirty implementations: R vs F#

A couple of days ago I started playing with F#.

Although I'm VERY far from being skillful F# programmer, I am slowly moving forward.

Not far ago I implemented one of the measures of association in R - Somers' D.

Somers' D is sometimes used for testing concordance of internal and external risk scores / rankings.

I had some problems with finding the right formula for Somers' D Asymptotic Standard Error, and when I finally found the solution I didn't have much energy to clean up my R code ;)

I thought, using this dirty code as a base for Somers' D implementation in F# may bring interesting results. My intention was not to give R too large a head start over F#.

Still, the differences are visible in many places...

First of all, I have been pretty surprised that basic matrix operations are not available in the core F# version.

It is necessary to add F# PowerPack to work with matrices.

Even then, working with matrices in F# does not seem so natural as in R (or Matlab). Or, probably, I still know too little about F#.

A couple of examples:

constructing matrix in R:

  1. PD    <- c(0.05,0.10,0.50,1,2,5,25)/100
  2. total <- c(5,10,20,25,20,15,5)/100
  4. defaulted    <- total*PD
  5. nondefaulted <- total*(1-PD)
  7. <- sum(total)
  9. portfolio <- rbind(defaulted,nondefaulted)/n

constructing matrix in F#:

  1. #r "FSharp.PowerPack.dll"
  3. let PD             = [ 0.05; 0.10; 0.50; 1.00; 2.00; 5.00; 25.00 ]
  4. let counterparties = [ 5.; 10.; 20.; 25.; 20.; 15.; 5]
  6. let groups = PD.Length // risk groups no.
  8. let div100 x = x / 100.0
  10. let PDprct = [ for x in PD do yield div100 x ]
  11. let CPprct = [ for x in counterparties do yield div100 x ]
  13. let n = CPprct |> Seq.sum
  15. let defaulted    = [ for i in 1..groups do yield CPprct.[i-1]*PDprct.[i-1]/]
  16. let nondefaulted = [ for i in 1..groups do yield CPprct.[i-1]*(1.0-PDprct.[i-1])/]
  18. let x = matrix [ defaulted; nondefaulted ]

calculating WR/DR in R:

  1. wr <- n^2-sum(sapply(1:nrow(x)function(i) sum(x[i,])^2))

calculating WR/DR in F#:

  1. let xr = x.NumRows
  3. let rowSum (x : matrix) (i : int) = Array.sum (RowVector.toArray (x.Row(i-1)))
  5. // WR / DR
  7. let wr =
  9.     let mutable mat_sum = 0.0
  11.     for i in 1..xr do
  12.         let row2  = rowSum x i ** 2.0
  13.         mat_sum   <- mat_sum + row2
  15.     n ** 2.0 - mat_sum

Later it gets a little better, but...

'A' function in R:

  1. <- function(x,i,j) {
  3.   xr <- nrow(x)
  4.   xc <- ncol(x)
  6.   sum(x[1:xr>i,1:xc>j])+sum(x[1:xr<i,1:xc<j])
  8. }

'A' function in F#:

  1. let A (x : matrix) i j =
  3.     let xr = x.NumRows
  4.     let xc = x.NumCols
  6.     let rowIdx1 = List.filter (fun x -> x>i) [ 1..xr ]
  7.     let colIdx1 = List.filter (fun x -> x>j) [ 1..xc ]
  9.     let rowIdx2 = List.filter (fun x -> x<i) [ 1..xr ]
  10.     let colIdx2 = List.filter (fun x -> x<j) [ 1..xc ]
  12.     let mutable Asum = 0.0
  14.     for r_i in rowIdx1 do
  15.         for r_j in colIdx1 do
  16.             Asum <- Asum + x.[r_i-1,r_j-1]
  18.     for r_i in rowIdx2 do
  19.         for r_j in colIdx2 do
  20.             Asum <- Asum + x.[r_i-1,r_j-1]
  22.     Asum

As I've mentioned at the beginning of the post - both codes are "dirty". Also, I definitely know R better than F# (even if it may not be apparent from the R code above ;)

Still, F# seems to require more coding and many "simple" operations (matrices...) may not be so easy in F# in comparison to R.

I'm still to find where F# excels :)

[ dirty R code ]

[ dirty F# code ]

środa, 11 grudnia 2013

F# - even the longest journey begins with a single step. And there may be bumps along the way

Don't take me wrong. I have just started familiarizing myself with F# - a fairly new functional programming language developed with heavy involvement of Microsoft.

My intention has been to examine, whether F# can be used for various tasks I usually perform with R (

As for now, F# looks pretty strange.

It is different in many ways from standard programming languages like C/C+. It is also different from R.

Learning it seems like solving a series of logic puzzles, at this stage.

My (very early) F# code is definitely not optimal, but it may give a hint of what may come later.

Take for example a simple function for calculating return on investment in a bond, used in my previous post.

In R, the function looks like that:

  1. # expected (discounted) return
  2. pv <- function(fa,n,cr,rf) {
  3.   -fa+sum(sapply(1:n, function(i) (fa*cr)/(1+rf)^i))+fa/(1+rf)^n
  4. }

You can see the code in context here:

Meanwhile, my F# equivalent is:

At least both functions return the same result :)

The nice thing about F# is that, although Microsoft did not include it in the free Visual Studio Express 2013, there is an online version of the F# available. You can write and test your F# code there.

OK, why F# may look strange? Just a couple of observations:
  • calculating power for floats and integers is handled differently - pown for integers and ** for floats
  • once a function is used with one type of argument - say int - you cannot use it again with any other type - say float
  • separate operations for adding a single element at the beginning of a list (::) and for joining the lists (@)
  • some symbol combinations (example: !!!), while it is possible to define the operations they perform, cannot be used between arguments, i.e. !!! 2 3 is fine, while 2 !!! 3 is not
I would like to stress again, that I am at the very beginning of my journey with F#. 

The peculiarities of F# have not discouraged me so far. I'd say, it is quite the opposite. They have increased my hunger for learning fore about this bizarre creature ;)

wtorek, 10 grudnia 2013

When interest rates go to infinity

CAUTION: It is my first post about corporate debt so excessive simplifications and mistakes are highly probable. Comments welcome.


The standard formula for calculating bond return with 3 year maturity and annual coupon of 5% tells us, that we should expected discounted return of around 13.6%, given the extremely low current "risk free" interest rate.

> pv(fa=100,n=3,cr=0.05,rf)
[1] 13.59618

Do you think it is adequate for the risk we are taking?

Actually it depends :)

Fig.: Probability of Default vs. Interest Rate curve

If we are unlucky and our bond defaults, we may actually lose approx. between 46% and 60% of our investment (assuming RR=37% and RT=1Y, see below).

> de(fa=100,di=0,cr=0.05,rv=0.4,rf,rl=1) # default in year one; no coupon payments
[1] -60.17087
> de(fa=100,di=1,cr=0.05,rv=0.4,rf,rl=1) # default after first coupon payment
[1] -55.36236
> de(fa=100,di=2,cr=0.05,rv=0.4,rf,rl=1) # default after second coupon payment
[1] -50.5744
> de(fa=100,di=3,cr=0.05,rv=0.4,rf,rl=1) # default after third coupon payment
[1] -45.80689

Three critical factors here are Probability of Default (PD), Recovery Rate (RR) and Resolution Time (RT).

The first tells us, how likely we are to lost all or part of our initial investment. 

The second - what part of the investment we could get back.

The third - when can we expect some of our money back after the default.

Average may be misleading here. The default rate for speculative bonds surpassed 11% in the period. In addition, intensity of defaults varies between geographies and industries.

According to Moody's, Resolution Time can take between 6 months and more than 3 years.

Let's focus on the Probability of Default - i.e. freeze all the other parameters: bond maturity = 3 years, Recovery Rate = 37%, Resolution Time = 1 year, and risk free (RF) interest rate = 0.429%.

The 5% annual coupon on our bond implies its Probability of Default at around 5.5%.

This estimation method used means that if we would have a large portfolio of identical bonds with equal and constant PD of 5.5% and annual coupon of 5%, we would finish our investment with (discounted) return of zero - i.e. we have treated our coupon as zero profit interest rate.

PD of 5.5% is clearly above the average historical default rate as recorded by Standard&Poor's. Hence if we believe the actual PD will be lower, say 2%, we will make a profit. Zero profit interest rate at PD equal 2% is 2.2%, so the difference (spread) between our coupon and risk level is 2.8 pp. 

The table below shows the relation between PD and zero profit interest rates required for PDs between 0% and 10%:

          PD    IR
  [1,]   0.00  0.005
  [2,]   0.01  0.013
  [3,]   0.02  0.022
  [4,]   0.03  0.031
  [5,]   0.04  0.039
  [6,]   0.05  0.048
  [7,]   0.06  0.057
  [8,]   0.07  0.066
  [9,]   0.08  0.075
 [10,]   0.09  0.085
 [11,]   0.10  0.094

Reminder: debt maturity, RR, RT and RF are still frozen

Clearly, when default rate increases, we should ask for the higher interest rate. However, as the chart at the beginning shows, the situation starts to be pretty hilarious after reaching some PD level. Around PD of 60%, we need to ask for 100% interest. And even further the required zero profit interest rate goes into infinity...

[ R code used ]

piątek, 29 listopada 2013

Encrypting files with AES in R

The recent news about the widespread NSA electronic spying renewed people interest in securing their data.

R seems an unlikely tool for data encryption, but actually can be used for this purpose.

The digest package provides implementation of two crucial crystallographic algorithms:

By joining these two algorithms, it is possible to write a rudimentary code performing encryption/decryption of virtually any file.

The code first creates a 256-bit hash from an arbitrary key-phrase, using SHA-256. This hash key is used as a key to encrypt/decrypt a file using AES-256.

An additional feature of the code is in-memory compression of the data before encryption.

Also, since AES-256 requires 16-bytes blocks of data, some random information may be glued to the file before encryption. 

Definitely this is not the best-in-class implementation of the symmetric encryption, but for many purposes it may be good enough.

A small warning: in-memory decompression may return an error when one tries to decrypt a file with incorrect key-phrase.

czwartek, 26 września 2013

Elementary Expectation Maximization algorithm with R code

I have recently found a short article describing the Expectation Maximization algorithm.

The Expectation Maximization algorithm enables parameter estimation in probabilistic models with incomplete data.

It is closely related with Maximum Likelihood Estimation (MLE), but while MLE requires complete data, Expectation Maximization works with some latent (unobserved) variables.

The values of the parameters being estimated with the Expectation Maximization algorithm converge at the values calculated with MLE.

For example, for a model with 10 (unfair) coin selections and 100 tosses per selection we may get:

> heads
 [1] 11 14 72 66 63 73 65  9 65 68
> coin
 [1] 1 1 2 2 2 2 2 1 2 2

> # actual probabilities
> head.prob1.init
[1] 0.124
> head.prob2.init
[1] 0.704

> # MLE estimates (we know which coin was selected in every turn)
> head.prob1.est
[1] 0.113
> head.prob2.est
[1] 0.674

> # Maximum Expectation estimates (coin selection is unknown)
> prob1.maxexp
[1] 0.113
> prob2.example
[1] 0.674

The above mentioned tutorial explains the basics of the Expectation Maximization algorithm well enough, so there is no need to repeat its content here.

What the article is missing is a code showing actual implementation of the simple version of this algorithm.

Hence I have written a short R code, demonstrating what's exactly going on here.

Enjoy! :)


Two other earlier introductory posts:

For Dummies: Intro to Generalized Method of Moments

Cointegrated drunk and her dog visualized

piątek, 5 lipca 2013

Trust no one?

I'd written quite extensively about the dark side of the Polish discretionary investment funds (the term in vogue is absolute return, now), but I had some hope in the emerging class of Polish quantitative funds.

So far, they rather disappoint, what frankly I much regret :(

The largest of them, UniSystem 1, managed by Union Investment, is down -8.5% YTD.


It has been falling systematically since the opening of the fund to the public in late 2012.

Unfortunately, it doesn't quite look like a simulated results presented in the fund's marketing materials:

One may say, that the fund is just following the trend:

Source: stooq
Note: fund certificates traded on open market are highly discounted (by some 18%)

Indeed, UniSystem 1 seems highly correlated with WIG20, the main index of the Warsaw Stock Exchange.

It is a little strange for a fund marketed as a market-neutral vehicle, aimed at various markets around the world, and willing to benefit from geographical diversification.

There may be something wrong with the investment models the fund employs.

The poor performance of UniSystem 1 has probably been the reason for cancellation of the new issue of the fund's certificates.

The disappointing fund performance is quite sad, especially taking into account that heavily beaten in recent years SuperFund seems to be finally catching some breath this year. Even if it is still behind S&P500:

Source: SuperFund

Nevertheless, I still hope, UniSystem 1 can turn around, and at least stabilize. The case of Provide Able 2 Trend may be a faint hint of such possibility:


czwartek, 4 lipca 2013

Creating some real value behind digital currencies

While algorithmic trading is moving into the cloud, the cloud is moving into stock exchange :)

Deutsche Boerse is going to start Cloud Exchange, when it will be possible to trade processing power and storage.

It may be interesting to know what entities will be buying computing resources, and if the demand for the resources increases in volatile or in calm times on the markets.

As I wrote a little ago, it may be more difficult to generate higher investment returns in times of low market volatility.

Also, the computation exchange may give some more tangible value to Bitcoin and other crypto or rather digital currencies, that depend on spending processing power on solving difficult mathematical problems. It may even create some possibilities for arbitrage between the price of processing power and the price of the crypto currencies.

Still, I dislike the idea of spending processor cycles and hence burning electricty just for solving some artificial riddles, instead of doing some beneficial computing-intensive calculations, like Folding@Home.

Maybe the introduction of Cloud Exchange will be the first step for creating some more solid fundations for digital currencies.