Monday, 23 June 2014

Stochastic Mortality (Limited)

Introduction

I have now added some stochastic mortality functionality to Mortality Manager. However, I have limited this to just generating projected rates using R and simply adding the ability to use these rates.

This post therefore covers how I created the rates using R (but through F#) and how these rates can be viewed and plotted in Mortality Manager.

Generating the Rates

To generate the rates I used the following:
To create the rates you first need to ensure that the demography package is downloaded to make it available to R. You then need to download the mx and population rates, which are needed as inputs to the Lee-Carter code in the R package.

I chose to use UK rates. These contain rates for the general UK population from 1922 to 2011. I downloaded the two required files - Mx_1x1.txt and Exposures_1x1.txt. Here is a sample from each:

Mx_1x1.txt


United Kingdom, Death rates (period 1x1)     Last modified: 06-May-2013, MPv5 (May07)

   Year      Age       Female       Male         Total
   1922        0     0.070053     0.092574     0.081500
   1922        1     0.024976     0.027255     0.026130
   1922        2     0.012143     0.012979     0.012566
   1922        3     0.006072     0.006324     0.006199
   1922        4     0.004426     0.004305     0.004365
   1922        5     0.003861     0.003781     0.003821
   1922        6     0.002958     0.003157     0.003058

...

Exposures_1x1.txt

United Kingdom, Exposure to risk (period 1x1)     Last modified: 06-May-2013, MPv5 (May07)

   Year      Age       Female          Male         Total
   1922        0       446849.54    461921.95    908771.49
   1922        1       452480.82    463987.76    916468.59
   1922        2       427577.73    439102.38    866680.10
   1922        3       359372.64    367637.22    727009.86
   1922        4       326025.47    331448.37    657473.84
   1922        5       349516.32    355195.04    704711.36
   1922        6       379820.11    385139.04    764959.16
   1922        7       404411.29    407120.72    811532.01
   1922        8       418049.48    419024.29    837073.76
   1922        9       418158.45    420482.13    838640.58 

...

I then created an F# Console Application to generate the projected rates:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
open System
open System.IO
open RProvider
open RDotNet
open RProvider.demography
open MortMgr

[<EntryPoint>]
let main argv = 
    let uk = R.read_demogdata("c:/Mx_1x1.txt","c:/Exposures_1x1.txt", "mortality", label="UK")
    let maxage = 110
    let h,nsim,ser,snm = 100,1000,"Male","UK Males"
    let LCm=R.lca(uk,series=box ser,max_age=box maxage,interpolate=box true)
    let fcastLCm=R.forecast_lca(LCm,h=h)
    let yrs = fcastLCm.AsList().["year"].AsInteger()|>Seq.toArray
    let sim=R.simulate_fmforecast(namedParams["object",box fcastLCm;"nsim",box nsim])
    let simv = sim.AsNumeric()|>Seq.toArray
    let qxts =
        let ans = Array.zeroCreate nsim
        for s = 0 to nsim-1 do
            let qxt = Array2D.create 121 h 1.0
            for a = 0 to 120 do
                for t = 0 to h-1 do
                    let mxt = simv.[s*h*(maxage+1)+t*(maxage+1)+(min a maxage)]
                    qxt.[a,t] <- mxt/(1.0+0.5*mxt)
            ans.[s]<-qxt
        ans
    let stbl = {StochName=snm;StartYear=yrs.[0];EndYear=yrs.[yrs.Length-1];Nosims=qxts.Length}
    IndStoch.addtbl stbl
    qxts|>Array.mapi (fun i q -> IndStoch.addsim stbl (i+1) q)|>ignore
    0 // return an integer exit code

Lines 1 to 6 set up the required references. Line 3 and 4 reference the R Provider. Line 5 references the specific R package. Line 6 references Mortality Manager, which is used to store the generated simulations.

Line 10 loads the files using R. Lines 11 and 12 set up some required parameters. In this case, we are going to generate 1,000 simulations for Males, projecting for 100 years.

Line 13 uses R to fit the input data to the Lee-Carter model. Line 14 then generates the forecast function. Line 15 gets the years to be forecast and converts this to an array. Line 16 then generates 1,000 sample forecast simulations. Line 17 then converts these simulations into an array.

Lines 18 to 27 then converts these simulations of mx to a 2D array of qx values.

Lines 28 to 30 store the results. Lines 28 and 29 just stores the general summary of the data as a Record Type. Line 30 separately stores each simulation. These results are stored as Json using Json.NET.

 Viewing the Simulations

I have just added a single Task Pane to allow you to view these simulations.


This provides the ability to view an individual simulation using the View Simulation button. In this case, simulation 1 is displayed. This shows the projected values of qx for each age and for each projected year. Note that the projection starts at 2012, one year after the last year of the data from the Human Mortality Database.

Simulation from a Specific Age/Year

The other facility provided is to view projected rates for a specified age and year.

This allows you to get simulations for a particular age from a particular year using the View Simulations button. This shows how the model might simulate rates of mortality for a particular individual. In this case, we are simulating from year 2020 for a male then aged 30.

You also get a plot of these simulations:


Mortality Manager v004

I have also packaged up Mortality Manager into a new distribution. These files are stored with the source code on bitbucket. The application is available in macro enabled workbooks for 32 bit and 64 bit versions of Excel:
The application is also available as an XLL, again for 32 bit or 64 bit Excel:
If you use these, you may also want to download an example workbook which uses the application:

Saturday, 7 June 2014

More Plotting

Introduction

As I mentioned in my last post, I have decided to extend the code I have for Excel interop through a new project on bitbucket. This is principally to add further charting capabilities and allow it to be used generally, in particular, when using F# Interactive.

I recently came across an interesting book on forecasting called "Forecasting: principles and practice" by Rob J Hyndman and George Athana­sopou­los. This is available in hard copy, but also published online at:
This contains a number of interesting plots using R and is supported by a CRAN package containing code and data for the examples in the book. In this post, I will replicate the plots shown in the chapter "the forecasters toolbox" - 2.1 Graphics.

Setting up the data

I first needed to make the data available to F#. I loaded up RStudio, loaded the fpp package and then tested I could generate the first example chart. I then saved the backing data in CSV format. This is the simple R code:

##do plot
plot(melsyd[,"Economy.Class"], main="Economy class passengers: Melbourne-Sydney",xlab="Year",ylab="Thousands")

##write csv for F#
write.zoo(as.zoo(melsyd),file="c:/vs/vs13/FSharp.XlInt/fpp/ch2/melsyd.txt",sep=",")


This generated a text file with these contents:

"Index","First.Class","Business.Class","Economy.Class"
1987.48076923077,1.912,NA,20.167
1987.5,1.848,NA,20.161
1987.51923076923,1.856,NA,19.993
1987.53846153846,2.142,NA,20.986
1987.55769230769,2.118,NA,20.497
....


The following chart was displayed:

In F# interactive I then used Deedle to load this data for use by F# and then called my library to display the chart in Excel.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
#load @"..\..\packages\Deedle.0.9.12\Deedle.fsx"
#r @"..\..\FSharp.XlIntLib\bin\debug\Office.dll"
#r @"..\..\FSharp.XlIntLib\bin\debug\Microsoft.Office.Interop.Excel.dll"
#r @"..\..\FSharp.XlIntLib\bin\debug\FSharp.XlIntLib.dll"

open System
open Deedle
open FSharp.XlInt

// call unless Excel is open and OK to use open version
Xl.start()

//do melsyd
let melsyd0 = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/melsyd.txt")
let melsyd1 : Frame<float, string> = melsyd0 |> Frame.indexRows "Index"
let plt1 = 
    Plt.line (melsyd1?``Economy.Class``
              |> Series.observations, Title = "Economy Melbourne-Sydney", XTitle = "Year", YTitle = "Thousands", XSpacing = 48, XNumberFormat = "0")

In lines 1 to 8, I set up the references needed to Deedle, Excel Interop and my library. In line 11, I call a utility function in my library that opens Excel and makes this instance the one to work with. In line 14, I use Deedle to load the CSV file into a dataframe. In line 15, I create an amended version of the dataframe that uses the Index field as the index of the dataframe. Finally, in lines 16 to 18, I call my library to plot the Economy.Class series. This is the result:


Further Plots

Similarly, I loaded other data from R into CSV files and then loaded them for use by F#. This is the remainder of the F# code used to generate further plots:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
//do a10
let a10 = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/a10.txt")
let a101 : Frame<float, string> = a10 |> Frame.indexRows "Index"
let sales = a101?x
            |> Series.observations
let plt2 = Plt.line (sales, Title = "Antidiabetic drug sales", XTitle = "Year", YTitle = "$ million", XSpacing = 60, XNumberFormat = "0")

//do seasonplot
Plt.seasonplot (sales, Title = "Seasonal drug sales", YTitle = "$ million")
//do monthplot
Plt.monthplot (sales, Title = "Seasonal deviation drug sales", YTitle = "$ million")

//do fuel
let fuel = Frame.ReadCsv(__SOURCE_DIRECTORY__ + "/fuel.txt")
let city = fuel?City
           |> Series.values
let carbon = fuel?Carbon
             |> Series.values
let xy = Seq.zip city carbon

//jitter
Plt.scatterplot (Plt.jitter (xy), Title = "Jitter", XTitle = "City mpg", YTitle = "Carbon footprint") |> ignore

//scatterplot matrix
let litres = fuel?Litres
             |> Series.values
let highway = fuel?Highway
              |> Series.values
let data = [| litres; city; highway; carbon |]
let labels = [| "Litres"; "City"; "Highway"; "Carbon" |]

Plt.scatterplotmatrix (data, labels, Title = "ScatterPlot Matrix") |> ignore
//call to tidy up excel
Xl.close()

In lines 1 to 5, the data for sales is loaded. In line 6, a simple line plot of this date is created:

In line 9, the data is loaded into a plot showing the seasonal changes each month for each year:


In line 11, there is a variation on this plot which shows for each month how the sales change over the years:

In lines 14 to 19, some car fuel data is loaded, which is then plotted as a standard scatter plot in line 22. The more interesting element of this plot is that the data is "jittered" before being plotted to help illustrate where certain points have overlapping data items.

In lines 25 to 30, four of the series in the fuel dataframe are set up to be plotted against each other in the commonly used scatterplot matrix. In line 32, this data was plotted:


Finally, in line 34, a utility function was called to close this instance of Excel.