Open Source Risk Modeling Manifesto

Page content

Open Source Risk Modeling Manifesto

This post is a summary of a presentation given at the 2014 Autumn TopQuants Meeting, aka, the Open Source Risk Modeling Manifesto.

Python Toolkit

The dismal state of quantitative risk modeling

The current framework of internal risk modeling at financial institutions has had a fatal triple stroke. We saw in quick sequence: market risk, operational risk, and credit risk measurement failures, covering practically all business models.

This fact left the science and art of quantitative risk modeling reeling under the crushing weight of empirical evidence. The aspect of failure we are interested here is the technical failure of risk models, that is, the engineering side of things, as distinct from the risk management failure, the human context of using risk models. After all,

Good risk managers can use primitive or poor risk models to good effect and poor risk managers will ignore or subvert the outcomes of perfect risk models

It would take volumes to document all the specific weaknesses and faults of risk modeling revealed by the successive crises since 2008.

  • For our purposes some cursory glances will suffice to set the tone: In the market risk space, the academic mantra that “credit risk is just another form of market risk” has proven disastrously wrong. This exposed deep methodological difficulties (model risk) plaguing the market risk treatment of illiquid products.
  • In operational risk, multi-billion fines leveled at the industry revealed that the best practice technical approach “reduced form” AMA approach is essentially blind to both the buildup of internal risk factors and unable to offer a reasonable update of views after the event realization.
  • Finally, and most unfortunately, the vital for the real economy credit risk models managed to get wrong every statistical moment of the distribution:
    • The first order (PD / expected loss) estimates have proven unable to capture the deterioration of underwriting standards (key product / client risk factors were ignored).
    • The second order (volatility or correlation) aspects have not captured dependency between markets because of obsolete approaches to estimating sector correlations and;
    • the tail risk side of the models has not included the rare but disastrous events such as sovereign default. Contagion and system risk modeling was still in its infancy

The problem with internal risk models is already reflecting in various new regulatory policies since the crisis (non-risk based metrics such as the leverage ratio, standardization of risk models etc.) that reverse genuine technical achievements spanning decades of effort.

But what is there to be done?

The risk modeling community is certainly not missing intellectual firepower. It can revisit and fix what is fixable and jettison what was unworkable. The real challenge is to constructively channel this firepower towards a more robust and professional landscape that will serve the industry and will also be recognized by other stakeholders. Alas, this is not an easy task. Very deservedly, there is little appetite for one more round of self-declared and self-serving “excellence”.

The current industry setup around internal risk models has failed. Our view is that a viable future can instead adapt and emulate the behaviors, organizational patterns and toolkits of technical areas that have succeeded rather than failed in tasks of similar complexity.

While inspiration can be drawn from many other areas of human endeavor (most areas of engineering actually qualify – what is the last time your car exploded on an uphill?), our focus here is on a paradigm we denote as Open Source Risk Modeling.

What does the success of open source teach us?

Risk models are essentially just special purpose software, and developing risk modeling solutions has many affinities with developing open source software.

We believe that:

re-engineering some key parts of the risk modeling workflow along the lines followed by open source communities offers a viable technical change program that can re-establish (in due course) confidence in the risk quantification tools developed by the financial industry

Open Source has ushered new working paradigms that are extremely effective at solving tough problems:

  • Wikipedia, a community driven encyclopedia is the 6th top website globally and has eclipsed any other effort to compile general purpose encyclopedias.
  • Linux, the stable and high performance open source operating system is dominating both internet servers and mobile.
  • MariaDB, MongoDB, etc.., is a growing list of open source production ready databases that increasingly dominate the most important database technologies lists.
  • Stackoverflow, a website supporting collaborative programming receives 4M hits per day. The software world was indeed changed by open source!
  • Github, a cloud repository for open source projects has become the go-to place for trying-out and then joining the open source movement

The above examples (just a small sampler of a vast and growing universe!) utilize to varying degrees the following three key concepts:

  • Open source licensing that allows accessibility to and propagation of intellectual property
  • Promotion of standards that enables inter-operability and quality control and
  • Collaborative work ethic that pools efforts of independent agents.

The concept of open source licensing is fundamental for the current boom in software. Under the open source paradigm, while developers retain copyrights to their creation, the software (or other IP) is released under a license that permits (for example) inspection of the source code and - depending on the type of license - modification of the code and even further packaging of the code into new products, possibly even commercial resale. This setup acts multiplicatively, enabling the building of complex software frameworks with multiple contributors.

While the licensing and contributor agreements take care of the legal framework for collaboration, it is the collaborative tools and standards that make open source communities true productivity beehives. There is by now a huge range of tools, online websites, techniques and how-to’s. Just a sample: developer education tools (stackoverflow, public wiki’s), collaboration tools (github), project management styles (agile and scrum), documentation tools (new markup schemes), package management tools, open standards (W3C) and application programming interfaces (API’s).

Beside the legal framework and the enabling technical toolkit, there are a number of behaviors that are prevalent in open source and which are very conducive to productive and high quality development: Attribution becomes the means to build reputation, peer review is used in accepting contributions of code, selection of ideas is performed in online forums discussing project directions. Some of these behaviors are actually reminiscing of academic environments but are generally occurring rather naturally and without much formal governance.

Open Source Risk Modeling

In-house use of open source software to support various operations (e.g., linux servers) is by now a reality in the financial sector. But in what concerns the broader risk analysis stack, open source is only marginally present although not completely new: There are certain microfinance initiatives that developed field oriented open source front-end systems (MIFOS, Cyclos), there are some trading oriented pricing and risk libraries (quantlib, opengamma), there are insurance (actuarial) risk models (pillarone, openunderwriter) and finally there are numerous contributions to open source systems such as R and Python. We maintain an up-to-date list here.

Conspicuously missing from the above list is any significant effort targeting the risk modeling of “core” banking operations, including for example the standard credit, operational and business risk analysis tools. This is where Open Risk aims to make a difference by supporting the formation of an open source community focusing on this area. The architecture of this open source risk modeling framework would consist of a broad contribution community, comprising of individuals in academia, financial firms and/or regulatory bodies. Anybody from within (or without) the community can check-out, comment, test, validate, opine the risk library. Checking-in is subject to open standards that are enforced by peer review within the community. Users can either use standardized versions (use verbatim the code) or use customized versions (fork the code).

Open Risk is currently organizing the development an open source risk library. While in principle contributions are welcome in any language / platform, there are benefits of standardizing around a few key promising technologies. For this reason we suggest Python, R, Julia and/or C++.

Contributions in other languages are also welcome provided they adhere to the Open Risk API. While the work program is huge, we are aiming first for proof-of-principle projects. Some demo libraries and the following organizational tools are already available for any interested developer:

  • Risk Manual: This is a public wiki holding the documentation of the principles and methodologies behind the risk library
  • Github: This is the public repository storing the libraries. To use, create a github account and your are ready to commit code!

Q & A

A question that arises most frequently from finance individuals that have not been involved in open source is the economic perspective. Without going into all the details here, suffice it to say that there are multiple channels that can support the different modalities of an open community: from corporate sponsorship, to crowd-funding, to ad-driven business models, to added services (such as training and support) to “pro” versions of software that offer additional / full functionality.

If all other industries find economic ways to benefit from open source, there should be a way for the financial sector as well!

The second most frequent question from finance professionals are the issues around data privacy. This is a genuine issue. The answer is simply that a good majority of risk model development does not require sensitive client data, definitely not before the final stages of calibration. Open source risk modeling will need to adapt to some of the significant constraints of this particular industry.

Do you have suggestions / ideas / observations around open source in general or Open Risk in particular? Tell us through the feedback button!