Data Science

Open Risk Academy Course: Tensor Calculations with the Eigen C++ Library

Open Risk Academy Course: Tensor Calculations with the Eigen C++ Library

A DeepDive using the Eigen C++ Library to perform Tensor calculations

Reading Time: 3 min.
Course Objective The objective of the course is to provide an introduction to using Eigen::Tensor as a high-level library for using Tensors in C++ projects. We learn the concept and techniques of the Eigen Tensor class How to declare, initialize Tensors of various ranks and types and how to access Tensor elements Elementary unary and binary operations involving Tensors More complex operations (reductions, contractions) Modifying the shape of Tensors The course is now live at the Academy, the github repository hosts C++ scripts used in the course.
Mathematical Representations of Credit Portfolio Data

Mathematical Representations of Credit Portfolio Data

What do we mean by credit data? This post is a discussion around mathematical terminology and concepts that are useful in the context of working with credit data, taking us from network graph representations of credit systems to commonly used reference data sets

Reading Time: 27 min.
Definition of Credit Data What do we mean by credit data? For our purposes Credit Data is any well-defined dataset that has direct applications in the assessment of the Credit Risk of an individual or an organization, or, more generally, a dataset that allows the application of data driven Credit Portfolio Management policies. The appearance of credit data is quite familiar to practitioners: A spreadsheet, or a table in a database, with a number of columns and rows full of all sorts of information about borrowers and loans.
Exploring Ten Years of FOSDEM talks

Exploring Ten Years of FOSDEM talks

We look into ten years of FOSDEM conference data to start getting to grips with the open source phenomenon and also explore techniques for data review and exploratory data analysis using (of course) open source python tools. In the process we identify the imprint of the pandemic on attendance, the longest ever title, the distribution of mindshare of time and some notable newcomers.

Reading Time: 12 min.
FOSDEM is a non-commercial, volunteer-organized, two-day conference celebrating free and open-source software development. The conference has a geographic focus on European open source ecosystems and projects. FOSDEM is primarily aimed at developers, across the entire range of software and aims to enable them to meet and discuss the status of projects. We look into ten years of FOSDEM conference data to start getting to grips with the open source phenomenon and also explore techniques for data review and exploratory data analysis using (of course) open source python tools.
Representing Matrices as JSON Objects: Part 1

Representing Matrices as JSON Objects: Part 1

Representing a matrix as a JSON object is a task that appears in many modern data science contexts, in particular when one wants to exchange matrix data online. While there is no universally agreed way to achieve this task in all circumstances, in this series of posts we discuss a number of options and the associated tradeoffs.

Reading Time: 13 min.
Motivation and Objective Representing a matrix as a JSON object is a task that appears in many modern data science contexts, in particular when one wants to exchange matrix data online. There is no universally agreed way to achieve this task and various options are available depending on the matrix type and the programming tools and environment one has available. Matrices are not native structures in general purpose computing environments. They are typically handled with speficic packages (modules, extensions or libraries).
Class Inheritance in Data Science

Class Inheritance in Data Science

Object-oriented programming and techniques (OOP) such as using classes and inheritance are common in many application programming environments but don't travel well outside computer memory. When considering data science tasks and objectives the transition from object hierarchies to data structures (and vice versa) is not always straightforward. In this short course we explore how some programming languages, data formats, database API's and web frameworks handle hierarchical classes.

Reading Time: 3 min.
Summary In this short course we explore how some programming languages, data formats, database API’s and web frameworks handle hierarchical classes. Content Object-oriented programming and techniques (OOP) such as using classes and inheritance are common in many application programming environments but alas don’t “travel well” outside computer memory. The potentially intricate relationships of objects (both the data they hold and the meaning and possible uses of the data) are not easy to transfer (except of-course by full replication of code and data).
Open Risk Hydra GSOC 2021 Credit Risk Project Wrap Up

Open Risk Hydra GSOC 2021 Credit Risk Project Wrap Up

Reading Time: 5 min.
The GSOC 2021 collaboration between Open Risk and the Hydra Ecosystem - Project Wrap-Up Google Summer of Code 2021 came and went amid the still ongoing worldwide pandemic experience. Open Risk was happy to join forces with the Hydra Ecosystem in exploring a proof-of-concept for next generation API’s using Hydra. The project aimed to guide students (here and here) to build a hypermedia enabled REST service that can serve standardized credit portfolio data.
Open Risk Mentoring GSOC 2021 Hydra Nextgen API Project

Open Risk Mentoring GSOC 2021 Hydra Nextgen API Project

For the Google Summer of Code 2021 season Open Risk is happy to join forces with the Hydra Ecosystem to mentor a student project that aims to build a hypermedia enabled REST service around standardized credit portfolio data

Reading Time: 4 min.
A GSOC 2021 summer project collaboration between Open Risk and the Hydra Ecosystem Summer is underway and for the Google Summer of Code 2021 season Open Risk is happy to join forces with the Hydra Ecosystem. The project aims to guide students to build a hypermedia enabled REST service around standardized credit portfolio data. More specifically the project will build a REST service as backend for a hypothetical banking entity that collects and disseminates credit portfolio data conforming to an established public standard (the EBA NPL templates, see below).
Risk Function Ontology

Risk Function Ontology

The Risk Function Ontology (RFO) is a new ontology describing risk management roles (posts) and functions.

Reading Time: 3 min.
The Risk Function Ontology The Risk Function Ontology is a framework that aims to represent and categorize knowledge about risk management functions using semantic web information technologies. Codenamed RFO codifies the relationship between the various components of a risk management organization. Individuals, teams or even whole departments tasked with risk management exist in some shape or form in most organizations. The ontology allows the definition of risk management roles in more precise terms, which in turn can be used in a variety of contexts: towards better structured actual job descriptions, more accurate description of internal processes and easier inspection of alignement and consistency with risk taxonomies.
Making Open Risk Data easier

Making Open Risk Data easier

We introduce an online database that allows the (relatively) easy publication of structured risk data

Reading Time: 1 min.
Making Open Risk Data easier In an earlier blog post we discussed the promise of Open Risk Data and how the widespread availability of good information that is relevant for risk management can substantially help mitigate diverse risks. The list of Open Risk Data providers, particularly from public sector, keeps increasing and we are aiming to document all available datasets in the dedicated page of the Open Risk Manual. The trailblazing Wikidata project In this post we want to introduce another facility, an online database that allows the (relatively) easy publication of structured risk data.
Risk Model Ontology

Risk Model Ontology

Reading Time: 2 min.
Semantic Web Technologies The Risk Model Ontology is a framework that aims to represent and categorize knowledge about risk models using semantic web information technologies. In principle any semantic technology can be the starting point for a risk model ontology. The Open Risk Manual adopts the W3C’s Web Ontology Language (OWL). OWL is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things.
Overview of the Julia-Python-R Universe

Overview of the Julia-Python-R Universe

We introduce a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, the trio sometimes abbreviated as Jupyter

Reading Time: 3 min.
Overview of the Julia-Python-R Universe A new Open Risk Manual entry offers a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, sometimes abbreviated as Jupyter. Motivation A large component of Quantitative Risk Management relies on data processing and quantitative tools (aka Data Science ). In recent years open source software targeting Data Science finds increased adoption in diverse applications. The overview of the Julia-Python-R Universe article is a side by side comparison of a wide range of aspects of Python, Julia and R language ecosystems.
Data Quality and Exploratory Data Analysis using Python

Data Quality and Exploratory Data Analysis using Python

Reading Time: 0 min.
Data Quality and Exploratory Data Analysis using Python In two new Open Risk Academy courses we figure step by step how to use python to work to review risk data from a data quality perspective and how to perform exploratory data analysis with pandas, seaborn and statsmodels: Introduction to Risk Data Review Exploratory Data Analysis using Pandas, Seaborn and Statsmodels
Data Scientists Have No Future

Data Scientists Have No Future

Reading Time: 1 min.
Data Scientists Have No Future The working definition of a Data Scientist seems to be in the current overheated environment: doing whatever it takes to get the job done in a digital #tech domain that we have long neglected but which is now coming back to haunt us! That is nice urgency while it lasts, but it is not a serious job description for the future. You will always find entrepreneurial institutions to offer degrees and certifications on the latest trending hashtag.
The Promise of Open Risk Data

The Promise of Open Risk Data

Reading Time: 3 min.
There is a legend that every time a data set is released into the open, somewhere dies a black swan The Promise of Open Risk Data Well, it is not a true legend. Legends take centuries of oral storytelling to form. In our frantic age, dominated by the daily news cycle and viral twitter storms, legends have been replaced by the rather more short-lived memes and #hashtags. Black Swans need no introductions The whole informal theory of black swans concerns improbable events (low likelihood events) that come as a nasty surprise and have large impact.
Can accounting ever be sexy? From IFRS 9 to Sustainability

Can accounting ever be sexy? From IFRS 9 to Sustainability

Reading Time: 1 min.
Accounting probably would not count among the more glamorous of professions. The reasons for that status and whether it is justified are beyond the scope of this brief commentary. What is interesting to note, though, is that the relative attractiveness of accounting is arguably improving, driven by a number of systemic societal developments: the need for more proactive assessment of the state of the world, eliminating the infamous “rear-view mirror” pathology.
What Inka quipus teach us about data management

What Inka quipus teach us about data management

Reading Time: 3 min.
What Inka quipus teach us about data management Chances are that your knowledge of ancient Peruvian culture is a bit rusty. Maybe you have some vague high-school memories of an extensive but backward empire that was conquered and then asset-stripped by a handful of Spanish conquistadores. Or maybe your best preserved memory is the excitement of reading von Daniken’s speculations that the Nazca lines are extraterrestrial spaceports. But unless you happened at some point later in life to hear about the work of Prof.
Open Source Risk Data with MongoDB and Python

Open Source Risk Data with MongoDB and Python

Reading Time: 3 min.
Open Source Risk Data with MongoDB and Python Open source software is all the rage those days in IT and the concept is making rapid inroads in all parts of the enterprise. An earlier comprehensive survey by Gartner, Inc. found that by 2011 more than half of organizations surveyed had adopted open-source software (OSS) solutions as part of their IT strategy. This percentage may have currently exceeded the 75% mark according to open source advisory firms.
Open Risk API

Open Risk API

Reading Time: 3 min.
Open Risk API If you work in financial risk management you will most likely recognize where the following sentence is coming from: One of the most significant lessons learned from the global financial crisis that began in 2007 was that banks information technology (IT) and data architectures were inadequate to support the broad management of financial risks. This had severe consequences to the banks themselves and to the stability of the financial system as a whole For those lucky few risk managers not being affected by inadequate IT systems, the excerpt is from the Basel Committee’s Principles for effective risk data aggregation and risk reporting (2013).
White Paper 03, Introducing the Open Risk API

White Paper 03, Introducing the Open Risk API

Reading Time: 1 min.
Open Risk White Paper 3: Introducing the Open Risk API We develop a proposal for an open source application programming interface (API) that allows for the distributed development, deployment and use of financial risk models. The proposal aims to explore the following key question: how to integrate in a robust and trustworthy manner diverse risk modeling and risk data resources, contributed by multiple authors, using different technologies, and which very likely will evolve over time.
The Zen of Modeling

The Zen of Modeling

Reading Time: 1 min.
Risk modeling is as much art as it is science The Zen of Modeling aims to capture the struggle for risk modeling beauty An undocumented risk model is only a computer program A risk model that cannot be programmed is only a concept A risk model only comes to life with empirical validation Correct implementation of an imperfect model is better than wrong implementation of a perfect model In complex systems there is always more than one path to a risk model There are no persistently true models but there are many persistently wrong models Correlation is imperfectly correlated with causation Nirvana is the simplest model that is fit for purpose Hierarchical systems lead to hierarchical models.