Data Types are a fundamental building block of data science Data science is about data, but data are not simple and tame beasts. They have character and attitude, which can cause a lot of friction between them and the data scientist. There is a lot of sweat and tears involved when confronting data, but data scientists can do worse than know how to handle in particular Data Type quirks. Namaly a good fraction of data science involves not modelling data, not transforming data, not even cleaning data but simply goading data around the right containers, providing them with the right stage that fits their character.
Visualizing a year in lockdowns and restricted mobility As we move into February 2021 the world will be experiencing almost a year under pandemic conditions. This has markedly changed behavioral patterns of human mobility across the board. One major difference with previous pandemics is that through the use of a variety of digital technologies and new data collection channels we know have an unprecedented view of those changing mobility patterns.
Constructing a Global Mobility Index (GMI): In previous posts (here, and here) we introduced new Open Risk Dashboard functionalities that integrate COVID-19 community mobility data (currently focusing on the datasets provided by Google). As a reminder, these reports chart over time human mobility trends collected from mobile geolocation data. The granularity is by geography and across different categories of places / activities such as retail and recreation areas, groceries and pharmacies, parks, transit stations, workplaces, and residential areas.
Is the size of global debt truly “astronomical”? The notion of astronomical numbers and figures is quite frequently seeping in everyday language when large quantities of something are encountered in “normal” life. The strict definition of astronomical is obviously something of, or relating to, astronomy and astronomical observations but in common usage it also denotes something enormously or inconceivably large. This is, of course, because astronomical figures are inconceivably large!
Using Sankey Diagrams: Sankey Diagrams are a type of flow diagram composed of interconnected arrows. The width of the arrows is proportional to the flow rate. Sankey diagrams are often used in physical sciences (physics, chemistry, biology) and engineering but also in economics. They can be used to represent the relative role and significance of various inputs and outputs in a given process. Sankey diagrams emphasize the major transfers within a system.
What this blog post is about (and what it isn’t): With the ever more widespread adoption of Data Science, defined as the intensive use of data in various forms of decision making, there is a renewed interest in Visualization as an effective channel for humans to understand data at various stages of the data lifecycle. There is a large variety of data visualization tools which can produce an ever more bewildering variety of visualization types
The community mobility reports and OpenCPM: In a previous post we introduced new OpenCPM functionality that integrates COVID-19 community mobility data (currently from Google). The reports chart movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential. While these reports are unlikely to persist as open data sources in the long term, the current availability (as of May 2020) enables providing within OpenCPM a mobility data dashboard that can help draw insights through visualization and statistical analysis.
The community mobility reports and OpenCPM: As the COVID-19 pandemic unfolded technology providers (most notably Google and Apple) made available to the public aggregated and anonymized data about human mobility in the crisis period (on the basis of smartphone location data). These Community Mobility Reports provide insights into how mobility patterns changed in response both to pandemic news and policies aimed at combating COVID-19. The reports chart movement trends over time by geography, across different categories of locations and activities, such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential.
Course Content: This course is a CrashProgram (short course) introducing the GeoJSON specification for the encoding of geospatial features. The course is at an introductory technical level. It requires some familiarity with data specifications such as JSON and a very basic knowledge of Python Who Is This Course For: The course is useful to: Any developer or data scientist that wants to work with geospatial features encoded in the geojson format How Does The Course Help: Mastering the course content provides background knowledge towards the following activities:
Course Content: This course is an introduction to the concept of credit contagion. It covers the following topics: Contagion Risk Overview and Definition Various Contagion Types and Modelling Challenges The Simple Contagion Model by Davis and Lo Supply Chains Contagion Sovereign Contagion Who Is This Course For: The course is useful to: Risk Analysts across the financial industry and beyond Risk Management students Quantitative Risk Managers developing or validating risk models How Does The Course Help: Mastering the course content provides background knowledge towards the following activities:
Connecting the Dots: Economic Networks as Property Graphs: We develop a quantitative framework that approaches economic networks from the point of view of contractual relationships between agents (and the interdependencies those generate). The representation of agent properties, transactions and contracts is done in the a context of a property graph. A typical use case for the proposed framework is the study of credit networks. You can find the white paper here: (OpenRiskWP08_131219)
A new logo for the Open Risk Manual: We have updated the logo for the Open Risk Manual. The new logo aims to make more explicit both the inspiration that the Open Risk Manual project draws from the trail-blazing Wikipedia initiative (and increasing collection of associated Wikimedia projects) and the reliance on the open source ecosystem of software and tools, including the mediawiki software and the important semantic mediawiki extension.
Visualization of large scale economic data sets: Economic data are increasingly being aggregated and disseminated by Statistics Agencies and Central Banks using modern API’s (application programming interfaces) which enable unprecedented accessibility to wider audiences. In turn the availability of relevant information enables more informed decision making by a variety of actors in both public and private sectors. An excellent example of such a modern facility is the European Central Bank’s Statistical Data Warehouse (SDW), an online economic data repository that provides features to access, find, compare, download and share the ECB’s published statistical information.
Motivation for the comparison: A large component of risk management relies on data processing and quantitative tools. In turn, such information processing pipelines and numerical algorithms must be implemented in computer systems. Computing systems come in an extraordinary large variety but in recent years open source software finds increased adoption for diverse applications (machine learning, data science, artificial intelligence). In particular cloud computing environments are primarily based on open source projects at the systems level.
The challenge with historical credit data: Historical credit data are vital for a host of credit portfolio management activities: Starting with assessment of the performance of different types of credits and all the way to the construction of sophisticated credit risk models. Such is the importance of data inputs that for risk models impacting significant decision making / external reporting there are even prescribed minimum requirements for the type and quality of necessary historical credit data.
Release of version 0.4.1 of the transitionMatrix package focuses on stressing transition matrices: Further building the open source OpenCPM toolkit this realease of transitionMatrix features: Feature: Added functionality for conditioning multi-period transition matrices Training: Example calculation and visualization of conditional matrices Datasets: State space description and CGS mappings for top-6 credit rating agencies Conditional Transition Probabilities The calculation of conditional transition probabilities given an empirical transition matrix is a highly non-trivial task involving many modelling assumptions.
Representing economic activity using pictograms: Visualization can produce significant new insights when applied to quantitative data. It is currently undergoing a renaissance that mirrors other developments in computing and data science. Sophisticated open source libraries such as d3.js or matplotlib, to name but a couple, are enabling an ever wider range of users to distill valuable information from the avalanche of data being produced. Yet when it comes to visualizing data that relate to abstract concepts it can be quite difficult to find an appropriate grammar to express the quantitative context.
Open Risk released version 0.1 of the Transition Matrix Library Motivation: State transition phenomena where a system exhibits stochastic (random) migration between well defined discrete states (see picture below for an illustration) are very common in a variety of fields. Depending on the precise specification and modelling assumptions they may go under the name of multi-state models, Markov chain models or state-space models. In financial applications a prominent example of phenomena that can be modelled using state transitions are credit rating migrations of pools of borrowers.
Visualizing the risk management of the future: How do we communicate risk insights? The information tools used by risk managers to communicate insights have been transformed multiple times over the ages. In each era we have adopted existing technologies, but we also created demand for new technologies. Our era is no exception. To understand where we are going, we need to understand where we are coming from. So lets briefly recap our industrious past before we peer briefly into the visually exciting crystal ball.
Correlation Radar added to the Dashboard: About the Correlation Radar: The EU Risk Dashboard is a web app developed by Open Risk to assist with the exploration and understanding of the large number of economic indicators published by the ECB in its Statistical Data Warehouse. The app data are derived from the timeseries available in the Warehouse. Most readings in the currently selected series are monthly or quarterly and are updated when those become available at the Warehouse.