Processing Agency Mortgage Data with Awk, Pandas and Django. Part 2, Dynamic Data

Page content
Python is the swiss knife of modern programming languages and a prime candidate to be also the swiss knife for risk modelling

Course Content:

This crash course illustrates how to process loan-level US Agency mortgage data using awk, pandas and django. The second part of the course focuses on the performing book. This part covers the following topics:

  • Concepts of the Credit Life Cycle and how changing states are captured in Loan Data as Dynamic (Variable) Fields with a focus on performing loans (excluding delinquent loans)
  • Selecting performing loan data using awk and pandas
  • Working with Date formats
  • Data Quality Concepts for Performing Loans
  • Manipulating and exporting derived data models using pandas
  • Importing performing book credit data models into a django based web platform (openNPL) that enables further interactive work with such data

Nota Bene: The course requires actual historical loan performance data for its proper completion. Those data are not provided within the course. Students must source such data themselves from the Data Dynamics website and agree to be in compliance with the applicable terms and conditions.

Who Is This Course For:

The course is useful to:

  • Data Engineers / Data Scientists across the financial industry and beyond that need to work with mortgage data
  • Credit Risk Management professionals and students
  • Credit Portfolio Management professionals

How Does The Course Help:

Mastering the course content provides background knowledge towards the following activities:

  • Improved ability to process large loan-level historical performance data
  • Pre-process, categorize, segment and improve on such data sets in preparation for further analysis

What Will You Get From The Course:

  • You will be able to confidently work with Loan-level historical performance data
  • You will be able to contribute to the specific use cases mentioned above

Course Level and Difficulty Level:

This course is part of the Risk Modeling using Python family.

  • This is a Core Level course in Risk Modelling. A good grounding at Introductory level to various Data Engineering and Data Science topics is a prerequisite for making the most out of this course.
  • This is a Technical course which means certain technology elements (Python, CLI) are needed for mastering the material.

If you have not taken an Open Risk Academy course before the "CrashCourse Academy Demo" provides a quick overview of the Academy.

The following table places the course in the Open Risk Academy skills diagram:

Introductory Level Core Level Advanced Level
Technical CrashProgram

Course Material:

The course material comprises the following:

Time Requirements and Important Dates

  • The course is self-paced and can be undertaken at any point. It requires a commitment of about five hours total, depending on student familiarity and existing development environment.

Where To Get Help:

If you get stuck on any issue with the course or the Academy:

  • If the issue is related to the course topics / material, check in the first instance the Course Forum
  • If the issue is related the operation of the Open Risk Academy check first the Academy FAQ. If the issue persists contact us at

Enroll and Get Started with PYT37067

Discussion @ the Commons