Overview of the Julia-Python-R Universe

We introduce a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, the trio sometimes abbreviated as Jupyter

October 16, 2019 (Last Modified: April 23, 2024)

Releases, Open Risk Manual, Open Source Tools

Reading Time: 3 min.

Page content

Overview of the Julia-Python-R Universe

A new Open Risk Manual entry offers a side-by-side review of the main open source ecosystems supporting the Data Science domain: Julia, Python, R, sometimes abbreviated as Jupyter.

Motivation

A large component of Quantitative Risk Management relies on data processing and quantitative tools (aka Data Science ). In recent years open source software targeting Data Science finds increased adoption in diverse applications. The overview of the Julia-Python-R Universe article is a side by side comparison of a wide range of aspects of Python, Julia and R language ecosystems.

The comparison of the three ecosystems aims:

To be useful for people that are somewhat familiar with programming and want to inspect options and use the most appropriate tool
To promote interoperability, cross-validation and overall best-practices across the three pillars
To be factual, as much as possible, without drifting to judgement / opinions
To cover the use cases relevant for the implementation of quantitative risk models

This comparison does not aim:

To be a detailed / comprehensive catalog of all available libraries (which by now count to many thousands!)
To cover use cases that are very removed from quantitative risk models
To be exhaustive (for example to identify all the possible computer systems one can run a Python interpreter on, or count all the possible ways one can perform linear regression in R)

Topics covered

History and Community
Devices and Operating Systems
Package Management
Package Documentation
Language Characteristics
Development Environment
Files, Databases and Data Manipulation
Data Quality and Data Validation
Workflow Management
General Purpose Mathematical Libraries
Core Statistics Libraries
Stochastic Processes
Econometrics / Timeseries Libraries
Machine Learning Libraries
GeoSpatial Libraries
Visualization
Web, Desktop and Mobile Deployment
Privacy-Preserving Computation
Semantic Web / Semantic Data
Bindings to Other Languages
High Performance Computing
Using R, Python and Julia together

Disclaimers

The comparison does absolutely not provide an assessment of which system is better. The proper way to use the comparison is to start with one’s objectives, knowledge level, use case and figure out what other components to add to their toolkit.

Remark: The comparison attempted here is not entirely appropriate, as the three systems for computing have quite different origins and, therefore, quite diverging architectural design choices.

For example, strictly speaking R is not a general programming language. R is a system for statistical computation and graphics. It consists of a sufficiently general language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files.

Yet despite this disclaimer we hope you agree with us that a comparison is justified because in very large domain of applications and use cases the three frameworks can be used interchangeably (or nearly so).

Structure

The comparison data are provided in tabular format in several distinct tables.
Each table documents a relevant language or ecosystem subdomain.
The number and focus areas of the different tables are somewhat arbitrary and may expand in the future.
The order of the topics is roughly from more generic aspects towards more specialized / advanced areas, concluding with interoperatibility.
Each table entry (row) highlights key functionality within the subdomain. The language columns point to information or packages and, where applicable, there might be additional commentary.
Reference links are included when useful.

At the bottom of some tables there is a row indicated Package Review. This row has a collection of links to the CRAN Task Reviews that aim to summarize the large number of R packages available for some data science tasks. There are also links to a mirror effort to create Python Task Views (this content is still WIP - contributors welcome, see below)

Getting Involved

You can provide simple and anonymous feedback on the wiki version of the overview using the feedback button at the bottom of the page. Alternatively you can become an Open Risk Manual author and actively edit the page.

If you are more comfortable using github / markdown, there is a mirror page available here, which can be seen as a web page here. Please note that the tables are in html format as they are generated automatically.

People interested in developing the Python Task Views can contribute via the github repo.

Overview of the Julia-Python-R Universe

Overview of the Julia-Python-R Universe

Motivation

Topics covered

Disclaimers

Structure

Getting Involved

Comment

See Also