Data language for building maintainable social science datasets


The project is under active development and is not production ready. You are free to play around with the language and tools, but note that backward compatibility is not guaranteed. If you are interested in using REAM for serious projects, please contact me.

Migration from the old documentation is a work in progress. Not all features are well-documented, and the website needs more polishing. See here for a more detailed discussion.

REAM is an alternative to spreadsheets and databases for managing social science datasets that emphasizes on maintainability. It compiles to both analysis-ready datasets (CSV, JSON) and human-readable documentations (HTML, PDF). Multiple formats, one source.

graph LR; SOURCE[REAM File] --> PARSER(["REAM Parser
(ream-core)"]); PARSER --> DATA[(Datasets
CSV, JSON, etc.)] SOURCE --> CONVERTER([Third-party
Markdown Converter]) CONVERTER --> DOC[[Documentation
HTML, PDF, etc.]]

REAM has three main components:

  1. a data serialization language for structured datasets (working in progress)
  2. a data template for generating datasets (working in progress)
  3. a collection of filters for data manipulation (planned)

The language, along with the toolchain, aims to help social scientists build human-readable, modular and reusable datasets, ranging from small personal projects to large collaborative projects. Learn more about REAM's features to see whether it is the right choice for your next project.

It reads like Markdown, writes like Markdown, converts like Markdown. Is it Markdown?
REAM Aint Markdown!
# Country
- name: Belgium
- capital: Brussels
- population: 11433256
  > data from 2019; retrieved from World Bank
- euro_zone: TRUE
  > joined in 1999

## Language
- name: Dutch
- size: 0.59

## Language
- name: French
- size: 0.4

## Language
- name: German
- size: 0.01