Virtualization of PIM Capable Memory

May-2022 (ENS-Lyon) to Jan-2024 (Remote)

Supervised by Prof. Alain Tchana and Prof. Oana Balmau.

Abstract:

Data movement is the leading cause of performance degradation and energy consumption in modern data centers. Processing in-memory (PIM) is an architecture that addresses data movement by bringing computation inside the memory chips. Even though PIM is not a new concept, the current availability of UPMEM, the first commercial PIM device for off-the-shelf machines, led to a resurgence in this space.

PIM device virtualization for cloud environments is crucial to enable large-scale adoption of processing-in-memory. We are the first to study the virtualization of UPMEM PIM devices by designing and implementing vPIM, an open-source UPMEM virtualization system for the cloud. We hope this work will lay the foundation for future research on PIM for cloud computing.

overview

The paper has been submitted for UNSENIX ATC ‘24

Persistent Memory in TimescaleDB: Persistent Memory Caching

Research Project

January-2022 to April-2022

DISCS-lab of McGill University

Supervised by Prof. Oana Balmau.

Abstract:

Previously we conducted a baseline study and benchmark on Time Series Database (TSDB) like InfluxDB and TimescaleDB and got to know the struc- tures and properties of databases. Our goal was to eventually deploy new storage technologies such as Persistent Memory (PMEM) in the TSDB to verify the impact of emerging storage technologies on reimagining the way data-intensive storage systems work. In this project, we explored multiple possibilities for adjusting the database and explored several Persistent Memory Development Kit (PMDK) libraries that can be used. A persistent memory cache is then implemented and added in TimescaleDB and PostgreSQL. We then benchmark the database with the original cache and PMEM cache running in parallel and analyze the performance of the database.

Backgrounds:

The Intel® OptaneTM DC Persistent Memory Module (PMEM) is the first commercially available NVDIMM that creates a new tier between volatile DRAM and block-based storage. Persistent Memory is non-volatile, byte-addressable, low latency memory with densities greater than or equal to Dynamic Random Access Memory (DRAM). It is beneficial due to its potential to dramatically increase system performance and enable a fundamental change in computing architecture. Applications, middleware, and operating systems are no longer bound by file system overhead in order to run persistent transactions.

Persistent Memory Module.

We have conducted a structure study and per- formance benchmark on 2 popular Time Series Databases: InfluxDB and TimescaleDB. We found that The speed of the same query varies greatly in the different databases. Also, the query rates are quite different between different types of queries in the same database. We decided to choose one database and apply persistent memory to it. InfluxDB is implemented in Go, while TimescaleDB is implemented in C. From the implementation perspective, we decided to go with TimescaleDB, as many libraries of PMDK support C language, which provides more flexibility in terms of programming.

Comparison Study of Time Series Databases: InfluxDB vs TimescaleDB

Research Project

September-2021 to December-2021

DISCS-lab of McGill University

Supervised by Prof. Oana Balmau.

Abstract:

The variable of time has become a crucial factor in the analysis and usage of data as it helps organizations for better understanding of the trends and patterns over time, eventually used in future prediction. Time Series data and Time Series Database (TSDB) are thus developed for optimized usage of the time feature. Solving the database bottleneck and improving system efficiency become crucial to make improvements. Motivated by solving the bottleneck of the Time Series Database, we conducted this research project to study the functionalities and properties of Time Series databases and to further compare their performance in insertion and various kinds of queries, in the end to choose the Time Series database best suited for our need. This project serves as a baseline and preparation work for our goal in eventually deploying new storage technologies such as Persistent Memory in the TSDB to verify the impact of emerging storage technologies on reimagining the way data-intensive storage systems work.

Jiaxuan Chen

Works

Virtualization of PIM Capable Memory

Abstract:

Persistent Memory in TimescaleDB: Persistent Memory Caching

Abstract:

Backgrounds:

Comparison Study of Time Series Databases: InfluxDB vs TimescaleDB

Abstract: