Prometheus

image

Project website: http://prometheus.io/ GitHub: https://github.com/prometheus


Sparse High Resolution Histograms

Currently co-authoring the proof of concept (PoC) for sparse high resolution histograms in Prometheus with Björn Rabenstein. This work touches almost all of Prometheus codebase - scraping, TSDB, PromQL, recording and alerting rules, and soon remote-read, remote-write. This work will eventually be released in the main branch, while the current experiments are happening in the sparsehistogram branch. Find the detailed design doc by Björn here, and his another document on PromQL extensions here.


Prometheus Alert-Generator Compliance Specification and Test Suite

Authored the 1.0 specification for the Prometheus Alert-Generator compliance that you can find here. Currently, I am authoring the test suite to automatically test this specification for any software. You can find the ongoing work in prometheus/compliance.


Snapshot of In-Memory Chunks on Shutdown for Faster Restarts

This brought down the restart time of Prometheus by up to 80%! This was added in PR#7229. You can find detailed explanation in this blog post.


Memory-Mapping of Head Chunks from Disk

This brought down the memory usage of Promtheus by upto 50%. This was achieved with the combination of PR#6830 and PR#6679. You can read more about it in this blog post, and a detailed explanation in this blog post.


'@' Modifier in PromQL

Based on this design doc, the @ modifier for PromQL was added in PR#8121. Learn more in this blog post.


Subquery Support in PromQL

With PR#4831, subqueries of the form

<instant_query> '[' <range> ':' [ <resolution> ] ']' [ offset <duration> ]

was introduced in Prometheus. You can read more about it in this blog post.


Vertical Compaction and Queries in TSDB

With PR#370, Prometheus was able to handle time-overlapping blocks of data. This enabled backfilling of old data into Prometheus.


Persist for State of Alerts Across Restarts

My GSoC 2018 work, as the title says. PR#4061. Read more about it in my blog post.


Unit Testing of Rules in promtool

My GSoC 2018 work, as the title says. PR#4350 added unit testing of rules in the promtool. Read more about it in my blog post.


Performance and Memory Optimizations

It's more about the investigation than the final fix.