Michael Andersen, David Culler, David E. Culler, Michael P. Andersen February 22, 2016
The increase in high-precision, high-sample-rate telemetry timeseries poses a problem for existing timeseries databases which can neither cope with the throughput demands of these streams nor provide the necessary primitives for effective analysis of them. We present a novel abstraction for telemetry timeseries data and a data structure for providing this abstraction: a timepartitioning version-annotated copy-on-write tree. An implementation in Go is shown to outperform existing solutions, demonstrating a throughput of 53 million inserted values per second and 119 million queried values per second on a four-node cluster. The system achieves a 2.9x compression ratio and satisfies statistical queries spanning a year of data in under 200ms, as demonstrated on a year-long production deployment storing 2.1 trillion data points. The principles and design of this database are generally applicable to a large variety of timeseries types and represent a significant advance in the development of technology for the Internet of Things.
Download PDF142 Views
86 Downloads
Metadata
- AuthorsMichael Andersen, David Culler, David E. Culler, Michael P. Andersen
- Deposited December 22, 2021
- Available December 22, 2021
- ISSN--
- Text Versionfast16-papers-andersen.pdf.txt
- PDF Versionfast16-papers-andersen.pdf