Datasets

Large-scale public datasets for benchmarking OLAP databases and query engines: NYCTaxi, GitHub Archives, CommonCrawl, Criteo, and Kaggle.

Datasets

Large-scale public datasets commonly used for benchmarking OLAP databases, query engines, and data lake tools.