Sprocket Data Sets

What is this?

This page gives you information on the datasets that are available through the Sprocket platform. Our datasets are mainly in the parquet format which is compact and efficient, similar to the simpler CSV format but taking up much less space.

Since the parquet format is not as commonly used as CSV or JSON, we also recommend a tool called DuckDB to work with these datasets easily. DuckDB lets you work with the datasets without downloading them, thus making it faster and efficient. Most examples here assume you're using DuckDB.

If you're more comfortable with common formats, we also provide these datasets in CSV and JSON formats. These can be used directly on Google Sheets, or on other websites / tools you use or build.


Exploring this data:

This is not an exhaustive list of tools that are compatible with the parquet format, but it has a couple options that can help you get started.

The quickest way to get started is with the DuckDB Web Console, which runs an instance of DuckDB in your browser, or with

Reporting Tools

Command line tools

Browser Based Tools

Desktop Tools

  • Tad A fast, free, cross-platform tabular data viewer application powered by DuckDB.
  • qStudio A free SQL tool specialized for data analysts. It runs on every operating system and allows easy browsing of tables and charting of results.
  • DBeaver See Also

    DBeaver Community is a free cross-platform database tool for developers, database administrators, analysts, and everyone working with data.

Using this data in a javascript project:

Browser Usage:

The Sprocket team also exposes a ready-to-use Javascript bundle that you can use on your webpage. Use the example below to load it on your page and run a simple query.

<!-- Load the Javascript bundle -->
<script type="module" async src="https://f004.backblazeb2.com/file/sprocket-artifacts/public/pages/assets/sprdb/sprdb.js" defer id="spr_script"></script>
<!-- Use the library -->
<script>
// Wait for the script to actually load
spr_script.addEventListener("load", async () => {
    // Wait for the database to initialize itself, this may take a couple seconds
    await spr.ready

    // Log that it is ready; and run a simple query
    console.log("SprocketDB Ready")
    console.log(await spr.query("SELECT 1"))
})
</script>

Reference:

type Query = (query_string: string) => Promise<Record<string, string | number | boolean>[]>
type Schema = Record<string,Record<string,string>>

type spr = { query: Query, schema: Schema, ready: Promise<void> }

window.spr = spr

NodeJS Usage:

For serverside usage; the Sprocket team has created the sprocketdb package. You can install and use this package to easily query the datasets.

import {sprocketDb} from "sprocketdb";

const main = async () => {
  const query = await sprocketDb(); // factory function to setup the database

  const aPlayer = await query("SELECT * FROM players LIMIT 1");

  console.log(
    aPlayer[0];
  )
}

main()