Data Products

Accelerate your data time-to-value with starter kits and LLM data extraction.

Over a career you repeatedly run into missing datasets that you carry from one job to the next. A really good calendar dimension. Sets of intervals for grouping timestamps. Mapping postal codes to latitude/longitude. We publish and license these kinds of missing datasets.

Most companies are also lacking a comprehensive semantic layer with a baseline of reports, visualizations and analytics products. Product revenue mix. Predicting churn. Revenue per cohort. We package up roughly the first year worth of projects for a particular industry, business function or use case into a Semantic Starter Kit. This includes a "silver" and "gold" layer data model, reference visualizations, and python scripts.

We can also use LLMs to create datasets directly by leveraging only its training data. Most of our datasets are created directly or indirectly by LLMs. Extracting data from LLMs is difficult to do at scale, given the high compute cost, estimation challenges, legal issues, validation, and trying to track changes over time. We can deliver the datasets you need quickly and within your token budget.