{cas} a journal by Cas Stantonius

Posts tagged with synthetic-data

2 posts found

Data Onboarder - Part 1

· 100 min read · notebook

Can I use DSPy to optimize converting JSON into existing SQL ## TL;DR - **Objective**: Generate synthetic data with known data manipulations (transformations and corruptions); optimize prompts so an agent can spot said manipulations to ultimately suggest *how* the json could be incorporated into the SQL - **Process**: Created a mock database, Python tools for querying the database, defined manipulations, generated synthetic data based on existing schema, built agent to predict these known data anomalies, attempted to score these predictions - **Outcome**: The agent did a good job in identifying the issues in general but we don't know how well yet, since scoring proved to be a challenge.

Read more →