How this works (plain English)

Last updated: 2026-04-02

This tool helps you check whether your private AI chat content may have ended up in public training datasets. We built this for normal people, not just technical people.

What happens when you search

  1. You type a query (like an email, phone number, name, or unusual phrase).
  2. We send that query to public HuggingFace dataset search endpoints.
  3. We check for matches in known high-risk datasets.
  4. We show you redacted previews and match confidence.
  5. We immediately discard your query and results after the response is returned.

What we are

What we are not

What we store

Public dataset registry metadata (dataset names/links), and basic service health metrics (uptime/latency/error). No raw user query content. No raw match content.

What your result means

What to do if you find a match

  1. Use the "Report to dataset host" button.
  2. File an FTC complaint if you want regulatory paper trail.
  3. Review your state privacy rights.

Short version

We help you check public datasets. We do not keep your sensitive search data. We show only redacted snippets. We give you next steps.

Our next steps? We'll help show you what to do next. Why are we doing this? Because someone needs to, and at the hpl company we believe people like us broke it. It's our responsibility to help you fix it. Have questions? Need more help? Did we miss something? Let us know.

Questions or corrections: hello@hplcompany.com