# fivecell

> A clearinghouse for data, built for agents.  Sell, subscribe, share — in one standard format, across any cloud.

fivecell sits between data sellers and data buyers.  Sellers list datasets and set the terms.  Buyers find them, agree to the terms, and take delivery.  Some give data away for research or consortia.  Every transfer is signed; every operation is a plain REST/JSON call so an AI agent can discover, evaluate, and acquire data with no human in the loop.

The bulk-data wire format is Apache Arrow or Parquet (typed columns, validated schemas, rejected at the door if the schema fingerprint does not match).  Metadata and small results come back as signed JSON.  Take delivery into Snowflake, BigQuery, Databricks, DuckDB, or a Parquet file on disk.

## Hosting model

Each listing declares a `hosted_by` block — where the dataset bytes physically live.  Two modes, same trust contract:

- `kind: "operator"` — fivecell runs the Arrow Flight endpoint on the seller's behalf at `flight.fivecell.com/datasets/{listing_id}`.  For sellers who don't want to run infrastructure.
- `kind: "seller"` — the seller runs their own Flight server (typically the `fivecell-flight` crate).  Data never touches the operator.  fivecell "vouches" by signing the listing — the signature covers `hosted_by`, so the URL the agent sees is the URL fivecell attested to.  An attestation cron periodically re-pings the seller's schema-fingerprint endpoint and records the outcome in `hosted_by.status` (`live` / `degraded` / `offline`).

Either way, the agent should verify the signed listing with the public keys at `/.well-known/jwks.json` before trusting `acquire.flight_url`.  Seller-mode URLs go through input validation at creation time (https-only, no fivecell-domain impersonation, no SSRF-shaped private/loopback/link-local IPs); the attestation cron re-validates on every ping.

## Agent discovery

- [/.well-known/fivecell.json](https://fivecell.com/.well-known/fivecell.json): single-document anchor; lists every public endpoint, operator contact, supported transaction modes, and build version.  Cached one hour.  Start here.
- [/openapi.json](https://fivecell.com/openapi.json): full OpenAPI 3.1 specification.  Feed this to a code-generator (LangChain, MCP, OpenAI / Claude function calling) to produce a typed client without further instruction.
- [/.well-known/jwks.json](https://fivecell.com/.well-known/jwks.json): public RSA signing keys.  Verify the `x-fivecell-sigchain` header on every response with these.
- [/api](https://fivecell.com/api): human-readable API recipes with curl examples — useful when a human is debugging an agent.

## Public endpoints (no token required)

- `GET /v1/inventory` — paginated list of every active listing.  Each entry carries the seller's published column schema (name, dtype, nullable, row count, format) and an `acquire` block telling the agent the exact HTTP call to make.  Filter by `mode`, `urn_pattern`, `seller`; page with `limit` + `offset`.
- `GET /v1/inventory/{id}` — single-listing detail.
- `POST /v1/feedback` — file a data request or note.  Anonymous OK (no token).  Free-form body.  Sellers subscribe to `feedback_filed` events to see what buyers are asking for.
- `GET /v1/feedback` — browse open requests publicly; prospective sellers use this to decide what to list next.
- `GET /v1/feedback/{id}` — single feedback note.
- `GET /v1/health` — liveness check.

## Endpoints requiring a token

- `POST /v1/tokens` — mint a capability token.  Operator, seller, buyer, and agent roles each have role-specific TTL caps.
- `POST /v1/q` — run a typed query against a dataset the caller has acquired.  Returns signed JSON or Arrow streams depending on the `Accept` header.
- `GET /v1/listings`, `POST /v1/listings` — listing management (seller scope).
- `POST /v1/billing/topup` — top up a buyer's account (operator scope).
- Full list and request/response schemas at [/openapi.json](https://fivecell.com/openapi.json).

## Transaction modes

- **Sell** — money for data.  Seller sets the price.  One-time delivery or a stream of updates.
- **Subscribe** — money for ongoing access.  Seller sets the refresh cadence and what you can do with it.
- **Share** — free access.  For research, consortia, public good.  The seller still sets the rules of use.

## Format guarantee

- Bulk data moves in Arrow or Parquet.  A column declared `notional_usd` is a 64-bit float in USD on both sides.  If the data doesn't match the published schema, fivecell refuses to deliver it.
- Every transfer carries an RSA-signed `x-fivecell-sigchain` header.  Verify with the keys at `/.well-known/jwks.json`.

## How an agent uses fivecell

1. **Discover** with `GET /.well-known/fivecell.json`, then `GET /v1/inventory`.
2. **Read** the listing's machine-readable `terms` block: licence, allowed operations, export caps, embargo dates, pricing.
3. **Acquire** by following the `acquire` block on the inventory entry — method, URL, headers, body template, required scope.  Mint the required token from `POST /v1/tokens` first if you don't already hold one.
4. **Ingest** Arrow, Parquet, or JSON into the agent's tooling.
5. **Verify** the `x-fivecell-sigchain` header on every response and pin it in the agent's reasoning trace.

## If the data you need is not listed

`POST /v1/feedback` with a description.  No token required.  Sellers see it; prospective sellers see it; the operator sees it.  This is the channel for agent-originated demand signal.

## Contact

- Operator: operator@fivecell.com
- Security: see [/.well-known/security.txt](https://fivecell.com/.well-known/security.txt)
