{
  "access": "public",
  "type": "reference",
  "format": "markdown",
  "title": "Data Tools Reference",
  "chunked": true,
  "url": "https://library.datagrout.ai/data-tools",
  "summary": "Pure JSON/structure manipulation tools for operating on any value — maps, arrays, or scalars. Zero credits, no external calls, no LLM. Plus `data.map` for fan-out orchestration over MCP tools. All tools accept either inline `payload` or a `cache_ref` from `_meta.datagrout.cache_ref` of any previous tool response. All Data tools are available at `data-grout@1/data.*@1`.",
  "content_markdown": "# Data Tools\n\nPure JSON/structure manipulation tools for operating on any value — maps, arrays, or scalars. Zero credits, no external calls, no LLM. Plus `data.map` for fan-out orchestration over MCP tools. All tools accept either inline `payload` or a `cache_ref` from `_meta.datagrout.cache_ref` of any previous tool response. All Data tools are available at `data-grout@1/data.*@1`.\n\n---\n\n## `data.get@1`\n\nAccess a value at a path within a nested structure.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | any | conditionally | -- | JSON value to access. Provide either `payload` or `cache_ref` |\n| `cache_ref` | string | conditionally | -- | Cache reference from a prior tool call |\n| `path` | array | yes | -- | Ordered list of keys (strings) and indices (integers) |\n\n### Example\n\n```json\n{\n  \"name\": \"data-grout@1/data.get@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"path\": [\"QueryResponse\", \"Invoice\", 0, \"TotalAmt\"]\n  }\n}\n```\n\n**Response:** `{\"value\": 150.00, \"found\": true}`\n\n---\n\n## `data.pick@1`\n\nKeep only the specified keys from a map. If payload is a list of maps, picks from each one.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | object \\| array | conditionally | -- | Map or list of maps |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `keys` | array | yes | -- | List of key names to keep |\n\n### Example\n\n```json\n{\n  \"name\": \"data-grout@1/data.pick@1\",\n  \"arguments\": {\n    \"payload\": {\"name\": \"Alice\", \"age\": 30, \"email\": \"a@b.com\", \"internal_id\": \"xyz\"},\n    \"keys\": [\"name\", \"age\"]\n  }\n}\n```\n\n**Response:** `{\"data\": {\"name\": \"Alice\", \"age\": 30}}`\n\n---\n\n## `data.omit@1`\n\nRemove the specified keys from a map. If payload is a list of maps, omits from each one.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | object \\| array | conditionally | -- | Map or list of maps |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `keys` | array | yes | -- | List of key names to remove |\n\n---\n\n## `data.take@1`\n\nReturn the first N items from an array. When operating on a `cache_ref`, creates a structural-sharing view instead of duplicating data.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | Array to take from |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `n` | integer | yes | -- | Number of items to take |\n\n**Response:** `{\"data\": [...], \"count\": N}`\n\n---\n\n## `data.drop@1`\n\nSkip the first N items from an array and return the rest. Creates a view ref when using `cache_ref`.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | Array to drop from |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `n` | integer | yes | -- | Number of items to skip |\n\n---\n\n## `data.keys@1`\n\nReturn the keys of a map or the indices (0..n-1) of an array.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | any | conditionally | -- | Map or array |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n\n**Response:** `{\"keys\": [\"name\", \"age\", \"email\"], \"count\": 3}`\n\n---\n\n## `data.count@1`\n\nCount items in an array, keys in a map, or characters in a string.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | any | conditionally | -- | Value to count |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n\n**Response:** `{\"count\": 42}`\n\n---\n\n## `data.flatten@1`\n\nFlatten a nested map into a single-level map with dot-path keys.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | object \\| array | conditionally | -- | Nested map or array |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `separator` | string | no | `\".\"` | Path separator |\n\n### Example\n\n```json\n{\n  \"name\": \"data-grout@1/data.flatten@1\",\n  \"arguments\": {\n    \"payload\": {\"user\": {\"name\": \"Alice\", \"address\": {\"city\": \"SF\"}}}\n  }\n}\n```\n\n**Response:** `{\"data\": {\"user.name\": \"Alice\", \"user.address.city\": \"SF\"}}`\n\n---\n\n## `data.merge@1`\n\nMerge two maps together. Target values override base for conflicting keys.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `base` | object | conditionally | -- | Base map (inline) |\n| `base_cache_ref` | string | conditionally | -- | Cache reference for base |\n| `target` | object | conditionally | -- | Map to merge on top |\n| `target_cache_ref` | string | conditionally | -- | Cache reference for target |\n| `deep` | boolean | no | `false` | Recursively merge nested maps |\n\n---\n\n## `data.filter@1`\n\nFilter items in an array using declarative predicate conditions with path-based field access. Works on any array — record maps, nested objects, or flat values. When operating on a `cache_ref`, creates a structural-sharing view.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | Array to filter |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `where` | array | yes | -- | List of condition objects |\n\nEach condition in `where`:\n\n| Field | Type | Description |\n|-------|------|-------------|\n| `path` | array | Path to the value to compare (list of keys/indices). `[]` compares the element itself |\n| `field` | string | Shorthand for `path: [field]` — use for simple record maps |\n| `op` | string | Operator: eq, neq, gt, gte, lt, lte, in, not_in, contains, starts_with, ends_with, is_null, not_null |\n| `value` | any | Comparison value (not needed for is_null/not_null) |\n\n### Example: filter nested objects\n\n```json\n{\n  \"name\": \"data-grout@1/data.filter@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"where\": [\n      {\"path\": [\"address\", \"state\"], \"op\": \"eq\", \"value\": \"CA\"},\n      {\"field\": \"active\", \"op\": \"eq\", \"value\": true}\n    ]\n  }\n}\n```\n\n**Response:** `{\"data\": [...], \"count\": 15, \"removed\": 85}`\n\n### Example: filter flat array\n\n```json\n{\n  \"name\": \"data-grout@1/data.filter@1\",\n  \"arguments\": {\n    \"payload\": [1, 5, 10, 15, 20, 25],\n    \"where\": [{\"path\": [], \"op\": \"gte\", \"value\": 10}]\n  }\n}\n```\n\n**Response:** `{\"data\": [10, 15, 20, 25], \"count\": 4, \"removed\": 2}`\n\n---\n\n## `data.sort@1`\n\nSort any array. For flat arrays, sorts by value directly. For arrays of maps, use `by` to specify sort keys.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | Array to sort |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `by` | string \\| array | no | -- | Field name or list of sort specs with `field`/`path` and `dir` |\n| `dir` | string | no | `\"asc\"` | Default direction when `by` is omitted or a simple string |\n\n### Example: multi-field sort\n\n```json\n{\n  \"name\": \"data-grout@1/data.sort@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"by\": [\n      {\"field\": \"department\", \"dir\": \"asc\"},\n      {\"field\": \"salary\", \"dir\": \"desc\"}\n    ]\n  }\n}\n```\n\n### Example: sort flat array descending\n\n```json\n{\n  \"name\": \"data-grout@1/data.sort@1\",\n  \"arguments\": {\n    \"payload\": [3, 1, 4, 1, 5, 9],\n    \"dir\": \"desc\"\n  }\n}\n```\n\n**Response:** `{\"data\": [9, 5, 4, 3, 1, 1], \"count\": 6}`\n\n---\n\n## `data.unique@1`\n\nDeduplicate an array, keeping the first occurrence of each value.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | Array to deduplicate |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `by` | string | no | -- | Field name to deduplicate on (for arrays of maps) |\n| `path` | array | no | -- | Path to the value to deduplicate on (takes precedence over `by`) |\n\n### Example\n\n```json\n{\n  \"name\": \"data-grout@1/data.unique@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"by\": \"email\"\n  }\n}\n```\n\n**Response:** `{\"data\": [...], \"count\": 42, \"duplicates_removed\": 8}`\n\n---\n\n## `data.aggregate@1`\n\nReduce a list to a single value using a named operation. Works on flat lists of numbers/values or on a specific field across a list of records.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | List of values to aggregate |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `op` | string | yes | -- | Operation: sum, mean, min, max, count, count_distinct, product, median, mode, range, join, first, last, flatten |\n| `field` | string | no | -- | When payload is a list of maps, the field to extract before aggregating |\n| `separator` | string | no | `\", \"` | Separator for the `join` operation |\n\n### Example: sum a flat list\n\n```json\n{\n  \"name\": \"data-grout@1/data.aggregate@1\",\n  \"arguments\": {\n    \"payload\": [12, 435, 67543, 2567],\n    \"op\": \"sum\"\n  }\n}\n```\n\n**Response:** `{\"result\": 70557, \"op\": \"sum\", \"count\": 4, \"skipped\": 0}`\n\n### Example: average a field across records\n\n```json\n{\n  \"name\": \"data-grout@1/data.aggregate@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"op\": \"mean\",\n    \"field\": \"TotalAmt\"\n  }\n}\n```\n\n**Response:** `{\"result\": 2450.75, \"op\": \"mean\", \"count\": 24, \"skipped\": 0}`\n\n### Example: join strings\n\n```json\n{\n  \"name\": \"data-grout@1/data.aggregate@1\",\n  \"arguments\": {\n    \"payload\": [\"Alice\", \"Bob\", \"Charlie\"],\n    \"op\": \"join\",\n    \"separator\": \" | \"\n  }\n}\n```\n\n**Response:** `{\"result\": \"Alice | Bob | Charlie\", \"op\": \"join\", \"count\": 3, \"skipped\": 0}`\n\n---\n\n## `data.map@1`\n\nFan-out: call any MCP tool once per item in a list, collecting results. Like `list.map { |item| tool.call(item) }` where the function is any tool available on the current server. Each iteration is a separate tool call with its own credit charge. Executions run in parallel by default.\n\n### Parameters\n\n| Parameter | Type | Required | Default | Description |\n|-----------|------|----------|---------|-------------|\n| `payload` | array | conditionally | -- | List of items to iterate over |\n| `cache_ref` | string | conditionally | -- | Cache reference |\n| `tool` | string | yes | -- | Fully-qualified tool name to call for each item |\n| `as` | string | no | -- | Inject each item as this named argument |\n| `template` | object | no | -- | Argument template with `$item` / `$item.field` placeholders |\n| `extra_args` | object | no | `{}` | Additional arguments merged into every tool call |\n| `max_items` | integer | no | `50` | Safety cap on items processed. Set to 0 for no limit |\n| `concurrency` | integer | no | `5` | Max parallel tool calls. Set to 1 for sequential |\n| `on_error` | string | no | `\"collect\"` | Error handling: `collect` (errors inline), `skip` (omit failures), `abort` (stop on first error) |\n| `timeout_ms` | integer | no | `30000` | Per-item timeout in milliseconds |\n\n### Example: look up multiple items\n\n```json\n{\n  \"name\": \"data-grout@1/data.map@1\",\n  \"arguments\": {\n    \"payload\": [\"INV-001\", \"INV-002\", \"INV-003\"],\n    \"tool\": \"quickbooks@1/get-invoice@1\",\n    \"as\": \"id\"\n  }\n}\n```\n\nEach item is passed as `{\"id\": \"INV-001\"}`, `{\"id\": \"INV-002\"}`, etc. Results are collected into a list.\n\n### Example: with template\n\n```json\n{\n  \"name\": \"data-grout@1/data.map@1\",\n  \"arguments\": {\n    \"cache_ref\": \"rc_abc123...\",\n    \"tool\": \"salesforce@1/get-account@1\",\n    \"template\": {\"account_id\": \"$item.Id\"},\n    \"concurrency\": 3,\n    \"on_error\": \"skip\"\n  }\n}\n```\n\n### Example: enrich records with extra args\n\n```json\n{\n  \"name\": \"data-grout@1/data.map@1\",\n  \"arguments\": {\n    \"payload\": [\"AAPL\", \"GOOG\", \"MSFT\"],\n    \"tool\": \"market-data@1/get-quote@1\",\n    \"as\": \"symbol\",\n    \"extra_args\": {\"include_history\": true, \"period\": \"1d\"}\n  }\n}\n```\n\n**Response:**\n\n```json\n{\n  \"results\": [\n    {\"symbol\": \"AAPL\", \"price\": 185.23, ...},\n    {\"symbol\": \"GOOG\", \"price\": 142.50, ...},\n    {\"symbol\": \"MSFT\", \"price\": 415.80, ...}\n  ],\n  \"total\": 3,\n  \"succeeded\": 3,\n  \"failed\": 0,\n  \"skipped\": 0,\n  \"_cache_ref\": \"rc_xyz789...\"\n}\n```\n\nThe combined results are cached with their own `cache_ref` for downstream chaining — pipe into `data.filter`, `frame.group`, `prism.chart`, etc.\n\n---\n\n## Using `cache_ref` with data tools\n\nEvery tool call response includes `_meta.datagrout.cache_ref`. Use this ref in subsequent data tool calls to avoid re-transmitting payloads:\n\n```json\n[\n  {\"tool\": \"salesforce@1/get-all-accounts@1\", \"args\": {}, \"output\": \"accounts\"},\n  {\"tool\": \"data-grout@1/data.take@1\", \"args\": {\"cache_ref\": \"$accounts._meta.datagrout.cache_ref\", \"n\": 5}},\n  {\"tool\": \"data-grout@1/data.pick@1\", \"args\": {\"cache_ref\": \"$accounts._meta.datagrout.cache_ref\", \"keys\": [\"Name\", \"Industry\"]}}\n]\n```\n\nCache entries have a 10-minute touch-on-access TTL. All cached data is AES-256-GCM encrypted at rest. Entries are isolated per user.\n"
}