---
title: "LLM Tool Calling"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{LLM Tool Calling}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
library(myIO)
# Render tool results as the JSON an agent actually receives.
as_json <- function(x) {
  cat(jsonlite::toJSON(x, pretty = TRUE, auto_unbox = TRUE, null = "null"))
}
```

## The problem this solves

Ask an AI assistant to "make a box plot of Sepal.Width by Species with myIO" and
it will confidently write something. Often it is subtly wrong: a chart type that
doesn't exist (`scatterplot` instead of `point`), a mapping key it invented
(`value_var` instead of `y_var`), or a function argument with the wrong name
(`setAxisFormat(axis_x = ...)`). The code looks plausible and fails at runtime —
or worse, runs and renders nothing.

myIO ships the pieces that let an AI **check its work before it commits**: a
machine-readable description of every chart it can draw, and a validator that
flags structural mistakes and suggests the fix. The result is that an assistant
can generate a myIO chart, discover it got a name wrong, correct it, and only
then hand you working code.

## New to "tool calling"?

A short orientation; skip to [What you can rely on](#what-you-can-rely-on) if
this is familiar.

Modern LLMs can be given **tools** — functions you describe to the model that it
may call mid-conversation. Instead of answering in one shot, the model can call
a tool, read the result, and revise. **Tool calling** is that loop.
**MCP** (the Model Context Protocol) is a standard way to expose such tools to
assistants like Claude Desktop, Claude Code, or Cursor, so you configure them
once and any MCP-aware client can use them.

myIO exposes six tools. The two that do the work are `validate_spec` (is this
chart description valid?) and `validate_call` (is this function call valid?);
the other four let the model discover what's available. They're usable two ways:
as plain **R functions** (shown throughout this article, so every example below
is real output) and as an **MCP server** (covered at the end) for assistants.

## A first example

List the chart types myIO knows about — this is the model's menu:

```{r}
myio_list_chart_types()
```

Now suppose the assistant proposes a box plot but guesses the mapping keys:

```{r}
attempt <- list(
  type = "boxplot",
  mapping = list(column_var = "Species", value_var = "Sepal.Width")
)
as_json(myio_validate_spec(attempt))
```

`valid` is `false`. The errors say `boxplot` actually needs `x_var` and `y_var`,
and — crucially — each unknown key carries a `suggestion`. Applying them
(`column_var` → `x_var`, `value_var` → `y_var`) gives a spec that passes:

```{r}
fixed <- list(
  type = "boxplot",
  mapping = list(x_var = "Species", y_var = "Sepal.Width")
)
as_json(myio_validate_spec(fixed))
```

That is the whole idea: the model doesn't have to know myIO's API by heart, and
it doesn't have to guess. It checks, and the validator tells it what to do.

## What you can rely on {#what-you-can-rely-on}

For anyone evaluating this as a real solution, the properties that matter:

- **Complete and current by construction.** The schema is *generated from the
  engine's own contracts* — the same definitions the renderer uses — so it can't
  drift out of sync with what myIO actually draws, and it covers the full
  surface, not a hand-picked subset:

  ```{r}
  length(myio_list_chart_types())   # chart types
  length(myio_list_functions())     # exported function signatures
  ```

- **Stable, machine-readable errors.** Every result is `{ valid, errors }`, and
  each error has a fixed `code` your code can branch on — never prose to parse:
  `UNKNOWN_TYPE`, `MISSING_MAPPING`, `UNKNOWN_MAPPING_KEY`, `INVALID_TRANSFORM`,
  `MISSING_COLUMN`, `NON_NUMERIC_COLUMN`, `UNKNOWN_FUNCTION`, `UNKNOWN_ARGUMENT`.

- **Identical across languages.** The R functions here and the MCP server share
  one generated schema and are held to the same conformance test corpus, so an
  agent gets the same answer whichever surface it calls.

Each chart type carries its own contract — required mappings, numeric fields,
valid transforms — which the model can fetch on demand:

```{r}
as_json(myio_chart_schema("boxplot"))
```

## Checking specs against real data

By default validation is structural. Pass a column type map and it also checks
the spec against your actual data: mapped columns must exist, and numeric fields
must map to numeric columns.

```{r}
as_json(myio_validate_spec(
  list(type = "point", mapping = list(x_var = "wt", y_var = "mpg")),
  columns = list(wt = "numeric", mpg = "character")
))
```

## Checking function calls

The same applies to the `set*()` styling functions — wrong argument names are a
common LLM mistake. `validate_call` checks the name and arguments against the
real signature and suggests corrections:

```{r}
as_json(myio_validate_call("setAxisFormat", list(axis_x = ".0f")))
```

```{r}
myio_function_signature("setAxisFormat")
```

## The six tools

| Tool | R function | Purpose |
|------|------------|---------|
| `list_chart_types` | `myio_list_chart_types()` | Every chart type in the schema |
| `get_chart_schema` | `myio_chart_schema(type)` | One type's contract (or all) |
| `validate_spec` | `myio_validate_spec(spec, columns)` | Validate a chart spec, with optional data-contract checks |
| `list_functions` | `myio_list_functions()` | Every exported function name |
| `get_function_signature` | `myio_function_signature(fn)` | A function's argument names |
| `validate_call` | `myio_validate_call(fn, args)` | Validate a function name + arguments |

## The repair loop

Putting it together — the pattern an agent follows whether it calls the R
functions or the MCP tools:

1. Propose a chart spec.
2. Call `validate_spec`.
3. If `valid` is `false`, apply the `suggestion` for each error and go to 2.
4. Once valid, validate any accompanying `set*()` call with `validate_call`.

This loop, run against the actual validator, converges the broken attempt above
to a working spec without any hard-coded knowledge of myIO's API:

```{r}
spec <- list(type = "boxplot",
             mapping = list(column_var = "Species", value_var = "Sepal.Width"))
res  <- myio_validate_spec(spec)

while (!res$valid) {
  for (err in res$errors) {
    if (identical(err$code, "UNKNOWN_MAPPING_KEY") && !is.null(err$suggestion)) {
      spec$mapping[[err$suggestion]] <- spec$mapping[[err$field]]
      spec$mapping[[err$field]] <- NULL
    }
  }
  res <- myio_validate_spec(spec)
}
as_json(spec)
```

## Wiring it into an assistant (MCP)

To give the six tools to an MCP-aware assistant (Claude Desktop, Claude Code,
Cursor), run the bundled Node server. It is deliberately separate from the R
package so installing myIO never pulls in Node dependencies.

```{sh, eval=FALSE}
cd mcp
npm install
node server.mjs
```

Add it to the client's MCP configuration (the shape is the same across Claude
Desktop, Claude Code, and Cursor):

```json
{
  "mcpServers": {
    "myio": {
      "command": "node",
      "args": ["/absolute/path/to/myIO/mcp/server.mjs"]
    }
  }
}
```

For a local, fully open-source loop you can drive the same tools from an Ollama
model's tool-calling API; the steps are identical to the repair loop above.
Because the server reads the same generated schema, its results match the R
functions in this article exactly.

## What validation guarantees — and what it does not

`validate_spec` and `validate_call` guarantee **contract correctness**: a known
chart type, an allowed transform, the expected mapping keys, optional
data-column compatibility, and known function arguments. They do **not**
guarantee the chosen chart is the *right* chart for the question — asking for a
box plot when a histogram was wanted produces a perfectly valid spec. Semantic
and aesthetic judgement stays with the model; these tools close the
structural-error class, which is the part an LLM most reliably gets wrong.
```