Files and media

How non-text payloads flow through a Weft project.

Any non-text payload in a Weft project (an image, an audio clip, a video, a PDF, a spreadsheet) moves through the file system. You do not pass raw bytes between nodes. Instead, files are uploaded once through the weft-api, given a stable URL, and passed around as a small JSON handle that media-typed ports consume.

This keeps the data flow light (edges carry references, not megabytes) and gives every file a stable retrieval URL across the entire execution graph.

Media types and the handle shape

Weft has four media types: Image, Audio, Video, Document. Media is a shorthand union covering all four (useful for nodes like WhatsAppSendMedia that accept any of them).

A media port carries a JSON object with three fields:

Inside an ExecPython node, you access the fields with image["url"], image["mimeType"], image["filename"]. To produce a media output, return a dict in that exact shape under your port's name.

The upload flow

The weft-api exposes a two-step upload endpoint at /api/v1/files:

  1. POST /api/v1/files with {filename, mimeType, ephemeral?, executionId?}. The server creates a file record and returns {file_id, upload_url, url, filename, mimeType}.
  2. PUT the raw bytes to upload_url. Once the bytes are on disk, GET url will serve them back.

On WeaveMind Cloud, the bytes are stored in R2 and the upload_url is a presigned R2 URL, so the dashboard uploads directly to R2 without the bytes ever touching the API server. On OSS local mode, the upload_url points back at the API server which writes to the local data/files/ directory.

Where files come from

Files enter a project in a few ways:

  • Blob config fields in the Builder. Nodes like Image, Audio, Video, Document have a blob field. Drop a file onto the field in the expanded node view, the dashboard uploads it and stores the returned handle in the node's config.
  • Runner blob fields. When a node has a blob field and the Loom exposes it to visitors, the Runner renders a file upload widget. The visitor uploads, the node's config is updated with the new handle, then the project runs.
  • Incoming messages with attachments. Platform receive triggers like WhatsAppReceive expose typed media output ports (image, video, audio, document). When a message arrives with an attachment, the matching port carries the handle, the others are null.
  • Node outputs. Nodes that generate or download media (image generation, text-to-speech, scrapers) upload the new file through the same API and return its handle as their output.
  • Explicit uploads from ExecPython. You can POST bytes from inside an ExecPython node and return the resulting handle as a media output.

Why a URL and not bytes

Passing URLs instead of raw bytes has several benefits.

  • Cheap edges. A 50 MB video moves through a graph as a small JSON handle. The executor does not copy bytes between nodes.
  • Same reference in code and in UI. The Builder can preview an image output without downloading it inline. The Runner can render a video player from the same handle.
  • Durable state. The file is persisted the moment the PUT completes. If the execution crashes mid-run, the bytes are still on disk (or in R2) when the project resumes.
  • External APIs want URLs anyway. Most vision models, transcription APIs, and document analysis APIs accept a URL directly. The handle is already in the right shape to pass along.

Lifecycle

The upload endpoint tracks ephemeral and executionId in the file metadata, so the platform knows whether a given file is tied to a single execution (a webhook payload, a runner upload) or is persistent (a project-level asset). No automatic cleanup job is in place today: files stay on disk until you delete them manually or garbage-collect by hand. That is a known gap and a candidate for the post-launch cleanup pass.

For now, treat the file system as durable but not auto-pruned. Do not rely on it for long-term storage of anything you need to keep forever (put those in a real object store), but also do not expect files to disappear on their own between runs.

What's next

  • Types: the full type system, media types included.
  • Put a human in the loop: forms that can include display_image fields fed from upstream media ports.
  • Inspect a past run: every file produced by an execution is linked from the execution record.