File systems for agents | Amplify Partners

If you've spent any time on X this year (while waiting for Claude Code or Codex to generate your API integration) you might think AI agents are weeks away from replacing you at your job. It’s possible that a fast takeoff could happen, but today, agents are pretty tame. A relatively small number of agents access a relatively small number of files, making incremental updates. Agents are definitely operating at scale, it’s just human scale, not agent scale. As such, the infrastructure requirements for these agents are manageable. They get away with using systems built for mere mortals.

But if the pundits are right, this will soon change. There will be lots of agents, wrangling lots of files (concurrently) and making lots of updates. As agents become more capable, the question of how they read, write, and update data becomes more complex. In recent months, I’ve been swayed by the Twitterati: I now believe that file systems are better suited to agents than databases or object stores. I'm less convinced that today’s file systems are already AI futureproof.

Why file systems?

When I first saw file systems trending on X, I thought this was just a rabbit hole for AI engineers to wander down, like graph RAG or SAEs. After years of getting pwned by Pandas as a data scientist, I finally discovered SQL databases and shortly thereafter became convinced they’re all we need. All other data management systems are snake oil (with the exception of Foundation DB, maybe Kafka, and certainly WarpStream).

But people on the Internet do occasionally know things, so I decided to dive deeper. We’ve seen that when infrastructure decisions are made early in a technological paradigm, they tend to calcify. The choices we make now about how agents interact with data will shape what's probable, possible, and painful for years to come. So we better be sure that file systems are the right choice. Or risk becoming the Bruno Mars of technical debt.

So I did what I always do. I talked to people (and agents) way smarter than me. And the folks building data infra for over a decade convinced me that file systems are, in fact, the way. Here’s why:

Training corpora contain lots of file system operations and captures context

A lot of code from the public internet, on which foundation models are trained, represents applications that interface with file systems. Code for reading files, writing files, navigating directory structures, and manipulating paths is the lingua franca of software engineering. Open any GitHub repository and you'll find file I/O operations scattered throughout like configuration parsing, log writing, asset loading, and data serialization.

Database code is also well represented in the training corpora, but it often depends on context that is not fully captured in the code itself, such as schemas, constraints, and production-specific access patterns. In practice, many database interactions are tightly coupled to external states that the model does not see at generation time.

This creates an asymmetry. File system operations have dead-simple semantics; they are usually self-contained and can be executed with local context. In contrast, database operations are more context-dependent and correctness-sensitive; their rich semantics may subtly differ from system to system. As a result, agents today tend to be more reliable when working with file-like abstractions where the necessary state is explicit and locally accessible, and less reliable when interacting with systems that depend on implicit or external structure.

File systems do what agents need to work with data

Before the GenAI boom, there were two common patterns for interfacing with data: OLTP and OLAP. Applications would read, write, and update multiple entries in databases concurrently using transactions. Transactions had to be fast and reliable. While many OLAP systems support snapshot isolation within read transactions, they were designed for a different workload, in which analysts run queries ad hoc or repeatedly through BI and other reporting applications. Analytics workloads focused on aggregating and exploring high volumes of structured data to generate insights.

In contrast, agents have idiosyncratic data access patterns. Consider a few examples:

A coding agent must retrieve a file containing source code, understand its structure, modify specific functions, and write the updated file back.
A legal agent might pull multiple contracts from storage, extract relevant clauses from each, cross-reference them, and synthesize an answer.
A research agent might iterate through a set of papers, extract key findings, and compile a literature review.

These workflows share common characteristics. They feature relatively high-throughput, low-latency operations on specific, often small, unstructured files retrieved from a larger corpus. Sometimes, the agent must access a series of documents, retrieving files that depend on one another (e.g., read file A, use its contents to determine which part of file B to modify, and write the result to file C).

Why not ____

Why not OLTP databases? Relational databases are great for data with clear relational structure - datasets that can be represented as tables with schemas, foreign keys linking entities, and normalized forms that eliminate redundancy. They’re especially useful when reliability is critical, when you need strict consistency, atomic multi-row updates, and rollback guarantees.

Most agentic workloads don’t fit this mold.

When a coding agent works with a repository, the “data” is source code - text files with rich, irregular structures that vary by language, framework, and project conventions. When a legal agent processes contracts, the “data” is natural language - documents with implicit structure that resists clean tabular representation. In both cases, the underlying data is unstructured or only loosely structured.

This can create friction at the schema level. Relational databases often require schemas to be defined upfront, but agents don’t always know ahead of time what they will store or work with. Their data models can evolve as tasks, tools, and contexts change. Modeling this in tables can lead to frequent schema changes or generic representations that offload structure and validation to the application layer.

The mismatch may extend to access patterns as well. Agents don’t typically operate on small, well-scoped rows. Instead, they read and write large, contiguous chunks of data, such as entire files, full documents, or complete histories. They frequently ask for “the last N steps” or need to load an entire conversation or codebase into context. These workloads are also inherently sequential and history-oriented. Files map naturally to this model, while OLTP systems tend to treat data as unordered sets of rows, making ordered histories and append-like workflows less natural to express, especially when large blobs or nested structures are involved.

You can work around this by storing unstructured data as blobs or JSON and using the relational database as a metadata index. Many systems do exactly this. But at that point, the database is no longer being used for its intended purpose. You’ve effectively built a file system abstraction on top of a transactional engine, while still paying the operational and performance costs of a system optimized for very different workloads…why not just use the right tool for the job?

Clearly, I don’t believe that relational databases are bad; and I suspect they will play a key role in the development of agents and LLM-driven applications. However, many agentic systems operate over evolving collections of documents, logs, and artifacts, and mapping those cleanly onto schema-first, row-oriented interfaces can introduce additional complexity.

Why not object storage? Object storage (S3, GCS, Azure Blob Storage) optimizes for an entirely different workload. It is built for high throughput on large objects like media files, data lake archives, backup snapshots, or ML training datasets measured in terabytes or petabytes. It is also designed primarily for durability, with replication and persistence guarantees that make it ideal for long-term storage.

Agentic workloads often invert these assumptions. Agents retrieve small files frequently, update them often, and operate in tight loops where they read context, write intermediate state, read it back, and iterate. Object storage is too slow for this pattern. S3-style systems optimize for throughput and durability, not low per-operation latency, making repeated small reads and writes inefficient, even with newer tiers like S3 Express One Zone, which reduce latency but introduce higher cost and single-AZ durability tradeoffs.

Object storage is also poorly suited to how agents mutate data. Objects are immutable blobs: you cannot efficiently append or partially update a file, only rewrite the entire object. Agents, by contrast, often append logs, update small pieces of state, or refine outputs incrementally. As a result, simple operations become repeated full rewrites that are both inefficient and awkward.

Consistency and coordination are similarly limited. Object stores generally lack transactional semantics, atomic multi-object updates, and strong coordination primitives across objects. Even with improved consistency models, they are not designed for tightly coupled read-after-write loops or coordinated updates across related artifacts. Agents often need predictable state ordering and safe updates across multiple files, which requires adding coordination layers on top.

There is also a semantic mismatch. Object storage exposes a flat namespace with key-value semantics, even if hierarchy can be emulated through prefixes. File systems instead provide true hierarchical organization with directories, relative paths, and navigation primitives. Agents, trained on code that manipulates file systems, naturally reason in terms of paths and directories, making large collections of small files cumbersome to manage in object storage.

Finally, object storage lacks native querying or indexing beyond key prefixes. Agents often need to search prior artifacts, retrieve relevant context, or filter by metadata, which typically requires adding a separate indexing layer, vector database, or metadata store. At that point, object storage becomes a durable backing store for large blobs and checkpoints rather than the system agents interact with directly.

Why file systems? Compared to other options, file systems occupy a practical sweet spot for agent workloads. At the interface level, they present a simple, stable abstraction, a collection of self-contained files organized in a POSIX-like hierarchy. This maps directly to the kinds of data agents work with. Codebases, documents, datasets, configs, and logs are already organized as files and directories, with relationships encoded in paths and structure implied by layout. They are not naturally tabular or monolithic blobs, so agents can operate on data in its native form rather than translating it into something else. File systems also avoid forcing premature structure. There is no schema to define upfront or migration process when formats evolve. Files can be text, JSON, code, or anything else, and can change shape over time without friction. Structure emerges where it is useful rather than being imposed globally.

File systems also fit naturally with how agents access and modify data. Many agent workloads are inherently ordered, where the sequence of operations matters and artifacts are best understood as a progression over time. Agents repeatedly modify small files in place by appending logs, editing functions, and tweaking prompts, creating a tight working set with strong locality. These workloads resemble ordered streams more than unordered sets of rows, and file systems handle them naturally through low-latency access, caching, and efficient in-place updates.

Consistency is similarly pragmatic. On a local filesystem, operations often appear immediate and predictable, simplifying reasoning for agents. Distributed or networked variants can weaken those guarantees and introduce edge cases around visibility and ordering, but the model remains easier to work with than systems that require coordinating transactions across many independent records.

The result is an abstraction that is minimal, composable, and already well suited to how agents and developers work. You can layer indexing, retrieval, versioning, or durability underneath it, sync to object storage, attach a vector index, or maintain metadata elsewhere. But the core interface remains stable, the agent reads and writes files, and everything else is an implementation detail.

The state of file systems today

So if file systems are the right abstraction, do existing implementations meet the needs of agent developers? Today, the answer is a tentative yes, but that could change quickly.

Most developers building agents today rely on local or single-node file systems, typically accessed through standard POSIX interfaces. The workloads are modest; small-to-medium-sized files, limited concurrency, and tight feedback loops where latency matters more than throughput. Under these conditions, a single machine, backed by SSD storage, is sufficient. Many agent frameworks implicitly assume this model. They read and write files directly, cache intermediate outputs locally, and treat the file system as a simple persistence layer; they don’t require coordination across multiple machines.

More recently, developers have begun to abstract the file system itself. Instead of interacting directly with the disk, agents may operate on virtualized file systems, including sandboxed or scoped file systems for tool execution and safety, or container-backed file systems that isolate execution environments. These abstractions preserve the familiar file interface while decoupling storage from a specific disk, so it is easier to reset state, enforce boundaries, and manage execution.

When developers outgrow a single machine, they typically turn to distributed file systems that attempt to preserve the same programming model. Systems like NFS, HDFS, Lustre, BeeGFS, and CephFS all expose a shared namespace that can be mounted across multiple machines, allowing different processes to read and write the same files concurrently, with coordination typically handled at the application level.

Although these systems are often grouped together, they vary significantly in design. NFS is essentially a network protocol that allows clients to access files hosted on a remote server, and is still widely used for its simplicity and compatibility. HDFS was built for large-scale data processing and is optimized for high-throughput, sequential access. Lustre and BeeGFS prioritize parallel I/O and aggregate bandwidth, making them well-suited to HPC and training workloads. CephFS reflects a more modern architecture, distributing both data and metadata to improve scalability and fault tolerance.

These systems work well because they are tightly aligned with the workloads they were built for: large files, relatively stable datasets, and access patterns that are either read-heavy or append-heavy. In that regime, performance is predictable, and the abstraction largely holds.

Issues arise when the workload shifts toward many small files, frequent updates, and a growing number of independent processes interacting through shared state, which puts pressure on metadata systems. In addition to serving data, the file system implicitly coordinates behavior among processes. And while distributed file systems can accommodate that shift to a degree, they were not designed with it as the primary requirement.

Concurrency and isolation guarantees

When you have hundreds or thousands of agents operating simultaneously, reading from and modifying shared state, and writing it back, the problem changes. The file system must store data reliably and mediate interactions between independent processes. To do this well, strong coordination and consistency guarantees are not optional. What happens when two agents try to modify the same file? What does it mean for an operation that spans multiple files to fail halfway through?

Most file systems do support concurrent access in some form. Multiple processes can read and write files simultaneously. Distributed systems extend that model across machines. But the guarantees are weaker than they appear since coordination is typically pushed to the application layer, relying on advisory locking, single-writer patterns, or implicit resolution like last-write-wins. Although these approaches work when conflicts are rare or when access patterns are well-structured, they become fragile under contention.

The truth is that traditional distributed file systems weren't designed for highly concurrent, fine-grained mutation of shared state. They optimize for throughput and availability, with consistency guarantees that vary by system. They typically do not provide transactional semantics across multiple files or coordinated updates across related pieces of state. The guarantees they do provide are often sufficient when conflicts are rare, but become limiting when many independent processes are actively mutating shared state.

There is interesting work happening here. Turso, building on SQLite, is exploring file system architectures specifically designed for agent state management. In AgentFS, the entire agent runtime (files, state, tool calls, and execution history) lives inside a single SQLite database, which can be copied, forked, snapshotted, or moved across machines as a single artifact. SQLite provides transactional guarantees, fast local access via kernel page caching, and queryability, all within one portable file. The design also enforces copy-on-write isolation, allowing multiple agents to operate in sandboxed environments without affecting each other's state, though true concurrent multi-agent writes remain an active area of development.

Materialization and query capabilities

Today, an agent might retrieve a single file and make a single update. Very soon, agents will need to pull data from multiple files, materialize intermediate results, and run operations across those results. Not long after that, they’ll be doing something closer to what we once relied on systems like Spark for (i.e. joining data across documents, aggregating it, and computing new outputs).

What’s changing is that retrieval is no longer a one-shot step. RAG assumes a single pass in which the agent retrieves context, passes it to the model, and moves on. But with agentic retrieval, the process is different. The agent must combine information across files, dynamically assembling context and continuously reconstructing the working set it needs at each step.

At that point, this stops looking like retrieval and starts looking like a query problem.

The agent is selecting, filtering, transforming, and composing unstructured data. It is, in effect, running queries over a corpus, just without anything resembling a real query engine. And so the gap becomes obvious.

We have robust systems for structured data, and reasonably good ones for semi-structured formats like JSON and Parquet. But agents are working with highly unstructured content, code, documents, and natural language. Unfortunately, the infrastructure for querying and manipulating that at scale is still immature. Filesystems expose the data, but they don’t provide the operations needed to work with it at this level.

There are teams starting to explore this space, and some are making real progress. Archil is rethinking the file system layer for AI workloads. Their system sits in front of object storage and gives agents fast, consistent access to data across environments, while explicitly optimizing for the latency characteristics of agent workloads. Vortex is attacking the problem at the data layer. It’s a columnar file format designed for extremely fast reads, including random access, selective reads, and large batch scans, with tight integration with modern query engines. The format is especially well-suited to read-dominant agent analytics workloads, where agents repeatedly scan, filter, and aggregate large collections of data across many steps.

Access control

As these workloads become more complex, access control becomes part of the execution model. Filesystems do support ACLs, but they’re typically defined at the level of files, directories, users, and groups. Permissions are static, evaluated at access time, and don’t extend to how data is used once it’s read.

That model breaks down for agent workloads. Agents don’t just read a file once. They read from many files, combine information, and construct intermediate results that persist across steps. Access decisions need to be enforced not just at file open, but throughout the execution of a multi-step workflow. For example, an agent might be allowed to read two documents independently but not to combine them or to propagate specific fields into downstream outputs.

This requires more granular and dynamic controls. Policies need to operate at the level of fields, fragments, and derived data, not just files, and they need to be evaluated continuously as the agent builds up state. That includes controlling what data can be cached, what intermediate results can be materialized, and what can be passed into subsequent steps.

This starts to look like query-time policy enforcement. Permissions are applied during selection, filtering, and transformation, and they need to compose correctly as data moves through the system.

Scale of files themselves

Today’s agent workloads are still constrained. They operate over small files, modest context windows, limited working memory, and bounded outputs. That naturally limits the size of the data they touch. As models improve and context windows expand, agents will start working over much larger artifacts. A coding agent won’t operate file by file, it will need to understand and modify entire codebases. A legal agent won’t review a single contract, it will need to synthesize across hundreds of documents.

Obviously, these aren’t one-shot operations. Agents will revisit the same data, build intermediate results, and iterate across steps. At that point, the bottleneck will shift. The file system will no longer just be responsible for storing data. It will need to support efficient access to large files and large collections of files. Agents will need to scan selectively, cache intermediate state, and move across data without repeatedly loading it into memory.

As these workloads scale, the problem becomes one of access and computation, not just storage.

File systems for the Agent Era

I now believe that file systems are the right abstraction for agents. They match how agents are trained, how they think about data, and how they operate in practice. But the systems we have today weren’t designed for what agents are becoming.

As agents scale, the workload changes. Many agents operate concurrently over shared state. They don’t just read files, they coordinate through them. They don’t just retrieve data, they construct working sets, materialize intermediate results, and iterate across steps.

That shift introduces new requirements. Concurrency needs stronger guarantees. Retrieval becomes a query problem over unstructured data. Access control moves from static ACLs to dynamic, execution-aware policies. And file systems need to support efficient operations over much larger artifacts.

We’re starting to see early answers, but the shape of the stack is still in flux. The choices we make now about how agents interact with data will harden quickly, so it’s worth getting them right.

Authors

Sarah Catanzaro

Editors

Justin Gage

Acknowledgments

Thank you to Will Manning for edits on an earlier draft.