HOW-TO · RAG

How to Implement Secure File Operations for Agents

advanced25 minBy Fredoline Eruo
Target environment
Ubuntu 24.04 · Ollama 0.4.x
PREREQUISITES

File system tools working, security review needed, Python 3.10+

What this does

Secure file operations prevent path traversal, unauthorized access, command injection, and data leaks when agents interact with the file system. Security must be enforced at the tool level, not by convention.

Steps

  • Enforce an allowlist of permitted directories. Never rely on relative path safety alone.
import os

ALLOWED_DIRS = [
    os.path.abspath(os.path.expanduser("~/agent_workspace")),
    os.path.abspath("/tmp/agent_data"),
]

def validate_path(requested_path: str) -> str:
    abs_path = os.path.abspath(os.path.normpath(requested_path))
    for allowed in ALLOWED_DIRS:
        if abs_path.startswith(allowed):
            return abs_path
    raise PermissionError(f"Path {requested_path} is not in allowed directories")
  • Restrict file extensions. Only allow safe file types.
ALLOWED_EXTENSIONS = {".txt", ".md", ".json", ".csv", ".py", ".yaml", ".yml", ".log"}

def validate_extension(path: str):
    ext = os.path.splitext(path)[1].lower()
    if ext not in ALLOWED_EXTENSIONS:
        raise PermissionError(f"File extension {ext} is not allowed")
  • Set file size limits. Prevent agent from reading or writing huge files.
MAX_READ_SIZE = 10 * 1024 * 1024  # 10 MB
MAX_WRITE_SIZE = 5 * 1024 * 1024   # 5 MB

def validate_read_size(path: str):
    size = os.path.getsize(path)
    if size > MAX_READ_SIZE:
        raise ValueError(f"File too large: {size} bytes (max {MAX_READ_SIZE})")

def validate_write_size(content: str):
    if len(content.encode()) > MAX_WRITE_SIZE:
        raise ValueError(f"Content too large: {len(content)} bytes (max {MAX_WRITE_SIZE})")
  • Implement an audit log for all file operations.
import logging
import json

file_audit_logger = logging.getLogger("file_audit")

def audit_file_operation(agent_id: str, operation: str, path: str, status: str, details: str = ""):
    file_audit_logger.info(json.dumps({
        "agent": agent_id,
        "operation": operation,
        "path": path,
        "status": status,
        "details": details
    }))
  • Add the security wrapper to every file tool.
@tool
def secure_read_file(path: str, agent_id: str = "unknown") -> str:
    """Read a file with security validation."""
    try:
        safe = validate_path(path)
        validate_extension(safe)
        validate_read_size(safe)
        with open(safe, "r") as f:
            content = f.read()
        audit_file_operation(agent_id, "read", path, "success")
        return content
    except (PermissionError, ValueError) as e:
        audit_file_operation(agent_id, "read", path, "denied", str(e))
        return f"Access denied: {e}"
  • Prevent command injection in shell-based tools. Never pass user input directly to a shell.
import subprocess

@tool
def grep_file(pattern: str, path: str) -> str:
    """Search a file using grep (safe version)."""
    safe = validate_path(path)
    validate_extension(safe)
    # Use list form of subprocess, never shell=True with string interpolation
    result = subprocess.run(
        ["grep", "--", pattern, safe],
        capture_output=True, text=True, timeout=10
    )
    return result.stdout or "No matches"

Verification

python -c "
import os
allowed = [os.path.abspath('/tmp')]
path = os.path.abspath('/tmp/../etc/passwd')
print(any(path.startswith(a) for a in allowed))
# Expected: False (path traversal detected)
"

Common failures

  • Symlink attacks. A file within an allowed directory could be a symlink to a forbidden file. Resolve symlinks with os.path.realpath().
  • Race conditions (TOCTOU). A file passes validation but is replaced before being read. Open and validate using the file descriptor, not the path.
  • Logging sensitive content. Audit logs may contain file contents with PII. Log only metadata (path, size, timestamp), not content.
  • Version mismatch - The installed package or runtime differs from the command shown; check the version first and rerun the smallest verification command.
  • Local environment drift - Another service, virtual environment, model, or path is being used; print the active binary path and configuration before changing the guide steps.

Related guides

  • How to Add File System Operations as Agent Tools
  • How to Build Custom Tools for Agents