AlphaFold
AlphaFold is a deep learning model developed by DeepMind that predicts the 3D structure of proteins from their amino acid sequence. For operators running local AI, AlphaFold is relevant as a specialized model that requires significant computational resources—typically a high-VRAM GPU (e.g., 24 GB or more) and substantial system RAM—to run inference or retraining. It is not a general-purpose language model but a domain-specific tool for structural biology, often used via Docker containers or pre-built packages like AlphaFold2.
Deeper dive
AlphaFold uses a transformer-based architecture with attention mechanisms to process multiple sequence alignments (MSAs) and template structures, outputting per-residue confidence scores and atomic coordinates. The model's size and complexity mean that running it on consumer hardware is challenging: the full inference pipeline requires downloading large databases (e.g., BFD, MGnify) totaling hundreds of gigabytes, and the model itself can exceed 10 GB. Operators typically use precomputed predictions from public databases (e.g., UniProt) rather than running AlphaFold locally, but if needed, they can use optimized versions like AlphaFold3 or ColabFold, which reduce resource demands by using smaller databases and faster MSA generation.
Practical example
An operator with an RTX 4090 (24 GB VRAM) and 64 GB system RAM could run AlphaFold2 inference on a single protein of moderate length (400 residues) in about 30 minutes, using the official Docker image. However, the full database download (2 TB) is impractical for most local setups; instead, operators often use ColabFold, which runs on Google Colab or locally with a smaller database and can predict a 400-residue protein in under 10 minutes on the same GPU.
Workflow example
In practice, an operator interested in protein structure prediction would not use llama.cpp or Ollama but instead install AlphaFold via Docker: docker run --gpus all -v /path/to/data:/data alphafold. For lighter needs, they might use ColabFold via a Jupyter notebook or command line: python colabfold_batch --num-recycle 3 input.fasta output_dir. The workflow involves preparing a FASTA file, running the prediction, and inspecting output PDB files with tools like PyMOL or ChimeraX.
Reviewed by Fredoline Eruo. See our editorial policy.