COURSE · FND · B002

Python for AI — Zero to Useful

Learn python for ai — zero to useful through RunLocalAI's practical lens: python, programming, data and api, hardware fit, runtime settings, verification habits and local-vs-cloud tradeoffs.

36 chapters40hFoundations trackBy Fredoline Eruo
PREREQUISITES
  • B001

Why This Course Exists

Python dominates AI because it lets you ship working code without ceremony. Unlike academic languages designed for proofs, Python evolved in the real world—glue scripts became libraries became ecosystems. NumPy appeared in 2006. Pandas followed in 2008. By 2023, Python 3.12 delivered error messages that actually help.

This course teaches Python as a tool for AI work, not as an abstract programming language. Every chapter connects to practical AI tasks: cleaning datasets, calling APIs, processing embeddings, handling rate limits. You will not learn everything about Python. You will learn exactly what you need to work with AI systems.

The course assumes you completed B001. You should know what a terminal is and have edited a text file. Everything else builds from there.

What You Will Know After

After completing all 36 chapters, you will:

  • Write Python functions that handle real-world data pipelines
  • Manipulate arrays with NumPy (the foundation of TensorFlow and PyTorch)
  • Clean and transform tabular data with Pandas
  • Call REST APIs, handle errors, and implement retry logic
  • Parse JSON responses and extract what AI models need
  • Understand when to use lists versus dictionaries versus NumPy arrays
  • Debug Python code using explicit error messages and print statements

You will not be a Python expert. You will be useful.


CHAPTERS
  1. 01Why Python for AI?Python won AI not through performance but through the ecosystem. You write Python because every AI tool you need already has a Python interface, and the code you write today will still run in five years without modification.15 min
  2. 02Your EnvironmentYour Python environment is isolated per project. Never run `pip install` globally unless you understand system-level dependencies. Use virtual environments from day one.20 min
  3. 03Variables and TypesPython types exist but are inferred, not declared. F-strings are your primary tool for building strings with embedded values. Type errors happen when you mix types—handle conversions explicitly.20 min
  4. 04Control FlowPython uses indentation to define blocks. Four spaces or one tab—pick one and stay consistent. `for` loops iterate over sequences; `while` loops iterate based on conditions.20 min
  5. 05FunctionsFunctions let you name and reuse logic. Default arguments reduce the number of functions you need to write. Use keyword arguments in function calls when the argument meaning is not obvious from position.20 min
  6. 06Lists and List ComprehensionsList comprehensions replace 3-line loops with one expression. Use them for transformations and filtering. They are idiomatic Python—AI code will look foreign without them.20 min
  7. 07Dictionaries for AI ConfigDictionaries are the go-to structure for configuration and structured data. `.get()` prevents KeyError crashes. Nested dictionaries mirror JSON structure from API requests.20 min
  8. 08File I/OAlways use `with open()` for file operations. `json.dump()` and `json.load()` handle the serialization between Python dictionaries and JSON text.20 min
  9. 09Error HandlingCatch specific exceptions, not bare `except Exception`. This prevents masking bugs. Use `raise` to fail fast when inputs are invalid—silent failures cause worse bugs than loud ones.20 min
  10. 10NumPy Arrays for MLNumPy arrays are not lists. They are fixed-type, contiguous memory blocks. This structure enables the vectorized operations that make AI computationally feasible. Convert lists to arrays when doing math.20 min
  11. 11NumPy OperationsNumPy operations apply to entire arrays without explicit loops. Reduction operations collapse arrays to scalars. Masking selects elements based on conditions—this is how you filter numerical data.20 min
  12. 12NumPy BroadcastingBroadcasting eliminates loops for element-wise operations. The key is matching dimensions—trailing dimensions must be 1 or match. Use `keepdims=True` and `np.newaxis` to control shapes explicitly.20 min
  13. 13Pandas SeriesSeries extend NumPy arrays with labels (index) and missing data support. Use `.iloc[]` for position-based access, `.loc[]` for label-based access.20 min
  14. 14Pandas DataFramesDataFrames are tables with named columns. Chain operations: `df[df["col"] > value]["other_col"]`. Use `inplace=True` when you want to modify the original DataFrame; otherwise, assign to a new variable.20 min
  15. 15Data Cleaning with PandasData cleaning is 80% of AI data work. Pandas method chaining keeps cleaning readable. Always verify data after cleaning—dropping too many rows is a silent bug.20 min
  16. 16Working with APIsAI APIs accept JSON and return JSON. `requests.post(url, json=payload)` serializes the dict to JSON and sets the Content-Type header. Always check `response.status_code` before processing.20 min
  17. 17Rate Limiting and RetriesRate limits are normal, not exceptional. Build retry logic with exponential backoff from the start. Use `time.sleep()` to pause between attempts. Check `Retry-After` header for server-specified delays.15 min
  18. 18Reading API ResponsesAPI responses are nested dictionaries. Use chained `.get()` calls with defaults to handle missing keys gracefully. List comprehensions extract fields from batched responses.20 min
  19. 19Object-Oriented PythonOOP's real value in AI code is encapsulating the fit/transform pattern that appears everywhere: preprocessors, feature extractors, vectorizers, model wrappers. Private attributes (with `_`) protect internal state from accidental misuse.15 min
  20. 20Classes for AI ToolsDataclasses are your friend for AI tool configurations. They provide type hints (which IDEs and linters use), sensible defaults, and clean structure. Nest them for complex pipelines.15 min
  21. 21Inheritance and CompositionInheritance for shared behavior/protocols (like this base wrapper pattern). Composition for capability aggregation (a pipeline that "has-a" tokenizer, "has-a" vectorizer). Most AI pipeline code should lean toward composition.15 min
  22. 22Regular ExpressionsRegex patterns use metacharacters: `.` (any char), `\d` (digit), `\w` (word char), `+` (one or more), `*` (zero or more), `[]` (character class). Escape with `\` for literal versions: `\.` for dot, `\+` for plus.15 min
  23. 23Text Processing with RegexCompile regex patterns once if you're using them repeatedly: `pattern = re.compile(r'...')`. Then call `pattern.search(text)` instead of `re.search(r'...', text)`. This avoids re-parsing the pattern on every call.15 min
  24. 24Data Visualization with MatplotlibLearn the matplotlib hierarchy: `Figure` (the whole canvas) contains `Axes` (individual plots). Most of your interaction is with Axes methods (`plot()`, `scatter()`, `hist()`, `set_xlabel()`). Always `plt.show()` or `plt.savefig()` or your plot won't render.15 min
  25. 25Plotting AI MetricsFor AI metrics, prefer `sklearn.metrics` plotting helpers—they handle axis scaling, legends, and formatting correctly. Plot your metrics as soon as you generate them, not at the end of long training runs where failures hide.15 min
  26. 26Performance ProfilingProfile before optimizing. You'll often find that your intuition about what's slow is wrong—actual bottlenecks are frequently in places you'd never suspect (data loading, string formatting, logging). Use `time.perf_counter()` for quick measurements, `cProfile` for systematic analysis.15 min
  27. 27Optimizing Python CodeProfile first. Then optimize hot paths. Use numpy for numerical work, `join()` for strings, comprehensions over loops, `lru_cache` for repeated function calls. Never sacrifice correctness for speed—and never assume what the bottleneck is.15 min
  28. 28Virtual Environments Deep DiveCreate a new virtual environment for every project. Never install packages globally unless you're sure you need system-wide availability. When you need to reproduce an environment: `pip install -r requirements.txt`. A practical addition: `venv` with `.python-version` and auto-activation using direnv or shell integration: ```bash # .envrc for direnv (if using direnv) layout python python # Or in .bashrc/.zshrc, add git-aware prompt modification # that auto-activates when entering directories with .venv ```15 min
  29. 29Dependency ManagementFor AI projects specifically, dependency hell is real. CUDA versions, numpy builds with different BLAS libraries, torch with CUDA versus CPU builds—all require careful environment management. Poetry and pip-tools both help, but for GPU-dependent projects, consider Docker containers where the base image pins everything.15 min
  30. 30Project Structure for AIChoose a structure and commit to it. The `src/` layout + `scripts/` + `tests/` + `config/` pattern works for most AI tools. Separate configuration from code—configuration changes without deployments.15 min
  31. 31Building a CLI ToolClick handles the boring parts: help text generation, argument parsing, error handling, color output. Use `@click.group()` for multi-command tools. `@click.argument()` for required inputs, `@click.option` for flags and named parameters.15 min
  32. 32CLI Argument ParsingUse `click.Path(exists=True)` for file validation—the CLI won't proceed if the path doesn't exist. For custom validation, raise `click.Abort()` or use `click.BadParameter()`. Separate validation from business logic.15 min
  33. 33CLI with Rich OutputUse `rich` when your tool produces lots of output, shows progress over time, or displays structured data. It dramatically improves the user experience of CLI tools without much extra code. For logging to file, write plain text; for terminal display, use rich.15 min
  34. 34AI Pipeline ScriptUse `ThreadPoolExecutor` for I/O-bound work (API calls) not `ProcessPoolExecutor`. For CPU-bound work, processes; for slow external calls, threads. The batches + progress bar pattern keeps users informed during long operations.15 min
  35. 35Batch Processing with ProgressCheckpoint state to disk for resumability. Long-running batches will fail—VMs restart, sessions disconnect. A recovery mechanism means starting over is unnecessary. Use small batches with tight feedback loops rather than monolithic processing.15 min
  36. 36Final Project: AI Data PipelineThis is production code you can extend: swap the simulated embedding for real API calls, add vector database integration, build validation pipelines, add monitoring. The structure—loader, preprocessor, embedder—separates concerns for testability and extension.25 min