Skip to content

Conda & Python Installation

Before we start coding, you need Python and a way to manage packages and environments. We'll use Conda — the standard tool for this in data science and engineering.


What is Conda?

Conda is a package manager and environment manager rolled into one:

  • Package manager — installs software, including Python itself, R, C libraries, and more (not just Python packages like pip)
  • Environment manager — creates isolated environments per project, so conflicting dependency versions don't interfere with each other

You switch between environments with a single command. Each is completely independent.

Conda vs. pip

Feature Conda pip
Installs Python itself Yes No
Manages non-Python packages (C libs, R, etc.) Yes No
Environment management built in Yes No (needs venv separately)
Resolves dependency conflicts Yes (thorough) Partially
Package source Anaconda / conda-forge channels PyPI
Speed Slower (more thorough) Faster

You'll use both

In practice, data engineers use Conda to manage environments and install core packages, then use pip inside a Conda environment for Python-only packages that aren't on conda-forge.


Miniforge vs. Anaconda vs. Miniconda

There are several Conda distributions. Here's the difference:

Distribution What it includes Best for
Miniforge Conda + conda-forge defaults Recommended — lightweight, open-source
Miniconda Conda + Anaconda defaults Lightweight, uses Anaconda channel
Anaconda Conda + 250+ pre-installed packages Beginners who want everything pre-installed

We recommend Miniforge — it's lightweight, uses the community-maintained conda-forge channel by default, and is fully open-source.


Install Miniforge

# Download and run the installer
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Follow the prompts:

  • Press Enter to review the license, then type yes
  • Accept the default install location (~/miniforge3)
  • Type yes when asked to initialize Miniforge (this adds it to your shell)

Then restart your terminal, or run:

source ~/.bashrc   # or ~/.zshrc on macOS
  1. Download the installer from github.com/conda-forge/miniforge/releases
    • Choose Miniforge3-Windows-x86_64.exe
  2. Run the installer
    • Check "Add Miniforge3 to my PATH environment variable"
    • Check "Register Miniforge3 as my default Python"
  3. Open a new Command Prompt or PowerShell window
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh

Follow the same prompts as macOS. Restart your terminal or run:

source ~/.bashrc

Verify the installation

conda --version

You should see something like conda 24.x.x. Also verify Python:

python --version

Essential Conda Commands

Create an environment

# Create an environment called "workshop" with Python 3.12
conda create -n workshop python=3.12

Activate / deactivate

# Activate the environment
conda activate workshop

# Your prompt changes to show the active environment:
# (workshop) username@computer:~$

# Deactivate (go back to base)
conda deactivate

Install packages

# Install packages with conda
conda install pandas matplotlib requests

# Install from pip (inside a conda environment)
pip install some-package-not-on-conda

List environments and packages

# See all your environments
conda env list

# See packages in the current environment
conda list

# Search for a package
conda search numpy

Remove an environment

conda env remove -n old-project

Set Up the Workshop Environment

Let's create the environment we'll use for the rest of the workshop:

# Create the environment
conda create -n workshop python=3.12 pandas matplotlib requests -y

# Activate it
conda activate workshop

# Verify
python --version
pip list

You should see

Python 3.12.x
And pandas, matplotlib, and requests in your package list.

From now on, always activate this environment before working on workshop code:

conda activate workshop

Troubleshooting

conda: command not found

The installer didn't add Conda to your PATH. Run:

# macOS/Linux
~/miniforge3/bin/conda init
source ~/.bashrc  # or ~/.zshrc
On Windows, make sure you checked "Add to PATH" during installation. If you didn't, reinstall.

Conda is very slow to resolve environments

Miniforge uses mamba as the default solver, which is much faster than classic Conda. If you installed Miniconda or Anaconda instead, you can switch:

conda install -n base conda-libmamba-solver
conda config --set solver libmamba

Should I use pip or conda to install packages?

Prefer conda install when the package is available on conda-forge. Use pip install as a fallback for packages that are only on PyPI. Avoid mixing them heavily in the same environment.


Next: VS Code Installation →