Installation Guide

Quick Install

Get started with StatClean in seconds using pip.

From PyPI (Recommended)
pip install statclean

This installs the latest stable version with all required dependencies.

Requirements

StatClean requires Python 3.7 or higher and the following packages:

Core Dependencies

  • numpy ≥ 1.19.0
  • pandas ≥ 1.2.0
  • matplotlib ≥ 3.3.0
  • seaborn ≥ 0.11.0
  • scipy ≥ 1.6.0
  • tqdm ≥ 4.60.0

Optional Dependencies

scikit-learn ≥ 0.24.0

For shrinkage covariance estimation in Mahalanobis distance calculations and example datasets.

Development Installation

For contributing or accessing the latest features:

# Clone repository
git clone https://github.com/SubaashNair/StatClean.git
cd StatClean

# Install in development mode
pip install -e .

# Install development dependencies
pip install -r requirements.txt

Verification

Test your installation with this simple verification script:

import statclean
from statclean import StatClean
import pandas as pd

# Quick test
df = pd.DataFrame({'test': [1, 2, 3, 100, 4]})
cleaner = StatClean(df)
print("StatClean installed successfully!")
print(f"Version: {statclean.__version__}")
If no errors appear, StatClean is properly installed!

Platform Compatibility

Operating Systems

  • Windows (10+)
  • macOS (10.14+)
  • Linux (Ubuntu 18.04+, CentOS 7+)

Python Version Support

  • Python 3.7
  • Python 3.8
  • Python 3.9
  • Python 3.10
  • Python 3.11
  • Python 3.12

Troubleshooting

# Make sure you installed the package
pip install statclean

# Check if it's installed
pip show statclean

# Create a fresh virtual environment
python -m venv statclean_env
source statclean_env/bin/activate  # On Windows: statclean_env\Scripts\activate
pip install statclean

# For headless servers, set matplotlib backend
export MPLBACKEND=Agg

Add this to your script for programmatic backend setting:

import matplotlib
matplotlib.use('Agg')  # Must be before importing pyplot

# Install optimized numerical libraries
pip install numpy[mkl] pandas[performance]

# For Intel processors
conda install mkl mkl-service

Docker Installation

Use StatClean in a containerized environment:

FROM python:3.9-slim

# Install StatClean
RUN pip install statclean

# Set working directory
WORKDIR /app

# Copy your application
COPY . .

# Run your application
CMD ["python", "your_script.py"]
Conda Installation

StatClean will be available on conda-forge soon:

conda install -c conda-forge statclean

IDE Setup

VS Code

Install the Python extension and StatClean will provide full IntelliSense support with type hints.

Recommended extensions: Python, Pylance
PyCharm

StatClean includes comprehensive type annotations for excellent PyCharm integration.

Professional and Community editions supported
Jupyter
pip install jupyter
jupyter notebook

Then import StatClean:

from statclean import StatClean
Next Steps

Now that StatClean is installed, explore the documentation: