Installation Guide

Requirements

pyCTAKES requires Python 3.8 or higher and is tested on:

Python 3.8, 3.9, 3.10, 3.11, 3.12
Linux, macOS, and Windows

Basic Installation

Install from PyPI (Recommended)

pip install pyctakes

This installs pyCTAKES with basic dependencies for rule-based processing.

Development Installation

For the latest development version:

pip install git+https://github.com/sonish777/pyctakes.git

Optional Dependencies

pyCTAKES supports multiple NLP backends. Install additional packages for enhanced functionality:

spaCy (Recommended)

For advanced tokenization, POS tagging, and model-based NER:

# Install spaCy
pip install spacy

# Download English model
python -m spacy download en_core_web_sm

# For clinical models (optional)
pip install scispacy
python -m spacy download en_core_sci_sm

Stanza

Alternative NLP backend:

pip install stanza

UMLS Integration

For comprehensive concept mapping:

pip install quickumls
# Requires UMLS license and setup

Complete Installation

For all features:

# Install with all optional dependencies
pip install pyctakes[all]

# Or install components separately
pip install pyctakes spacy scispacy stanza
python -m spacy download en_core_web_sm
python -m spacy download en_core_sci_sm

Verification

Verify your installation:

import pyctakes

# Test basic functionality
pipeline = pyctakes.create_basic_pipeline()
result = pipeline.process_text("Patient has diabetes.")
print(f"Found {len(result.document.annotations)} annotations")

Docker Installation

Run pyCTAKES in Docker:

# Pull the image
docker pull sonish777/pyctakes:latest

# Run interactively
docker run -it sonish777/pyctakes:latest python

# Process a file
docker run -v $(pwd):/data sonish777/pyctakes:latest \
  pyctakes annotate /data/clinical_note.txt

Troubleshooting

Common Issues

1. spaCy model not found

python -m spacy download en_core_web_sm

2. Permission errors

pip install --user pyctakes

3. Environment conflicts

# Use virtual environment
python -m venv pyctakes_env
source pyctakes_env/bin/activate  # Linux/Mac
# or pyctakes_env\Scripts\activate  # Windows
pip install pyctakes

Platform-Specific Notes

macOS Apple Silicon (M1/M2)

# Install with conda for better compatibility
conda install pyctakes -c conda-forge

Windows

# Use conda on Windows for easier dependency management
conda install pyctakes -c conda-forge

Linux

# May need additional system dependencies
sudo apt-get install python3-dev build-essential
pip install pyctakes

Performance Optimization

For optimal performance:

Install spaCy models: Significantly improves NER accuracy
Use SSD storage: Faster model loading
Allocate sufficient RAM: 4GB+ recommended for large models
GPU support: Install CUDA-compatible packages for transformer models