Troubleshooting¶
Solutions for common issues when using QuantNado.
Installation Issues¶
Python 3.14 Compatibility Error¶
Problem: the configured Python interpreter version (3.14) is newer than PyO3's maximum supported version (3.13)
Solution: Downgrade to Python 3.13:
QuantNado requires Python 3.12-3.13 due to PyO3 compatibility. Python 3.14 support coming soon.
Permission Denied When Creating Dataset¶
Problem: PermissionError: /path/to/dataset.zarr
Solution: 1. Check write permissions in output directory:
-
Fix permissions if needed:
-
Or write to a different location (e.g., temp directory):
BAM File Issues¶
Error: "No such file or directory" for BAM¶
Problem: BAM file not found or glob pattern doesn't match
Solution: 1. Check files exist:
- Verify correct directory:
Error: "Missing index file"¶
Problem: BAM file lacks .bai index
Solution:
Dataset Creation Issues¶
Very Slow Dataset Creation¶
Problem: Processing takes hours for moderate-sized files
Solution: Enable parallelization with --max-workers:
quantnado create-dataset *.bam \
--output dataset.zarr \
--chromsizes hg38.chrom.sizes \
--max-workers 8 # Adjust based on CPU cores
Check available cores:
"Invalid chromsizes format" Error¶
Problem: Chromsizes file not recognized
Solution: Verify format (tab-separated, 2 columns):
Should be: - Column 1: Chromosome name (chr1, chr2, etc.) - Column 2: Chromosome length (integer) - Tab-separated (no spaces)
Download correct file:
# For hg38
curl -O https://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes
# For other genomes, check UCSC Genome Browser
Peak Calling Issues¶
No Peaks Called¶
Problem: Peak calling produces empty output
Solutions: 1. Reduce quantile threshold:
quantnado call-peaks \
--bigwig-dir ./signals/ \
--output-dir ./peaks/ \
--quantile 0.95 # Less stringent
-
Check bigWig signal:
-
Disable blacklist temporarily:
-
Check for all-zero regions:
Python API Issues¶
Cannot Import QuantNado¶
Problem: ImportError: No module named 'quantnado'
Solution:
# Verify installation
python -c "import quantnado; print(quantnado.__version__)"
# If not found, install
pip install quantnado
# For development:
pip install -e . # From source directory
Dataset Open Fails¶
Problem: Error: Cannot open Zarr store
Solution: 1. Check dataset exists:
- Verify it's a valid Zarr store:
Memory Issues With Large Datasets¶
Problem: Out of memory when accessing data
Solution: Load data in chunks:
from quantnado import QuantNado
qn = QuantNado.open("large_dataset.zarr")
chroms = qn.to_xarray()
# Process chromosome by chromosome
for chrom in qn.chromosomes:
data = chroms[chrom] # Lazy load
result = data.compute() # Compute only what you need
del result # Free memory
HPC/Cluster Issues¶
SLURM Job Fails With "Module Not Found"¶
Problem: Script runs locally but fails on cluster
Solution: Activate conda environment in job script:
#!/bin/bash
#SBATCH --job-name=quantnado
#SBATCH --cpus-per-task=16
# Activate environment
source /path/to/conda/etc/profile.d/conda.sh
conda activate quantnado
# Run command
quantnado create-dataset *.bam --output dataset.zarr
Out of Memory in SLURM Job¶
Problem: Job killed with MemoryLimit exceeded
Solution: Request more memory:
Or reduce parallelization:
quantnado create-dataset *.bam \
--output dataset.zarr \
--max-workers 4 # Fewer threads = less memory
Getting Help¶
If issues persist: 1. Check Basic Usage for examples 2. Review CLI Reference for all options 3. Open an issue on GitHub 4. Include relevant error messages and command used
Problem: Dependency conflict between anndata and zarr.
Root Cause: anndata doesn't support zarr 3.x but QuantNado requires it.
Solution:
Or reinstall QuantNado fresh:
ImportError: cannot import name 'BaseCompressedSparseDataset'¶
Problem: Incompatible anndata version.
Solution:
Module 'pkg_resources' has been deprecated¶
Problem: Warning about deprecated pkg_resources (non-fatal).
Solution: This is a warning only and doesn't affect functionality. Future versions of dependencies will fix this automatically.
Dataset Creation Issues¶
KeyError: "not all values found in index 'sample'"¶
Problem: Trying to select samples that don't exist in dataset.
Example Error:
X = chrom['chr1'].sel(sample=['CAT-RCH-ACV_H3K27ac']).compute()
# KeyError: "not all values found in index 'sample'"
Solution:
-
Check available samples:
-
Sample names are BAM file stems (without extension):
-
Select correct sample:
FileNotFoundError: BAM file not found¶
Problem: Specified BAM file path doesn't exist.
Solution:
from pathlib import Path
# Verify files exist
bam_files = list(Path("/path/to/bams").glob("*.bam"))
print(f"Found {len(bam_files)} BAM files")
print(bam_files)
# Check for indexed files
for bam in bam_files:
bai = Path(str(bam) + ".bai")
if not bai.exists():
print(f"Missing index for {bam}")
ValueError: chromsizes_dict appears empty¶
Problem: No valid chromosomes extracted from BAM file.
Causes: - BAM file is corrupted - BAM file header is incomplete - Invalid chromsizes file
Solutions: 1. Verify BAM file integrity:
- Provide explicit chromsizes file:
ValueError: sample_names must not be empty¶
Problem: No BAM files provided.
Solution:
# Check bam_files list
bam_files = [...]
print(f"BAM files: {bam_files}")
# Verify it's not empty
assert len(bam_files) > 0
Dataset creation is slow¶
Problem: Processing takes longer than expected.
Typical durations:
- 10M reads: 1-2 minutes per sample
- 50M reads: 5-10 minutes per sample
- 100M+ reads: 15-30+ minutes per sample
Optimization:
# Use fewer threads for faster startup (less memory)
quantnado create-dataset *.bam --output dataset.zarr --max-workers 1
# Use more threads for larger datasets (more memory, in-parallel)
quantnado create-dataset *.bam --output dataset.zarr --max-workers 8
MemoryError during dataset creation¶
Problem: Running out of RAM during processing.
Solutions:
-
Use fewer workers:
-
Process in batches:
-
Increase available memory (system dependent)
Analysis Issues¶
ValueError: not all values found in index¶
Problem: Selecting ranges that don't exist.
Solution:
# Check the actual data
signal = qn.reduce(intervals_path="regions.bed", reduction="mean")
print(signal.dims)
print(signal.coords)
# Use valid sample names
valid_samples = signal.sample.values
signal_subset = signal.sel(sample=valid_samples[:2])
ValueError: fixed_width is not divisible by bin_size¶
Problem: Incompatible parameters.
qn.extract(
feature_type="promoter",
gtf_path="genes.gtf",
fixed_width=2000,
bin_size=50, # 2000 / 50 = 40 ✓ OK
bin_agg="mean"
)
Solution: Use divisible values:
Shape mismatch errors¶
Problem: Dimension mismatch in operations.
Debug:
signal = qn.reduce(intervals_path="regions.bed")
print("Signal shape:", signal["mean"].shape) # (ranges, sample)
# PCA expects (feature, sample)
pca, transformed = qn.pca(signal["mean"], n_components=10)
# extract expects (regions, position, sample)
extracted = qn.extract(intervals_path="regions.bed")
print("Extracted shape:", extracted.shape)
Data Issues¶
NaN values in output¶
Problem: Reduced signal contains NaN values.
Causes: - Regions outside chromosome boundaries - Missing data in original BAM - Integer overflow
Solutions:
signal = qn.reduce(intervals_path="regions.bed")
# Check for NaN
print(f"NaN count: {np.isnan(signal['mean']).sum().values}")
# Remove NaN-containing regions
signal_clean = signal.dropna('ranges', how='any')
# Or use for PCA with explicit NaN handling
pca, transformed = qn.pca(
signal["mean"],
nan_handling_strategy="drop"
)
Zero signal everywhere¶
Problem: All signal values are zero.
Likely causes: - BAM file has no reads - Chromosome names don't match - Region definition is incorrect
Debug:
# Check BAM file
samtools flagstat file.bam
samtools view -c file.bam
# Check chromosome names
samtools view -H file.bam | grep SQ
Unexpected sparsity¶
Problem: Dataset is sparser than expected.
Info: Check sparsity metrics:
This is normal for sparse genomic data (usually 90%+ sparse).
File & Storage Issues¶
Zarr store is corrupted¶
Problem: Can't read Zarr dataset.
Symptoms: - Permission errors - "zarr.errors.ArrayNotFoundError" - Missing data
Solutions: 1. Verify store integrity:
- Recreate from BAM files if store is damaged
- Check disk space and permissions:
Permission denied errors¶
Problem: Cannot access Zarr store or BAM files.
Solution:
# Check permissions
ls -l file.bam
chmod 644 file.bam
# Or check directory permissions
chmod 755 /path/to/dataset.zarr
Out of disk space¶
Problem: Zarr creation fails due to insufficient space.
Solution: 1. Free up space 2. Calculate required space:
3. Use remote storage if availablePerformance Issues¶
Dask operations are slow¶
Problem: Extract or reduce operations take too long.
Solutions: 1. Reduce dataset size:
# Use fewer chromosomes
signal = qn.reduce(
intervals_path="regions_chr1_only.bed",
reduction="mean"
)
# Use fewer samples
signal = signal.sel(sample=['sample1', 'sample2'])
-
Increase chunk size (trade-off with memory):
-
Compute in batches rather than all at once
High memory usage¶
Problem: Dask operations consume too much memory.
Solutions: 1. Set Dask memory limit:
- Use smaller chunks
- Process regions in batches
Metadata Issues¶
Metadata not being read¶
Problem: Metadata is empty or not loaded.
Debug:
qn = QuantNado.open("dataset.zarr")
print(qn.metadata) # Check if empty
# Check if metadata was set
print(qn.store.list_metadata_columns())
Solution: Ensure metadata was provided during creation:
qn = QuantNado.from_bam_files(
bam_files=[...],
store_path="dataset.zarr",
metadata="samples.csv", # Provide CSV path
sample_column="sample_id"
)
Metadata column name mismatch¶
Problem: Sample column not found.
Solution:
import pandas as pd
# Check CSV column names
metadata = pd.read_csv("samples.csv")
print(metadata.columns)
# Use correct column name
qn = QuantNado.from_bam_files(
bam_files=[...],
store_path="dataset.zarr",
metadata="samples.csv",
sample_column="sample_id" # Must match CSV column
)
Getting Help¶
If issues persist:
- Check Documentation: Full docs
- Search GitHub Issues: QuantNado Issues
- Enable Debug Logging:
- Report Issue with:
- Error message and traceback
- Python version:
python --version - QuantNado version:
python -c "import quantnado; print(quantnado.__version__)" - Operating system
- Minimal reproducible example