Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.exe
107 changes: 79 additions & 28 deletions C++/Lanczos/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,79 @@
**Lanczos Method with Selective Ortohanalization**

These functions need liblapacke (and hence libblas and liblapack). To install them do:
sudo apt update
sudo apt install libopenblas-base
sudo apt install libopenblas-dev
sudo apt install liblapack3
sudo apt install liblapack-dev
sudo apt install liblapacke-dev

Then make sure that you are using version 7 of gcc and g++

**Rmark:** In my case the folder containing **straw** is **~/HiC/straw_may_2022**. The user should replace it by their own folder.

To create an executable for computing few leading eigenvectors of the correlation matrix of contact matrix for a particular chromosome do:
**g++ -O2 -o Lan.exe s_fLan.cpp s_fSOLan.c s_dthMul.c hgFlipSign.c ~/HiC/straw_may_2022/C++/straw.cpp -I. -I ~/HiC/straw_may_2022/C++ -lz -lcurl -lpthread -lblas -llapack -llapacke**
Run
./Lan.exe
to see usage.
By default it uses unnormalized observed over expected (o/e) matrix.
Use -o flag to use observed matrix instead (usually not recommended)
Use -n norm to use normalized matrix; norm can be NONE (no normalization - default), VC, VC_SQRT, KR, SCALE, SCALA, etc.

To do the above for Genome Wide (GW) contact matrix do:
**g++ -O2 -o GWev.exe s_fGW.cpp getGWMatrix.cpp s_fSOLan.c s_dthMul.c ~/HiC/straw_may_2022/C++/straw.cpp -I ~/HiC/straw_may_2022/C++ -lz -lcurl -lpthread -lblas -llapack -llapacke**
Run
./GWev.exe for usage.
By default it uses **inter**chromosomal matrix. To use the full matrix specife the -f flag
**Lanczos Method with Selective Orthogonalization**

This package provides tools for computing leading eigenvectors of contact matrices from Hi-C data.

## Building the Software

### Prerequisites

The software requires:
- C++ compiler (GCC/G++ 4.8 or later)
- OpenBLAS
- LAPACK/LAPACKE
- libcurl
- zlib
- straw library (place in `~/straw` or modify build script)

### Platform-Specific Build Instructions

#### Linux
```bash
# Make the script executable
chmod +x build_linux.sh
# Run the build script
./build_linux.sh
```

The script will automatically install required dependencies using apt-get.

#### macOS
```bash
# Make the script executable
chmod +x build_mac.sh
# Run the build script
./build_mac.sh
```

The script will use Homebrew to install required dependencies.

#### Windows
1. Install MinGW-w64 from [WinLibs](https://winlibs.com/)
2. Add MinGW-w64 bin directory to your PATH
3. Run the build script:
```cmd
build_windows.bat
```

You may need to modify the paths in `build_windows.bat` to match your MinGW-w64 installation.

## Usage

### Chromosome-Specific Analysis (Lan.exe)
```bash
./Lan.exe [options] <hicfile> <chromosome> <outbase> <resolution> [nv]

Options:
-o Use observed matrix instead of observed/expected (o/e) matrix
-t <float> Set tolerance (default: 1.0e-7)
-e <float> Set epsilon (default: 1.0e-8)
-I <int> Set maximum iterations (default: 200)
-n <string> Set normalization method (default: NONE)
-T <int> Set number of threads (default: 1)
-v <int> Set verbosity level (default: 1)
```

### Genome-Wide Analysis (GWev.exe)
```bash
./GWev.exe [options] <hicfile> <outbase> <resolution> [nv]

Options:
-f Use full matrix instead of inter-chromosomal only
-t <float> Set tolerance (default: 1.0e-7)
-e <float> Set epsilon (default: 1.0e-8)
-I <int> Set maximum iterations (default: 200)
-T <int> Set number of threads (default: 1)
-v <int> Set verbosity level (default: 1)
```

## Output Format
The programs generate eigenvector files in WIG format that can be visualized in genome browsers.
68 changes: 68 additions & 0 deletions C++/Lanczos/build_linux.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/bin/bash

# Exit on error
set -e

echo "🔧 Setting up build environment..."

# Install dependencies using apt
echo "📦 Installing dependencies..."
sudo apt-get update
sudo apt-get install -y \
build-essential \
gcc \
g++ \
libopenblas-dev \
liblapack-dev \
liblapacke-dev \
libcurl4-openssl-dev \
zlib1g-dev

# Check if straw is available
STRAW_PATH="$HOME/straw"
if [ ! -d "$STRAW_PATH" ]; then
echo "⚠️ Warning: straw library not found at $STRAW_PATH"
echo "Please install straw library and place it in $STRAW_PATH"
echo "or modify this script with the correct path"
exit 1
fi

echo "🔨 Building executables..."

# Common compiler flags
COMMON_FLAGS="-O2 -Wno-format-security -I/usr/include -I$STRAW_PATH/C++"
COMMON_LIBS="-L/usr/lib -lz -lcurl -lpthread -lopenblas -llapack -llapacke"

# First compile straw library
echo "Building straw library..."
g++ $COMMON_FLAGS -c "$STRAW_PATH/C++/straw.cpp" -o straw.o

# Compile Lan.exe
echo "Building Lan.exe..."
g++ $COMMON_FLAGS -std=c++11 -o Lan.exe \
s_fLan.cpp \
s_fSOLan.c \
s_dthMul.c \
hgFlipSign.c \
straw.o \
-I. \
$COMMON_LIBS

# Compile GWev.exe
echo "Building GWev.exe..."
g++ $COMMON_FLAGS -std=c++11 -o GWev.exe \
s_fGW.cpp \
getGWMatrix.cpp \
s_fSOLan.c \
s_dthMul.c \
straw.o \
$COMMON_LIBS

# Clean up object files
rm -f straw.o

echo "✅ Build completed successfully!"
echo
echo "You can now run:"
echo " ./Lan.exe - for chromosome-specific analysis"
echo " ./GWev.exe - for genome-wide analysis"
69 changes: 69 additions & 0 deletions C++/Lanczos/build_mac.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#!/bin/bash

# Exit on error
set -e

echo "🔧 Setting up build environment..."

# Assume Homebrew is installed

# Install dependencies
echo "📦 Installing dependencies..."
brew install openblas
brew install lapack
brew install gcc@13 # Latest stable GCC
brew install curl
brew install zlib

# Set up environment variables for OpenBLAS and LAPACK
export LDFLAGS="-L/opt/homebrew/opt/openblas/lib -L/opt/homebrew/opt/lapack/lib"
export CPPFLAGS="-I/opt/homebrew/opt/openblas/include -I/opt/homebrew/opt/lapack/include"

# Check if straw is available
STRAW_PATH="../../../straw"
if [ ! -d "$STRAW_PATH" ]; then
echo "⚠️ Warning: straw library not found at $STRAW_PATH"
echo "Please install straw library and place it in $STRAW_PATH"
echo "or modify this script with the correct path"
exit 1
fi

echo "🔨 Building executables..."

# Common compiler flags
COMMON_FLAGS="-O2 -Wno-format-security -I/opt/homebrew/opt/openblas/include -I/opt/homebrew/opt/lapack/include -I$STRAW_PATH/C++"
COMMON_LIBS="-L/opt/homebrew/opt/openblas/lib -L/opt/homebrew/opt/lapack/lib -lz -lcurl -lpthread -lopenblas -llapack -llapacke"

# First compile straw library
echo "Building straw library..."
g++-13 $COMMON_FLAGS -c "$STRAW_PATH/C++/straw.cpp" -o straw.o

# Compile Lan.exe
echo "Building Lan.exe..."
g++-13 $COMMON_FLAGS -std=c++11 -o Lan.exe \
s_fLan.cpp \
s_fSOLan.c \
s_dthMul.c \
hgFlipSign.c \
straw.o \
-I. \
$COMMON_LIBS

# Compile GWev.exe
echo "Building GWev.exe..."
g++-13 $COMMON_FLAGS -std=c++11 -o GWev.exe \
s_fGW.cpp \
getGWMatrix.cpp \
s_fSOLan.c \
s_dthMul.c \
straw.o \
$COMMON_LIBS

# Clean up object files
rm -f straw.o

echo "✅ Build completed successfully!"
echo
echo "You can now run:"
echo " ./Lan.exe - for chromosome-specific analysis"
echo " ./GWev.exe - for genome-wide analysis"
67 changes: 67 additions & 0 deletions C++/Lanczos/build_windows.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
@echo off
setlocal enabledelayedexpansion

echo 🔧 Setting up build environment...

REM Check if MinGW-w64 is installed and in PATH
where g++ >nul 2>&1
if %ERRORLEVEL% NEQ 0 (
echo Error: g++ not found. Please install MinGW-w64 and add it to your PATH
echo You can download it from: https://winlibs.com/
exit /b 1
)

REM Set paths - MODIFY THESE AS NEEDED
set "MINGW_PATH=C:\mingw64"
set "STRAW_PATH=%USERPROFILE%\straw"
set "OPENBLAS_PATH=%MINGW_PATH%\opt\openblas"

REM Check if straw exists
if not exist "%STRAW_PATH%" (
echo ⚠️ Warning: straw library not found at %STRAW_PATH%
echo Please install straw library and place it in %STRAW_PATH%
echo or modify this script with the correct path
exit /b 1
)

echo 🔨 Building executables...

REM Common compiler flags
set "COMMON_FLAGS=-O2 -Wno-format-security -I%MINGW_PATH%\include -I%STRAW_PATH%\C++"
set "COMMON_LIBS=-L%MINGW_PATH%\lib -lz -lcurl -lpthread -lopenblas -llapack -llapacke"

REM First compile straw library
echo Building straw library...
g++ %COMMON_FLAGS% -c "%STRAW_PATH%\C++\straw.cpp" -o straw.o

REM Compile Lan.exe
echo Building Lan.exe...
g++ %COMMON_FLAGS% -std=c++11 -o Lan.exe ^
s_fLan.cpp ^
s_fSOLan.c ^
s_dthMul.c ^
hgFlipSign.c ^
straw.o ^
-I. ^
%COMMON_LIBS%

REM Compile GWev.exe
echo Building GWev.exe...
g++ %COMMON_FLAGS% -std=c++11 -o GWev.exe ^
s_fGW.cpp ^
getGWMatrix.cpp ^
s_fSOLan.c ^
s_dthMul.c ^
straw.o ^
%COMMON_LIBS%

REM Clean up object files
del straw.o

echo ✅ Build completed successfully!
echo.
echo You can now run:
echo Lan.exe - for chromosome-specific analysis
echo GWev.exe - for genome-wide analysis

endlocal
12 changes: 6 additions & 6 deletions C++/Lanczos/getGWMatrix.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,15 @@ unsigned int getMatrix(string fname, int binsize, string norm, string ob, bool i
string unit("BP");
ifstream fin;

long master = 0L;
int64_t master = 0LL;
map<string, chromosome> chromosomeMap;
string genomeID;
int version = 0;
long nviPosition = 0;
long nviLength = 0;
long totalFileSize;
int32_t nChrs = 0;
int32_t version = 0;
int64_t nviPosition = 0LL;
int64_t nviLength = 0LL;
int64_t totalFileSize;

int nChrs;
vector<std::string> chroms;
vector<int> chrLen;
fin.open(fname, fstream::in);
Expand Down
1 change: 1 addition & 0 deletions C++/Lanczos/s_dthMul.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ void *Mul(void *threadid) {
res[i[p]] += x[p]*v[j[p]];
res[j[p]] += x[p]*v[i[p]];
}
return NULL;
}

void utmvMul(unsigned int *i,unsigned int *j,float *x,long m,double *v,unsigned int k,double *res, int nth, double **rs) {
Expand Down
Loading