Skip to content

Out of memory on 3060Ti? #12

@mahnoorfirdous

Description

@mahnoorfirdous

Greetings.

I was trying to run the test on my 3060 Ti. If this is unsupported I didn't find that information on the repository. I compiled with Cuda 12.2, gcc-10.

Here's my compilation output

/usr/bin/cmake -S/home/mahnoor/mlexp/GPUStressTest -B/home/mahnoor/mlexp/GPUStressTest/build --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start /home/mahnoor/mlexp/GPUStressTest/build/CMakeFiles /home/mahnoor/mlexp/GPUStressTest/build//CMakeFiles/progress.marks
make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/home/mahnoor/mlexp/GPUStressTest/build'
make  -f CMakeFiles/gst.dir/build.make CMakeFiles/gst.dir/depend
make[2]: Entering directory '/home/mahnoor/mlexp/GPUStressTest/build'
cd /home/mahnoor/mlexp/GPUStressTest/build && /usr/bin/cmake -E cmake_depends "Unix Makefiles" /home/mahnoor/mlexp/GPUStressTest /home/mahnoor/mlexp/GPUStressTest /home/mahnoor/mlexp/GPUStressTest/build /home/mahnoor/mlexp/GPUStressTest/build /home/mahnoor/mlexp/GPUStressTest/build/CMakeFiles/gst.dir/DependInfo.cmake "--color="
make[2]: Leaving directory '/home/mahnoor/mlexp/GPUStressTest/build'
make  -f CMakeFiles/gst.dir/build.make CMakeFiles/gst.dir/build
make[2]: Entering directory '/home/mahnoor/mlexp/GPUStressTest/build'
[ 16%] Building CUDA object CMakeFiles/gst.dir/main.cu.o
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler  --options-file CMakeFiles/gst.dir/includes_CUDA.rsp -ccbin /usr/bin/g++-10 -m64 -Xcompiler "-Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64" -Xfatbin -compress-all -Xcudafe --display_error_number -DDEBUG_MATRIX_SIZES -O3 -DNDEBUG -std=c++14 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -MD -MT CMakeFiles/gst.dir/main.cu.o -MF CMakeFiles/gst.dir/main.cu.o.d -x cu -c /home/mahnoor/mlexp/GPUStressTest/main.cu -o CMakeFiles/gst.dir/main.cu.o
[ 33%] Building CUDA object CMakeFiles/gst.dir/util/test_args.cu.o
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler  --options-file CMakeFiles/gst.dir/includes_CUDA.rsp -ccbin /usr/bin/g++-10 -m64 -Xcompiler "-Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64" -Xfatbin -compress-all -Xcudafe --display_error_number -DDEBUG_MATRIX_SIZES -O3 -DNDEBUG -std=c++14 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -MD -MT CMakeFiles/gst.dir/util/test_args.cu.o -MF CMakeFiles/gst.dir/util/test_args.cu.o.d -x cu -c /home/mahnoor/mlexp/GPUStressTest/util/test_args.cu -o CMakeFiles/gst.dir/util/test_args.cu.o
[ 50%] Building CXX object CMakeFiles/gst.dir/util/measure.cc.o
/usr/bin/g++-10  -I/home/mahnoor/mlexp/GPUStressTest/util -I/usr/local/cuda/bin/../include -I/usr/local/cuda/cublasLt -Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64 -O3 -DNDEBUG -std=c++14 -MD -MT CMakeFiles/gst.dir/util/measure.cc.o -MF CMakeFiles/gst.dir/util/measure.cc.o.d -o CMakeFiles/gst.dir/util/measure.cc.o -c /home/mahnoor/mlexp/GPUStressTest/util/measure.cc
[ 66%] Building CUDA object CMakeFiles/gst.dir/util/device_info.cu.o
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler  --options-file CMakeFiles/gst.dir/includes_CUDA.rsp -ccbin /usr/bin/g++-10 -m64 -Xcompiler "-Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64" -Xfatbin -compress-all -Xcudafe --display_error_number -DDEBUG_MATRIX_SIZES -O3 -DNDEBUG -std=c++14 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -MD -MT CMakeFiles/gst.dir/util/device_info.cu.o -MF CMakeFiles/gst.dir/util/device_info.cu.o.d -x cu -c /home/mahnoor/mlexp/GPUStressTest/util/device_info.cu -o CMakeFiles/gst.dir/util/device_info.cu.o
[ 83%] Building CUDA object CMakeFiles/gst.dir/util/test_util.cu.o
/usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler  --options-file CMakeFiles/gst.dir/includes_CUDA.rsp -ccbin /usr/bin/g++-10 -m64 -Xcompiler "-Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64" -Xfatbin -compress-all -Xcudafe --display_error_number -DDEBUG_MATRIX_SIZES -O3 -DNDEBUG -std=c++14 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_75,code=[compute_75,sm_75]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -MD -MT CMakeFiles/gst.dir/util/test_util.cu.o -MF CMakeFiles/gst.dir/util/test_util.cu.o.d -x cu -c /home/mahnoor/mlexp/GPUStressTest/util/test_util.cu -o CMakeFiles/gst.dir/util/test_util.cu.o
/home/mahnoor/mlexp/GPUStressTest/util/test_util.h:95:20: warning: ‘cudaError_t gpuAllocPinnedAndMap(size_t, void**, void**)’ defined but not used [-Wunused-function]
   95 | static cudaError_t gpuAllocPinnedAndMap(size_t sizeInbytes, void** HostMemPtr,
      |                    ^~~~~~~~~~~~~~~~~~~~
[100%] Linking CXX executable gst
/usr/bin/cmake -E cmake_link_script CMakeFiles/gst.dir/link.txt --verbose=1
/usr/bin/g++-10  -Wall -Wextra -fno-strict-aliasing -Wno-unused-parameter -I/usr/local/cuda/include -I/usr/local/cuda/cublasLt -L/usr/local/cuda/lib64 -O3 -DNDEBUG CMakeFiles/gst.dir/main.cu.o CMakeFiles/gst.dir/util/test_args.cu.o CMakeFiles/gst.dir/util/measure.cc.o CMakeFiles/gst.dir/util/device_info.cu.o CMakeFiles/gst.dir/util/test_util.cu.o -o gst   -L/usr/local/cuda/targets/x86_64-linux/lib/stubs  -L/usr/local/cuda/targets/x86_64-linux/lib  -lcublas -lcublasLt -lcudadevrt -lcudart_static -lrt -lpthread -ldl 
make[2]: Leaving directory '/home/mahnoor/mlexp/GPUStressTest/build'
[100%] Built target gst
make[1]: Leaving directory '/home/mahnoor/mlexp/GPUStressTest/build'
/usr/bin/cmake -E cmake_progress_start /home/mahnoor/mlexp/GPUStressTest/build/CMakeFiles 0

Here is the output of ./gst

./gst capturing GPU information...
WATCHDOG starting, TIMEOUT: 600 seconds
Detected 1 CUDA Capable device(s)
./gst Done.
Device 0: "NVIDIA GeForce RTX 3060 Ti"
./gst done capturing GPU information.
DEBUG_MATRIX_SIZES: Checking matrix size only (no CUDA execution) for: T4
Initilizing T4 based test suite
GPU Memory: 7, memgb: 16


Device 0: "NVIDIA GeForce RTX 3060 Ti", PCIe: 9
stress_tests[0].test_name FP16
P hsh
m 31864
n 38648
k 88304
ta 0
tb 1
B 0

***** STARTING TEST 0: FP16 On Device 0 NVIDIA GeForce RTX 3060 Ti
testing cublasLt
Allocate matrixSize Total Bytes A + B + C:  14915943040 
std::exception: out of memory
testing cublasLt fail

I thought the tool might have adjusted to work for units with smaller memory. If this is just how the tool is that is fine, but I would like to rule out whether I am doing something wrong.

I am using Ubuntu Budgie 24.04, AMD Ryzen 5600.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions