Skip to content

Failed internal tests #3

@elvircrn

Description

@elvircrn

Ont his version of the code: #2

and (at least) on the following matrices from https://sparse.tamu.edu/Williams:

  • cant/cant.mtx
  • pdb1HYS/pdb1HYS.mtx

with the flag -D CHECK_RESULT=1, the code produced the following output, noting that the tests have failed:

Input:

./test -d 0 -aat 0 cant/cant.mtx

Output:

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- cant/cant.mtx --------------
input matrix A: ( 62451, 62451 ) nnz = 4007383
 loadfile time    = 0.67493 sec
the tilesize = 16
SpGEMM nnzCub = 269486473
CSR to Tile conversion uses 28.78 ms
tile space overhead = 37.74 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.37 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  4.06 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  48.40 ms------------------------

-----------------------Malloc uses 0.71 ms-------------------------------
Non-empty tiles of C = 194910
nnzC = 17440029
CUDA  TileSpGEMM runtime is 53.63 ms, gflops = 10.05
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 0, nnzCub = 269486473, Compression rate =  inf
CUDA  cuSPARSE SpGEMM runtime is 1.3550 ms, GFlops = 397.7660
cuSPARSE failed!
---------------------------------------------------------------
---------------------------------------------------------------

Input:

./test -d 0 -aat 0 pdb1HYS/pdb1HYS.mtx

Output:

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- pdb1HYS/pdb1HYS.mtx --------------
input matrix A: ( 36417, 36417 ) nnz = 4344765
 loadfile time    = 0.69516 sec
the tilesize = 16
SpGEMM nnzCub = 555322659
CSR to Tile conversion uses 33.98 ms
tile space overhead = 40.01 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.34 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  6.93 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  93.50 ms------------------------

-----------------------Malloc uses 0.95 ms-------------------------------
Non-empty tiles of C = 221571
nnzC = 19594581
CUDA  TileSpGEMM runtime is 101.79 ms, gflops = 10.91
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 0, nnzCub = 555322659, Compression rate =  inf
CUDA  cuSPARSE SpGEMM runtime is 1.3250 ms, GFlops = 838.2229
cuSPARSE failed!
---------------------------------------------------------------
---------------------------------------------------------------

However, when run against https://sparse.tamu.edu/SNAP/CollegeMsg,

Input:

./test -d 0 -aat 0 CollegeMsg/CollegeMsg.mtx

Output

--------------------------------!!!!!!!!------------------------------------
device_id = 0
---------------------------------------------------------------
Device [ 0 ] GeForce GTX 1650 Ti @ 1485.00 MHz
MAT: -------------- /home/elvircrn/tug/thesis/repo/matrices/CollegeMsg/CollegeMsg.mtx --------------
input matrix A: ( 1899, 1899 ) nnz = 20296
 loadfile time    = 0.00273 sec
the tilesize = 16
SpGEMM nnzCub = 744395
CSR to Tile conversion uses 1.14 ms
tile space overhead = 0.61 MB
step1 ----Calculate the number and tile-column index of tiles of matrixC---
step1 ---------------------- Runtime is  0.20 ms-------------------------

step2 --------Calculate the number of nonzeros of each tile of matrixC-----
step2 ---------------------- Runtime is  0.90 ms-------------------------

step3 ---------Calculate the val&col of nonzeros of matrixC-------------
step3 ---------------------- Runtime is  3.51 ms------------------------

-----------------------Malloc uses 0.46 ms-------------------------------
Non-empty tiles of C = 14154
nnzC = 407071
CUDA  TileSpGEMM runtime is 5.17 ms, gflops = 0.29
-------------------------------check----------------------------------------
tile to CSR conversion complete!

--------------- SpGEMM (using cuSPARSE) ---------------
 - cuda SpGEMM start! Benchmark runs 1 times.
 - cuda SpGEMM completed!

nnzC = 407071, nnzCub = 744395, Compression rate = 1.83
CUDA  cuSPARSE SpGEMM runtime is 1.7550 ms, GFlops = 0.8483

Validating results...
[PASSED] nnzC = 407071
[PASSED] row_pointer
[PASSED] column_index & value
---------------------------------------------------------------
---------------------------------------------------------------

the code passes it's own tests.

Let me know if more information is necessary. Therefore, I was unable to reproduce the results from the paper given this setup. Please let me know if I have made an error at some point.

Thanks,
Elvir

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions