Skip to content

Conversation

@tomsmeding
Copy link
Member

Description

See the Note in accelerate-llvm-ptx/src/Data/Array/Accelerate/LLVM/PTX/Context.hs .

How has this been tested?

This is somewhat tricky to test, as one needs to let GC run after the PTX context has no users any more. This was my test file:

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Control.Concurrent (threadDelay)
import Control.Monad
import System.IO (hFlush, stdout)
import qualified Data.Array.Accelerate as A
import qualified Data.Array.Accelerate.Debug.Internal as A
import qualified Data.Array.Accelerate.LLVM.PTX as GPU

main :: IO ()
main = do
  print $ GPU.run $ A.sum (A.generate (A.I1 10000) (\(A.I1 i) -> A.toFloating i :: A.Exp Float))
  forM_ [1..5] $ \_ -> do
    threadDelay 1000000
    putChar '*' >> hFlush stdout
  A.traceM A.verbose "done"

Furthermore, I added additional debug prints in the finalizers of arrays and modules — as far as I can tell these are the only places where a finalizer uses a CUDA context. These manual prints were necessary because simply passing +ACC -ddump-gc made the problem disappear, seemingly because more things were retained somehow.

The program above reliably fails on my machine (cuda 12) and Jizo (cuda 13) before this PR, and reliably succeeds after; furthermore, my debug prints indicate that finalization order is indeed nondeterministic between the 1 module, 2 arrays and 1 context allocated in the above program — I've observed every possible order (apart from the two arrays, which I didn't bother to distinguish in the output). The STM-based synchronisation introduced in this PR seems to properly ensure that resources are explicitly freed only if the Context isn't already destroyed.

No automated test was added because this is tricky to do in an automated setting where the context is retained over invocations; creating a new context for the test would be possible. Do we want that?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@ivogabe ivogabe merged commit 3025e30 into master Nov 25, 2025
4 of 54 checks passed
@tomsmeding tomsmeding deleted the ptx-fix-finalizers branch November 25, 2025 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants