Skip to content

Conversation

@lionel-
Copy link
Contributor

@lionel- lionel- commented Jan 13, 2026

Branched from #981

Addresses posit-dev/positron#1766

Frontend-side PR to be opened soon (feature/breakpoints)
Requires https://github.com/lionel-/biome/pull/1
Requires r-lib/pkgload#323 (run pak::pak("r-lib/pkgload#323)

This PR implements breakpoint support in the DAP server adds breakpoint support by injecting browser() calls at parse time rather than modifying live objects. Key benefits over RStudio's approach:

  • In packages, breakpoints work everywhere: functions, R6 methods, .onLoad() hooks, top-level code
  • In scripts, no need for debugSource() - just Cmd+Enter from the editor

This is the culmination of a series of prepatory PRs that implement a new approach for integrating with R that allows injection of breakpoints in a much more general and robust way:

  1. Annotate execution request code with source references
    Annotate execute request source code with code location #981
    Allows code evaluated from a script to be aware of location in frontend

    Positron side of source ref annotation for execution requests
    Include code locations in execute requests positron#10815

  2. Integrate Rowan syntax trees in Ark
    Create Rowan parse tree, store document contents as text, and use Biome position encoding converters #974
    Used to inject breakpoints in source code rather than dynamic objects

    Prepare Air for Ark dependency on Rowan
    Export LSP utils crate for Ark air#452

    Add post-order visit hook
    https://github.com/lionel-/biome/pull/1
    Work around limitation of Biome's Rowan API for tree manipulation

The first series of PRs makes it possible for code evaluated from an editor to carry source references. It also allows this PR to build on this and inject breakpoints before evaluation. With these, users can step through scripts using familiar gestures like Cmd+Enter.

For instance if a breakpoint is set withing an lapply() call:

lapply(1:3, function(x) {
  1  # BP
  x
})
Screen.Recording.2026-01-13.at.15.48.40.mov

Being able to invoke the debugger with Cmd+Enter without having to source a whole file as in RStudio is much more practical:

  • No need to evaluate potentially long-running expressions prior to the breakpoint.
  • No need to run expressions at the start of the script that reset state that's potentially important for the debugging session.

To achieve this, Ark is now integrated with the R REPL in a novel way that allows Ark to be in charge of parsing, so it can control source references. It's also by necessity in charge of evaluation, which will be useful later on to provide recover() functionality in Positron (evaluation will happen in the selected call frame).

The second series of PRs sets the stage for a novel approach for injecting breakpoints implemented in this PR. Traditionally, breakpoint injection has followed base R's setBreakpoint() approach which uses trace() to inject browser() calls in runtime functions. While this approach works well, it has inherent limitations and difficulties:

  • We're dynamically modifying functions after they've been copied in various places (e.g. imported by packages, exported to the search path, inserted in the S3 method table, or completely ad hoc things like manually copied in a private environment from an onload hook). Modifying one copy is not sufficient for consistent behaviour of the debugger, finding all copies is not possible in principle, and finding the most important ones is not trivial.

  • After injection, the existing srcref objects need to be adjusted to the modified AST, which is tricky.

  • Sometimes the breakpoint might be set in a place that is hard to reach, like R6 methods. RStudio still doesn't support them https://r6.r-lib.org/articles/Debugging.html

The new approach implemented here is to inject browser() calls for breakpoints at parse time. By injecting breakpoints before R ever sees the code, we sidestep all of these limitations and difficulties. R only evaluates code that already contains breakpoint calls, so we never miss copies and source references are correct from the start.

With this approach, we support breakpoints anywhere in a package, including R6 methods, .onLoad() hooks, or even at top-level, without needing ongoing maintenance work, e.g. to add support to new object systems like S7.

Screen.Recording.2026-01-13.at.15.52.33.mov

Breakpoint injection

An injected breakpoint looks like this:

.ark_auto_step(.ark_breakpoint(browser(), "*url*", "*id*"))
#line *line* "*url*"
expression_to_break_on
  • .ark_auto_step() is an identity function that serves as sentinel when R steps through code. If the user steps on an injected breakpoint, we detect the auto-step call in the debug at message emitted by R and automatically step over it (i.e. call n).

  • .ark_breakpoint() takes a browser() call promised in the current environment, a URL, and the breakpoint's unique ID. It only forces the browser argument if the breakpoint is active. Since the argument is promised in the call-site environment, this causes R to mark that environment as being debugged with the RDEBUG() flag.

    It does not stop quite at the right place though, inside the .ark_breakpoint() wrapper, with .ark_auto_step() on the stack as well. To solve this, there is a second condition triggering auto-stepping in ReadConsole: if the function of the top stack frame is .ark_breakpoint() (which we detect through a class assigned to the function), then we auto-step. This causes R to resume evaluation and, since the call-site environment is being debugged (RDEBUG() is set on the environment), it stops at the next expression automatically, in this case expression_to_break_on.

  • The #line directive right above expression_to_break_on maps the source references to the original location in the source document. When R stops on the expression, it emits the original location, allowing the DAP to communicate the appropriate stopping place to the frontend.

Source instrumentation

base::source() and devtools::load_all() need breakpoint injection as described above, as well as top-level adjustments so it's possible to step through a script or top-level package file.

If the sourced file looks like:

1
2

The instrumented version ends up as:

{
#line 1 "file:///file.R"
1
base::.ark_auto_step(base::.ark_verify_breakpoints_range("file:///test.R", 1L, 2L))
#line 2 "file:///file.R"
2
}
  • The whole source is wrapped in {} to allow R to step through the code.

  • Line directives map each expression to original source.

  • An auto-stepped .ark_verify_breakpoints_range() call after each expression lets the DAP know that any breakpoints spanned by the last expression are now "verified", i.e. the breakpoints have been injected and the code containing them has been evaluated.

    A subtlety: verify calls are not injected after trailing expressions in a {} list. The issue is that every injected call (breakpoint or verify) needs a #line directive afterwards to restore correct source reference mapping. These directives are attached as leading trivia to the next sibling node. But a trailing expression has no next sibling, so there's nowhere to attach the directive.

    Instead, trailing expressions defer their verification to the parent {} list. For deeply nested trailing breakpoints, the verification bubbles up to the outermost list. In source()'s case, this unconditionally adds a final verify call covering all lines.

To hook into source(), we rebind base::source to a wrapper that optionally performs breakpoint injection. The hook has several safeguards to fall back to the original base::source() implementation:

  • If the global option ark.source_hook is set to FALSE.
  • If there are no breakpoints for the file URI.
  • If file is not supplied or can't be converted to a URI.
  • If any argument other than echo (used by Positron) or local (useful user argument) is supplied. This is conservative but avoids edge cases with argument combinations we haven't tested.

When the hook activates, it reads the file, calls into Rust to annotate the source with breakpoints, parses the annotated code (which is now a single {...} expression) as R AST, and evaluates it in the appropriate environment.

pkgload::load_all() hooks similarly but is simpler, see r-lib/pkgload#323.

Auto-stepping over/outside injected expressions

To prevent injected expressions from interfering with debug-stepping, we automatically step over to the next statement when the debugger stops at an injected expression. Two cases:

  • Not in debug mode, we'll stop only in injected breakpoint calls. Auto-step to next expression (the user expression we should actually stop at).
  • In debug mode, we'll step at injected calls (breakpoint or verification calls). In this case, auto-step to the next expression.

This happens in ReadConsole.

Limitations

  • While the general user expectation would be for breakpoints to just work right after activating them in the UI, the reality is that the breakpoint is "Unverified" (inactive) at first. To activate the breakpoint, the user has to send the code with new breakpoints to the R session, either by evaluating or sourcing / calling load_all().

    Alternative approaches that don't have these limitations:

    • Having R manage a list of breakpoints that can be compared against srcrefs of executed code. The interpreter would break when entering an expression that matches a breakpoint. This is the ideal setup.

    • Try and update live objects when user sets a breakpoint but then we're back to RStudio's limitations regarding finding the live objects and all their copies.

    A corollary is that when user has multiple sessions, breakpoint state is tied to session state. Breakpoints that are verified in one session will not be verified in another session, they need to be sourced one way or another in all sessions of interest.

  • Verbose console output:

    debug at file:///Users/lionel/Desktop/breakpoints.R#13: base::.ark_auto_step(base::.ark_verify_breakpoints_range("file:///Users/lionel/Desktop/breakpoints.R",
      7L, 13L))
    Browse[1]> n
    debug at file:///Users/lionel/Desktop/breakpoints.R#15: [1] 1
    Browse[1]> n
    debug at file:///Users/lionel/Desktop/breakpoints.R#16: [1] 2
    Browse[1]> n
    debug at file:///Users/lionel/Desktop/breakpoints.R#18: base::.ark_auto_step(base::.ark_breakpoint(browser(), "file:///Users/lionel/Desktop/breakpoints.R",  "2"))
    Called from: base::.ark_breakpoint(browser(), "file:///Users/lionel/Desktop/breakpoints.R",  "5")
    debug at file:///Users/lionel/Desktop/breakpoints.R#20: [1]

    Verbose printed functions after injection:

    # Browse[1]> sys.function()
    function() {
      base::.ark_auto_step(base::.ark_breakpoint(browser(), "file:///Users/lionel/Desktop/breakpoints.R", "5"))
      #line 20 "file:///Users/lionel/Desktop/breakpoints.R"
      1 # BP1
      base::.ark_auto_step(base::.ark_verify_breakpoints_range("file:///Users/lionel/Desktop/breakpoints.R", 20L, 21L))
      #line 21 "file:///Users/lionel/Desktop/breakpoints.R"
      2
    }

    Injection with trace() is not as verbose because it changes the function's class and print method.

    It's unclear how to do better. Filtering from WriteConsole is not appealing because of its streaming nature, hard to match with regex when you don't have the full context, and hard to have a filter that's 100% correct (although not necessarily a goal, seems like a useful property).

    We could potentially wrap functions with injected breakpoints in a constructor at parse-time, but this could cause unexpected behaviour and side effects in edge cases.

Breakpoint invalidation and adjustment

Some breakpoints can't hit precisely on the line requested by the user, but we're able to move them to valid lines:

  • Breakpoints inside multi-line expressions like

    list(
      1,  # BP here
      2
    )

    These breakpoints are moved up to the start of the statement.

  • Breakpoints on whitespace or comments

    # BP Here
    1

    These breakpoints are moved down to the start of the next statement.

Some breakpoints are invalid and can never be hit:

  • Breakpoints on a closing } line.
  • Breakpoints inside empty multiline braces { } (rare edge case)

Permanent DAP session

Previously, the DAP session was only connected when a browser REPL was active. We used to send a notification start_debug to request the frontend to connect with a DAP client. This is no longer the case, the DAP is now expected to always be connected so that we can receive notification about the state of breakpoints in the frontend. Without this, we would not know that we need to inject breakpoints in executed code.

The start_debug and stop_debug messages sent via the Jupyter comm are still used but have become hints for the frontend to show or hide the debug toolbar, not session lifecycle events.

When R enters the debugger, we send start_debug followed by a Stopped event. When R exits the debugger, we send stop_debug followed by a Continued event (not Terminated). The DAP connection remains active throughout, allowing the backend to:

  • Receive new breakpoints as the user sets them in the UI
  • Send verification events as code is evaluated
  • Preserve breakpoint state across debug sessions

DAP event notifications

Breakpoint state changes are communicated to the frontend via DAP Breakpoint events sent through a backend channel:

  1. At parse time: After annotate_input() or annotate_source(), we call notify_invalid_breakpoints() to inform the frontend about any breakpoints marked invalid during tree rewriting. This notification contains a message indicating reason of invalidaty and is shown on hover in the frontend UI.

  2. At breakpoint hit: When .ark_breakpoint() is about to force its browser() argument, it calls ps_verify_breakpoint() which marks the breakpoint as verified and sends a BreakpointState event. This ensures the breakpoint turns red immediately when first hit, even before the expression containing it finishes evaluating.

  3. After expression evaluation: The .ark_verify_breakpoints_range() calls injected after expressions call ps_verify_breakpoints_range(), which loops over breakpoints in the line range and sends BreakpointState events for any that transition from unverified to verified.

    Note that the frontend only reacts to line-adjustments for verified breakpoints, so it would not be helpful to notify these earlier at parse time.

These events allow the frontend to update breakpoint appearance (gray dot to red dot, or show error message on hover) in real-time as code executes.

Breakpoint state management

Each breakpoint has a state that determines its behavior and appearance:

  • Unverified: Initial state. The breakpoint hasn't been injected into evaluated code yet. Shown as a gray dot.
  • Verified: The breakpoint has been injected and the containing code has been evaluated. Shown as a red dot.
  • Disabled: A previously verified breakpoint that the user unchecked. When the user unchecks a breakpoint in the frontend UI, it appears as a deletion to the backend (omitted from the SetBreakpoints request). We preserve it internally as Disabled so that when the user re-enables it, we can restore its Verified state directly without requiring them to source again.
  • Invalid: The breakpoint is at an invalid location (closing brace, empty braces). Shown as gray with a hover message explaining why.

Breakpoints also track:

  • injected: Whether the breakpoint was actually injected into code during annotation. This is crucial: verify_breakpoints() only verifies breakpoints where injected == true. This prevents a bug where a breakpoint added after parsing gets incorrectly verified when stopping at another breakpoint in the same function.

  • line vs original_line: original_line is what the frontend sent. line is the adjusted/anchored line where R will actually stop. We need both because the frontend doesn't know about our line adjustments, so subsequent SetBreakpoints requests always use the original line numbers. We match against original_line to find existing breakpoints and preserve their state. The frontend displays the adjusted line after verification.

When multiple breakpoints anchor to the same expression (e.g., one on a blank line above and one inside a multi-line call), only one .ark_breakpoint() call is injected (using the first breakpoint's ID), but all matching breakpoints get their line adjusted and injected flag set. They all become verified together and display at the same location.

Document hashing and reconnection

Breakpoints are associated with a blake3 hash of the document content. This allows breakpoint state to survive DAP server disconnections, which happen when:

  • The user uses the disconnect command (the frontend automatically reconnects).
  • The console session goes to the background (the LSP is also disabled in this case, so we don't receive document change notifications).

When the server comes back online and receives a SetBreakpoints request, it compares the current document content against the stored hash. If the hash matches, existing breakpoint states (verified, adjusted lines, etc.) are preserved. If the document changed, all breakpoints for that URI are reset to unverified.

When a document changes while connected, did_change_document() removes all breakpoints for that URI and sends unverified events to the frontend, since the old line numbers may no longer be valid.

Tree rewriting implementation

Depends on https://github.com/lionel-/biome/pull/1

Breakpoint injection uses Biome's SyntaxRewriter, a tree visitor with preorder and postorder hooks that allows replacing nodes on the way out.

  • Preorder (visit_node): Cache line information for braced expressions. We record where each expression starts and its line range. This must be done before any modifications because the postorder hook sees a partially rebuilt tree with shifted token offsets.

  • Postorder (visit_node_post): Process expression lists bottom-up. For lists inside braces, we inject breakpoint calls, add #line directives, and mark remaining breakpoints (e.g. on closing braces) as invalid. The cached line info represents original source positions, which is exactly what we need for anchoring breakpoints to document lines.

We use SyntaxRewriter instead of BatchMutation because the latter doesn't handle insertions in lists (only replacements), and doesn't handle nested changes in a node that is later replaced. For example with a breakpoint on both { and 1:

{     # BP 1
   1  # BP 2
}

BP 2 causes changes inside the braces. Then BP 1 causes the whole brace expression to be replaced with a variant that has a #line directive. BatchMutation can't express both changes because it takes modifications upfront. SyntaxRewriter lets us replace nodes bottom-up as we go.

QA Notes

Activating breakpoints

  • There are multiple code paths that should all behave the same:

    • Evaluation from an editor (Cmd/Ctrl+Enter)
    • Sourcing file with and without echo
    • devtools::load_all()
  • Script breakpoints can be activated simply by evaluating from the editor with Cmd/Ctrl+Enter, either with the statement range or a selection. You can even immediately drop into the debugger, e.g. with:

    lapply(1:3, function(x) {
      1 # BP
      2
      3
    })

    This is a significant improvement over RStudio which requires debugSource() instead.

  • Package breakpoints require the pkgload PR (pak::pak("r-lib/pkgload#323")). With it installed, load_all() should allow breakpoints to work anywhere in the package: at top-level, in .onLoad hooks, in functions, in R6 methods, etc. This is a significant improvement over RStudio which doesn't support R6 methods.

  • We're hooking into base::source() and it should inject and verify breakpoints but only if file is supplied as sole argument. If any other argument is supplied, we use the regular base source() function. The exceptions are echo (which Positron may pass in) and local (useful user argument). If the global option ark.source_hook is set to FALSE, we always use regular source().

Verification behavior

  • Normally breakpoints are verified after an evaluation has finished. But if the evaluation causes R to stop on a breakpoint, it should become verified right away. For instance with:

    lapply(1:3, function(x) {
      x  # BP
    })

    The breakpoint on x should become a full red dot on the first stop there.

  • Inner breakpoints should be verified after stepping over them, both at top-level and inside brace lists.

    local({
      fn <- function() { # BP that causes to break in `local({ ... })`
        1 # Internal BP
      }
      fn # fn's breakpoints should be verified when stepping here
    })
  • If a breakpoint is added after a function was parsed/evaluated, stopping at another breakpoint in that function should not incorrectly verify the new breakpoint. For example:

    1. Parse and evaluate:

      fn <- function() {
        1 # BP1
        2
      }

      BP1 is injected and verified.

    2. Add an unverified breakpoint on line 2 (BP2). It's not injected yet since we haven't reparsed.

    3. Call fn() and stop at BP1. BP2 should remain Unverified since it was never actually injected.

  • When you source a file that causes an error mid-script, the breakpoints preceding the error site should be verified. The breakpoints following the error site remain unverified.

Adjustment and invalidation

  • After verification, all BPs inside multiline expressions should be mapped to the start line of the expression.

    local({
      list(  # Start of expression in { list where BPs get mapped to
        1, # BP inside multiline expression
           # BP on empty line
        2  # BP inside multiline expression
      )
    })
  • Breakpoints are invalid and should remain Unverified at all times when set:

    • In empty braces:
      {
        # BP
      }
    • On a closing brace }

    You can inspect the reason by hovering over the breakpoint.

Enable/disable

  • Breakpoints can be disabled then reenabled. If they were already verified prior to disabling, they should be restored as verified when reenabled.

  • When a breakpoint was verified then disabled by the user, stopping at the breakpoint with the debugger (via debug() or via another earlier breakpoint) should not bring back the disabled breakpoint online. Although we hold onto it internally in case it's re-enabled by user, it should remain inert until they do so.

Debug session behavior

  • The debug toolbar should appear when stopped at a breakpoint (or more generally in the R debugger) and disappear when continuing/exiting the debugger.

  • The Q command should cleanly exit the debugger and hide the toolbar. Same for n and f is they exit a debugger scope.

  • When you've stopped at a breakpoint, stepping onto a line where there is another breakpoint should just stop there normally (no double-stop on the same line because we automatically step over injected breakpoints).

  • If you set a breakpoint at top-level in a {} block, and step through it, evaluating another {} block without breakpoints shouldn't drop you in the debugger (I had to work around base R marking the global env as being debugged). This is covered by an Ark-side test.

Session and connection management

  • There should only be a single DAP session at any time. If there are multiple, you'll see the running sessions in the call stack viewpane of the frontend. When there is only a single session running, the session is normally (there are edge cases not discussed here) not shown at all in that viewpane.

  • Stopping, starting, and restarting sessions should all work as before.

  • Switch console sessions while debugging session is ongoing. The debugging session of the now background console should have quit with Q.

  • After switching sessions back and forth, breakpoint state should be restored.

  • Disconnecting a debug session from the frontend via the debug toolbar or Shift-F5 will not only exit the debugger on the R side but also disconnect from the DAP server. The frontend starts up a new session automatically, allowing the backend to receive breakpoint updates. In other words, a hard disconnect shouldn't interfere with breakpoint functionality.

  • Editing a file while stopped at a breakpoint should instantly invalidate breakpoints for that file. All injected breakpoints are conditional, invalidated breakpoints never cause R to break in the debugger. The goal is to prevent stepping through an editor using a stale source view. After invalidation, the user must re-source the breakpoints.

  • LSP session switching code is implicated. Should still work as before.

Testing

Backend tests (Ark)

The backend has extensive unit tests for code annotation in console_annotate.rs (43 tests):

  • Input annotation: Basic #line directive injection, line/character offsets, handling of existing whitespace and comments.
  • Breakpoint injection: Single and multiple breakpoints, nested brace lists, multiline expressions anchoring to start, blank lines anchoring to next expression, closing brace invalidation, empty braces invalidation.
  • Source annotation: Wrapping in {}, verification call injection, top-level breakpoints.
  • Edge cases: Doubly/triply nested braces, if-else branches, multiple breakpoints collapsing to same line.

Integration tests in kernel-debugger.rs cover browser REPL behavior (entering/exiting debugger, nested debugging, stepping, etc.) but don't yet cover breakpoint-specific scenarios.

These tests currently do not cover the Breakpoint events emitted by the backend (verification status, adjusted lines, reason of invalidity, etc).

Planned: DAP protocol tests

We plan to add protocol-level tests that communicate with the DAP server directly to verify breakpoint events:

  • SetBreakpoints response contains correct initial state (unverified, with correct IDs)
  • Breakpoint events are sent when breakpoints become verified
  • Breakpoint events are sent when breakpoints become invalid (with correct reason message)
  • Line adjustment is reflected in events after verification
  • Document hash changes cause breakpoints to reset to unverified
  • Disabled breakpoints are preserved and restored correctly
Suggested: Frontend UI tests

Important UI scenarios to cover with automated tests:

  • Basic flow: Set breakpoint -> source file -> breakpoint turns red -> hit breakpoint -> toolbar appears -> continue -> toolbar disappears
  • Breakpoint adjustment: Set breakpoint inside multiline expression -> after verification, dot moves to expression start
  • Invalid breakpoints: Set breakpoint on closing brace -> remains gray with hover message
  • Disable/enable: Verify breakpoint -> uncheck -> re-check -> still verified (no re-sourcing needed)
  • Document edit: Stop at breakpoint -> edit file -> breakpoints turn gray -> never activates again
  • Session switching: Breakpoint state is attached to active console session. Editing the file should disable breakpoints of background sessions too.

See also all the tricky usage patterns in QA notes above.

@lionel- lionel- force-pushed the feature/breakpoints branch 3 times, most recently from ec4c3d7 to 26384ed Compare January 15, 2026 11:14
@lionel- lionel- requested a review from DavisVaughan January 15, 2026 11:21
@lionel- lionel- force-pushed the feature/execute-location branch from e87eeef to bb58a49 Compare January 16, 2026 07:16
@lionel- lionel- force-pushed the feature/breakpoints branch from 26384ed to 27d6d53 Compare January 16, 2026 07:16
@lionel- lionel- force-pushed the feature/execute-location branch from bb58a49 to a41fec2 Compare January 16, 2026 07:22
@lionel- lionel- force-pushed the feature/breakpoints branch from 27d6d53 to 26e192f Compare January 16, 2026 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants