-
Notifications
You must be signed in to change notification settings - Fork 23
Add breakpoint support to DAP #1003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lionel-
wants to merge
42
commits into
feature/execute-location
Choose a base branch
from
feature/breakpoints
base: feature/execute-location
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ec4c3d7 to
26384ed
Compare
e87eeef to
bb58a49
Compare
26384ed to
27d6d53
Compare
This avoids unnecessary re-verification when users toggle breakpoints on/off in unchanged documents. Will also allow restoring breakpoint on session change in multi-session workflows
In a script, set a breakpoint in a braced block and evaluate it:
{
1 # BP
}
Step through it. Then run another block that doesn't have a breakpoint:
{
2
}
Without the workaround, R will enter the debugger at the first expression of the
block.
- More consistent verification inside braces - More consistent implementation in `annotate_input()` and `annotate_source()`
Fixes `View()`
bb58a49 to
a41fec2
Compare
27d6d53 to
26e192f
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Branched from #981
Addresses posit-dev/positron#1766
Frontend-side PR to be opened soon (feature/breakpoints)
Requires https://github.com/lionel-/biome/pull/1
Requires r-lib/pkgload#323 (run
pak::pak("r-lib/pkgload#323)This PR implements breakpoint support in the DAP server adds breakpoint support by injecting
browser()calls at parse time rather than modifying live objects. Key benefits over RStudio's approach:.onLoad()hooks, top-level codedebugSource()- just Cmd+Enter from the editorThis is the culmination of a series of prepatory PRs that implement a new approach for integrating with R that allows injection of breakpoints in a much more general and robust way:
Annotate execution request code with source references
Annotate execute request source code with code location #981
Allows code evaluated from a script to be aware of location in frontend
Positron side of source ref annotation for execution requests
Include code locations in execute requests positron#10815
Integrate Rowan syntax trees in Ark
Create Rowan parse tree, store document contents as text, and use Biome position encoding converters #974
Used to inject breakpoints in source code rather than dynamic objects
Prepare Air for Ark dependency on Rowan
Export LSP utils crate for Ark air#452
Add post-order visit hook
https://github.com/lionel-/biome/pull/1
Work around limitation of Biome's Rowan API for tree manipulation
The first series of PRs makes it possible for code evaluated from an editor to carry source references. It also allows this PR to build on this and inject breakpoints before evaluation. With these, users can step through scripts using familiar gestures like Cmd+Enter.
For instance if a breakpoint is set withing an
lapply()call:Screen.Recording.2026-01-13.at.15.48.40.mov
Being able to invoke the debugger with Cmd+Enter without having to source a whole file as in RStudio is much more practical:
To achieve this, Ark is now integrated with the R REPL in a novel way that allows Ark to be in charge of parsing, so it can control source references. It's also by necessity in charge of evaluation, which will be useful later on to provide
recover()functionality in Positron (evaluation will happen in the selected call frame).The second series of PRs sets the stage for a novel approach for injecting breakpoints implemented in this PR. Traditionally, breakpoint injection has followed base R's
setBreakpoint()approach which usestrace()to injectbrowser()calls in runtime functions. While this approach works well, it has inherent limitations and difficulties:We're dynamically modifying functions after they've been copied in various places (e.g. imported by packages, exported to the search path, inserted in the S3 method table, or completely ad hoc things like manually copied in a private environment from an onload hook). Modifying one copy is not sufficient for consistent behaviour of the debugger, finding all copies is not possible in principle, and finding the most important ones is not trivial.
After injection, the existing srcref objects need to be adjusted to the modified AST, which is tricky.
Sometimes the breakpoint might be set in a place that is hard to reach, like R6 methods. RStudio still doesn't support them https://r6.r-lib.org/articles/Debugging.html
The new approach implemented here is to inject
browser()calls for breakpoints at parse time. By injecting breakpoints before R ever sees the code, we sidestep all of these limitations and difficulties. R only evaluates code that already contains breakpoint calls, so we never miss copies and source references are correct from the start.With this approach, we support breakpoints anywhere in a package, including R6 methods,
.onLoad()hooks, or even at top-level, without needing ongoing maintenance work, e.g. to add support to new object systems like S7.Screen.Recording.2026-01-13.at.15.52.33.mov
Breakpoint injection
An injected breakpoint looks like this:
.ark_auto_step()is an identity function that serves as sentinel when R steps through code. If the user steps on an injected breakpoint, we detect the auto-step call in thedebug atmessage emitted by R and automatically step over it (i.e. calln)..ark_breakpoint()takes abrowser()call promised in the current environment, a URL, and the breakpoint's unique ID. It only forces the browser argument if the breakpoint is active. Since the argument is promised in the call-site environment, this causes R to mark that environment as being debugged with theRDEBUG()flag.It does not stop quite at the right place though, inside the
.ark_breakpoint()wrapper, with.ark_auto_step()on the stack as well. To solve this, there is a second condition triggering auto-stepping in ReadConsole: if the function of the top stack frame is.ark_breakpoint()(which we detect through a class assigned to the function), then we auto-step. This causes R to resume evaluation and, since the call-site environment is being debugged (RDEBUG()is set on the environment), it stops at the next expression automatically, in this caseexpression_to_break_on.The
#linedirective right aboveexpression_to_break_onmaps the source references to the original location in the source document. When R stops on the expression, it emits the original location, allowing the DAP to communicate the appropriate stopping place to the frontend.Source instrumentation
base::source()anddevtools::load_all()need breakpoint injection as described above, as well as top-level adjustments so it's possible to step through a script or top-level package file.If the sourced file looks like:
The instrumented version ends up as:
{ #line 1 "file:///file.R" 1 base::.ark_auto_step(base::.ark_verify_breakpoints_range("file:///test.R", 1L, 2L)) #line 2 "file:///file.R" 2 }The whole source is wrapped in
{}to allow R to step through the code.Line directives map each expression to original source.
An auto-stepped
.ark_verify_breakpoints_range()call after each expression lets the DAP know that any breakpoints spanned by the last expression are now "verified", i.e. the breakpoints have been injected and the code containing them has been evaluated.A subtlety: verify calls are not injected after trailing expressions in a
{}list. The issue is that every injected call (breakpoint or verify) needs a#linedirective afterwards to restore correct source reference mapping. These directives are attached as leading trivia to the next sibling node. But a trailing expression has no next sibling, so there's nowhere to attach the directive.Instead, trailing expressions defer their verification to the parent
{}list. For deeply nested trailing breakpoints, the verification bubbles up to the outermost list. Insource()'s case, this unconditionally adds a final verify call covering all lines.To hook into
source(), we rebindbase::sourceto a wrapper that optionally performs breakpoint injection. The hook has several safeguards to fall back to the originalbase::source()implementation:ark.source_hookis set toFALSE.fileis not supplied or can't be converted to a URI.echo(used by Positron) orlocal(useful user argument) is supplied. This is conservative but avoids edge cases with argument combinations we haven't tested.When the hook activates, it reads the file, calls into Rust to annotate the source with breakpoints, parses the annotated code (which is now a single
{...}expression) as R AST, and evaluates it in the appropriate environment.pkgload::load_all()hooks similarly but is simpler, see r-lib/pkgload#323.Auto-stepping over/outside injected expressions
To prevent injected expressions from interfering with debug-stepping, we automatically step over to the next statement when the debugger stops at an injected expression. Two cases:
This happens in
ReadConsole.Limitations
While the general user expectation would be for breakpoints to just work right after activating them in the UI, the reality is that the breakpoint is "Unverified" (inactive) at first. To activate the breakpoint, the user has to send the code with new breakpoints to the R session, either by evaluating or sourcing / calling
load_all().Alternative approaches that don't have these limitations:
Having R manage a list of breakpoints that can be compared against srcrefs of executed code. The interpreter would break when entering an expression that matches a breakpoint. This is the ideal setup.
Try and update live objects when user sets a breakpoint but then we're back to RStudio's limitations regarding finding the live objects and all their copies.
A corollary is that when user has multiple sessions, breakpoint state is tied to session state. Breakpoints that are verified in one session will not be verified in another session, they need to be sourced one way or another in all sessions of interest.
Verbose console output:
Verbose printed functions after injection:
Injection with
trace()is not as verbose because it changes the function's class and print method.It's unclear how to do better. Filtering from
WriteConsoleis not appealing because of its streaming nature, hard to match with regex when you don't have the full context, and hard to have a filter that's 100% correct (although not necessarily a goal, seems like a useful property).We could potentially wrap functions with injected breakpoints in a constructor at parse-time, but this could cause unexpected behaviour and side effects in edge cases.
Breakpoint invalidation and adjustment
Some breakpoints can't hit precisely on the line requested by the user, but we're able to move them to valid lines:
Breakpoints inside multi-line expressions like
These breakpoints are moved up to the start of the statement.
Breakpoints on whitespace or comments
These breakpoints are moved down to the start of the next statement.
Some breakpoints are invalid and can never be hit:
}line.{ }(rare edge case)Permanent DAP session
Previously, the DAP session was only connected when a browser REPL was active. We used to send a notification
start_debugto request the frontend to connect with a DAP client. This is no longer the case, the DAP is now expected to always be connected so that we can receive notification about the state of breakpoints in the frontend. Without this, we would not know that we need to inject breakpoints in executed code.The
start_debugandstop_debugmessages sent via the Jupyter comm are still used but have become hints for the frontend to show or hide the debug toolbar, not session lifecycle events.When R enters the debugger, we send
start_debugfollowed by aStoppedevent. When R exits the debugger, we sendstop_debugfollowed by aContinuedevent (notTerminated). The DAP connection remains active throughout, allowing the backend to:DAP event notifications
Breakpoint state changes are communicated to the frontend via DAP
Breakpointevents sent through a backend channel:At parse time: After
annotate_input()orannotate_source(), we callnotify_invalid_breakpoints()to inform the frontend about any breakpoints marked invalid during tree rewriting. This notification contains a message indicating reason of invalidaty and is shown on hover in the frontend UI.At breakpoint hit: When
.ark_breakpoint()is about to force itsbrowser()argument, it callsps_verify_breakpoint()which marks the breakpoint as verified and sends aBreakpointStateevent. This ensures the breakpoint turns red immediately when first hit, even before the expression containing it finishes evaluating.After expression evaluation: The
.ark_verify_breakpoints_range()calls injected after expressions callps_verify_breakpoints_range(), which loops over breakpoints in the line range and sendsBreakpointStateevents for any that transition from unverified to verified.Note that the frontend only reacts to line-adjustments for verified breakpoints, so it would not be helpful to notify these earlier at parse time.
These events allow the frontend to update breakpoint appearance (gray dot to red dot, or show error message on hover) in real-time as code executes.
Breakpoint state management
Each breakpoint has a state that determines its behavior and appearance:
SetBreakpointsrequest). We preserve it internally as Disabled so that when the user re-enables it, we can restore its Verified state directly without requiring them to source again.Breakpoints also track:
injected: Whether the breakpoint was actually injected into code during annotation. This is crucial:verify_breakpoints()only verifies breakpoints whereinjected == true. This prevents a bug where a breakpoint added after parsing gets incorrectly verified when stopping at another breakpoint in the same function.linevsoriginal_line:original_lineis what the frontend sent.lineis the adjusted/anchored line where R will actually stop. We need both because the frontend doesn't know about our line adjustments, so subsequentSetBreakpointsrequests always use the original line numbers. We match againstoriginal_lineto find existing breakpoints and preserve their state. The frontend displays the adjustedlineafter verification.When multiple breakpoints anchor to the same expression (e.g., one on a blank line above and one inside a multi-line call), only one
.ark_breakpoint()call is injected (using the first breakpoint's ID), but all matching breakpoints get theirlineadjusted andinjectedflag set. They all become verified together and display at the same location.Document hashing and reconnection
Breakpoints are associated with a blake3 hash of the document content. This allows breakpoint state to survive DAP server disconnections, which happen when:
When the server comes back online and receives a
SetBreakpointsrequest, it compares the current document content against the stored hash. If the hash matches, existing breakpoint states (verified, adjusted lines, etc.) are preserved. If the document changed, all breakpoints for that URI are reset to unverified.When a document changes while connected,
did_change_document()removes all breakpoints for that URI and sends unverified events to the frontend, since the old line numbers may no longer be valid.Tree rewriting implementation
Depends on https://github.com/lionel-/biome/pull/1
Breakpoint injection uses Biome's
SyntaxRewriter, a tree visitor with preorder and postorder hooks that allows replacing nodes on the way out.Preorder (
visit_node): Cache line information for braced expressions. We record where each expression starts and its line range. This must be done before any modifications because the postorder hook sees a partially rebuilt tree with shifted token offsets.Postorder (
visit_node_post): Process expression lists bottom-up. For lists inside braces, we inject breakpoint calls, add#linedirectives, and mark remaining breakpoints (e.g. on closing braces) as invalid. The cached line info represents original source positions, which is exactly what we need for anchoring breakpoints to document lines.We use
SyntaxRewriterinstead ofBatchMutationbecause the latter doesn't handle insertions in lists (only replacements), and doesn't handle nested changes in a node that is later replaced. For example with a breakpoint on both{and1:{ # BP 1 1 # BP 2 }BP 2 causes changes inside the braces. Then BP 1 causes the whole brace expression to be replaced with a variant that has a
#linedirective.BatchMutationcan't express both changes because it takes modifications upfront.SyntaxRewriterlets us replace nodes bottom-up as we go.QA Notes
Activating breakpoints
There are multiple code paths that should all behave the same:
devtools::load_all()Script breakpoints can be activated simply by evaluating from the editor with Cmd/Ctrl+Enter, either with the statement range or a selection. You can even immediately drop into the debugger, e.g. with:
This is a significant improvement over RStudio which requires
debugSource()instead.Package breakpoints require the pkgload PR (
pak::pak("r-lib/pkgload#323")). With it installed,load_all()should allow breakpoints to work anywhere in the package: at top-level, in.onLoadhooks, in functions, in R6 methods, etc. This is a significant improvement over RStudio which doesn't support R6 methods.We're hooking into
base::source()and it should inject and verify breakpoints but only iffileis supplied as sole argument. If any other argument is supplied, we use the regular basesource()function. The exceptions areecho(which Positron may pass in) andlocal(useful user argument). If the global optionark.source_hookis set toFALSE, we always use regularsource().Verification behavior
Normally breakpoints are verified after an evaluation has finished. But if the evaluation causes R to stop on a breakpoint, it should become verified right away. For instance with:
The breakpoint on
xshould become a full red dot on the first stop there.Inner breakpoints should be verified after stepping over them, both at top-level and inside brace lists.
local({ fn <- function() { # BP that causes to break in `local({ ... })` 1 # Internal BP } fn # fn's breakpoints should be verified when stepping here })If a breakpoint is added after a function was parsed/evaluated, stopping at another breakpoint in that function should not incorrectly verify the new breakpoint. For example:
Parse and evaluate:
BP1 is injected and verified.
Add an unverified breakpoint on line 2 (BP2). It's not injected yet since we haven't reparsed.
Call
fn()and stop at BP1. BP2 should remain Unverified since it was never actually injected.When you source a file that causes an error mid-script, the breakpoints preceding the error site should be verified. The breakpoints following the error site remain unverified.
Adjustment and invalidation
After verification, all BPs inside multiline expressions should be mapped to the start line of the expression.
local({ list( # Start of expression in { list where BPs get mapped to 1, # BP inside multiline expression # BP on empty line 2 # BP inside multiline expression ) })Breakpoints are invalid and should remain Unverified at all times when set:
{ # BP }}You can inspect the reason by hovering over the breakpoint.
Enable/disable
Breakpoints can be disabled then reenabled. If they were already verified prior to disabling, they should be restored as verified when reenabled.
When a breakpoint was verified then disabled by the user, stopping at the breakpoint with the debugger (via
debug()or via another earlier breakpoint) should not bring back the disabled breakpoint online. Although we hold onto it internally in case it's re-enabled by user, it should remain inert until they do so.Debug session behavior
The debug toolbar should appear when stopped at a breakpoint (or more generally in the R debugger) and disappear when continuing/exiting the debugger.
The
Qcommand should cleanly exit the debugger and hide the toolbar. Same fornandfis they exit a debugger scope.When you've stopped at a breakpoint, stepping onto a line where there is another breakpoint should just stop there normally (no double-stop on the same line because we automatically step over injected breakpoints).
If you set a breakpoint at top-level in a
{}block, and step through it, evaluating another{}block without breakpoints shouldn't drop you in the debugger (I had to work around base R marking the global env as being debugged). This is covered by an Ark-side test.Session and connection management
There should only be a single DAP session at any time. If there are multiple, you'll see the running sessions in the call stack viewpane of the frontend. When there is only a single session running, the session is normally (there are edge cases not discussed here) not shown at all in that viewpane.
Stopping, starting, and restarting sessions should all work as before.
Switch console sessions while debugging session is ongoing. The debugging session of the now background console should have quit with
Q.After switching sessions back and forth, breakpoint state should be restored.
Disconnecting a debug session from the frontend via the debug toolbar or Shift-F5 will not only exit the debugger on the R side but also disconnect from the DAP server. The frontend starts up a new session automatically, allowing the backend to receive breakpoint updates. In other words, a hard disconnect shouldn't interfere with breakpoint functionality.
Editing a file while stopped at a breakpoint should instantly invalidate breakpoints for that file. All injected breakpoints are conditional, invalidated breakpoints never cause R to break in the debugger. The goal is to prevent stepping through an editor using a stale source view. After invalidation, the user must re-source the breakpoints.
LSP session switching code is implicated. Should still work as before.
Testing
Backend tests (Ark)
The backend has extensive unit tests for code annotation in
console_annotate.rs(43 tests):#linedirective injection, line/character offsets, handling of existing whitespace and comments.{}, verification call injection, top-level breakpoints.Integration tests in
kernel-debugger.rscover browser REPL behavior (entering/exiting debugger, nested debugging, stepping, etc.) but don't yet cover breakpoint-specific scenarios.These tests currently do not cover the Breakpoint events emitted by the backend (verification status, adjusted lines, reason of invalidity, etc).
Planned: DAP protocol tests
We plan to add protocol-level tests that communicate with the DAP server directly to verify breakpoint events:
SetBreakpointsresponse contains correct initial state (unverified, with correct IDs)Breakpointevents are sent when breakpoints become verifiedBreakpointevents are sent when breakpoints become invalid (with correct reason message)Suggested: Frontend UI tests
Important UI scenarios to cover with automated tests:
See also all the tricky usage patterns in QA notes above.