-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Bug Report
Description
dvc status fails to recognize .dvc files when .gitignore uses ** globbing patterns with negation rules (e.g., data/raw/** followed by !data/raw/**/*.dvc). DVC reports "There are no data or pipelines tracked in this project yet" even though:
.dvc files are correctly committed to Git
dvc add succeeds without errors
The cache is properly populated
Individual file queries work (dvc status file.dvc)
This causes dvc push to silently skip pushing data to remotes, as DVC believes no files are tracked.
Reproduce
Initialize a Git repository and DVC:
git init
dvc init
Create a .gitignore with the following patterns:
# Ignore all data files
data/raw/**
data/interim/**
data/processed/**
# But keep DVC metafiles
!data/raw/**/*.dvc
!data/interim/**/*.dvc
!data/processed/**/*.dvc
.dvc/cache/
Add a data file:
mkdir -p data/raw
echo "test data" > data/raw/example.nc
dvc add data/raw/example.nc
git add data/raw/example.nc.dvc .gitignore
git commit -m "Add data file"
Check DVC status:
dvc status
dvc status --cloud
Verify the issue:
# This shows the file is NOT in DVC's index
dvc status
# Output: "There are no data or pipelines tracked in this project yet."
# But individual queries work
dvc status data/raw/example.nc.dvc
# Output: Shows file status correctly
# Git correctly tracks the file
git ls-files "*.dvc"
# Output: data/raw/example.nc.dvc
# Push appears to succeed but does nothing
dvc push
# Output: "Everything is up to date." (even if remote is empty)
Expected
dvc status should recognize all .dvc files tracked in Git and report their status correctly, regardless of .gitignore globbing patterns used. The command should output the status of tracked data files, and dvc push should push data to the remote when needed.
Environment information
Output of dvc doctor:
$ dvc doctor
DVC version: 3.66.1 (conda)
---------------------------
Platform: Python 3.12.12 on Windows-11-10.0.26200-SP0
Subprojects:
dvc_data = 3.18.2
dvc_objects = 5.2.0
dvc_render = 1.0.2
dvc_task = 0.40.2
scmrepo = 3.6.1
Supports:
http (aiohttp = 3.13.3, aiohttp-retry = 2.9.1),
https (aiohttp = 3.13.3, aiohttp-retry = 2.9.1)
Config:
Global: C:\Users\username\AppData\Local\iterative\dvc
System: C:\ProgramData\iterative\dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: local
Workspace directory: NTFS on C:\
Repo: dvc, git
Additional Information (if any):
Package manager: Pixi 0.63.2
Git version: 2.45.2.windows.1
Cache type configured: copy
Verbose output showing DVC doesn't find stages:
$ dvc status --cloud -vv
2026-02-05 17:22:39,507 TRACE: Namespace(...)
2026-02-05 17:22:40,518 TRACE: 1.61 ms in collecting stages from [project directories]
There are no data or pipelines tracked in this project yet.
Python verification that index is empty:
>>> import dvc.repo
>>> repo = dvc.repo.Repo('.')
>>> repo.index.stages
[] # Empty despite .dvc files existing in Git
Workaround that fixes the issue:
Replace the ** globbing patterns with specific file patterns:
# Instead of:
# data/raw/**
# !data/raw/**/*.dvc
# Use:
data/raw/*.nc
data/interim/*.parquet
data/processed/*.csv
After changing .gitignore:
dvc add data/raw/*.nc --force # Re-cache files
dvc status # Now correctly shows tracked files
dvc push # Successfully pushes to remote
Impact: This is a critical silent failure that affects users following common patterns for ignoring data directories while keeping DVC metadata. The issue took 4 hours to debug because there are no warnings or errors indicating that .dvc files exist but aren't being tracked by DVC's index.