Skip to content

Docker - No such container #14

@alonsnir

Description

@alonsnir

Trying to get tao toolkit up and running using this tutorial and having issues with the Docker.

!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt \
                             -o $DATA_DOWNLOAD_DIR/yolo_v4/tfrecords/train \
                             -r $USER_EXPERIMENT_DIR/
                             #--gpus 1 --debug/

2025-01-28 22:58:19,933 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2025-01-28 22:58:20,025 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2025-01-28 22:58:20,064 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True

What's next:
    Try Docker Debug for seamless, persistent debugging tools in any container or image → docker debug f228ac51bcb901c9206c4772e25830c541d1bd3329c1f33c18dc8c0e13acbb4d
    Learn more at https://docs.docker.com/go/debug-cli/
Error response from daemon: No such container: f228ac51bcb901c9206c4772e25830c541d1bd3329c1f33c18dc8c0e13acbb4d
2025-01-28 22:58:22,421 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.
tao info
Configuration of the TAO Toolkit Instance
task_group: ['model', 'dataset', 'deploy']
format_version: 3.0
toolkit_version: 5.5.0
published_date: 08/26/2024


docker login nvcr.io
.......
Login Succeeded

Maybe the following will give a hint for the reason?

docker run --rm --gpus all ubuntu nvidia-smi
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

now with sudo

sudo docker run --rm --gpus all ubuntu nvidia-smi      
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4060 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   47C    P4              10W /  55W |      8MiB /  8188MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

Some system info:

$ cat /etc/docker/daemon.json
{
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "/usr/bin/nvidia-container-runtime"
        }
    }
}

$ cat ~/.docker/config.json 
{
	"auths": {
		"nvcr.io": {
			"auth": "************************************************************8"
		}
	},
	"credStore": "desktop",
	"currentContext": "desktop-linux",
	"plugins": {
		"debug": {
			"hooks": "exec"
		},
		"scout": {
			"hooks": "pull,buildx build"
		}
	},
	"features": {
		"hooks": "true"
	}
}

Also, every restart the credStore somehow became credsStore, which prevents me from the docker login to nvcr.io, unless i change it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions