Skip to content

Try DVC as a collaboration workflow in RML #79

@m09

Description

@m09

During our recent exploration of ML collaboration tool-suites, we came across dvc, a well-established open source solution developed among others by the folks at Open Data Science, a community beloved by Russian speaking data scientists.

We'd like to give it a try since it fits really well with our values at source{d} and solves the core part of our problems when we experiment:

  • it's open source;
  • it relies heavily on git and git-like mechanisms;
  • it doesn't try to solve everything in one huge single entry-point solution but rather tackles the core problems and let us free for the rest.

To try dvc, the first step is to use it individually in one or two projects:

The second step is to have the ability to share the large data files and results for good teamwork and collaboration. To test this, two things are needed:

  • Set up a dvc remote on our ML Cluster;
  • Use it in a test project to see if it enhances teamwork, probably https://github.com/src-d/tm-experiments since we need to collaborate on it for dev2dev similarity.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions