From e7ba59b27bf559d93732acf3d24de34eed2303ec Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Sat, 22 Nov 2025 10:29:43 -0500 Subject: [PATCH 1/3] move the list of features into the guides #11998 --- doc/sphinx-guides/source/admin/features.md | 229 +++++++++++++++++++++ doc/sphinx-guides/source/admin/index.rst | 1 + scripts/issues/11998/tsv2md.py | 55 +++++ 3 files changed, 285 insertions(+) create mode 100644 doc/sphinx-guides/source/admin/features.md create mode 100755 scripts/issues/11998/tsv2md.py diff --git a/doc/sphinx-guides/source/admin/features.md b/doc/sphinx-guides/source/admin/features.md new file mode 100644 index 00000000000..5a80771058b --- /dev/null +++ b/doc/sphinx-guides/source/admin/features.md @@ -0,0 +1,229 @@ +# Features + +An overview of Dataverse features can be found at . This is a more comprehensive list. + +```{contents} Contents: +:local: +:depth: 3 +``` + + +## Support for FAIR Data Principles + +Findable, Accessible, Interoperable, Reusable. +[More information.](https://scholar.harvard.edu/mercecrosas/presentations/fair-guiding-principles-implementation-dataverse) +## Data citation for datasets and files + +EndNote XML, RIS, or BibTeX format at the dataset or file level. +{doc}`More information.` + +## OAI-PMH (Harvesting) + +Gather and expose metadata from and to other systems using standardized metadata formats: Dublin Core, Data Document Initiative (DDI), OpenAIRE, etc. +{doc}`More information.` + +## APIs for interoperability and custom integrations + +Search API, Data Deposit (SWORD) API, Data Access API, Metrics API, Migration API, etc. +{doc}`More information.` + +## API client libraries + +Interact with Dataverse APIs from Python, R, Javascript, Java, and Ruby +{doc}`More information.` + +## DataCite integration + +DOIs are reserved, and when datasets are published, their metadata is published to DataCite. +{doc}`More information.` + +## Login via Shibboleth + +Single Sign On (SSO) using your institution's credentials. +{doc}`More information.` + +## Login via ORCID, Google, GitHub, or Microsoft + +Log in using popular OAuth2 providers. +{doc}`More information.` + +## Login via OpenID Connect (OIDC) + +Log in using your institution's identity provider or a third party. +{doc}`More information.` + +## Internationalization + +The Dataverse software has been translated into multiple languages. +{ref}`More information.` + +## Versioning + +History of changes to datasets and files are preserved. +{doc}`More information.` + +## Restricted files + +Control who can download files and choose whether or not to enable a "Request Access" button. +{ref}`More information.` + +## Embargo + +Make content inaccessible until an embargo end date. +{ref}`More information.` + +## Custom licenses + +CC0 by default but add as many standard licenses as you like or create your own. +{ref}`More information.` + +## Custom terms of use + +Custom terms of use can be used in place of a license or disabled by an administrator. +{ref}`More information.` + +## Publishing workflow support + +Datasets start as drafts and can be submitted for review before publication. +{ref}`More information.` + +## File hierarchy + +Users are able to control dataset file hierarchy and directory structure. +{doc}`More information.` + +## File previews + +A preview is available for text, tabular, image, audio, video, and geospatial files. +{ref}`More information.` + +## Preview and analysis of tabular files + +Data Explorer allows for searching, charting and cross tabulation analysis +{ref}`More information.` + +## Usage statistics and metrics + +Download counters, support for Make Data Count. +{doc}`More information.` + +## Guestbook + +Optionally collect data about who is downloading the files from your datasets. +{ref}`More information.` + +## Fixity checks for files + +MD5, SHA-1, SHA-256, SHA-512, UNF. +{ref}`More information.<:FileFixityChecksumAlgorithm>` + +## File download in R and TSV format + +Proprietary tabular formats are converted into RData and TSV. +{doc}`More information.` + +## Faceted search + +Facets are data driven and customizable per collection. +{doc}`More information.` + +## Customization of collections + +Each personal or organizational collection can be customized and branded. +{ref}`More information.` + +## Private URL + +Create a URL for reviewers to view an unpublished (and optionally anonymized) dataset. +{ref}`More information.` + +## Widgets + +Embed listings of data in external websites. +{ref}`More information.` + +## Notifications + +In app and email notifications for access requests, requests for review, etc. +{ref}`More information.` + +## Schema.org JSON-LD + +Used by Google Dataset Search and other services for discoverability. +{ref}`More information.` + +## External tools + +Enable additional features not built in to the Dataverse software. +{doc}`More information.` + +## External vocabulary + +Let users pick from external vocabularies (provided via API/SKOSMOS) when filling in metadata. +{ref}`More information.` + +## Dropbox integration + +Upload files stored on Dropbox. +{doc}`More information.` + +## GitHub integration + +A GitHub Action is available to upload files from GitHub to a dataset. +{doc}`More information.` + +## Integration with Jupyter notebooks + +Datasets can be opened in Binder to run code in Jupyter notebooks, RStudio, and other computation environments. +{ref}`More information.` + +## User management + +Dashboard for common user-related tasks. +{doc}`More information.` + +## Curation status labels + +Let curators mark datasets with a status label customized to your needs. +{ref}`More information.<:AllowedCurationLabels>` + +## Branding + +Your installation can be branded with a custom homepage, header, footer, CSS, etc. +{ref}`More information.` + +## Backend storage on S3 or Swift + +Choose between filesystem or object storage, configurable per collection and per dataset. +{doc}`More information.` + +## Direct upload and download for S3 + +After a permission check, files can pass freely and directly between a client computer and S3. +{doc}`More information.` + +## Export data in BagIt format + +For preservation, bags can be sent to the local filesystem, Duraclound, and Google Cloud. +{ref}`More information.` + +## Post-publication automation (workflows) + +Allow publication of a dataset to kick off external processes and integrations. +{doc}`More information.` + +## Pull header metadata from Astronomy (FITS) files + +Dataset metadata prepopulated from FITS file metadata. +{ref}`More information.` + +## Provenance + +Upload standard W3C provenance files or enter free text instead. +{ref}`More information.` + +## Auxiliary files for data files + +Each data file can have any number of auxiliary files for documentation or other purposes (experimental). +{doc}`More information.` + diff --git a/doc/sphinx-guides/source/admin/index.rst b/doc/sphinx-guides/source/admin/index.rst index a8a543571a7..c6522475088 100755 --- a/doc/sphinx-guides/source/admin/index.rst +++ b/doc/sphinx-guides/source/admin/index.rst @@ -13,6 +13,7 @@ This guide documents the functionality only available to superusers (such as "da .. toctree:: :maxdepth: 2 + features dashboard external-tools discoverability diff --git a/scripts/issues/11998/tsv2md.py b/scripts/issues/11998/tsv2md.py new file mode 100755 index 00000000000..888cb9b1595 --- /dev/null +++ b/scripts/issues/11998/tsv2md.py @@ -0,0 +1,55 @@ +#!/usr/bin/env python +# +# Download features.tsv like this: +# curl -L "https://docs.google.com/spreadsheets/d/1EIFGAfDfZAboFa3_ShRfgoT6xSDpKohDH2_iCyO5MtA/export?gid=729532473&format=tsv" > features.tsv +# +# The gid above is a specific tab in this spreadsheet: +# https://docs.google.com/spreadsheets/d/1EIFGAfDfZAboFa3_ShRfgoT6xSDpKohDH2_iCyO5MtA/edit?usp=sharing +# +# Here's the README for the spreadsheet: +# https://docs.google.com/document/d/1wqLVoEpnD93Y_wQtA2cQEkAuC0QstC6XVs9XlA7yvbM/edit?usp=sharing +import sys +from optparse import OptionParser +import csv + +parser = OptionParser() +options, args = parser.parse_args() + +if args: + tsv_file = open(args[0]) +else: + tsv_file = sys.stdin + +print("""# Features + +An overview of Dataverse features can be found at . This is a more comprehensive list. + +```{contents} Contents: +:local: +:depth: 3 +``` + +""") + +reader = csv.DictReader(tsv_file, delimiter="\t") +rows = [row for row in reader] +missing = [] +for row in rows: + title = row["Title"] + description = row["Description"] + url = row["URL"] + dtype = row["DocLinkType"] + target = row["DocLinkTarget"] + print("## %s" % title) + print() + print("%s" % description) + if target == 'url': + print("[More information.](%s)" % (url)) + elif target != '': + print("{%s}`More information.<%s>`" % (dtype, target)) + print() + else: + missing.append(url) +tsv_file.close() +for item in missing: + print(item) From 0fb5356a1ab26f280b62a76c2937284b4d6cf04c Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Mon, 1 Dec 2025 10:46:31 -0500 Subject: [PATCH 2/3] group by category #11998 --- doc/sphinx-guides/source/admin/features.md | 264 +++++++++++---------- scripts/issues/11998/tsv2md.py | 37 +-- 2 files changed, 163 insertions(+), 138 deletions(-) diff --git a/doc/sphinx-guides/source/admin/features.md b/doc/sphinx-guides/source/admin/features.md index 5a80771058b..9aa6cafe6c2 100644 --- a/doc/sphinx-guides/source/admin/features.md +++ b/doc/sphinx-guides/source/admin/features.md @@ -8,222 +8,238 @@ An overview of Dataverse features can be found at ` +Single Sign On (SSO) using your institution's credentials. +{doc}`More information.` -## OAI-PMH (Harvesting) +### Login via ORCID, Google, GitHub, or Microsoft -Gather and expose metadata from and to other systems using standardized metadata formats: Dublin Core, Data Document Initiative (DDI), OpenAIRE, etc. -{doc}`More information.` +Log in using popular OAuth2 providers. +{doc}`More information.` -## APIs for interoperability and custom integrations +### Login via OpenID Connect (OIDC) -Search API, Data Deposit (SWORD) API, Data Access API, Metrics API, Migration API, etc. -{doc}`More information.` +Log in using your institution's identity provider or a third party. +{doc}`More information.` -## API client libraries +### Versioning -Interact with Dataverse APIs from Python, R, Javascript, Java, and Ruby -{doc}`More information.` +History of changes to datasets and files are preserved. +{doc}`More information.` -## DataCite integration +### File previews -DOIs are reserved, and when datasets are published, their metadata is published to DataCite. -{doc}`More information.` +A preview is available for text, tabular, image, audio, video, and geospatial files. +{ref}`More information.` -## Login via Shibboleth +### Preview and analysis of tabular files -Single Sign On (SSO) using your institution's credentials. -{doc}`More information.` +Data Explorer allows for searching, charting and cross tabulation analysis +{ref}`More information.` -## Login via ORCID, Google, GitHub, or Microsoft +### Guestbook -Log in using popular OAuth2 providers. -{doc}`More information.` +Optionally collect data about who is downloading the files from your datasets. +{ref}`More information.` -## Login via OpenID Connect (OIDC) +### File download in R and TSV format -Log in using your institution's identity provider or a third party. -{doc}`More information.` +Proprietary tabular formats are converted into RData and TSV. +{doc}`More information.` -## Internationalization +### Faceted search -The Dataverse software has been translated into multiple languages. -{ref}`More information.` +Facets are data driven and customizable per collection. +{doc}`More information.` -## Versioning +## Administration -History of changes to datasets and files are preserved. -{doc}`More information.` +### Usage statistics and metrics -## Restricted files +Download counters, support for Make Data Count. +{doc}`More information.` -Control who can download files and choose whether or not to enable a "Request Access" button. -{ref}`More information.` +### Private URL -## Embargo +Create a URL for reviewers to view an unpublished (and optionally anonymized) dataset. +{ref}`More information.` -Make content inaccessible until an embargo end date. -{ref}`More information.` +### Notifications -## Custom licenses +In app and email notifications for access requests, requests for review, etc. +{ref}`More information.` -CC0 by default but add as many standard licenses as you like or create your own. -{ref}`More information.` +### User management -## Custom terms of use +Dashboard for common user-related tasks. +{doc}`More information.` -Custom terms of use can be used in place of a license or disabled by an administrator. -{ref}`More information.` +### Curation status labels -## Publishing workflow support +Let curators mark datasets with a status label customized to your needs. +{ref}`More information.<:AllowedCurationLabels>` -Datasets start as drafts and can be submitted for review before publication. -{ref}`More information.` +## Customization -## File hierarchy +### Internationalization -Users are able to control dataset file hierarchy and directory structure. -{doc}`More information.` +The Dataverse software has been translated into multiple languages. +{ref}`More information.` -## File previews +### Customization of collections -A preview is available for text, tabular, image, audio, video, and geospatial files. -{ref}`More information.` +Each personal or organizational collection can be customized and branded. +{ref}`More information.` -## Preview and analysis of tabular files +### Widgets -Data Explorer allows for searching, charting and cross tabulation analysis -{ref}`More information.` +Embed listings of data in external websites. +{ref}`More information.` -## Usage statistics and metrics +### Branding -Download counters, support for Make Data Count. -{doc}`More information.` +Your installation can be branded with a custom homepage, header, footer, CSS, etc. +{ref}`More information.` -## Guestbook +## FAIR data publication -Optionally collect data about who is downloading the files from your datasets. -{ref}`More information.` +### Support for FAIR Data Principles -## Fixity checks for files +Findable, Accessible, Interoperable, Reusable. +[More information.](https://scholar.harvard.edu/mercecrosas/presentations/fair-guiding-principles-implementation-dataverse) +### Publishing workflow support -MD5, SHA-1, SHA-256, SHA-512, UNF. -{ref}`More information.<:FileFixityChecksumAlgorithm>` +Datasets start as drafts and can be submitted for review before publication. +{ref}`More information.` -## File download in R and TSV format +## File management -Proprietary tabular formats are converted into RData and TSV. -{doc}`More information.` +### Restricted files -## Faceted search +Control who can download files and choose whether or not to enable a "Request Access" button. +{ref}`More information.` -Facets are data driven and customizable per collection. -{doc}`More information.` +### Embargo -## Customization of collections +Make content inaccessible until an embargo end date. +{ref}`More information.` -Each personal or organizational collection can be customized and branded. -{ref}`More information.` +### File hierarchy -## Private URL +Users are able to control dataset file hierarchy and directory structure. +{doc}`More information.` -Create a URL for reviewers to view an unpublished (and optionally anonymized) dataset. -{ref}`More information.` +### Fixity checks for files -## Widgets +MD5, SHA-1, SHA-256, SHA-512, UNF. +{ref}`More information.<:FileFixityChecksumAlgorithm>` -Embed listings of data in external websites. -{ref}`More information.` +### Backend storage on S3 or Swift -## Notifications +Choose between filesystem or object storage, configurable per collection and per dataset. +{doc}`More information.` -In app and email notifications for access requests, requests for review, etc. -{ref}`More information.` +### Direct upload and download for S3 -## Schema.org JSON-LD +After a permission check, files can pass freely and directly between a client computer and S3. +{doc}`More information.` -Used by Google Dataset Search and other services for discoverability. -{ref}`More information.` +### Pull header metadata from Astronomy (FITS) files -## External tools +Dataset metadata prepopulated from FITS file metadata. +{ref}`More information.` -Enable additional features not built in to the Dataverse software. -{doc}`More information.` +### Auxiliary files for data files -## External vocabulary +Each data file can have any number of auxiliary files for documentation or other purposes (experimental). +{doc}`More information.` -Let users pick from external vocabularies (provided via API/SKOSMOS) when filling in metadata. -{ref}`More information.` +## Integrations + +### DataCite integration + +DOIs are reserved, and when datasets are published, their metadata is published to DataCite. +{doc}`More information.` + +### External tools + +Enable additional features not built in to the Dataverse software. +{doc}`More information.` -## Dropbox integration +### Dropbox integration Upload files stored on Dropbox. {doc}`More information.` -## GitHub integration +### GitHub integration A GitHub Action is available to upload files from GitHub to a dataset. {doc}`More information.` -## Integration with Jupyter notebooks +### Integration with Jupyter notebooks Datasets can be opened in Binder to run code in Jupyter notebooks, RStudio, and other computation environments. {ref}`More information.` -## User management +## Interoperability -Dashboard for common user-related tasks. -{doc}`More information.` +### OAI-PMH (Harvesting) -## Curation status labels +Gather and expose metadata from and to other systems using standardized metadata formats: Dublin Core, Data Document Initiative (DDI), OpenAIRE, etc. +{doc}`More information.` -Let curators mark datasets with a status label customized to your needs. -{ref}`More information.<:AllowedCurationLabels>` +### APIs for interoperability and custom integrations -## Branding +Search API, Data Deposit (SWORD) API, Data Access API, Metrics API, Migration API, etc. +{doc}`More information.` -Your installation can be branded with a custom homepage, header, footer, CSS, etc. -{ref}`More information.` +### API client libraries -## Backend storage on S3 or Swift +Interact with Dataverse APIs from Python, R, Javascript, Java, and Ruby +{doc}`More information.` -Choose between filesystem or object storage, configurable per collection and per dataset. -{doc}`More information.` +### Schema.org JSON-LD + +Used by Google Dataset Search and other services for discoverability. +{ref}`More information.` -## Direct upload and download for S3 +### External vocabulary -After a permission check, files can pass freely and directly between a client computer and S3. -{doc}`More information.` +Let users pick from external vocabularies (provided via API/SKOSMOS) when filling in metadata. +{ref}`More information.` -## Export data in BagIt format +### Export data in BagIt format For preservation, bags can be sent to the local filesystem, Duraclound, and Google Cloud. {ref}`More information.` -## Post-publication automation (workflows) +## Reusability -Allow publication of a dataset to kick off external processes and integrations. -{doc}`More information.` +### Data citation for datasets and files -## Pull header metadata from Astronomy (FITS) files +EndNote XML, RIS, or BibTeX format at the dataset or file level. +{doc}`More information.` -Dataset metadata prepopulated from FITS file metadata. -{ref}`More information.` +### Custom licenses -## Provenance +CC0 by default but add as many standard licenses as you like or create your own. +{ref}`More information.` -Upload standard W3C provenance files or enter free text instead. -{ref}`More information.` +### Custom terms of use -## Auxiliary files for data files +Custom terms of use can be used in place of a license or disabled by an administrator. +{ref}`More information.` -Each data file can have any number of auxiliary files for documentation or other purposes (experimental). -{doc}`More information.` +### Post-publication automation (workflows) + +Allow publication of a dataset to kick off external processes and integrations. +{doc}`More information.` + +### Provenance + +Upload standard W3C provenance files or enter free text instead. +{ref}`More information.` diff --git a/scripts/issues/11998/tsv2md.py b/scripts/issues/11998/tsv2md.py index 888cb9b1595..47c65e51f6c 100755 --- a/scripts/issues/11998/tsv2md.py +++ b/scripts/issues/11998/tsv2md.py @@ -11,6 +11,7 @@ import sys from optparse import OptionParser import csv +from itertools import groupby parser = OptionParser() options, args = parser.parse_args() @@ -34,22 +35,30 @@ reader = csv.DictReader(tsv_file, delimiter="\t") rows = [row for row in reader] missing = [] -for row in rows: - title = row["Title"] - description = row["Description"] - url = row["URL"] - dtype = row["DocLinkType"] - target = row["DocLinkTarget"] - print("## %s" % title) +# Sort rows by category +rows.sort(key=lambda x: x["Categories"]) + +# Group by category +for category, group in groupby(rows, key=lambda x: x["Categories"]): + # print('BEGIN') + print("## %s" % category) print() - print("%s" % description) - if target == 'url': - print("[More information.](%s)" % (url)) - elif target != '': - print("{%s}`More information.<%s>`" % (dtype, target)) + for row in group: + title = row["Title"] + description = row["Description"] + url = row["URL"] + dtype = row["DocLinkType"] + target = row["DocLinkTarget"] + print("### %s" % title) print() - else: - missing.append(url) + print("%s" % description) + if target == 'url': + print("[More information.](%s)" % (url)) + elif target != '': + print("{%s}`More information.<%s>`" % (dtype, target)) + print() + else: + missing.append(url) tsv_file.close() for item in missing: print(item) From bc232862b8190905662cb26721b06d639caf85f8 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Fri, 16 Jan 2026 09:20:26 -0500 Subject: [PATCH 3/3] crosslink "features" and "what is dataverse" pages #11998 --- doc/sphinx-guides/source/admin/features.md | 2 +- doc/sphinx-guides/source/quickstart/what-is-dataverse.md | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/sphinx-guides/source/admin/features.md b/doc/sphinx-guides/source/admin/features.md index 9aa6cafe6c2..59bf7e85fa5 100644 --- a/doc/sphinx-guides/source/admin/features.md +++ b/doc/sphinx-guides/source/admin/features.md @@ -1,6 +1,6 @@ # Features -An overview of Dataverse features can be found at . This is a more comprehensive list. +An overview of Dataverse features can be found at {ref}`core-capabilities` and . This is a more comprehensive list. ```{contents} Contents: :local: diff --git a/doc/sphinx-guides/source/quickstart/what-is-dataverse.md b/doc/sphinx-guides/source/quickstart/what-is-dataverse.md index 6f86473bada..ceb3da0a6ad 100644 --- a/doc/sphinx-guides/source/quickstart/what-is-dataverse.md +++ b/doc/sphinx-guides/source/quickstart/what-is-dataverse.md @@ -10,6 +10,7 @@ A Dataverse repository can host one or more Dataverse collections, which organiz - Data files - Documentation or code +(core-capabilities)= ## Core Capabilities ### 📤 Upload, manage, publish and download data files. @@ -37,4 +38,4 @@ A Dataverse repository can host one or more Dataverse collections, which organiz - Compare versions with the detailed version change overview on dataset-level. ### ✨More features -The Dataverse project is continuously evolving. For an overview of capabilities, visit the [features list](https://dataverse.org/software-features). +The Dataverse project is continuously evolving. For an overview of capabilities, see {doc}`/admin/features` in the Admin Guide.