diff --git a/doc/release-notes/8424-signposting.md b/doc/release-notes/8424-signposting.md new file mode 100644 index 00000000000..6994ba92cb7 --- /dev/null +++ b/doc/release-notes/8424-signposting.md @@ -0,0 +1,8 @@ +# Signposting for Dataverse + +This release adds [Signposting](https://signposting.org/) support to Dataverse to improve machine discoverability of datasets and files. + +The following MicroProfile Config options are now available (these can be treated as JVM options): + +- dataverse.signposting.level1-author-limit +- dataverse.signposting.level1-item-limit diff --git a/doc/sphinx-guides/source/admin/discoverability.rst b/doc/sphinx-guides/source/admin/discoverability.rst new file mode 100644 index 00000000000..767bb55bce6 --- /dev/null +++ b/doc/sphinx-guides/source/admin/discoverability.rst @@ -0,0 +1,76 @@ +Discoverability +=============== + +Datasets are made discoverable by a variety of methods. + +.. contents:: |toctitle| + :local: + +DataCite Integration +-------------------- + +If you are using `DataCite `_ as your DOI provider, when datasets are published, metadata is pushed to DataCite, where it can be searched. For more information, see :ref:`:DoiProvider` in the Installation Guide. + +OAI-PMH (Harvesting) +-------------------- + +The Dataverse software supports a protocol called OAI-PMH that facilitates harvesting dataset metadata from one system into another. For details on harvesting, see the :doc:`harvestserver` section. + +Machine-Readable Metadata on Dataset Landing Pages +-------------------------------------------------- + +As recommended in `A Data Citation Roadmap for Scholarly Data Repositories `_, the Dataverse software embeds metadata on dataset landing pages in a variety of machine-readable ways. + +Dublin Core HTML Meta Tags +++++++++++++++++++++++++++ + +The HTML source of a dataset landing page includes "DC" (Dublin Core) ```` tags such as the following:: + + {"@context":"http://schema.org","@type":"Dataset","@id":"https://doi.org/... + + +.. _discovery-sign-posting: + +Signposting ++++++++++++ + +The Dataverse software supports `Signposting `_. This allows machines to request more information about a dataset through the `Link `_ HTTP header. + +There are 2 Signposting profile levels, level 1 and level 2. In this implementation, + * Level 1 links are shown `as recommended `_ in the "Link" + HTTP header, which can be fetched by sending an HTTP HEAD request, e.g. ``curl -I https://demo.dataverse.org/dataset.xhtml?persistentId=doi:10.5072/FK2/KPY4ZC``. + The number of author and file links in the level 1 header can be configured as described below. + * The level 2 linkset can be fetched by visiting the dedicated linkset page for + that artifact. The link can be seen in level 1 links with key name ``rel="linkset"``. + +Note: Authors without author link will not be counted nor shown in any profile/linkset. +The following configuration options are available: + +- :ref:`dataverse.signposting.level1-author-limit` + + Sets the max number of authors to be shown in `level 1` profile. + If the number of authors (with identifier URLs) exceeds this value, no author links will be shown in `level 1` profile. + The default is 5. + +- :ref:`dataverse.signposting.level1-item-limit` + + Sets the max number of items/files which will be shown in `level 1` profile. Datasets with + too many files will not show any file links in `level 1` profile. They will be shown in `level 2` linkset only. + The default is 5. + +See also :ref:`signposting-api` in the API Guide. + +Additional Discoverability Through Integrations +----------------------------------------------- + +See :ref:`integrations-discovery` in the Integrations section for additional discovery methods you can enable. diff --git a/doc/sphinx-guides/source/admin/index.rst b/doc/sphinx-guides/source/admin/index.rst index b97d9161d50..ac81aa737a7 100755 --- a/doc/sphinx-guides/source/admin/index.rst +++ b/doc/sphinx-guides/source/admin/index.rst @@ -14,6 +14,7 @@ This guide documents the functionality only available to superusers (such as "da dashboard external-tools + discoverability harvestclients harvestserver metadatacustomization diff --git a/doc/sphinx-guides/source/admin/integrations.rst b/doc/sphinx-guides/source/admin/integrations.rst index 4f919ca6bf2..795c7239aae 100644 --- a/doc/sphinx-guides/source/admin/integrations.rst +++ b/doc/sphinx-guides/source/admin/integrations.rst @@ -179,15 +179,12 @@ Avgidea Data Search Researchers can use a Google Sheets add-on to search for Dataverse installation's CSV data and then import that data into a sheet. See `Avgidea Data Search `_ for details. +.. _integrations-discovery: + Discoverability --------------- -Integration with `DataCite `_ is built in to the Dataverse Software. When datasets are published, metadata is sent to DataCite. You can further increase the discoverability of your datasets by setting up additional integrations. - -OAI-PMH (Harvesting) -++++++++++++++++++++ - -The Dataverse Software supports a protocol called OAI-PMH that facilitates harvesting datasets from one system into another. For details on harvesting, see the :doc:`harvestserver` section. +A number of builtin features related to data discovery are listed under :doc:`discoverability` but you can further increase the discoverability of your data by setting up integrations. SHARE +++++ diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index 0c4978b58f0..07cba1efccf 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -2084,10 +2084,34 @@ The response is a JSON object described in the :doc:`/api/external-tools` sectio export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV export VERSION=1.0 export TOOL_ID=1 - curl -H "X-Dataverse-key: $API_TOKEN" -H "Accept:application/json" "$SERVER_URL/api/datasets/:persistentId/versions/$VERSION/toolparams/$TOOL_ID?persistentId=$PERSISTENT_IDENTIFIER" +.. _signposting-api: + +Retrieve Signposting Information +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Dataverse supports :ref:`discovery-sign-posting` as a discovery mechanism. +Signposting involves the addition of a `Link `__ HTTP header providing summary information on GET and HEAD requests to retrieve the dataset page and a separate /linkset API call to retrieve additional information. + +Here is an example of a "Link" header: + +``Link: ;rel="cite-as", ;rel="describedby";type="application/vnd.citationstyles.csl+json",;rel="describedby";type="application/json+ld", ;rel="type",;rel="type", https://demo.dataverse.org/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.5072/FK2/YD5QDG;rel="license", ; rel="linkset";type="application/linkset+json"`` + +The URL for linkset information is discoverable under the ``rel="linkset";type="application/linkset+json`` entry in the "Link" header, such as in the example above. + +The reponse includes a JSON object conforming to the `Signposting `__ specification. +Signposting is not supported for draft dataset versions. + +.. code-block:: bash + + export SERVER_URL=https://demo.dataverse.org + export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/YD5QDG + export VERSION=1.0 + + curl -H "Accept:application/json" "$SERVER_URL/api/datasets/:persistentId/versions/$VERSION/linkset?persistentId=$PERSISTENT_IDENTIFIER" + Files ----- diff --git a/doc/sphinx-guides/source/developers/configuration.rst b/doc/sphinx-guides/source/developers/configuration.rst index 9ed20299de3..d342c28efc6 100644 --- a/doc/sphinx-guides/source/developers/configuration.rst +++ b/doc/sphinx-guides/source/developers/configuration.rst @@ -93,6 +93,7 @@ sub-scopes first. - All sub-scopes are below that. - Scopes are separated by dots (periods). - A scope may be a placeholder, filled with a variable during lookup. (Named object mapping.) +- The setting should be in kebab case (``signing-secret``) rather than camel case (``signingSecret``). Any consumer of the setting can choose to use one of the fluent ``lookup()`` methods, which hides away alias handling, conversion etc from consuming code. See also the detailed Javadoc for these methods. diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index e0fd7ebe40c..ac45bf81b56 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -2153,6 +2153,26 @@ See also these related database settings: - :ref:`:Authority` - :ref:`:Shoulder` + +.. _dataverse.signposting.level1-author-limit: + +dataverse.signposting.level1-author-limit ++++++++++++++++++++++++++++++++++++++++++ + +See :ref:`discovery-sign-posting` for details. + +Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable ``DATAVERSE_SIGNPOSTING_LEVEL1_AUTHOR_LIMIT``. + +.. _dataverse.signposting.level1-item-limit: + +dataverse.signposting.level1-item-limit ++++++++++++++++++++++++++++++++++++++++ + +See :ref:`discovery-sign-posting` for details. + +Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable ``DATAVERSE_SIGNPOSTING_LEVEL1_ITEM_LIMIT``. + + .. _feature-flags: Feature Flags diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java index e6745247071..f6837333f45 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetPage.java @@ -143,6 +143,8 @@ import edu.harvard.iq.dataverse.search.SearchServiceBean; import edu.harvard.iq.dataverse.search.SearchUtil; import edu.harvard.iq.dataverse.search.SolrClientService; +import edu.harvard.iq.dataverse.settings.JvmSettings; +import edu.harvard.iq.dataverse.util.SignpostingResources; import edu.harvard.iq.dataverse.util.FileMetadataUtil; import java.util.Comparator; import org.apache.solr.client.solrj.SolrQuery; @@ -6046,8 +6048,7 @@ public boolean downloadingRestrictedFiles() { } return false; } - - + //Determines whether this Dataset uses a public store and therefore doesn't support embargoed or restricted files public boolean isHasPublicStore() { return settingsWrapper.isTrueForKey(SettingsServiceBean.Key.PublicInstall, StorageIO.isPublicStore(dataset.getEffectiveStorageDriverId())); @@ -6080,5 +6081,19 @@ public String getWebloaderUrlForDataset(Dataset d) { return null; } } + + /** + * Add Signposting + * @return String + */ + public String getSignpostingLinkHeader() { + if (!workingVersion.isReleased()) { + return null; + } + SignpostingResources sr = new SignpostingResources(systemConfig, workingVersion, + JvmSettings.SIGNPOSTING_LEVEL1_AUTHOR_LIMIT.lookupOptional().orElse(""), + JvmSettings.SIGNPOSTING_LEVEL1_ITEM_LIMIT.lookupOptional().orElse("")); + return sr.getLinks(); + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java index f3abe222396..d40bc153141 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java @@ -83,6 +83,7 @@ import edu.harvard.iq.dataverse.metrics.MetricsUtil; import edu.harvard.iq.dataverse.makedatacount.MakeDataCountUtil; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key; import edu.harvard.iq.dataverse.util.ArchiverUtil; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.EjbUtil; @@ -93,6 +94,8 @@ import edu.harvard.iq.dataverse.util.json.JSONLDUtil; import edu.harvard.iq.dataverse.util.json.JsonLDTerm; import edu.harvard.iq.dataverse.util.json.JsonParseException; +import edu.harvard.iq.dataverse.util.json.JsonPrinter; +import edu.harvard.iq.dataverse.util.SignpostingResources; import edu.harvard.iq.dataverse.util.json.JsonUtil; import edu.harvard.iq.dataverse.search.IndexServiceBean; @@ -156,6 +159,7 @@ import org.glassfish.jersey.media.multipart.FormDataContentDisposition; import org.glassfish.jersey.media.multipart.FormDataParam; import com.amazonaws.services.s3.model.PartETag; +import edu.harvard.iq.dataverse.settings.JvmSettings; @Path("datasets") public class Datasets extends AbstractApiBean { @@ -558,7 +562,38 @@ public Response getVersionMetadataBlock(@Context ContainerRequestContext crc, return notFound("metadata block named " + blockName + " not found"); }, getRequestUser(crc)); } - + + /** + * Add Signposting + * @param datasetId + * @param versionId + * @param uriInfo + * @param headers + * @return + */ + @GET + @AuthRequired + @Path("{id}/versions/{versionId}/linkset") + public Response getLinkset(@Context ContainerRequestContext crc, @PathParam("id") String datasetId, @PathParam("versionId") String versionId, @Context UriInfo uriInfo, @Context HttpHeaders headers) { + if ( ":draft".equals(versionId) ) { + return badRequest("Signposting is not supported on the :draft version"); + } + User user = getRequestUser(crc); + return response(req -> { + DatasetVersion dsv = getDatasetVersionOrDie(req, versionId, findDatasetOrDie(datasetId), uriInfo, headers); + return ok(Json.createObjectBuilder().add( + "linkset", + new SignpostingResources( + systemConfig, + dsv, + JvmSettings.SIGNPOSTING_LEVEL1_AUTHOR_LIMIT.lookupOptional().orElse(""), + JvmSettings.SIGNPOSTING_LEVEL1_ITEM_LIMIT.lookupOptional().orElse("") + ).getJsonLinkset() + ) + ); + }, user); + } + @GET @AuthRequired @Path("{id}/modifyRegistration") diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java b/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java index 370bd631dd9..2b2a8de85f7 100644 --- a/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java +++ b/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java @@ -67,6 +67,11 @@ public enum JvmSettings { // API SETTINGS SCOPE_API(PREFIX, "api"), API_SIGNING_SECRET(SCOPE_API, "signing-secret"), + + // SIGNPOSTING SETTINGS + SCOPE_SIGNPOSTING(PREFIX, "signposting"), + SIGNPOSTING_LEVEL1_AUTHOR_LIMIT(SCOPE_SIGNPOSTING, "level1-author-limit"), + SIGNPOSTING_LEVEL1_ITEM_LIMIT(SCOPE_SIGNPOSTING, "level1-item-limit"), // FEATURE FLAGS SETTINGS SCOPE_FLAGS(PREFIX, "feature"), diff --git a/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java b/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java new file mode 100644 index 00000000000..54be3a8765f --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/util/SignpostingResources.java @@ -0,0 +1,269 @@ +package edu.harvard.iq.dataverse.util; + +/** + Eko Indarto, DANS + Vic Ding, DANS + + This file prepares the resources used in Signposting + + Two configurable options allow changing the limit for the number of authors or datafiles (items) allowed in the level-1 header. + If more than this number exists, no entries of that type are included in the level-1 header. + See the documentation for the dataverse.signposting.level1-author-limit, and dataverse.signposting.level1-item-limit + + Also note that per the signposting spec, authors for which no PID/URL has been provided are not included in the signposting output. + + */ + +import edu.harvard.iq.dataverse.*; +import edu.harvard.iq.dataverse.dataset.DatasetUtil; +import javax.json.Json; +import javax.json.JsonArrayBuilder; +import javax.json.JsonObjectBuilder; +import java.util.ArrayList; +import java.util.LinkedList; +import java.util.List; +import java.util.Objects; +import java.util.logging.Logger; + +import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder; + +public class SignpostingResources { + private static final Logger logger = Logger.getLogger(SignpostingResources.class.getCanonicalName()); + SystemConfig systemConfig; + DatasetVersion workingDatasetVersion; + static final String defaultFileTypeValue = "https://schema.org/Dataset"; + static final int defaultMaxLinks = 5; + int maxAuthors; + int maxItems; + + public SignpostingResources(SystemConfig systemConfig, DatasetVersion workingDatasetVersion, String authorLimitSetting, String itemLimitSetting) { + this.systemConfig = systemConfig; + this.workingDatasetVersion = workingDatasetVersion; + maxAuthors = SystemConfig.getIntLimitFromStringOrDefault(authorLimitSetting, defaultMaxLinks); + maxItems = SystemConfig.getIntLimitFromStringOrDefault(itemLimitSetting, defaultMaxLinks); + } + + + /** + * Get key, values of signposting items and return as string + * + * @return comma delimited string + */ + public String getLinks() { + List valueList = new LinkedList<>(); + Dataset ds = workingDatasetVersion.getDataset(); + + String identifierSchema = getAuthorsAsString(getAuthorURLs(true)); + if (identifierSchema != null && !identifierSchema.isEmpty()) { + valueList.add(identifierSchema); + } + + if (!Objects.equals(ds.getPersistentURL(), "")) { + String citeAs = "<" + ds.getPersistentURL() + ">;rel=\"cite-as\""; + valueList.add(citeAs); + } + + List fms = workingDatasetVersion.getFileMetadatas(); + String items = getItems(fms); + if (items != null && !Objects.equals(items, "")) { + valueList.add(items); + } + + String describedby = "<" + ds.getGlobalId().asURL().toString() + ">;rel=\"describedby\"" + ";type=\"" + "application/vnd.citationstyles.csl+json\""; + describedby += ",<" + systemConfig.getDataverseSiteUrl() + "/api/datasets/export?exporter=schema.org&persistentId=" + + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() + ">;rel=\"describedby\"" + ";type=\"application/json+ld\""; + valueList.add(describedby); + + String type = ";rel=\"type\""; + type = ";rel=\"type\",<" + defaultFileTypeValue + ">;rel=\"type\""; + valueList.add(type); + + String licenseString = DatasetUtil.getLicenseURI(workingDatasetVersion) + ";rel=\"license\""; + valueList.add(licenseString); + + String linkset = "<" + systemConfig.getDataverseSiteUrl() + "/api/datasets/:persistentId/versions/" + + workingDatasetVersion.getVersionNumber() + "." + workingDatasetVersion.getMinorVersionNumber() + + "/linkset?persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() + "> ; rel=\"linkset\";type=\"application/linkset+json\""; + valueList.add(linkset); + logger.fine(String.format("valueList is: %s", valueList)); + + return String.join(", ", valueList); + } + + public JsonArrayBuilder getJsonLinkset() { + Dataset ds = workingDatasetVersion.getDataset(); + GlobalId gid = ds.getGlobalId(); + String landingPage = systemConfig.getDataverseSiteUrl() + "/dataset.xhtml?persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier(); + JsonArrayBuilder authors = getJsonAuthors(getAuthorURLs(false)); + JsonArrayBuilder items = getJsonItems(); + + String licenseString = DatasetUtil.getLicenseURI(workingDatasetVersion); + + JsonArrayBuilder mediaTypes = Json.createArrayBuilder(); + mediaTypes.add( + jsonObjectBuilder().add( + "href", + gid.asURL().toString() + ).add( + "type", + "application/vnd.citationstyles.csl+json" + ) + ); + + mediaTypes.add( + jsonObjectBuilder().add( + "href", + systemConfig.getDataverseSiteUrl() + "/api/datasets/export?exporter=schema.org&persistentId=" + ds.getProtocol() + ":" + ds.getAuthority() + "/" + ds.getIdentifier() + ).add( + "type", + "application/json+ld" + ) + ); + JsonArrayBuilder linksetJsonObj = Json.createArrayBuilder(); + + JsonObjectBuilder mandatory; + mandatory = jsonObjectBuilder().add("anchor", landingPage) + .add("cite-as", Json.createArrayBuilder().add(jsonObjectBuilder().add("href", ds.getPersistentURL()))) + .add("type", + Json.createArrayBuilder().add(jsonObjectBuilder().add("href", "https://schema.org/AboutPage")) + .add(jsonObjectBuilder().add("href", defaultFileTypeValue))); + + if (authors != null) { + mandatory.add("author", authors); + } + if (licenseString != null && !licenseString.isBlank()) { + mandatory.add("license", jsonObjectBuilder().add("href", licenseString)); + } + if (!mediaTypes.toString().isBlank()) { + mandatory.add("describedby", mediaTypes); + } + if (items != null) { + mandatory.add("item", items); + } + linksetJsonObj.add(mandatory); + + // remove scholarly type as shown already on landing page + for (FileMetadata fm : workingDatasetVersion.getFileMetadatas()) { + DataFile df = fm.getDataFile(); + JsonObjectBuilder itemAnchor = jsonObjectBuilder().add("anchor", getPublicDownloadUrl(df)); + itemAnchor.add("collection", Json.createArrayBuilder().add(jsonObjectBuilder() + .add("href", landingPage))); + linksetJsonObj.add(itemAnchor); + } + + return linksetJsonObj; + } + + /*Method retrieves all the authors of a DatasetVersion with a valid URL and puts them in a list + * @param limit - if true, will return an empty list (for level 1) if more than maxAuthor authors with URLs are found + */ + private List getAuthorURLs(boolean limit) { + List authorURLs = new ArrayList(maxAuthors); + int visibleAuthorCounter = 0; + + for (DatasetAuthor da : workingDatasetVersion.getDatasetAuthors()) { + logger.fine(String.format("idtype: %s; idvalue: %s, affiliation: %s; identifierUrl: %s", da.getIdType(), + da.getIdValue(), da.getAffiliation(), da.getIdentifierAsUrl())); + String authorURL = ""; + authorURL = getAuthorUrl(da); + if (authorURL != null && !authorURL.isBlank()) { + // return empty if number of visible author more than max allowed + // >= since we're comparing before incrementing visibleAuthorCounter + if (visibleAuthorCounter >= maxAuthors) { + authorURLs.clear(); + break; + } + authorURLs.add(authorURL); + visibleAuthorCounter++; + + + } + } + return authorURLs; + } + + + /** + * Get Authors as string + * For example: + * if author has VIAF + * Link: ; rel="author" + * + * @param datasetAuthorURLs list of all DatasetAuthors with a valid URL + * @return all the author links in a string + */ + private String getAuthorsAsString(List datasetAuthorURLs) { + String singleAuthorString; + String identifierSchema = null; + for (String authorURL : datasetAuthorURLs) { + singleAuthorString = "<" + authorURL + ">;rel=\"author\""; + if (identifierSchema == null) { + identifierSchema = singleAuthorString; + } else { + identifierSchema = String.join(",", identifierSchema, singleAuthorString); + } + } + logger.fine(String.format("identifierSchema: %s", identifierSchema)); + return identifierSchema; + } + + /* + * + */ + private String getAuthorUrl(DatasetAuthor da) { + String authorURL = ""; + //If no type and there's a value, assume it is a URL (is this reasonable?) + //Otherise, get the URL using the type and value + if (da.getIdType() != null && !da.getIdType().isBlank() && da.getIdValue()!=null) { + authorURL = da.getIdValue(); + } else { + authorURL = da.getIdentifierAsUrl(); + } + return authorURL; + } + + private JsonArrayBuilder getJsonAuthors(List datasetAuthorURLs) { + if(datasetAuthorURLs.isEmpty()) { + return null; + } + JsonArrayBuilder authors = Json.createArrayBuilder(); + for (String authorURL : datasetAuthorURLs) { + authors.add(jsonObjectBuilder().add("href", authorURL)); + } + return authors; + } + + private String getItems(List fms) { + if (fms.size() > maxItems) { + logger.fine(String.format("maxItem is %s and fms size is %s", maxItems, fms.size())); + return null; + } + + String itemString = null; + for (FileMetadata fm : fms) { + DataFile df = fm.getDataFile(); + if (itemString == null) { + itemString = "<" + getPublicDownloadUrl(df) + ">;rel=\"item\";type=\"" + df.getContentType() + "\""; + } else { + itemString = String.join(",", itemString, "<" + getPublicDownloadUrl(df) + ">;rel=\"item\";type=\"" + df.getContentType() + "\""); + } + } + return itemString; + } + + private JsonArrayBuilder getJsonItems() { + JsonArrayBuilder items = Json.createArrayBuilder(); + for (FileMetadata fm : workingDatasetVersion.getFileMetadatas()) { + DataFile df = fm.getDataFile(); + items.add(jsonObjectBuilder().add("href", getPublicDownloadUrl(df)).add("type", df.getContentType())); + } + + return items; + } + + private String getPublicDownloadUrl(DataFile dataFile) { + GlobalId gid = dataFile.getGlobalId(); + return FileUtil.getPublicDownloadUrl(systemConfig.getDataverseSiteUrl(), + ((gid != null) ? gid.asString() : null), dataFile.getId()); + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java index 6ec4209336d..ad7e9718727 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java @@ -976,4 +976,23 @@ public static JsonObjectBuilder mapToObject(Map in) { in.keySet().forEach( k->b.add(k, in.get(k)) ); return b; } + + + /** + * Get signposting from Dataset + * @param ds the designated Dataset + * @return json linkset + */ + public static JsonObjectBuilder jsonLinkset(Dataset ds) { + return jsonObjectBuilder() + .add("anchor", ds.getPersistentURL()) + .add("cite-as", Json.createArrayBuilder().add(jsonObjectBuilder().add("href", ds.getPersistentURL()))) + .add("type", Json.createArrayBuilder().add(jsonObjectBuilder().add("href", "https://schema.org/AboutPage"))) + .add("author", ds.getPersistentURL()) + .add("protocol", ds.getProtocol()) + .add("authority", ds.getAuthority()) + .add("publisher", BrandingUtil.getInstallationBrandName()) + .add("publicationDate", ds.getPublicationDateFormattedYYYYMMDD()) + .add("storageIdentifier", ds.getStorageIdentifier()); + } } diff --git a/src/main/webapp/dataset.xhtml b/src/main/webapp/dataset.xhtml index 11b449e815e..d2ed76dfed1 100644 --- a/src/main/webapp/dataset.xhtml +++ b/src/main/webapp/dataset.xhtml @@ -88,6 +88,11 @@ + + + + + diff --git a/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java b/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java new file mode 100644 index 00000000000..e22d0740c48 --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/api/SignpostingIT.java @@ -0,0 +1,106 @@ +package edu.harvard.iq.dataverse.api; + +import com.jayway.restassured.RestAssured; +import com.jayway.restassured.http.ContentType; + +import static com.jayway.restassured.RestAssured.given; +import com.jayway.restassured.response.Response; + +import edu.harvard.iq.dataverse.util.json.JsonUtil; + +import static javax.ws.rs.core.Response.Status.CREATED; +import static javax.ws.rs.core.Response.Status.OK; +import static org.junit.Assert.assertTrue; + +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import javax.json.JsonObject; + +import org.junit.BeforeClass; +import org.junit.Test; + +public class SignpostingIT { + + @BeforeClass + public static void setUpClass() { + RestAssured.baseURI = UtilIT.getRestAssuredBaseUri(); + } + + @Test + public void testSignposting() { + + Response createUser = UtilIT.createRandomUser(); + createUser.then().assertThat().statusCode(OK.getStatusCode()); + String username = UtilIT.getUsernameFromResponse(createUser); + String apiToken = UtilIT.getApiTokenFromResponse(createUser); + Response toggleSuperuser = UtilIT.makeSuperUser(username); + toggleSuperuser.then().assertThat().statusCode(OK.getStatusCode()); + + Response createDataverse = UtilIT.createRandomDataverse(apiToken); + createDataverse.then().assertThat().statusCode(CREATED.getStatusCode()); + String dataverseAlias = UtilIT.getAliasFromResponse(createDataverse); + Integer dataverseId = UtilIT.getDataverseIdFromResponse(createDataverse); + + Response createDataset = UtilIT.createRandomDatasetViaNativeApi(dataverseAlias, apiToken); + createDataset.prettyPrint(); + createDataset.then().assertThat().statusCode(CREATED.getStatusCode()); + + String datasetPid = UtilIT.getDatasetPersistentIdFromResponse(createDataset); + + Response publishDataverse = UtilIT.publishDataverseViaNativeApi(dataverseAlias, apiToken); + publishDataverse.then().assertThat().statusCode(OK.getStatusCode()); + Response publishDataset = UtilIT.publishDatasetViaNativeApi(datasetPid, "major", apiToken); + publishDataset.then().assertThat().statusCode(OK.getStatusCode()); + + String datasetLandingPage = RestAssured.baseURI + "/dataset.xhtml?persistentId=" + datasetPid; + System.out.println("Checking dataset landing page for Signposting: " + datasetLandingPage); + Response getHtml = given().get(datasetLandingPage); + + System.out.println("Link header: " + getHtml.getHeader("Link")); + + getHtml.then().assertThat().statusCode(OK.getStatusCode()); + + // Make sure there's Signposting stuff in the "Link" header such as + // the dataset PID, cite-as, etc. + String linkHeader = getHtml.getHeader("Link"); + assertTrue(linkHeader.contains(datasetPid)); + assertTrue(linkHeader.contains("cite-as")); + assertTrue(linkHeader.contains("describedby")); + + Response headHtml = given().head(datasetLandingPage); + + System.out.println("Link header: " + headHtml.getHeader("Link")); + + headHtml.then().assertThat().statusCode(OK.getStatusCode()); + + // Make sure there's Signposting stuff in the "Link" header such as + // the dataset PID, cite-as, etc. + linkHeader = getHtml.getHeader("Link"); + assertTrue(linkHeader.contains(datasetPid)); + assertTrue(linkHeader.contains("cite-as")); + assertTrue(linkHeader.contains("describedby")); + + Pattern pattern = Pattern.compile("<([^<]*)> ; rel=\"linkset\";type=\"application\\/linkset\\+json\""); + Matcher matcher = pattern.matcher(linkHeader); + matcher.find(); + String linksetUrl = matcher.group(1); + + System.out.println("Linkset URL: " + linksetUrl); + + Response linksetResponse = given().accept(ContentType.JSON).get(linksetUrl); + + String responseString = linksetResponse.getBody().asString(); + + JsonObject data = JsonUtil.getJsonObject(responseString).getJsonObject("data"); + JsonObject lso = data.getJsonArray("linkset").getJsonObject(0); + System.out.println("Linkset: " + lso.toString()); + + linksetResponse.then().assertThat().statusCode(OK.getStatusCode()); + + assertTrue(lso.getString("anchor").indexOf("/dataset.xhtml?persistentId=" + datasetPid) > 0); + assertTrue(lso.containsKey("describedby")); + + } + +} diff --git a/tests/integration-tests.txt b/tests/integration-tests.txt index 1e9110be2de..158393791f2 100644 --- a/tests/integration-tests.txt +++ b/tests/integration-tests.txt @@ -1 +1 @@ -DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT +DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT,SignpostingIT