Skip to content

Signposting#7919

Closed
janvanmansum wants to merge 66 commits intoIQSS:developfrom
DANS-KNAW:signposting
Closed

Signposting#7919
janvanmansum wants to merge 66 commits intoIQSS:developfrom
DANS-KNAW:signposting

Conversation

@janvanmansum
Copy link
Contributor

@janvanmansum janvanmansum commented Jun 3, 2021

Developers:
@vicding-mi
@ekoi

Context:
This Signposting feature was developed by Data Archiving and Networked Services (DANS-KNAW), the Netherlands. It is part of ODISSEI Portal work group and funded by ODISSEI, "Open Data Infrastructure for Social Science and Economic Innovations". ODISSEI has received funding from the Dutch Research Council (NWO).

What this PR does / why we need it:

This PR adds Signposting support to Dataverse 5.5.

Signposting is an approach to make the scholarly web more friendly to machines. It uses Typed Links as a means to clarify patterns that occur repeatedly in scholarly portals. For resources of any media type, these typed links are provided in HTTP Link headers. For HTML resources, they may additionally be provided in HTML link elements.

Which issue(s) this PR closes:
Closes #5962

Special notes for your reviewer:
TODO: when multi-license is merged, should adapt signposting as multi-license provides reliable way to fetch valid license URI which is not available in vanilla Dataverse yet.

Suggestions on how to test this:
cURL or Postman

To get level 1 profile

curl -I http://localhost:8080/dataset.xhtml\?persistentId\=doi:10.17026/dans-xdg-jtew

HTTP/1.1 200 OK
Server: Payara Server  5.2020.7 #badassfish
X-Powered-By: Servlet/4.0 JSP/2.3 (Payara Server  5.2020.7 #badassfish Java/Azul Systems, Inc./11)
Set-Cookie: JSESSIONID=b91e9e3a3cdbb7e583458a0b75d0; Path=/; HttpOnly
Link: 
<https://doi.org/10.17026/dans-xdg-jtew>;rel="cite-as", 
<https://doi.org/10.17026/dans-xdg-jtew>;rel="describedby";type="application/vnd.citationstyles.csl+json",
<http://localhost:8080/api/datasets/export?exporter=schema.org&persistentId=doi:10.17026/dans-xdg-jtew>;rel="describedby";type="application/json+ld", 
<https://schema.org/AboutPage>;rel="type",
<https://schema.org/Dataset>;rel="type", 
https://creativecommons.org/licenses/cc0/;rel="license", 
<http://localhost:8080/api/datasets/:persistentId/versions/1.0/linkset?persistentId=doi:10.17026/dans-xdg-jtew> ; rel="linkset";type="application/linkset+json"
Transfer-Encoding: chunked
Content-Type: text/html;charset=UTF-8
X-Frame-Options: SAMEORIGIN

To get level 2 linkset

Please note this line in level 1 profile, this give the link to linkset

<http://localhost:8080/api/datasets/:persistentId/versions/1.0/linkset?persistentId=doi:10.17026/dans-xdg-jtew> ; rel="linkset";type="application/linkset+json"

curl http://localhost:8080/api/datasets/:persistentId/versions/1.0/linkset\?persistentId\=doi:10.17026/dans-xdg-jtew

Here is a sample response in JSON

{
  "linkset": [
    {
      "anchor": "http://localhost:8080/dataset.xhtml?persistentId=doi:10.17026/dans-xdg-jtew",
      "cite-as": [
        {
          "href": "https://doi.org/10.17026/dans-xdg-jtew"
        }
      ],
      "type": [
        {
          "href": "https://schema.org/AboutPage"
        },
        {
          "href": "https://schema.org/Dataset"
        }
      ],
      "license": {
        "href": "https://creativecommons.org/licenses/cc0/"
      },
      "describedby": [
        {
          "href": "https://doi.org/10.17026/dans-xdg-jtew",
          "type": "application/vnd.citationstyles.csl+json"
        },
        {
          "href": "http://localhost:8080/api/datasets/export?exporter=schema.org&persistentId=doi:10.17026/dans-xdg-jtew",
          "type": "application/json+ld"
        }
      ],
      "item": []
    }
  ]
}

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

No

Is there a release notes update needed for this change?:

Additional documentation:

Please consult Signposting website

Vic Ding and others added 30 commits November 4, 2020 12:32
* uses external js
* adding signposting configuration setting
* signposting poc from eko
* error fix
* move the signposting location in code
* Signposting with MicroSettings cleanup v1 xhtml revert to vanilla
* clean up
* bug fix and add default file type; TODO: Bug on the baseurl (empty string)
* hide type when do not use default
* set default when no config is set for signposting
* modification according to reviews
* move long json string from code to bunddle
* allow empty config on the level 2 profile

Co-authored-by: Kevin Condon <kcondon@hmdc.harvard.edu>
Co-authored-by: ekoi <eko.indarto@dans.knaw.nl>
@pdurbin
Copy link
Member

pdurbin commented Jun 9, 2021

there seem to be additional changed files that have nothing to do with Signposting

@janvanmansum this has been resolved by your recent commits. Thanks!

Some docs would be appreciated. Maybe a new page in the Admin Guide?

Also, if you could add a release note, that would be great. Here's the process: https://guides.dataverse.org/en/5.5/developers/making-releases.html#write-release-notes

Thanks!

@coveralls
Copy link

coveralls commented Jun 9, 2021

Coverage Status

Coverage decreased (-0.06%) to 19.264% when pulling da8f52e on DANS-KNAW:signposting into b61b572 on IQSS:develop.

Vic Ding and others added 2 commits July 1, 2021 10:13
* The level 2 linkset can be fetched by visiting the dedicated linkset page for
that artifact. The link can be seen in level 1 links with key name `rel="linkset"`.

The configuration is stored as JSON string in the `Bundle.properties` file, key name is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see from code below that this is just a default setting and that there is SignpostingConf setting as well. Minimally, the docs should indicate that it is set-able without editing the Bundle. More deeply - I'm not sure the Bundle is the right place for a default, i.e. it isn't something that should be internationalized. If it is easier to get a default for the whole setting, should it go in the code instead? Or?

Comment on lines +561 to +562
if (signpostingConf.isEmpty()) return notFound("Configuration key for signposting is empty [SignpostingConf]");
if (dsv.getId() == null) return notFound("Dataset not found: Id is empty");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about the error responses - not sure I would want users to know whether the config key exists. One option would just be to send no message or "Signposting not enabled". I also wonder if 501/not implemented or 409/conflict be better for the first?
For the second - I assume the getDatasetVersionOrDie (findDatasetOrDie) above will return before you could hit this?

<p:watermark for="inputText" value="#{dsfv.datasetField.datasetFieldType.watermark}"></p:watermark>

<p:inputTextarea value="#{dsfv.valueForEdit}" id="description" pt:aria-required="#{dsf.required}"
<p:inputTextarea value="#{dsfv.valueForEdit}" id="description" tabindex="#{block.index+1}"
Copy link
Member

@qqmyers qqmyers Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of the differences in this file part of signposting?

this.systemConfig = systemConfig;
this.workingDatasetVersion = workingDatasetVersion;
if (jsonSetting == null) {
jsonSetting = BundleUtil.getStringFromBundle("signposting.configuration.SignpostingConf");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to my comment above - would it be simpler to just have a static string for the default in this class? Or to use defaults for the individual entries below (e,g, as useDefaultFileType defaults to true below - actually several fields get defaults that way already).

));

String authorURL = "";
authorURL = getAuthorUrl(da);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that metadata fields such as the components of the author need to be escaped/sanitized before they are used in html - the DatasetVersion.getJsonLd() method in v5.6+ is a good example of how to get safe values for individual fields and how to sanitize the entire response you're creating. Happy to discuss further.


private String getItems(List<FileMetadata> fms) {
if (fms.size() > maxItems) {
logger.info(String.format("maxItem is %s and fms size is %s", maxItems, fms.size()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop to fine or remove before finalizing the PR...

logger.warning(String.format("Error getting storageID from file; original error message is: %s", e.getLocalizedMessage()));
}

if (storageIO instanceof SwiftAccessIO) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions about this:

  • Given that files may be restricted, that an admin could move where files are stored to a different bucket or store type, etc., does it make sense to provide direct URLs to the storage?
  • Is Swift the only store that needs special processing? (S3, remote stores that are coming ~soon?)
  • If needed, should this be optional? (Personally, I think I'd just send the Dataverse filedownload URL and let it handle access control and potential temporary/permanent redirects when that URL is accessed.)

String fileDownloadUrl;
SwiftAccessIO<DataFile> swiftIO = (SwiftAccessIO<DataFile>) storageIO;
try {
swiftIO.open();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely sure about swift, but other stores, once you do an open(), you have streams that have to be closed to avoid resource leaks.

* @param sr corresponding SignpostingResources
* @return json linkset
*/
public static JsonArrayBuilder jsonLinkset(SignpostingResources sr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? If you have an sr, can't you call sr.getJsonLinkset() yourself?

*/
public String getSignpostingLinkHeader() {
if (!workingVersion.isReleased())
return "DRAFT";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why DRAFT? is this expected with SignPosting? Or should the header not be included if the dataset is draft?

* @param ds the designated Dataset
* @return json linkset
*/
public static JsonObjectBuilder jsonLinkset(Dataset ds) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used? Also - since JsonPrinter is a singleton object, static doesn't change much.

.add("anchor", ds.getPersistentURL())
.add("cite-as", Json.createArrayBuilder().add(jsonObjectBuilder().add("href", ds.getPersistentURL())))
.add("type", Json.createArrayBuilder().add(jsonObjectBuilder().add("href", "https://schema.org/AboutPage")))
.add("author", ds.getPersistentURL())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

author is the persistentUrl?

@@ -1,4 +1,5 @@
dataverse.db.host=localhost
dataverse.db.host=postgres
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these aren't part of signposting!

Copy link
Member

@qqmyers qqmyers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't attempt to run this or dig too much into the specific contents, but left a few comments that I hope are helpful. I'm not sure if the plan is to wait for multiple licenses to get merged before this goes forward or not, but thought I'd at least give a quick readthrough now.

private String getAuthorUrl(DatasetAuthor da) {
String authorURL = "";
if (da.getIdValue() != null && !da.getIdValue().trim().isEmpty()) {
authorURL = da.getIdValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what you want? - any idValue gets sent back even if the getIdentifierAsUrl() would return a value? (Perhaps there shouldn't be an else in the next line?

@qqmyers qqmyers added the GDCC: DANS related to GDCC work for DANS label Sep 8, 2021
@qqmyers
Copy link
Member

qqmyers commented Feb 2, 2022

@vicding-mi - FYI the multi-license PR has just been merged, so if that was a blocker here, it's gone. I haven't looked back through to see if the other comments have been addressed, but if you can make any updates for those and for multi-license that you think are needed, please do and then flag me and/or mark this for review so we can look at it again and move it towards being merged.

@vicding-mi
Copy link
Contributor

Many thanks @qqmyers , I will recheck and proceed.

@pdurbin
Copy link
Member

pdurbin commented Apr 5, 2022

Closing in favor of this newer pull request:

@pdurbin pdurbin closed this Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GDCC: DANS related to GDCC work for DANS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Signposting support in Dataverse

10 participants