Skip to content

Conversation

@lfoppiano
Copy link
Collaborator

This PR adds basic bibliographic data in response:

{
    "application": "software-mentions",
    "version": "0.8.2-SNAPSHOT",
    "revision": "b2c5e558",
    "date": "2025-10-30T07:57+0000",
    "md5": "7585CD828983A1205E684C8DA31E622C",
    "biblio": {
        "doi": "10.1080/27658511.2023.2299549",
        "title": "Exploring loss and damage from climate change and global perspectives that influence response mechanism in vulnerable communities",
        "authors": "Cyril Joseph Effiong , Jamila Musa Wakawa Zanna, David Hannah and Fraser Sugden Cyril Joseph Effiong Cyril Joseph Effiong"
    },
    "pages": [

The author list needs to be polished, probably by selecting some additional processing as it's done in Grobid.

The same is applied to extraction from TEI or JATS.

@jameshowison
Copy link
Contributor

This is cool! So now we can extract everything needed to link papers and mentions from the response file?

@lfoppiano
Copy link
Collaborator Author

Yes! I need to fix the author list, but it's nearly ready.

@lfoppiano lfoppiano marked this pull request as ready for review November 26, 2025 11:27
@lfoppiano lfoppiano marked this pull request as draft November 26, 2025 12:41
@lfoppiano lfoppiano marked this pull request as ready for review November 26, 2025 14:03
@lfoppiano
Copy link
Collaborator Author

lfoppiano commented Nov 26, 2025

@jameshowison I believe that this should be completed (if you could test it or ask someone to double-check that would be perhaps better). Now we get the de-duplicated articles from PDF (thorugh Grobid header processing) and the authors from TEI.

Example:

{
	"application": "software-mentions",
	"version": "0.8.2-SNAPSHOT",
	"revision": "b2c5e558",
	"date": "2025-11-26T14:00+0000",
	"md5": "18148FC55E83CD12862CC12341A0D134",
	"biblio": {
		"doi": "10.1111/nph.19682",
		"title": "Dynamics and drivers of mycorrhizal fungi after glacier retreat",
		"authors": "Alexis Carteron, Isabel Cantera, Alessia Guerrieri, Silvio Marta, Aurélie Bonin, Roberto Ambrosini, Fabien Anthelme, Sergio Azzoni, Peter Almond, Pablo Alviz Gazitúa, Sophie Cauvy-Fraunié, Jorge Luis, Ceballos Lievano, Pritam Chand, Chand Sharma, J Clague, Justiniano Alejo, Cochachín Rapre, Chiara Compostella, Cruz Encarnación, Olivier Dangles, Andre Eger, Sergey Erokhin, Andrea Franzetti, Ludovic Gielly, Fabrizio Gili, Mauro Gobbi, Sigmund Hågvar, Norine Khedim, Isela Meneses, Gwendolyn Peyre, Francesca Pittino, Antoine Rabatel, Nurai Urseitova, Yan Yang, Vitalii Zaginaev, Andrea Zerboni, Anaïs Zimmer, Pierre Taberlet, Adele Diolaiuti, Jerome Poulenard, Wilfried Thuiller, Marco Caccianiga, Francesco Gentile, Ficetola"
	},
	"mentions": [
		{
			"type": "software",
			"software-type": "software",
			"software-name": {
				"rawForm": "OBITools",
				"normalizedForm": "OBITools",
				"offsetStart": 78,
				"offsetEnd": 86
[...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants