Skip to content

Conversation

@JakobMiesner
Copy link

@JakobMiesner JakobMiesner commented Sep 15, 2025

@JakobMiesner JakobMiesner force-pushed the feature/ils-library-kpis branch 3 times, most recently from 868574c to 9471bc3 Compare September 17, 2025 15:03
Comment on lines +18 to +20
2. Outsider/Stakeholder Dashboard
The Audience of this Dashboard is not the librarians, but rather the patrons and management.
It displays simpler KPIs that show if the library is working well, while being less detailed and technical than the internal dashboard.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure which KPIs would fit in the internal which to external dashboard, was this voiced as a requirement?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but since this RFC just handles the API endpoints, this does not impact the implementation of this RFC.

- `after_record_insert`
- `after_record_update`
- `after_record_delete`
- aggregate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify this part - for loans, we already have all the loans indexed with the creation date - why do we need to generate state events? couldn't we just query the loan search and get the answer? is it that because we need it as a number per day?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because:

  1. With the way this will be implemented, we can easily track the creation, update and deletion of all types of records in ILS. While we could also search for the creation date of loans, this is not as easily possible with updates/deletions. If we also use this stat for loan creations, we stay consistent with how the stat is implemented for the other record types.
  2. deleted records would no longer show up if we just look at the creation date. While this is no problem for loans, as they can not be deleted, it is a problem for other record types (e.g. documents).


### The specific KPIs are implemented as follows
1. Turnover rate of the Library collection:
1. number of new loans / number of loanable items
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great to note somewhere what kind of outcome is expected to measure. As in: why do we divide by number of loanable items?

Copy link
Author

@JakobMiesner JakobMiesner Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This KPI is used to measure the rate of use of the collection.
It is described in ISO 11620:2023 A.2.1.1. I will also mention the ISO in the RFC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please note down what we decided for implementation of number of loanable items per unit of time

- aggregate:
- count
- daily
- over composite field `loan_creation_method__document_availability_during_loan_creation`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets go through this section together IRL

- we also add information about the provider to the event for interlibrary loans, so future aggregations can differentiate the waiting time based on the provider


4. Number of changes to the Library collections:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this section I believe are missing to store also curators id - we are frequently asked for stats for an individual curator

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ask whether this is wanted.

## Drawbacks

### Periodic Stats
By implementing periodic stats to be added as events, it is easy to run into situations, where invenio-stats always only aggregates one document per time period.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand this part, let's chat

So a loan would also contain the field `waiting_time`.
This would allow to directly query the records indices for the KPIs, without the need of an additional stats index.
But this would introduce a lot of fields to the records indices, which are only used for KPIs.
Additionally, the dashboard would need to perform a lot of queries and aggregations, which might overload the search system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the stats will be queried by search, can you explain how is it different ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stats served by invenio-stats are already partially aggregated (e.g. on a daily basis).
If we use invenio-stats, a request asking for the number of new loans for a month leads to an aggreagtion that has to take a maxmimum of 31 documents into account.
When not using invenio-stats, the query triggers an aggregation over all new loans in those 31 days, which might be a large number of documents.

I updated the RFC to better describe this.


### KPIs

#### KPI 3.1 - Extracting for loan creation method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's discuss on this part


Alternatively, we could just listen to the signal `after_record_insert` from `invenio_records`, filter for loans and only during event generation or preprocessing extract the creation method. (Unsure if possible)

#### Median vs. Average
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this discussed with the librarians? is it possible to easily have both or leave it up to the dashboard "client"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Average was the one requested in the ticket but we could also add both. But this also depends on this discussion


#### Aggregation period
We aggregate most stats on a daily basis.
An exception of this are the loan durations and waiting times, which are aggregated monthly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the motivation behind montly aggregation? what do we gain/lose by aggregating daily too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it does not really matter for the average, it matters for the median (discussed here).
For average, we offer two separate queries that allows the dashboard to compute the average x/y

  • sum of metric in index: x
  • count of documents in index: y

We could aggregate both numbers daily and still, the dashboard could display the average for the whole month by doing $\frac{x_1 + x_2 ... x_{31}}{y_1 + x_2 ... + y_{31}}$.
But for median we have to decide the granularity during the aggregation and here daily does not really make sense.
As we might want to add the median and currently only one granularity per aggregation is allowed by the StatAggregator in invenio-stats, I decided to go for month.

@JakobMiesner JakobMiesner force-pushed the feature/ils-library-kpis branch from 9471bc3 to dd9db5a Compare September 23, 2025 12:22

By the design of `invenio-stats`, all stats are aggregated.
Currently, this aggregation is always done over a certain `field` (a field to group the documents in the events index by).
Some of our KPIs do not have such a `field`, as all documents should be grouped together.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain better what is the expected change? It is unclear what all documents should be grouped together practically means.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see the subsection "Global Aggregation - global-aggregation" and the attached PR

@JakobMiesner JakobMiesner force-pushed the feature/ils-library-kpis branch from c16a932 to c1ba1eb Compare September 23, 2025 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants