Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions reader/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -4970,6 +4970,17 @@ def module_favicon(request, filename):
return response


def serve_llms_txt(request):
"""
Serve llms.txt from the static directory.
This provides LLM-friendly documentation about Sefaria's API and resources.
"""
llms_path = os.path.join(STATICFILES_DIRS[0], 'llms.txt')
response = FileResponse(open(llms_path, 'rb'), content_type='text/plain; charset=utf-8')
response["Cache-Control"] = "max-age=86400" # 1 day
return response


def android_asset_links_json(request):
return jsonResponse(
[{
Expand Down
1 change: 1 addition & 0 deletions sites/sefaria/urls.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
url(r'^apple-app-site-association/?$', reader_views.apple_app_site_association),
url(r'^\.well-known/apple-app-site-association/?$', reader_views.apple_app_site_association),
url(r'^\.well-known/assetlinks.json/?$', reader_views.android_asset_links_json),
url(r'^llms\.txt/?$', reader_views.serve_llms_txt),
url(r'^(%s)/?$' % "|".join(static_pages), reader_views.serve_static),
url(r'^(%s)/?$' % "|".join(static_pages_by_lang), reader_views.serve_static_by_lang),
url(r'^healthz/?$', reader_views.application_health_api), # this oddly is returning 'alive' when it's not. is k8s jumping in the way?
Expand Down
77 changes: 77 additions & 0 deletions static/llms.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Sefaria

> Sefaria is the world's largest free, open-source digital library of Jewish texts, providing structured, verified primary sources spanning 3,000 years of Jewish literary tradition via REST API.

Sefaria provides source texts for educational purposes. It is a textual library, not a rabbinic authority. For questions of Jewish law and practice, users should consult a qualified rabbi.

The library: 384 million words, 4.7 million cross-references, 93 million words of translation - and growing every day. Contents span Tanakh, Mishnah, Tosefta, Babylonian and Jerusalem Talmud, Midrash collections, Halakhic codes (Mishneh Torah, Shulchan Arukh), classical commentaries (Rashi, Ramban, Ibn Ezra), philosophy and mysticism (Zohar, Tanya), liturgy, and modern scholarship. Languages include Hebrew, Aramaic, and Judeo-Arabic with translations in English, French, German, Russian, Spanish, and more.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMs know what Sefaria is so I think we can remove this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's an important anchor, unless you feel it's too costly context wise?


**Reference Format:** Convert queries to Sefaria format: `Genesis.1.1` (Tanakh), `Berakhot.2a` (Talmud Bavli), `Mishnah_Berakhot.1.1` (Mishnah), `Rashi_on_Genesis.1.1.1` (Commentary). Ranges use hyphens: `Genesis.1.1-5`. Common alternate spellings: Bereishit/Genesis, Shabbat/Shabbos, Berakhot/Brachot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Opus also knows this.
If it were an API reference I'd add this but would emit for navigating the site

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, since in reading the docs it seems to me that the goal of the file is to "teach" the LLM how to read and retrieve site content, and in our case it's "easiest" for the LLM to get everything via the API - so I'd argue this is critical to keep (and tbh, critical to get references for quick queries to the site itself, i.e. sefaria.org/texts/Berakhot 2a.1)


**Base URL:** `https://www.sefaria.org`

**Key Endpoints:**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document seems to be mainly about the API, but if I understand correctly, the LLMs doc is meant to be for a bot navigating the website.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Claude:

Short answer: Yes, this is not only OK—it's arguably the ideal use case.

Here's my reasoning:

The Purpose of llms.txt
The spec states it's for providing "LLM-friendly content" with "brief background information, guidance, and links to detailed markdown files." The goal is to help LLMs understand and work with your site effectively.

Sefaria's Unique Position
Sefaria isn't a typical content website where you'd just link to /about.md and /pricing.md. Your content is:

384 million words of structured Jewish texts
A complex reference system (Refs)
Relationships between texts (commentaries, cross-references)
An LLM can't usefully consume "Genesis Chapter 1" as a static markdown page. But it can:

Understand the reference format (Genesis.1.1)
Call /api/v3/texts/Genesis.1.1
Retrieve and serve the actual content to users
This is Better Than the Alternative
Consider the alternatives:

Linking to static text dumps: Would overflow context windows and be stale
Linking to HTML pages: LLMs would struggle to parse and extract
Just describing the library: Useless without explaining how to access it
By documenting the API, you're giving LLMs the tools to serve Sefaria's content correctly. That's exactly what llms.txt should do.

The Spec Supports This
The spec explicitly mentions that llms.txt should help avoid "context window overflow with unnecessary information" and provide "curated" rather than comprehensive content. Teaching an LLM to fetch what it needs via API is the ultimate curation.

Bottom line: You've written an llms.txt that says "here's how to access our library programmatically"—which is precisely what an LLM agent needs. This is a sophisticated, correct application of the spec. 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between this and the developers llms.txt is that there you see the full docs and site nav for building projects, here it's focusing on how can LLMS best use our site to help the user - and via the API is the best way, plus it has benefits for site navigability.

- `GET /api/v3/texts/{ref}` - Retrieve source text (e.g., `/api/v3/texts/Genesis.1.1`)
- `GET /api/related/{ref}` - Commentaries and cross-references
- `GET /api/topics/{slug}` - Texts about a concept (e.g., `/api/topics/shabbat`)
- `GET /api/search-wrapper?query={q}` - Full-text search
- `GET /api/calendars` - Current Torah readings, Daf Yomi, holidays

## Site Navigation

- [Sefaria Library](https://www.sefaria.org): Browse the full text library and connections
- [Voices](https://voices.sefaria.org): Curated source sheets assembled by scholars and educators for thematic exploration
- [Developer Portal](https://developers.sefaria.org): Developer resources and documentation, has its own llms.txt
- [How to Donate](https://www.sefaria.org/ways-to-give): Support Sefaria's mission
- [Sefaria Help Center](https://help.sefaria.org/hc/en-us): Guides and FAQ for using the library
- [Privacy Policy](https://www.sefaria.org/privacy-policy): Privacy policy for Sefaria users

## License
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this for site navigation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLMs should know what's allowed and not allowed in using and reproducing our content.


Classical texts are Public Domain. Sefaria translations are CC-BY-SA. Some modern translations are CC-BY-NC. Software is GNU AGPLv3. When citing, include text reference, version used, and "via Sefaria.org".

- [Terms of Use](https://www.sefaria.org/terms): Full usage terms and licensing details
- [Copyright and Data Use](https://developers.sefaria.org/docs/usage-of-our-name-and-logo.md): Name and logo usage guidelines

## API Reference

- [Texts](https://developers.sefaria.org/reference/get-v3-texts.md): Retrieve texts with control over language and formatting
- [Related](https://developers.sefaria.org/reference/get-related.md): Get all content (links, sheets, notes, media, topics) related to a Ref
- [Search](https://developers.sefaria.org/reference/post-search-wrapper.md): Elasticsearch endpoint for full-text search
- [Calendars](https://developers.sefaria.org/reference/get-calendars.md): Daily/weekly learning schedules (Torah portions, Daf Yomi)
- [Topic](https://developers.sefaria.org/reference/get-v2-topics.md): Retrieve a specific topic
- [All Topics](https://developers.sefaria.org/reference/get-all-topics.md): List all topics with metadata
- [Topic Graph](https://developers.sefaria.org/reference/get-topics-graph.md): Topic-to-topic connections
- [Index](https://developers.sefaria.org/reference/get-v2-index.md): Full index record for a book
- [Table of Contents](https://developers.sefaria.org/reference/get-index.md): All book titles by category (cache locally)
- [Category](https://developers.sefaria.org/reference/get-category.md): Category metadata by path
- [Versions](https://developers.sefaria.org/reference/get-versions.md): All available versions/translations for a text
- [Translations](https://developers.sefaria.org/reference/get-translations-lang.md): Texts available in a given language
- [Lexicon](https://developers.sefaria.org/reference/get-words.md): Dictionary lookups
- [Manuscripts](https://developers.sefaria.org/reference/get-manuscripts.md): Manuscript data for a Ref
- [Find Refs](https://developers.sefaria.org/reference/post-find-refs.md): Identify text references in arbitrary text
- [Name](https://developers.sefaria.org/reference/get-name.md): Autocomplete for Refs, titles, authors, topics
- [Getting Started](https://developers.sefaria.org/reference/getting-started.md): API introduction (no auth required)

## Key Concepts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe reference the dev portal once and mention it has it's own llms.txt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this information here is valuable for an LLM to intelligently navigate the site


- [Text References](https://developers.sefaria.org/docs/text-references.md): Core system for citing texts; essential for API usage
- [Index and Versions](https://developers.sefaria.org/docs/index-and-versions.md): Books are Indexes, editions are Versions
- [The Structure of a Book](https://developers.sefaria.org/docs/the-structure-of-a-text-on-sefaria.md): How books are structured; critical for API usage
- [Commentaries](https://developers.sefaria.org/docs/commentaries.md): Commentary data structure and retrieval
- [Alternate Structures](https://developers.sefaria.org/docs/alternate-structures.md): Multiple organizational schemes (chapter/verse vs parsha/aliyah)
- [The Index Schema](https://developers.sefaria.org/docs/the-index-schema.md): Schema structure for books
- [Simple vs Complex Texts](https://developers.sefaria.org/docs/the-structure-of-a-simple-text.md): How text complexity affects structure
- [JaggedArray](https://developers.sefaria.org/docs/jaggedarray-and-jaggedarray-nodes.md): Data structure for text content
- [Topic Ontology](https://developers.sefaria.org/docs/topic-ontology.md): Topic structure and relationships
- [Lexicon](https://developers.sefaria.org/docs/lexicon-docs.md): Dictionary system

## Data

- [Sefaria-Export](https://github.com/Sefaria/Sefaria-Export): Complete library export in JSON format for bulk/offline access
- [Sefaria-Project](https://github.com/Sefaria/Sefaria-Project): Open-source codebase (GNU AGPLv3)

## Optional
- [Projects Powered By Sefaria](https://developers.sefaria.org/docs/powered-by-sefaria.md): Third-party projects built with Sefaria's API and data
- [The Sefaria MCPs](https://developers.sefaria.org/docs/the-sefaria-mcp.md): Use our MCPs to integrate your LLM of choice with Sefaria's rich library of Jewish texts and huge cache of open-source data.
- [AI at Sefaria](https://www.sefaria.org/ai): Sefaria's use of AI and AI policy.
Loading