ScreenPy 2.0

Automated Screenplay Annotation for Extracting Storytelling Knowledge

ScreenPy is a Python package for parsing and analyzing screenplays to extract structured narrative elements and storytelling patterns. Based on research presented at the Intelligent Narrative Technologies Workshop, it provides tools for automated screenplay annotation and knowledge extraction.

🎬 Features

Screenplay Parsing: Parse raw screenplays into structured elements
Shot Heading Analysis: Extract location, shot type, subject, and time information
Dialogue Extraction: Identify speakers and their dialogue with parentheticals
Stage Direction Processing: Parse action descriptions and stage directions
Verb Sense Disambiguation: Map actions to FrameNet frames and WordNet synsets
Hierarchical Structure: Maintain scene and sub-scene relationships
JSON Export: Export parsed screenplays in machine-readable format

📖 Paper & Research

This project implements the methodology described in:

"Automated Screenplay Annotation for Extracting Storytelling Knowledge" David R. Winer and R. Michael Young Intelligent Narrative Technologies Workshop (INT17), 2017

Read the paper

Key Contributions

Grammar-based parsing of shot headings following industry standards
Hierarchical segmentation of screenplay structure
Verb sense disambiguation for action extraction
Large-scale corpus analysis of 1000+ IMSDb screenplays

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/drwiner/ScreenPy.git
cd ScreenPy

# Install package
pip install -e .

# Or install with development dependencies
pip install -e ".[dev]"

# For NLP features
pip install -e ".[nlp]"

Basic Usage

from screenpy import ScreenplayParser

# Initialize parser
parser = ScreenplayParser()

# Parse a screenplay file
screenplay = parser.parse_file("path/to/screenplay.txt")

# Access structured elements
for segment in screenplay.master_segments:
    print(f"Scene: {segment.heading.raw_text}")
    if segment.heading.location_type:
        print(f"  Location: {' - '.join(segment.heading.locations)}")
    if segment.heading.time_of_day:
        print(f"  Time: {segment.heading.time_of_day}")

# Export to JSON
screenplay_json = screenplay.to_json()

Command Line Interface

# Parse a single screenplay
screenpy parse screenplay.txt -o output.json

# Batch process screenplays
screenpy batch data/screenplays/ -o data/outputs/

# Extract verb senses with VSD
screenpy vsd screenplay.txt --frames --synsets

# Generate statistics
screenpy stats data/outputs/ -o stats.csv

📁 Project Structure

ScreenPy/
├── src/screenpy/           # Main package code
│   ├── parser/            # Parsing modules
│   │   ├── grammar.py     # Shot heading grammar
│   │   ├── segmenter.py   # Screenplay segmentation
│   │   └── elements.py    # Element extraction
│   ├── vsd/              # Verb Sense Disambiguation
│   │   ├── frames.py     # FrameNet integration
│   │   ├── synsets.py    # WordNet integration
│   │   └── clausie.py    # Clause extraction
│   ├── models.py         # Data models (Pydantic)
│   ├── utils.py          # Utilities
│   └── cli.py            # Command-line interface
├── tests/                # Test suite
├── data/                 # Data files
│   ├── screenplays/      # Raw screenplay files
│   └── outputs/          # Parsed outputs
├── docs/                 # Documentation
└── examples/             # Example scripts

📊 Screenplay Elements

Shot Headings

Shot headings follow a standardized grammar:

INT. LOCATION - SHOT_TYPE - SUBJECT - TIME_OF_DAY

Examples:

INT. CENTRAL PARK - DAY
EXT. WHITE HOUSE - SOUTH LAWN - CLOSE ON CNN CORRESPONDENT - SUNSET
WIDE SHOT - RACETRACK AND EMPTY STANDS

Supported Elements

Element	Description	Example
Master Headings	Scene beginnings with INT/EXT	`INT. OFFICE - DAY`
Shot Types	Camera shot specifications	`CLOSE`, `WIDE`, `TRACKING`
Stage Direction	Action descriptions	`John enters the room.`
Dialogue	Character speech	`JOHN: Hello there!`
Transitions	Scene changes	`CUT TO:`, `FADE OUT`
In-line Caps	Emphasized elements	Sound effects, character intros

🔬 Verb Sense Disambiguation (VSD)

The VSD module maps verbs in stage directions to semantic frames:

from screenpy.vsd import VerbSenseAnalyzer

analyzer = VerbSenseAnalyzer()

# Analyze stage direction
text = "Indy sails through sideways and rolls to a stop"
actions = analyzer.extract_actions(text)

for action in actions:
    print(f"Verb: {action.verb}")
    print(f"Frames: {action.verb_sense.frames}")
    print(f"Synsets: {action.verb_sense.synsets}")

📈 Corpus Statistics

Analysis of 1000+ IMSDb screenplays:

Genre	Films	Avg Segments	Avg Headings	Avg Dialogue
Action	272	1240	621	538
Comedy	310	1370	582	720
Drama	541	1328	591	667
Horror	134	1150	632	451
Sci-Fi	140	1161	607	472

Full statistics

🛠️ Development

Setup Development Environment

# Install development dependencies
pip install -e ".[dev]"

# Set up pre-commit hooks
pre-commit install

# Run tests
pytest

# Format code
black src/ tests/

# Type checking
mypy src/

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

📚 Documentation

🔗 Related Work

IMSDb - Internet Movie Script Database
FrameNet - Frame semantic annotations
WordNet - Lexical database
spaCy - Industrial-strength NLP
sense2vec - Semantic similarity

📄 Citation

If you use ScreenPy in your research, please cite:

@inproceedings{winer2017screenpy,
  title={Automated Screenplay Annotation for Extracting Storytelling Knowledge},
  author={Winer, David R. and Young, R. Michael},
  booktitle={Intelligent Narrative Technologies Workshop (INT17)},
  year={2017},
  organization={AAAI}
}

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

👥 Authors

David R. Winer - drwiner
R. Michael Young - Advisor

🙏 Acknowledgments

University of Utah School of Computing
Entertainment Arts and Engineering Program
National Science Foundation Grant No. 1654651

For questions or support, please open an issue

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.idea		.idea
data		data
examples		examples
legacy		legacy
parsed_output		parsed_output
src/screenpy		src/screenpy
tests		tests
verb_sense		verb_sense
.gitignore		.gitignore
INT17_screenplays.pdf		INT17_screenplays.pdf
MIGRATION.md		MIGRATION.md
README.md		README.md
README_OLD.md		README_OLD.md
VSD_frame_dict_by_genre		VSD_frame_dict_by_genre
command_line_processes.png		command_line_processes.png
genre_stats.png		genre_stats.png
genre_stats.xlsx		genre_stats.xlsx
genre_stats_combined.png		genre_stats_combined.png
matrix_sample.json		matrix_sample.json
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
scene_segs.png		scene_segs.png
screenplays__table_.pdf		screenplays__table_.pdf
test_installation.py		test_installation.py
western_genre_breakdown.xlsx		western_genre_breakdown.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScreenPy 2.0

🎬 Features

📖 Paper & Research

Key Contributions

🚀 Quick Start

Installation

Basic Usage

Command Line Interface

📁 Project Structure

📊 Screenplay Elements

Shot Headings

Supported Elements

🔬 Verb Sense Disambiguation (VSD)

📈 Corpus Statistics

🛠️ Development

Setup Development Environment

Contributing

📚 Documentation

🔗 Related Work

📄 Citation

📝 License

👥 Authors

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

drwiner/ScreenPy

Folders and files

Latest commit

History

Repository files navigation

ScreenPy 2.0

🎬 Features

📖 Paper & Research

Key Contributions

🚀 Quick Start

Installation

Basic Usage

Command Line Interface

📁 Project Structure

📊 Screenplay Elements

Shot Headings

Supported Elements

🔬 Verb Sense Disambiguation (VSD)

📈 Corpus Statistics

🛠️ Development

Setup Development Environment

Contributing

📚 Documentation

🔗 Related Work

📄 Citation

📝 License

👥 Authors

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages