A tool to generate CAST Universal Analyzer extension scaffolds for any programming language. The generator automates object detection and creation, while link detection (both same-technology and cross-technology) requires manual, technology-specific implementation.
- What This Generator Does
- What This Generator Does NOT Do
- Why Generic Link Detection Is Impossible
- 3-Level Architecture
- Quick Start
- Configuration Reference
- Implementing Link Detection (Pass 2)
- Implementing Cross-Technology Links (Application Level)
- CAST SDK Reference
- Using LLMs to Assist Implementation
- Deployment
The generator automatically produces:
| Component | Generated Automatically | Status |
|---|---|---|
| Extension structure | โ Yes | Complete scaffold |
| MetaModel XML | โ Yes | Object types and categories |
| Language patterns XML | โ Yes | File extensions, comments |
| Pass 1: Object detection | โ Yes | Fully functional |
| Pass 1: Object creation | โ Yes | Fully functional |
| Pass 2: Skeleton methods | โ Yes | Empty templates with documentation |
| Application Level: Template | โ Yes | Empty template with documentation |
| Test scaffolding | โ Yes | Ready to extend |
| NuGet packaging | โ Yes | Build scripts included |
Pass 1 is fully automatic. You define regex patterns in the config file, and the generator produces code that:
- Detects classes, functions, methods, and other structures
- Creates CAST objects with proper types and hierarchy
- Sets bookmarks for source code navigation
- Builds a global symbol table for resolution
The generator does NOT produce:
| Component | Generated | Reason |
|---|---|---|
| Pass 2: Call detection | โ Skeleton only | Requires technology-specific parsing |
| Pass 2: Call resolution | โ Skeleton only | Requires semantic analysis |
| Pass 2: Link creation | โ Skeleton only | Depends on resolution |
| Application Level: Cross-tech links | โ Template only | Requires custom matching logic |
| Import analysis | โ Not implemented | Language-specific semantics |
| Type inference | โ Not implemented | Requires full parser |
Pass 2 methods and Application Level are empty skeletons. You must implement technology-specific logic.
Consider this simple Python code:
def process(data):
data.add(item) # Which add() is this?Without knowing the type of data, we cannot determine which add() method is being called. Is it:
list.add()?set.add()?CustomClass.add()?
A regex-based parser cannot answer this question. It lacks:
- Type information - Variables don't carry type annotations in most languages
- Import resolution - We don't know what modules are imported
- Control flow analysis - The type might change at runtime
- Inheritance chains - Method resolution depends on class hierarchy
Now consider linking between technologies:
# Python code
def fetch_user_data():
result = call_api("/api/users") # Which Java endpoint is this?// Java code
@RequestMapping("/api/users")
public Users getUsers() { ... }
@RequestMapping("/api/user/{id}")
public User getUserById() { ... }To create accurate cross-technology links, you need:
- Technology-specific knowledge - How does Python call Java? REST? gRPC? Direct?
- Naming conventions - Does your project follow specific patterns?
- Multiple conditions - Name matching alone creates false positives
- Context analysis - Parse source code for concrete evidence (URLs, table names, etc.)
Creating a link from fetch_user_data() to the wrong endpoint or database table is worse than creating no link at all:
- False positives corrupt architectural analysis
- False positives mislead developers about dependencies
- False positives cannot be easily identified and removed
The previous generic implementation attempted heuristics (same-file preference, single-candidate matching), but these still produced unacceptable false positive rates.
Rather than pretend generic link detection works, this generator:
- Fully automates what CAN be done generically (object detection)
- Provides skeletons with detailed documentation for what CANNOT
- Empowers developers to implement correct, technology-specific logic
- Separates concerns between same-technology links (Pass 2) and cross-technology links (Application Level)
CAST analyzers use a 3-level architecture:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PASS 1: AUTOMATIC โ
โ (Fully implemented by generator) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ start_file(file) โ
โ โโโ Read source file โ
โ โโโ Parse with regex patterns โ
โ โโโ Detect: Classes, Functions, Methods, etc. โ
โ โโโ Create CAST objects with types and bookmarks โ
โ โโโ Store in library for Pass 2 โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PASS 2: MANUAL IMPLEMENTATION โ
โ (Skeleton only - YOU must implement) โ
โ Same-technology link detection โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ end_analysis() โ
โ โโโ full_parse() โ Detect calls (YOUR CODE) โ
โ โโโ resolve() โ Resolve targets (YOUR CODE) โ
โ โโโ save_links() โ Create links (YOUR CODE) โ
โ โ
โ Available data: โ
โ โโโ self.objects โ All objects in this file โ
โ โโโ self.object_lines โ Line ranges for each object โ
โ โโโ library.symbols โ All objects across all files โ
โ โโโ library.symbols_by_name โ Objects indexed by short name โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ APPLICATION LEVEL: MANUAL IMPLEMENTATION โ
โ (Template only - YOU must implement) โ
โ Cross-technology link detection โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ end_application(application) โ
โ โโโ Create links between different technologies (YOUR CODE) โ
โ โ
โ Available data: โ
โ โโโ application.search_objects() โ Find objects by type/name โ
โ โโโ application.objects() โ All objects (all technologies) โ
โ โโโ application.get_files() โ All analyzed files โ
โ โโโ open_source_file() โ Read source code โ
โ โ
โ Use cases: โ
โ โโโ Your Language โ Database Tables โ
โ โโโ Your Language โ Java/C# REST APIs โ
โ โโโ Your Language โ Configuration files โ
โ โโโ Your Language โ Message Queues/Topics โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
For each source file, the generator's code:
- Reads the file content
- Matches regex patterns from your config
- Creates CAST
CustomObjectinstances - Sets parent relationships based on hierarchy config
- Saves bookmarks for F11 code navigation
- Registers objects in a global symbol table
After all files are processed, you must implement:
- Call detection (
full_parse()) - Find call sites in source code - Call resolution (
resolve()) - Match calls to target objects within your technology - Link creation (
save_links()) - Use CAST SDK to create links
The generated module file contains detailed documentation in each skeleton method explaining exactly what to implement.
After all analyzer-level extensions (Java, SQL, your technology, etc.) have completed:
- Find objects from your technology and other technologies
- Implement matching logic with multiple conditions to avoid false positives
- Create cross-technology links using CAST SDK
The generated *_application_level.py file contains an empty template with documentation and helper methods.
Example scenarios:
- Your language function โ SQL Table
- Your language code โ Java REST endpoint
- Your language module โ Configuration file
- Your language publisher โ Message Queue
{
"language": "MyLang",
"namespace": "uc",
"file_no": 50,
"version": "1.0.0",
"author": "Your Name",
"extensions": ["ml", "mylang"],
"tags": "MyLang Extension",
"comment": "//",
"multiline_comment": { "begin": "/*", "end": "*/" },
"objects": {
"Program": { "parent": "file", "pattern_keys": [] },
"Class": { "parent": "Program", "pattern_keys": ["class"] },
"Function": { "parent": "Program", "pattern_keys": ["function"] },
"Method": { "parent": "Class", "pattern_keys": ["method"] }
},
"grammar": {
"block_delimiters": "braces",
"patterns": {
"class": ["^\\s*class\\s+(?P<name>\\w+)"],
"function": ["^\\s*func\\s+(?P<name>\\w+)\\s*\\("],
"method": ["^\\s+func\\s+(?P<name>\\w+)\\s*\\("]
},
"keywords": ["if", "else", "for", "while", "return"]
}
}Before generating the extension, reserve a unique file_no:
-
Go to the CAST SharePoint UA Corner:
https://castsoftware.sharepoint.com/sites/CoffeeMachine/SitePages/UA-Corner.aspx -
Reserve a range of IDs (e.g.,
2,193,000 - 2,193,999) -
Calculate your
file_no:file_no = (start_id - 2,000,000) / 1,000
python generate_extension.py config_mylang.json outputs/cd outputs/com.castsoftware.uc.mylang
python -m pytest tests/Open mylang_module.py and implement:
full_parse()- Technology-specific call detectionresolve()- Technology-specific call resolutionsave_links()- Link creation using CAST SDK
Open mylang_application_level.py and implement:
end_application()- Cross-technology link creation
.\plugin-to-nupkg.bat
# Copy .nupkg to CAST extensions folder| Field | Type | Description |
|---|---|---|
language |
string | Language name (e.g., "Python", "Go") |
namespace |
string | "uc", "labs", or "product" |
file_no |
integer | Unique ID range (reserve at CAST UA Corner) |
version |
string | Semantic version (e.g., "1.0.0") |
extensions |
array | File extensions without dot |
objects |
object | Object type hierarchy |
grammar.patterns |
object | Regex patterns for object detection |
"objects": {
"Program": { "parent": "file", "pattern_keys": [] },
"Class": { "parent": "Program", "pattern_keys": ["class"] },
"Method": { "parent": "Class", "pattern_keys": ["method"] }
}-
parent: Where this object type lives in hierarchy"file"= Top-level container (Program)- Object type name = Nested inside that type
-
pattern_keys: Which patterns detect this type- Empty
[]for auto-created objects (Program) - References pattern names from
grammar.patterns
- Empty
"grammar": {
"block_delimiters": "braces",
"patterns": {
"class": ["^\\s*class\\s+(?P<name>\\w+)"],
"function": ["^\\s*func\\s+(?P<name>\\w+)"]
},
"keywords": ["if", "else", "for", "while"]
}-
block_delimiters: How code blocks end"braces"-{...}(Go, Java, C)"end_keyword"-endkeyword (Ruby, Lua)"indentation"- Whitespace (Python)
-
patterns: Regex with(?P<name>...)capture group -
keywords: Reserved words (helper for manual link implementation)
After Pass 1 completes, you have access to:
# In the module class:
self.objects # {fullname: CustomObject} - All objects in this file
self.object_lines # {fullname: (start, end)} - Line ranges
self.source_content # Raw source code string
self.path # File path
# In the library (passed to resolve()):
library.symbols # {fullname: CustomObject} - ALL objects across ALL files
library.symbols_by_name # {short_name: [fullnames]} - For resolutionDetect calls in the source code:
def full_parse(self):
if not self.source_content:
return
lines = self.source_content.splitlines()
for line_num, line in enumerate(lines, 1):
# Your technology-specific call detection
# Example: Match "functionName(" pattern
for match in re.finditer(r'\b(\w+)\s*\(', line):
callee_name = match.group(1)
# Skip language keywords
if callee_name in self._get_keywords():
continue
# Find caller (which function contains this line)
caller = self._find_caller_for_line(line_num)
# Store for resolution
self.pending_links.append({
'caller': caller,
'callee': callee_name,
'line': line_num
})Match callee names to actual objects:
def resolve(self, library):
for link_info in self.pending_links:
callee_name = link_info['callee']
# Strategy 1: Same-file resolution (highest confidence)
for fullname, obj in self.objects.items():
if fullname.endswith('.' + callee_name):
link_info['resolved_callee'] = obj
link_info['resolved_callee_fullname'] = fullname
break
# Strategy 2: Cross-file resolution (only if unambiguous)
if 'resolved_callee' not in link_info:
candidates = library.symbols_by_name.get(callee_name, [])
if len(candidates) == 1:
fullname = candidates[0]
link_info['resolved_callee'] = library.symbols[fullname]
link_info['resolved_callee_fullname'] = fullnameCreate links using the CAST SDK:
def save_links(self):
from cast.analysers import create_link, Bookmark
links_created = 0
for link_info in self.pending_links:
if 'resolved_callee' not in link_info:
continue
caller_obj = self.objects.get(link_info['caller'])
callee_obj = link_info['resolved_callee']
line = link_info.get('line', 0)
if caller_obj and callee_obj:
# Create bookmark for navigation
bookmark = Bookmark(self.file, line, 1, line, -1)
# Create the link
create_link('callLink', caller_obj, callee_obj, bookmark)
links_created += 1
return links_createdUse Application Level when you need to link your technology's objects to objects from other technologies:
| Scenario | Example |
|---|---|
| Database access | Your language โ SQL Table |
| API calls | Your language โ Java/C# REST endpoint |
| Message queues | Your language โ Queue/Topic |
| Configuration | Your language โ XML/JSON config |
| External services | Your language โ Web service |
Open the generated *_application_level.py file and implement the matching logic:
def end_application(self, application):
"""
Create cross-technology links after all analyzers complete.
"""
from cast.application import create_link
# Find your technology's objects
my_functions = list(
application.search_objects(category='MyLangFunction')
)
# Find objects from other technologies
db_tables = list(application.search_objects(category='Table'))
java_methods = list(application.search_objects(category='JV_Method'))
# Implement matching logic with MULTIPLE conditions
links_created = 0
for func in my_functions:
func_name = func.get_name().lower()
# Example: Link to database tables
for table in db_tables:
table_name = table.get_name().lower()
# Use multiple conditions to avoid false positives!
if (table_name in func_name and
len(table_name) > 3 and # Avoid short names
func_name.startswith(('get_', 'query_', 'fetch_'))):
create_link('useLink', func, table)
links_created += 1
if links_created > 0:
print(f"[MyLang] Created {links_created} cross-technology links")-
Use multiple matching conditions
if (name_matches and pattern_matches and context_matches): create_link(...)
-
Validate name length
if table_name in func_name and len(table_name) > 3: # Avoids matching "id", "db", etc.
-
Parse source code for evidence
from cast.application import open_source_file source = open_source_file(func.get_file()).read() if 'SELECT * FROM ' + table_name in source: create_link('useLink', func, table)
- Don't match on short names alone
- Don't create links without evidence
- Don't forget error handling
# Search for objects
objects = application.search_objects(category='ObjectType')
objects = application.search_objects(name='MyObject')
# Iterate all objects
for obj in application.objects():
print(obj.get_name(), obj.get_type())
# Read source files
from cast.application import open_source_file
source = open_source_file(file_path)
content = source.read()
# Create links
from cast.application import create_link
create_link('callLink', source_obj, target_obj)Complete API Documentation: ๐ Application Level API Reference
Creates a relationship between two objects in the CAST knowledge base.
from cast.analysers import create_link, Bookmark
# Basic usage
create_link('callLink', caller_obj, callee_obj)
# With source location (enables F11 navigation)
bookmark = Bookmark(file, line, col, end_line, end_col)
create_link('callLink', caller_obj, callee_obj, bookmark)Parameters:
link_type(str): Predefined link type (see below)caller(CustomObject): Source objectcallee(CustomObject): Target objectbookmark(Bookmark, optional): Source location
IMPORTANT: You cannot invent new link types. Only these predefined types are available:
| Link Type | Meaning | Use Case |
|---|---|---|
callLink |
Function/method invocation | foo() calls bar() |
useLink |
Read/write access | Function uses variable |
relyonLink |
Dependency | Module depends on library |
inheritLink |
Inheritance | Class extends parent |
referLink |
Reference | Code mentions constant |
Custom link types can be defined in MetaModel XML, but this requires advanced knowledge of the CAST metamodel.
Enables F11 navigation to source code.
from cast.analysers import Bookmark
# Bookmark(file, start_line, start_col, end_line, end_col)
bookmark = Bookmark(self.file, 42, 1, 42, -1)
# -1 for end_col means "end of line"Objects are created in Pass 1, but here's the reference:
from cast.analysers import CustomObject
obj = CustomObject()
obj.set_type('MyLangFunction') # Object type from MetaModel
obj.set_name('myFunction') # Short name
obj.set_fullname('path/file.ml.myFunction') # Unique identifier
obj.set_parent(parent_obj) # Parent in hierarchy
obj.set_guid(unique_string) # Deterministic GUID
obj.save() # Persist to knowledge base
obj.save_position(bookmark) # Set source locationLarge Language Models can significantly accelerate both Pass 2 and Application Level implementation:
Ask an LLM to explain the calling conventions of your target language:
"Explain all the ways functions can be called in Lua, including method calls, metamethod invocations, and dynamic calls. Provide regex patterns that would match each calling style."
"Write a Python regex pattern that matches Go method calls of the form
receiver.Method()where the receiver can be a variable, pointer dereference, or type assertion."
"In Ruby, how would you resolve a method call
obj.process()to its definition, considering: method_missing, included modules, class reopening, and refinements? Which cases can be resolved statically?"
"In a typical Java + Python microservices architecture, what are the common patterns for Python code calling Java REST endpoints? How would you detect these calls in Python source code?"
"What are the common naming conventions for functions that access database tables in Python? Generate matching logic that links Python functions to SQL tables based on these conventions."
"How would you match Python code that reads configuration files to actual XML/JSON configuration objects? What evidence in the source code would indicate a dependency?"
- Generate extension scaffold with this tool
- Use an LLM to understand the target language's semantics
- Implement
full_parse()with LLM assistance (Pass 2) - Test on real code, iterate with LLM help
- Implement
resolve()for unambiguous cases (Pass 2) - Implement
end_application()for cross-technology links (Application Level) - Deploy and validate
cd outputs/com.castsoftware.uc.mylang
.\plugin-to-nupkg.batCopy the .nupkg file to:
C:\Cast\ProgramData\CAST\AIP-Console-Standalone\data\shared\extensions\
Since the extension doesn't include a DMT discoverer:
- Run Fast Scan on your application
- Start Deep Analysis
- Go to Config tab โ Universal Technology
- Click +ADD to create Analysis Unit
- Fill in: Name, Package (
main_sources), Language - Save and continue
Use the UA Package Assistant to validate:
- Open:
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\CAST 8.x\UAPackageAssistant.exe - Browse to extension folder
- Check "Validate package MetaModel file only"
- Click "Validate"
{
"language": "Simple",
"namespace": "uc",
"file_no": 99,
"version": "1.0.0",
"author": "Developer",
"extensions": ["sim"],
"tags": "Simple Extension",
"comment": "#",
"objects": {
"Program": { "parent": "file", "pattern_keys": [] },
"Function": { "parent": "Program", "pattern_keys": ["function"] }
},
"grammar": {
"block_delimiters": "braces",
"patterns": {
"function": ["^\\s*fn\\s+(?P<name>\\w+)"]
}
}
}{
"language": "Go",
"namespace": "uc",
"file_no": 32,
"version": "1.0.0",
"author": "CAST",
"extensions": ["go"],
"tags": "Go Extension",
"comment": "//",
"multiline_comment": { "begin": "/*", "end": "*/" },
"objects": {
"Program": { "parent": "file", "pattern_keys": [] },
"Struct": { "parent": "Program", "pattern_keys": ["struct"] },
"Interface": { "parent": "Program", "pattern_keys": ["interface"] },
"Function": { "parent": "Program", "pattern_keys": ["function"] },
"Method": { "parent": "Struct", "pattern_keys": ["method"] }
},
"grammar": {
"block_delimiters": "braces",
"patterns": {
"struct": ["^\\s*type\\s+(?P<name>\\w+)\\s+struct\\s*\\{"],
"interface": ["^\\s*type\\s+(?P<name>\\w+)\\s+interface\\s*\\{"],
"function": ["^\\s*func\\s+(?!\\()(?P<name>\\w+)\\s*\\("],
"method": ["^\\s*func\\s+\\(\\w+\\s+\\*?(?P<receiver>\\w+)\\)\\s+(?P<name>\\w+)\\s*\\("]
},
"keywords": ["if", "else", "for", "switch", "select", "case", "return", "go", "defer", "func", "type", "struct", "interface", "package", "import", "var", "const"]
}
}LGPL - See COPYING.LESSER.txt