Supporting non-vision models

Hi,
could you explain how to use this for non-vision models?
Trying to apply this to an audio model I am getting an error because `get_previous_layer` never finds a `Conv2d` or a `BatchNorm2d` and hence, after recursing through all the modules, just returns None, which then fails at `re.sub`.

Could you explain the logic behind "going back to the previous layer of exactly this type"?

the immediate earlier layers are: (add, causing the search) -> [transpose] -> gelu -> transpose -> layer_norm -> transpose -> conv1d -> ...

which layer would you expect to find here? Are you looking for the last layer that learns anything (which would be layer norm) or with actual learnable parameters (then it's conv1d)

---

there is a second case where this happens where the chain is: (add, causing the search) -> [dropout] -> linear -> layer_norm -> transpose -> gelu -> transpose -> layer_norm -> transpose -> conv1d -> ...

again, please help me out which layer should be found. I would guess the linear layer?


---

on a side note: instead of deep recursion, checking the whole list each time, building a tree or doubly-linked-list-like structure seems more appropriate (and easier to debug)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Supporting non-vision models #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Supporting non-vision models #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions