Adding Pytest features by bobbycxy · Pull Request #34 · LeonGuertler/SuperTinyLanguageModels

bobbycxy · 2024-07-02T09:43:15Z

(Work in progress)

This is a pull request to add pytest requirements for areas in our repo. The model_shell class object's forward function uses 3 different forward functions from the 3 different class objects that are passed to it - 1) an embedded class, 2) a core-model class, and 3) a model head class.

These functions perform overarching tasks that are 1) embedder, 2) core model, 3) model head.

Generally...
...the embedder is a class object that generally takes in a tensor (B, T) and returns a tensor (B, T, H).
...the core model is a class object that takes in a tensor (B, T, H) and returns a tensor (B, T, H).
...the model head is a class object that takes in a tensor (B, T, H) and returns a tensor (B, T, V).

This pytest will ensure that these outputs are correctly shaped and are not nan.

+ general fixes

DylanASHillier · 2024-07-03T03:25:38Z

models/experimental/next_thought/embedding_models.py

-        x = x.mean(dim=-2)
-        return x
+        x = x.mean(dim=-2) 
+        return x # 


remove the weird hash here?

DylanASHillier · 2024-07-03T03:27:37Z

debugging/debugging_models/debugging_components/debugging_model/test_modelshell.py

+    ## 2. ensure the output is not nan
+    assert not torch.isnan(res).all()
+    ## 3. ensure the output shape is correct
+    assert res.shape == (model_cfg['batch_size'], model_cfg['context_window'], embedder.token_embedder.embedding_dim)


these aren't really testing the modelshell, they are testing the individual components...

which is fine but should be in seperate test scripts. i.e. should have test_embedder, test_core_model, test_lm_head as files

and these are just testing the forward pass. that is fine for the core model and lm head, but the embedder interface has quite a few methods (padding, truncating, inference vs forward, etc. that should be checked

DylanASHillier

My basic feedback is this:

the model_shell tests should just test the functions of the model shell interface itself. in particular it should be 3 functions:
test_loglikelihood, test_inference, and test_forward.
This should be done with a MockEmbedder, MockTransformer, and MockHead that return (random) torch tensors of the appropriate shapes expected by the model
For building the embedder tests, it should make sure that the layer matches the interface defined in embedding_models.py i.e. that all the necessary functions are implemented.

bobbycxy · 2024-07-03T06:00:09Z

My basic feedback is this:

the model_shell tests should just test the functions of the model shell interface itself. in particular it should be 3 functions:
test_loglikelihood, test_inference, and test_forward.
This should be done with a MockEmbedder, MockTransformer, and MockHead that return (random) torch tensors of the appropriate shapes expected by the model

For building the embedder tests, it should make sure that the layer matches the interface defined in embedding_models.py i.e. that all the necessary functions are implemented.

Got it, I'll rework this PR.

…ore_model, lm_head. In addition, also tests the forward, inference and loglikelihood methods of the model_shell and byte_model shell.

bobbycxy · 2024-07-12T09:45:12Z

Hey @DylanASHillier , I've added pytest features for the methods of the model_shell (forward, inference, loglikelihood) and byte_model_shell (forward), as well as separated the pytests for the various types of the embedding_model, core_model and lm_heads. Let me know what needs to be mended or revised.

Meanwhile, I will add in pytests for the trainer modules next.

bobbycxy · 2024-07-29T08:04:15Z

Adding pytests in the training process.

bobbycxy · 2024-07-29T08:04:15Z

Adding pytests in the training process.

DylanASHillier · 2024-09-05T07:17:54Z

please refresh and resubmit

DylanASHillier · 2024-09-05T07:18:00Z

my bad

bobbycxy added 2 commits July 2, 2024 17:31

+ initial pytest for model_shell

89f9ec9

+ general fixes

removing the bytemodelshell related tests.

da42ee2

bobbycxy requested a review from DylanASHillier July 2, 2024 09:43

DylanASHillier reviewed Jul 3, 2024

View reviewed changes

DylanASHillier requested changes Jul 3, 2024

View reviewed changes

bobbycxy added 2 commits July 12, 2024 17:09

Adding pytest features for each class type for the embedding_model, c…

35ca739

…ore_model, lm_head. In addition, also tests the forward, inference and loglikelihood methods of the model_shell and byte_model shell.

rename pytest file to ensure unique test names

42c3923

bobbycxy added 2 commits July 17, 2024 14:38

Merge branch 'main' into features/pytest

743741c

+ pytest for all loss functions

d98b7ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Pytest features#34

Adding Pytest features#34
bobbycxy wants to merge 6 commits intomainfrom
features/pytest

bobbycxy commented Jul 2, 2024

Uh oh!

DylanASHillier Jul 3, 2024

Uh oh!

bobbycxy Jul 3, 2024

Uh oh!

DylanASHillier Jul 3, 2024

Uh oh!

DylanASHillier Jul 3, 2024

Uh oh!

DylanASHillier Jul 3, 2024

Uh oh!

DylanASHillier left a comment

Uh oh!

bobbycxy commented Jul 3, 2024

Uh oh!

bobbycxy commented Jul 12, 2024

Uh oh!

bobbycxy commented Jul 29, 2024

Uh oh!

bobbycxy commented Jul 29, 2024

Uh oh!

DylanASHillier commented Sep 5, 2024

Uh oh!

DylanASHillier commented Sep 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bobbycxy commented Jul 2, 2024

Uh oh!

DylanASHillier Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

bobbycxy Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

DylanASHillier Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

DylanASHillier Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

DylanASHillier Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

DylanASHillier left a comment

Choose a reason for hiding this comment

Uh oh!

bobbycxy commented Jul 3, 2024

Uh oh!

bobbycxy commented Jul 12, 2024

Uh oh!

bobbycxy commented Jul 29, 2024

Uh oh!

bobbycxy commented Jul 29, 2024

Uh oh!

DylanASHillier commented Sep 5, 2024

Uh oh!

DylanASHillier commented Sep 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants