Allocation of units for batch7

This is a work in progress! Template: https://github.com/LDSSA/wiki/issues/330

# Description

The goal of this issue is to assess interest and have a pre-allocation of batch7's 
teaching work and QA.

The units, overall work needed and release and delivery dates are listed
below.

# Units

### Admissions

* Unit 
  * Adjustment to the new Python and Pandas versions. 
  * Learning notebooks - minimum change: review 
  * Example notebooks - minimum change: review 
  * Exercise notebook - minimum change: review, new datasets
* To be released on 23 October 2023
* To be ready on: soon

| SLU |	Name | Last year instructor | Batch 7 instructor | Last year QA | 
|---|----|----|----|----|
| SLU01 | Pandas 101 | @majkah0 | @majkah0 | @Jujulian3 | 
| SLU02 | Subsetting Data in Pandas | @jgomes959 |@jgomes959 | @jgerebelo | 
| SLU03 | Visualization with Pandas & Matplotlib | @Gustavo-SF | @kagglekim  | @SaraOGomes | 
| Test | | @fabiocruz |  | @danizao @majkah0 @minhhoang1023 @Gustavo-SF | @cd702  @majkah0 @Gustavo-SF   |
Test on 23 October 2023


- **Batch 7 QA Lead**  SLU01/SLU02/SLU03: @FilipaPereira78
- **Batch 7 backup QA**  SLU01/SLU02/SLU03:@CaitlinHulse

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the test verification

### Specialization 1 + Bootcamp
* Project manager: José Rebelo @jgerebelo 
* Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
  * Learning notebook
  * Example notebook
  * Exercise notebook
* To be released on:
  * SLU04 - SLU10 learning notebooks: 19 November 2023
  * SLU04 - SLU10 exercise notebooks: 26 November 2023
  * SLU11- SLU19 learning and exercise notebooks: 26 November 2023
* To be ready in November

| SLU | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|----|----|----|----|----|
| SLU04 | Basic Stats with Pandas | @SaraOGomes | @cmm79   | @jgomes959 |  @BG2602 |
| SLU05 | Covariance &amp; Correlation | @kagglekim | @cmm79  | @anaritarc |  @BG2602 |
| SLU06 | Dealing with Data Problems | @majkah0 | @TeignmouthElectron  | @SaraOGomes | @BG2602 |
| SLU07 | Regression with Linear Regression | @jgerebelo | @joaogilsa | @carlacotas |  @Mohamedgaber9 |
| SLU08 | Metrics for Regression | @marianahenriques1 | @joaogilsa | @cd702 | @Mohamedgaber9 |
| SLU09 | Classification with Logistic Regression | @majkah0  | @majkah0 |@carlacotas | @caitlinhulse |
| SLU10 | Metrics for Classification | @phgui | @majkah0  | @majkah0 |  @caitlinhulse |
| SLU11 | Tree-Based Models | @anaritarc | @margaridantunes  |@carlacotas | @Mohamedgaber9 |
| SLU12 | Feature Engineering (aka Real World Data) | @danizao  | João Nobre | @anaritarc | @Mohamedgaber9 |
| SLU13 | Bias-Variance tradeoff &amp; Model Selection | @jgerebelo | @rodrigomverissimo  | @anaritarc |  @BG2602 |
| SLU14 | Model complexity &amp; Overfitting | @Gustavo-SF  | @Gustavo-SF  | @Jujulian3 | @BG2602 |
| SLU15 | Hyperparameter Tuning |  @jgomes959 | @jgomes959 | @SaraOGomes |  @BG2602 |
| SLU16 | Workflow | @cimendes  |   | @fabiocruz |  @TeignmouthElectron |
| SLU17 | Ethics &amp; Fairness | @hershaw  | @majkah0   | @Gustavo-SF | @TeignmouthElectron |
| SLU18 | Support Vector Machines (SVM) (optional unit) | @cimendes  | @majkah0  | @Jujulian3   | |
| SLU19 | k-Nearest Neighbors (kNN) (optional unit)| @cimendes  | @majkah0  | @Jujulian3 | |


Group | SLUs | Batch 7 QA  lead*| Batch 7 backup QA**|
|---|----|----|----|
|QA1|SLU04, SLU05, SLU06|@BG2602|Caitlin Hulse|
|QA2|SLU07, SLU08|@Mohamedgaber9|Cora|
|QA3|SLU09, SLU10|Caitlin Hulse||
|QA4|SLU11, SLU12|@Mohamedgaber9||
|QA5|SLU13, SLU14, SLU15|@BG2602||
|QA6|SLU16, SLU17|@@TeignmouthElectron||

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)

### Bootcamp presentations
Bootcamp presentations will be split in two parts. Presentations will be given by senior instructors. This is what is expected from each instructor:
* The presentation should be <= 60 min including student questions. The presentation should be on concepts and insights for the given topic, not the technical implementation in Python.
* If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
* If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

**Bootcamp part 1, Sunday 26. November 2023**
* Instructor 1: LTPlabs
   * ~ 30 min: Intro to data science, SLU04 - Basic Stats with Pandas, SLU05 - Covariance and Correlation
   * ~ 30 min: SLU06 - Dealing with Data Problems
* Instructor 2: José Rebelo, EDP
   * 30 - 60 min: SLU07 - Regression with Linear Regression, SLU08 - Metrics for Regression
* Instructor 3: LTPlabs
    * 30 - 60 min, SLU09 - Classification with Logistic Regression, SLU10 - Metrics for Classification

**Bootcamp part 2, Sunday 3. December 2023**
* Instructor 4: João Ascensão, Stratio - TBC
    *45-60 min: SLU11 - Tree-Based Models, SLU12 - Feature Engineering
* Instructor 5: Maria Cristina Dominguez
   * ~60 min: SLU13 - Bias-Variance tradeoff & Model Selection, SLU14 - Model complexity and Overfitting, SLU15 - Hyperparameter Tuning
* Instructor 6: Sam Hopkins, DareData
   * 30-60 min: SLU16 - Workflow, SLU17 - Ethics and Fairness

* Hackathon 1
  * Come up with new problem for hackathon
  * Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
  * Create baseline instructor solution
  * Evaluation guidelines doc
  * Overall guidelines for instructors to help out in hackathon
* To be released on 17 December 2023
* To be ready in November

| Work unit | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|----|----|----|----|----|
|  Hackathon 1 | Binary Classification | @wilsonramos1 | @jgomes959  | ? |. |

### Specialization 2,  8 January 2024 - 4 February 2024
* Project manager: Kim Pronk @kagglekim 
* Senior instructor: 
   * 1 hour AMA session
   * If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
   * If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
* Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

* Junior instructors
   * Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
   * In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
* To be released on **8 January (BLU01), 15 January (BLU02), 22 January (BLU03)**
* To be ready in ** December 2023**

* Hackathon 2
  * Come up with new problem for hackathon
  * Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
  * Create baseline instructor solution
  * Evaluation guidelines doc
  * Overall guidelines for instructors to help out in hackathon
* To be released on **4 February (Hackathon 02)**
* To be ready in **mid January**

| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA | 
|---|----|----|----|----|
| **-** | **Spec lead** | @martinb-bb| | | 
| BLU01 | Messy Data | @JerBouma  | | @majkah0| 
| BLU02 | Advanced Wrangling | @minhhoang1023 | | @cd702| 
| BLU03 | Data Sources | @jmaslek | |@anaritarc| 
| Hackathon 2 | Data Wrangling |@martinb-bb @JerBouma @minhhoang1023  @DidierRLopes|||

- **Batch 7 QA Lead**  BLU01/BLU02/BLU03: @AhmedEmad2525 
- **Batch 7 backup QA**  BLU01/BLU02/BLU03:

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification


### Specialization 3, 5 February - 3 March 2024
* Project manager: Mária Hanulová @majkah0  
* Senior instructor: Telmo Felgueira, Loka / JungleAI
   * 1 hour AMA session
   * If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
   * If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
* Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

* Junior instructors
   * Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
   * In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
* To be released on **5 February (BLU04), 12 February (BLU05), 19 February (BLU06)**
* To be ready in **January**

* Hackathon 3
  * Come up with new problem for hackathon
  * Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
  * Create baseline instructor solution
  * Evaluation guidelines doc
  * Overall guidelines for instructors to help out in hackathon
* To be released on **3 March (Hackathon 03)**
* To be ready in **mid February**

| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA | 
|---|----|----|----|----|
| **-** | **Spec lead** | **@TSFelg** | **@TSFelg** ||
| BLU04 | Time Series Concepts | @PedroRibeiro80 ||@Sonia-se|
| BLU05 | Classical Time Series Models | @jgerebelo | |@carlacotas |
| BLU06 | Machine Learning for Time Series | @jdpsc | @TeignmouthElectron | @SaraOGomes|
| Hackathon 3 | Timeseries | @TSFelg | | @Gustavo-SF |

- **Batch 7 QA Lead**  BLU04/BLU05/BLU06: @Mohamedgaber9 
- **Batch 7 backup QA**  BLU04/BLU05/BLU06:

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

### Specialization 4, 4 March - 31 March 2024 
* Project manager:
* Senior instructor:
   * 1 hour AMA session
   * If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
   * If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
* Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

* Junior instructors
   * Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
   * In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor. 
* To be released on **4 March (BLU07), 11 March (BLU08), 18 March (BLU09)**
* To be ready in **February**

* Hackathon 4
  * Come up with new problem for hackathon
  * Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
  * Create baseline instructor solution
  * Evaluation guidelines doc
  * Overall guidelines for instructors to help out in hackathon
* To be released on **31 March (Hackathon 04)**
* To be ready in ** mid March**

| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|----|----|----|----|
| **-** | **Spec lead** | **@CatarinaSilva** | | |
| BLU07 | Feature Extraction | @CatarinaSilva | | @cd702 |
| BLU08 | Dimensionality Reduction | @CatarinaSilva | |@majkah0|
| BLU09 | Information Extraction | @CatarinaSilva | | @carlacotas | 
| Hackathon 4 | NLP | BancoBPI  |

- **Batch 7 QA Lead**  BLU07/BLU08/BLU09: @CaitlinHulse
- **Batch 7 backup QA**  BLU07/BLU08/BLU09: @BG2602

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

### Specialization 5 - this will be an optional specialization
* Project manager:
* Junior instructors
   * Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
* To be released in **March/April**
* To be ready in **March**

| Work unit | Name | Last year instructor | Batch 6 instructor(s) | Last year QA |
|---|----|----|----|----|
| **-** | **Spec lead** || ||
| BLU10 | Non-personalised Recommender | | |@majkah0 @anaritarc |
| BLU11 | Personalized Recommenders | | | @majkah0 @anaritarc|
| BLU12 | Workflow | | | @majkah0 @anaritarc |
| Hackathon 5 | Recommender Systems |  | ||

- **Batch 7 QA Lead**  BLU10/BLU11/BLU12: @TeignmouthElectron 
- **Batch 7 backup QA**  BLU10/BLU11/BLU12: 

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification


### Specialization 6, 1 April - 28 April 2024
* Project manager:
* Senior instructor: Gustavo Fonseca, LDSA
   * 1 hour AMA session
   * If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
   * If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
* Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

* Junior instructors
   * Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
   * In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor. 
* To be released on **1 April (BLU13), 8 April (BLU14), 15 April (BLU15)**
* To be ready in **March**

* Hackathon
  * Come up with new problem for hackathon
  * Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
  * Create baseline instructor solution
  * Evaluation guidelines doc
  * Overall guidelines for instructors to help out in hackathon
* To be released on **28 April (Hackathon 06)**
* To be ready in **mid April**

Extra session about Venture Capital: Armilar

| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|----|----|----|----|
| **-** | **Spec lead** | @cimendes | @Gustavo-SF | | 
| BLU13 | Basic model Deployment | @cimendes | | @carlacotas|
| BLU14 | Deployment in the real world | @cimendes | | |
| BLU15 | Model CSI | @cimendes | | @carlacotas |
| Hackathon 6 | Data science in real world | @CatarinaSilva @cimendes @InesPessoa | | | 

- **Batch 7 QA Lead**  BLU13/BLU14/BLU15: 
- **Batch 7 backup QA**  BLU13/BLU14/BLU15:

*It will be the responsible for checking the SLUs
** It will  be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification


### Capstone, 29 April - 15 July 2024

* Preparing a strong dataset and problem
* Help building documents/forms/etc
* Replying to students QA
* Beta-testing/QAing
* Grading capstone
* To be released on **29 April**
* To be ready in **mid April**

| Work unit | Name | Last year instructor (s) | Batch 7 instructor(s) |
|---|----|----|----|
| - | Capstone | @minhhoang1023 @cimendes @fabiocruz  @Gustavo-SF @anaritarc @majkah0 ||

Other possible extra sessions:  
* NOS (LLM, Data Science in Real World);
* AICEP (Classfication, Data Science in Real World);
* BPI (Classfication, Data Science in Real World)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allocation of units for batch7 #381

Description

Units

Admissions

Specialization 1 + Bootcamp

Bootcamp presentations

Specialization 2, 8 January 2024 - 4 February 2024

Specialization 3, 5 February - 3 March 2024

Specialization 4, 4 March - 31 March 2024

Specialization 5 - this will be an optional specialization

Specialization 6, 1 April - 28 April 2024

Capstone, 29 April - 15 July 2024

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SLU	Name	Last year instructor	Batch 7 instructor	Last year QA
SLU01	Pandas 101	@majkah0	@majkah0	@Jujulian3
SLU02	Subsetting Data in Pandas	@jgomes959	@jgomes959	@jgerebelo
SLU03	Visualization with Pandas & Matplotlib	@Gustavo-SF	@kagglekim	@SaraOGomes
Test		@fabiocruz		@danizao @majkah0 @minhhoang1023 @Gustavo-SF
Test on 23 October 2023

SLU	Name	Last year instructor	Batch 7 instructor	Last year QA	Batch 7 QA
SLU04	Basic Stats with Pandas	@SaraOGomes	@cmm79	@jgomes959	@BG2602
SLU05	Covariance & Correlation	@kagglekim	@cmm79	@anaritarc	@BG2602
SLU06	Dealing with Data Problems	@majkah0	@TeignmouthElectron	@SaraOGomes	@BG2602
SLU07	Regression with Linear Regression	@jgerebelo	@joaogilsa	@carlacotas	@Mohamedgaber9
SLU08	Metrics for Regression	@marianahenriques1	@joaogilsa	@cd702	@Mohamedgaber9
SLU09	Classification with Logistic Regression	@majkah0	@majkah0	@carlacotas	@CaitlinHulse
SLU10	Metrics for Classification	@phgui	@majkah0	@majkah0	@CaitlinHulse
SLU11	Tree-Based Models	@anaritarc	@margaridantunes	@carlacotas	@Mohamedgaber9
SLU12	Feature Engineering (aka Real World Data)	@danizao	João Nobre	@anaritarc	@Mohamedgaber9
SLU13	Bias-Variance tradeoff & Model Selection	@jgerebelo	@rodrigomverissimo	@anaritarc	@BG2602
SLU14	Model complexity & Overfitting	@Gustavo-SF	@Gustavo-SF	@Jujulian3	@BG2602
SLU15	Hyperparameter Tuning	@jgomes959	@jgomes959	@SaraOGomes	@BG2602
SLU16	Workflow	@cimendes		@fabiocruz	@TeignmouthElectron
SLU17	Ethics & Fairness	@hershaw	@majkah0	@Gustavo-SF	@TeignmouthElectron
SLU18	Support Vector Machines (SVM) (optional unit)	@cimendes	@majkah0	@Jujulian3
SLU19	k-Nearest Neighbors (kNN) (optional unit)	@cimendes	@majkah0	@Jujulian3

Group	SLUs	Batch 7 QA lead*	Batch 7 backup QA**
QA1	SLU04, SLU05, SLU06	@BG2602	Caitlin Hulse
QA2	SLU07, SLU08	@Mohamedgaber9	Cora
QA3	SLU09, SLU10	Caitlin Hulse
QA4	SLU11, SLU12	@Mohamedgaber9
QA5	SLU13, SLU14, SLU15	@BG2602
QA6	SLU16, SLU17	@@TeignmouthElectron

Work unit	Name	Last year instructor	Last year QA
-	Spec lead	@martinb-bb
BLU01	Messy Data	@JerBouma	@majkah0
BLU02	Advanced Wrangling	@minhhoang1023	@cd702
BLU03	Data Sources	@jmaslek	@anaritarc
Hackathon 2	Data Wrangling	@martinb-bb @JerBouma @minhhoang1023 @DidierRLopes

Work unit	Name	Last year instructor	Batch 7 instructor(s)	Last year QA
-	Spec lead	@TSFelg	@TSFelg
BLU04	Time Series Concepts	@PedroRibeiro80		@Sonia-se
BLU05	Classical Time Series Models	@jgerebelo		@carlacotas
BLU06	Machine Learning for Time Series	@jdpsc	@TeignmouthElectron	@SaraOGomes
Hackathon 3	Timeseries	@TSFelg		@Gustavo-SF

Work unit	Name	Last year instructor	Last year QA
-	Spec lead	@CatarinaSilva
BLU07	Feature Extraction	@CatarinaSilva	@cd702
BLU08	Dimensionality Reduction	@CatarinaSilva	@majkah0
BLU09	Information Extraction	@CatarinaSilva	@carlacotas
Hackathon 4	NLP	BancoBPI

Work unit	Name	Last year QA
-	Spec lead
BLU10	Non-personalised Recommender	@majkah0 @anaritarc
BLU11	Personalized Recommenders	@majkah0 @anaritarc
BLU12	Workflow	@majkah0 @anaritarc
Hackathon 5	Recommender Systems

Allocation of units for batch7 #381

Description

Description

Units

Admissions

Specialization 1 + Bootcamp

Bootcamp presentations

Specialization 2, 8 January 2024 - 4 February 2024

Specialization 3, 5 February - 3 March 2024

Specialization 4, 4 March - 31 March 2024

Specialization 5 - this will be an optional specialization

Specialization 6, 1 April - 28 April 2024

Capstone, 29 April - 15 July 2024

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions