-
Notifications
You must be signed in to change notification settings - Fork 7
Description
This is a work in progress! Template: #330
Description
The goal of this issue is to assess interest and have a pre-allocation of batch7's
teaching work and QA.
The units, overall work needed and release and delivery dates are listed
below.
Units
Admissions
- Unit
- Adjustment to the new Python and Pandas versions.
- Learning notebooks - minimum change: review
- Example notebooks - minimum change: review
- Exercise notebook - minimum change: review, new datasets
- To be released on 23 October 2023
- To be ready on: soon
| SLU | Name | Last year instructor | Batch 7 instructor | Last year QA |
|---|---|---|---|---|
| SLU01 | Pandas 101 | @majkah0 | @majkah0 | @Jujulian3 |
| SLU02 | Subsetting Data in Pandas | @jgomes959 | @jgomes959 | @jgerebelo |
| SLU03 | Visualization with Pandas & Matplotlib | @Gustavo-SF | @kagglekim | @SaraOGomes |
| Test | @fabiocruz | @danizao @majkah0 @minhhoang1023 @Gustavo-SF | ||
| Test on 23 October 2023 |
- Batch 7 QA Lead SLU01/SLU02/SLU03: @FilipaPereira78
- Batch 7 backup QA SLU01/SLU02/SLU03:@CaitlinHulse
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the test verification
Specialization 1 + Bootcamp
- Project manager: José Rebelo @jgerebelo
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- Learning notebook
- Example notebook
- Exercise notebook
- To be released on:
- SLU04 - SLU10 learning notebooks: 19 November 2023
- SLU04 - SLU10 exercise notebooks: 26 November 2023
- SLU11- SLU19 learning and exercise notebooks: 26 November 2023
- To be ready in November
| SLU | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|---|---|---|---|---|
| SLU04 | Basic Stats with Pandas | @SaraOGomes | @cmm79 | @jgomes959 | @BG2602 |
| SLU05 | Covariance & Correlation | @kagglekim | @cmm79 | @anaritarc | @BG2602 |
| SLU06 | Dealing with Data Problems | @majkah0 | @TeignmouthElectron | @SaraOGomes | @BG2602 |
| SLU07 | Regression with Linear Regression | @jgerebelo | @joaogilsa | @carlacotas | @Mohamedgaber9 |
| SLU08 | Metrics for Regression | @marianahenriques1 | @joaogilsa | @cd702 | @Mohamedgaber9 |
| SLU09 | Classification with Logistic Regression | @majkah0 | @majkah0 | @carlacotas | @CaitlinHulse |
| SLU10 | Metrics for Classification | @phgui | @majkah0 | @majkah0 | @CaitlinHulse |
| SLU11 | Tree-Based Models | @anaritarc | @margaridantunes | @carlacotas | @Mohamedgaber9 |
| SLU12 | Feature Engineering (aka Real World Data) | @danizao | João Nobre | @anaritarc | @Mohamedgaber9 |
| SLU13 | Bias-Variance tradeoff & Model Selection | @jgerebelo | @rodrigomverissimo | @anaritarc | @BG2602 |
| SLU14 | Model complexity & Overfitting | @Gustavo-SF | @Gustavo-SF | @Jujulian3 | @BG2602 |
| SLU15 | Hyperparameter Tuning | @jgomes959 | @jgomes959 | @SaraOGomes | @BG2602 |
| SLU16 | Workflow | @cimendes | @fabiocruz | @TeignmouthElectron | |
| SLU17 | Ethics & Fairness | @hershaw | @majkah0 | @Gustavo-SF | @TeignmouthElectron |
| SLU18 | Support Vector Machines (SVM) (optional unit) | @cimendes | @majkah0 | @Jujulian3 | |
| SLU19 | k-Nearest Neighbors (kNN) (optional unit) | @cimendes | @majkah0 | @Jujulian3 |
| Group | SLUs | Batch 7 QA lead* | Batch 7 backup QA** |
|---|---|---|---|
| QA1 | SLU04, SLU05, SLU06 | @BG2602 | Caitlin Hulse |
| QA2 | SLU07, SLU08 | @Mohamedgaber9 | Cora |
| QA3 | SLU09, SLU10 | Caitlin Hulse | |
| QA4 | SLU11, SLU12 | @Mohamedgaber9 | |
| QA5 | SLU13, SLU14, SLU15 | @BG2602 | |
| QA6 | SLU16, SLU17 | @@TeignmouthElectron |
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Bootcamp presentations
Bootcamp presentations will be split in two parts. Presentations will be given by senior instructors. This is what is expected from each instructor:
- The presentation should be <= 60 min including student questions. The presentation should be on concepts and insights for the given topic, not the technical implementation in Python.
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
Bootcamp part 1, Sunday 26. November 2023
- Instructor 1: LTPlabs
- ~ 30 min: Intro to data science, SLU04 - Basic Stats with Pandas, SLU05 - Covariance and Correlation
- ~ 30 min: SLU06 - Dealing with Data Problems
- Instructor 2: José Rebelo, EDP
- 30 - 60 min: SLU07 - Regression with Linear Regression, SLU08 - Metrics for Regression
- Instructor 3: LTPlabs
- 30 - 60 min, SLU09 - Classification with Logistic Regression, SLU10 - Metrics for Classification
Bootcamp part 2, Sunday 3. December 2023
-
Instructor 4: João Ascensão, Stratio - TBC
*45-60 min: SLU11 - Tree-Based Models, SLU12 - Feature Engineering -
Instructor 5: Maria Cristina Dominguez
- ~60 min: SLU13 - Bias-Variance tradeoff & Model Selection, SLU14 - Model complexity and Overfitting, SLU15 - Hyperparameter Tuning
-
Instructor 6: Sam Hopkins, DareData
- 30-60 min: SLU16 - Workflow, SLU17 - Ethics and Fairness
-
Hackathon 1
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 17 December 2023
-
To be ready in November
| Work unit | Name | Last year instructor | Batch 7 instructor | Last year QA | Batch 7 QA |
|---|---|---|---|---|---|
| Hackathon 1 | Binary Classification | @wilsonramos1 | @jgomes959 | ? | . |
Specialization 2, 8 January 2024 - 4 February 2024
-
Project manager: Kim Pronk @kagglekim
-
Senior instructor:
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 8 January (BLU01), 15 January (BLU02), 22 January (BLU03)
-
To be ready in ** December 2023**
-
Hackathon 2
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 4 February (Hackathon 02)
-
To be ready in mid January
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @martinb-bb | ||
| BLU01 | Messy Data | @JerBouma | @majkah0 | |
| BLU02 | Advanced Wrangling | @minhhoang1023 | @cd702 | |
| BLU03 | Data Sources | @jmaslek | @anaritarc | |
| Hackathon 2 | Data Wrangling | @martinb-bb @JerBouma @minhhoang1023 @DidierRLopes |
- Batch 7 QA Lead BLU01/BLU02/BLU03: @AhmedEmad2525
- Batch 7 backup QA BLU01/BLU02/BLU03:
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification
Specialization 3, 5 February - 3 March 2024
-
Project manager: Mária Hanulová @majkah0
-
Senior instructor: Telmo Felgueira, Loka / JungleAI
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 5 February (BLU04), 12 February (BLU05), 19 February (BLU06)
-
To be ready in January
-
Hackathon 3
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 3 March (Hackathon 03)
-
To be ready in mid February
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @TSFelg | @TSFelg | |
| BLU04 | Time Series Concepts | @PedroRibeiro80 | @Sonia-se | |
| BLU05 | Classical Time Series Models | @jgerebelo | @carlacotas | |
| BLU06 | Machine Learning for Time Series | @jdpsc | @TeignmouthElectron | @SaraOGomes |
| Hackathon 3 | Timeseries | @TSFelg | @Gustavo-SF |
- Batch 7 QA Lead BLU04/BLU05/BLU06: @Mohamedgaber9
- Batch 7 backup QA BLU04/BLU05/BLU06:
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification
Specialization 4, 4 March - 31 March 2024
-
Project manager:
-
Senior instructor:
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 4 March (BLU07), 11 March (BLU08), 18 March (BLU09)
-
To be ready in February
-
Hackathon 4
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 31 March (Hackathon 04)
-
To be ready in ** mid March**
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @CatarinaSilva | ||
| BLU07 | Feature Extraction | @CatarinaSilva | @cd702 | |
| BLU08 | Dimensionality Reduction | @CatarinaSilva | @majkah0 | |
| BLU09 | Information Extraction | @CatarinaSilva | @carlacotas | |
| Hackathon 4 | NLP | BancoBPI |
- Batch 7 QA Lead BLU07/BLU08/BLU09: @CaitlinHulse
- Batch 7 backup QA BLU07/BLU08/BLU09: @BG2602
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification
Specialization 5 - this will be an optional specialization
- Project manager:
- Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- To be released in March/April
- To be ready in March
| Work unit | Name | Last year instructor | Batch 6 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | |||
| BLU10 | Non-personalised Recommender | @majkah0 @anaritarc | ||
| BLU11 | Personalized Recommenders | @majkah0 @anaritarc | ||
| BLU12 | Workflow | @majkah0 @anaritarc | ||
| Hackathon 5 | Recommender Systems |
- Batch 7 QA Lead BLU10/BLU11/BLU12: @TeignmouthElectron
- Batch 7 backup QA BLU10/BLU11/BLU12:
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification
Specialization 6, 1 April - 28 April 2024
-
Project manager:
-
Senior instructor: Gustavo Fonseca, LDSA
- 1 hour AMA session
- If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
- If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
-
Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.
-
Junior instructors
- Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
- In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
-
To be released on 1 April (BLU13), 8 April (BLU14), 15 April (BLU15)
-
To be ready in March
-
Hackathon
- Come up with new problem for hackathon
- Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
- Create baseline instructor solution
- Evaluation guidelines doc
- Overall guidelines for instructors to help out in hackathon
-
To be released on 28 April (Hackathon 06)
-
To be ready in mid April
Extra session about Venture Capital: Armilar
| Work unit | Name | Last year instructor | Batch 7 instructor(s) | Last year QA |
|---|---|---|---|---|
| - | Spec lead | @cimendes | @Gustavo-SF | |
| BLU13 | Basic model Deployment | @cimendes | @carlacotas | |
| BLU14 | Deployment in the real world | @cimendes | ||
| BLU15 | Model CSI | @cimendes | @carlacotas | |
| Hackathon 6 | Data science in real world | @CatarinaSilva @cimendes @InesPessoa |
- Batch 7 QA Lead BLU13/BLU14/BLU15:
- Batch 7 backup QA BLU13/BLU14/BLU15:
*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification
Capstone, 29 April - 15 July 2024
- Preparing a strong dataset and problem
- Help building documents/forms/etc
- Replying to students QA
- Beta-testing/QAing
- Grading capstone
- To be released on 29 April
- To be ready in mid April
| Work unit | Name | Last year instructor (s) | Batch 7 instructor(s) |
|---|---|---|---|
| - | Capstone | @minhhoang1023 @cimendes @fabiocruz @Gustavo-SF @anaritarc @majkah0 |
Other possible extra sessions:
- NOS (LLM, Data Science in Real World);
- AICEP (Classfication, Data Science in Real World);
- BPI (Classfication, Data Science in Real World)