Skip to content

Allocation of units for batch7 #381

@cimendes

Description

@cimendes

This is a work in progress! Template: #330

Description

The goal of this issue is to assess interest and have a pre-allocation of batch7's
teaching work and QA.

The units, overall work needed and release and delivery dates are listed
below.

Units

Admissions

  • Unit
    • Adjustment to the new Python and Pandas versions.
    • Learning notebooks - minimum change: review
    • Example notebooks - minimum change: review
    • Exercise notebook - minimum change: review, new datasets
  • To be released on 23 October 2023
  • To be ready on: soon
SLU Name Last year instructor Batch 7 instructor Last year QA
SLU01 Pandas 101 @majkah0 @majkah0 @Jujulian3
SLU02 Subsetting Data in Pandas @jgomes959 @jgomes959 @jgerebelo
SLU03 Visualization with Pandas & Matplotlib @Gustavo-SF @kagglekim @SaraOGomes
Test @fabiocruz @danizao @majkah0 @minhhoang1023 @Gustavo-SF
Test on 23 October 2023

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the test verification

Specialization 1 + Bootcamp

  • Project manager: José Rebelo @jgerebelo
  • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • Learning notebook
    • Example notebook
    • Exercise notebook
  • To be released on:
    • SLU04 - SLU10 learning notebooks: 19 November 2023
    • SLU04 - SLU10 exercise notebooks: 26 November 2023
    • SLU11- SLU19 learning and exercise notebooks: 26 November 2023
  • To be ready in November
SLU Name Last year instructor Batch 7 instructor Last year QA Batch 7 QA
SLU04 Basic Stats with Pandas @SaraOGomes @cmm79 @jgomes959 @BG2602
SLU05 Covariance & Correlation @kagglekim @cmm79 @anaritarc @BG2602
SLU06 Dealing with Data Problems @majkah0 @TeignmouthElectron @SaraOGomes @BG2602
SLU07 Regression with Linear Regression @jgerebelo @joaogilsa @carlacotas @Mohamedgaber9
SLU08 Metrics for Regression @marianahenriques1 @joaogilsa @cd702 @Mohamedgaber9
SLU09 Classification with Logistic Regression @majkah0 @majkah0 @carlacotas @CaitlinHulse
SLU10 Metrics for Classification @phgui @majkah0 @majkah0 @CaitlinHulse
SLU11 Tree-Based Models @anaritarc @margaridantunes @carlacotas @Mohamedgaber9
SLU12 Feature Engineering (aka Real World Data) @danizao João Nobre @anaritarc @Mohamedgaber9
SLU13 Bias-Variance tradeoff & Model Selection @jgerebelo @rodrigomverissimo @anaritarc @BG2602
SLU14 Model complexity & Overfitting @Gustavo-SF @Gustavo-SF @Jujulian3 @BG2602
SLU15 Hyperparameter Tuning @jgomes959 @jgomes959 @SaraOGomes @BG2602
SLU16 Workflow @cimendes @fabiocruz @TeignmouthElectron
SLU17 Ethics & Fairness @hershaw @majkah0 @Gustavo-SF @TeignmouthElectron
SLU18 Support Vector Machines (SVM) (optional unit) @cimendes @majkah0 @Jujulian3
SLU19 k-Nearest Neighbors (kNN) (optional unit) @cimendes @majkah0 @Jujulian3
Group SLUs Batch 7 QA lead* Batch 7 backup QA**
QA1 SLU04, SLU05, SLU06 @BG2602 Caitlin Hulse
QA2 SLU07, SLU08 @Mohamedgaber9 Cora
QA3 SLU09, SLU10 Caitlin Hulse
QA4 SLU11, SLU12 @Mohamedgaber9
QA5 SLU13, SLU14, SLU15 @BG2602
QA6 SLU16, SLU17 @@TeignmouthElectron

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)

Bootcamp presentations

Bootcamp presentations will be split in two parts. Presentations will be given by senior instructors. This is what is expected from each instructor:

  • The presentation should be <= 60 min including student questions. The presentation should be on concepts and insights for the given topic, not the technical implementation in Python.
  • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
  • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
    Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

Bootcamp part 1, Sunday 26. November 2023

  • Instructor 1: LTPlabs
    • ~ 30 min: Intro to data science, SLU04 - Basic Stats with Pandas, SLU05 - Covariance and Correlation
    • ~ 30 min: SLU06 - Dealing with Data Problems
  • Instructor 2: José Rebelo, EDP
    • 30 - 60 min: SLU07 - Regression with Linear Regression, SLU08 - Metrics for Regression
  • Instructor 3: LTPlabs
    • 30 - 60 min, SLU09 - Classification with Logistic Regression, SLU10 - Metrics for Classification

Bootcamp part 2, Sunday 3. December 2023

  • Instructor 4: João Ascensão, Stratio - TBC
    *45-60 min: SLU11 - Tree-Based Models, SLU12 - Feature Engineering

  • Instructor 5: Maria Cristina Dominguez

    • ~60 min: SLU13 - Bias-Variance tradeoff & Model Selection, SLU14 - Model complexity and Overfitting, SLU15 - Hyperparameter Tuning
  • Instructor 6: Sam Hopkins, DareData

    • 30-60 min: SLU16 - Workflow, SLU17 - Ethics and Fairness
  • Hackathon 1

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 17 December 2023

  • To be ready in November

Work unit Name Last year instructor Batch 7 instructor Last year QA Batch 7 QA
Hackathon 1 Binary Classification @wilsonramos1 @jgomes959 ? .

Specialization 2, 8 January 2024 - 4 February 2024

  • Project manager: Kim Pronk @kagglekim

  • Senior instructor:

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 8 January (BLU01), 15 January (BLU02), 22 January (BLU03)

  • To be ready in ** December 2023**

  • Hackathon 2

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 4 February (Hackathon 02)

  • To be ready in mid January

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @martinb-bb
BLU01 Messy Data @JerBouma @majkah0
BLU02 Advanced Wrangling @minhhoang1023 @cd702
BLU03 Data Sources @jmaslek @anaritarc
Hackathon 2 Data Wrangling @martinb-bb @JerBouma @minhhoang1023 @DidierRLopes
  • Batch 7 QA Lead BLU01/BLU02/BLU03: @AhmedEmad2525
  • Batch 7 backup QA BLU01/BLU02/BLU03:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 3, 5 February - 3 March 2024

  • Project manager: Mária Hanulová @majkah0

  • Senior instructor: Telmo Felgueira, Loka / JungleAI

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 5 February (BLU04), 12 February (BLU05), 19 February (BLU06)

  • To be ready in January

  • Hackathon 3

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 3 March (Hackathon 03)

  • To be ready in mid February

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @TSFelg @TSFelg
BLU04 Time Series Concepts @PedroRibeiro80 @Sonia-se
BLU05 Classical Time Series Models @jgerebelo @carlacotas
BLU06 Machine Learning for Time Series @jdpsc @TeignmouthElectron @SaraOGomes
Hackathon 3 Timeseries @TSFelg @Gustavo-SF
  • Batch 7 QA Lead BLU04/BLU05/BLU06: @Mohamedgaber9
  • Batch 7 backup QA BLU04/BLU05/BLU06:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 4, 4 March - 31 March 2024

  • Project manager:

  • Senior instructor:

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 4 March (BLU07), 11 March (BLU08), 18 March (BLU09)

  • To be ready in February

  • Hackathon 4

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 31 March (Hackathon 04)

  • To be ready in ** mid March**

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @CatarinaSilva
BLU07 Feature Extraction @CatarinaSilva @cd702
BLU08 Dimensionality Reduction @CatarinaSilva @majkah0
BLU09 Information Extraction @CatarinaSilva @carlacotas
Hackathon 4 NLP BancoBPI

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 5 - this will be an optional specialization

  • Project manager:
  • Junior instructors
    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
  • To be released in March/April
  • To be ready in March
Work unit Name Last year instructor Batch 6 instructor(s) Last year QA
- Spec lead
BLU10 Non-personalised Recommender @majkah0 @anaritarc
BLU11 Personalized Recommenders @majkah0 @anaritarc
BLU12 Workflow @majkah0 @anaritarc
Hackathon 5 Recommender Systems

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Specialization 6, 1 April - 28 April 2024

  • Project manager:

  • Senior instructor: Gustavo Fonseca, LDSA

    • 1 hour AMA session
    • If possible, read the relevant learning notebooks and give a high level feedback on them (e.g. topic X is not relevant, you should add topic Y)
    • If possible, answer questions from junior instructors who will update the corresponding SLUs, e.g. in a ~ 1 hour meeting where they will come with prepared questions.
  • Time requirements: 1 hour for the talk + time to prepare the talk (we can provide slides from last year), <1 hour to read the learning notebooks, 1 hour for meeting with junior instructors.

  • Junior instructors

    • Unit: minimum changes based on issues from last year and this year's QA. Adjustment to the new Python and Pandas versions. This year, we are doing a reverse process - first QA, then unit improvement.
    • In case of doubts, ask the senior instructor. Ideally, prepare all your questions, then meet with the senior instructor.
  • To be released on 1 April (BLU13), 8 April (BLU14), 15 April (BLU15)

  • To be ready in March

  • Hackathon

    • Come up with new problem for hackathon
    • Create description, Find dataset and create data splits (should require some iteration and validating the most common approaches work well, i.e., they improve over a dummy baseline)
    • Create baseline instructor solution
    • Evaluation guidelines doc
    • Overall guidelines for instructors to help out in hackathon
  • To be released on 28 April (Hackathon 06)

  • To be ready in mid April

Extra session about Venture Capital: Armilar

Work unit Name Last year instructor Batch 7 instructor(s) Last year QA
- Spec lead @cimendes @Gustavo-SF
BLU13 Basic model Deployment @cimendes @carlacotas
BLU14 Deployment in the real world @cimendes
BLU15 Model CSI @cimendes @carlacotas
Hackathon 6 Data science in real world @CatarinaSilva @cimendes @InesPessoa
  • Batch 7 QA Lead BLU13/BLU14/BLU15:
  • Batch 7 backup QA BLU13/BLU14/BLU15:

*It will be the responsible for checking the SLUs
** It will be the responsible for checking the SLUs, in case there are major changes (exercises or materials updates)
Both QA people do the Hackathon verification

Capstone, 29 April - 15 July 2024

  • Preparing a strong dataset and problem
  • Help building documents/forms/etc
  • Replying to students QA
  • Beta-testing/QAing
  • Grading capstone
  • To be released on 29 April
  • To be ready in mid April
Work unit Name Last year instructor (s) Batch 7 instructor(s)
- Capstone @minhhoang1023 @cimendes @fabiocruz @Gustavo-SF @anaritarc @majkah0

Other possible extra sessions:

  • NOS (LLM, Data Science in Real World);
  • AICEP (Classfication, Data Science in Real World);
  • BPI (Classfication, Data Science in Real World)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Batch 7QA AORFalls under the responsibility of the Quality Assurance (QA) AOR.Teaching AORFalls under the responsibility of the Teaching AOR.priority:high

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions