Jan van Gemert

Doing a MSc thesis at Delft University of Technology (TU Delft) in Computer Vision / Deep learning

I try to give MSc students a sense of doing real, cutting edge research and I have published in top venues based on a MSc thesis.
I exclusively supervise MSc thesis topics on visual data (images, video), and I do fundamental research, which means that I focus on understanding underlying concepts and a bit less on specific applications.

First meeting.
For our first meeting, please have a look at the following points and prepare accordingly:

  • Background: Show me that you have good grades for: 1. Deep Learning; and 2. Computer Vision by Deep Learning. Additional Machine Learning, Graphics, or Image Processing courses will be appreciated.
  • Information: I will need your first name, last name, email address, name of your MSc program, formal thesis starting date, expected graduation date, netID, and student number.
  • Thesis at a company?: We need to setup a "3 party agreement" about IP etc. Please contact thesis-eemcs@tudelft.nl.
  • Mare: (For CS students) Sign up your MSc thesis in MARE (do this as soon as possible).
  • Brightspace: Sign up for the MSc CS Thesis project EEMCS in Brightspace.
  • Planning: Planning is part of your final grade and you are responsible for making sure that you graduate on time. Add a text file to the 'notes' folder on the git with the name: "planning.txt" with a planning time line (give the expected dates) of when required forms, committee meetings, milestones, green light moments, etc., are due. Note that I cannot keep track of each step for all students, so, please do not depend on me and plan this for yourself. The information can be found on Brighspace and Mare
  • Git: During this meeting I will create a git project for you. Please create 3 folders: 'code' (for your code), 'thesis' (for your latex files), 'notes' (for our agreements).
  • Committee: For your thesis committee graduation we need at least 1 member outside of the PRB section. At our first meeting, please suggest 2 potential EWI faculty members outside PRB to be on your committee. You can find potential committee members on the "people" sections of the TUD computer science webpages, or it can be faculty members where you enjoyed their courses or have had contact with before. Note: it has to a faculty member from EWI. Try to submit this form as soon as possible.
  • Marunka: After we had our first meeting: send an email to Marunka van Stight (M.vanStight@tudelft.nl) to let her know your starting date and that you will start your MSc thesis with me and ask her for an account on our GPU cluster.
Thesis start.
Things to do when you start your thesis:

  • Mattermost: Sign up for our Mattermost and join the public "CV-lab-MSCthesis" channel. You can use this channel to get in contact with other MSc students. Writing a thesis is individual, and can be lonely, so please use this opportunity to meet/discuss with other students in a similar situation. Many parts of the research process are the same for each student, even if particular topics vary. Sharing these experiences with others benefits you, as well as your fellow students.
  • Mandatory student meetings: Every 2 weeks we have scheduled MSc student presentations from the PRB section (schedule). You are required to present a few times (follow my presentation guidelines.) and you have to attend at least 10x, if you cannot make it send an email to Marunka van Stight (M.vanStight@tudelft.nl). The benefit for you is that you can practice your presentation a few times before your final defense and that you get feedback and insight in what others are doing. If you have questions about this, please ask Marunka van Stight (M.vanStight@tudelft.nl)
  • GPU cluster: Please inform yourself about how to use our compute cluster.
  • Cluster folder: On the cluster, please create a folder with your name in "/tudelft.net/staff-umbrella/StudentsCVlab/" and use it for your experiments; this is not a safe place to store original source code: Make sure all important code is backed up in your git. Please avoid using many small files, as this slows down access and clogs the file system (limited nr of inodes: Consider using HDF5). Check for common datasets in "/tudelft.net/staff-bulk/ewi/insy/CV-DataSets". If you use conda, do "conda clean --all" after installing. Please regularly remove things you no longer need.
  • Research: Read my research guidelines.
  • Committee: Please submit the "Thesis Committee form" as soon as possible, where we add an EWI faculty member. This has to be an 'external' EWI faculty member, ie: outside of the PRB group (see "first meeting").
During your thesis.
Things to keep in mind when you are doing your thesis research (remember to go back to this page):

  • Go/NoGo: After around 10 weeks we have a Go/NoGo decision (first stage review), the goal is to verify that the topic is suitable; if the topic is not good/fertile, or not a good fit for you, then it's (difficult, but) really better for you to find a different topic as soon as possible, otherwise you might not be able to graduate (or spend excessive amounts of time). Don't worry: I've never had a "no go" yet; so, if you are able to get some progress it will be OK; there is also a 'repair' possibility. (For CS students) we will fill out the MaRe form together, see the MaRe system.
  • Green light: The green-light moment is when your (co)supervisor(s) agree that you have enough results to graduate. This means that the main research questions have been answered. Yet, there are possibly more things to try, and more text to write. When we believe that the main results will no longer change (there will be no more surprises), that is "green light". To determine "green light", you should be able to answer the "Graduation questions" written at the bottom of this document. We also need to determine that you will finish a good quality thesis report in the remaining time, so please email us a thesis report before the green light meeting.
  • Training: Are you training deep networks? Unfortunately, significant time goes to tuning hyper-parameters, please be prepared to follow A Recipe for Training Neural Networks
  • Co-supervisor: If you have a co-supervisor: Meet them individually each week.
  • When to meet me: Meet me whenever it is useful for you. This could for example be when you want to brainstorm, are unsure about something, need to make an informed descision about which direction to go, etc. If you need to, we can meet every week, or, if you don't need to, then we can meet after 3 weeks: plan it whenever it is useful for you, so if you want to meet me just to update me of your progress, then it might not be very useful for you and you can consider to postpone the meeting. Please keep in mind that you are responsible for scheduling meetings and your own progress.
  • How to meet me: Invite your co-supervisor for each meeting and schedule yourself for a slot online.
  • Please try to keep presentations short so that there is enough time to interact.
  • What to discuss with me: Start with a short presentation taking no longer than half of the scheduled time. The first slide should be the current "high level story line", for what a "story line" is, please refer to my Research paper template. This story line makes sure that we are all still aware of the topic and direction, which is important as the direction (and thus the story line) will change over time. Slide 2: Quick reminder of what part of the story line you will now present, and possibly any agreements that we made last time. Remaining slides are for content. Put the answer to the typical questions I ask on each slide, if relevant. If you present results, make sure to explicitly write down all conclusions you draw from these results on the slide.
  • Update: Do not forget to keep everything in your git up to date
  • Stuck?: Whenever you are stuck: First re-read my guidelines, especially the typical questions I ask, as you can try to ask them yourself.
  • Committee: Did you already submit the "Thesis Committee form"? Where we add an EWI faculty member outside of PRB (see "first meeting").
  • Clean: Please regularly remove things you no longer need. How is it going with the number of (small) files? Please avoid using very many small files, as this slows down access and clogs the file system (limited nr of inodes). Consider using HDF5.
Thesis report.
The core of your MSc Thesis report is a scientific article so that: 1. you learn how to write a scientific article, 2. we can see if you can separate core from detail; and 3. that the research is easy to submit for publication (if the work lends itself for that).

A scientific article is aimed at fellow experts in your research topic. Yet, your thesis should be understandable by a broader audience of knowledgeable non-experts. Thus, in your thesis report you should add the relevant technical context that allows for the scientific article to be read by knowledgable non-experts, such as an external committee member who is an expert on a different topic.

This means your thesis report will need three parts:
  • Part 1: General introduction chapter: The goal of this small chapter is to gently introduce the research and make the full report readable for a non-expert. Thus, you explain the structure (thesis = scientific article+background) and briefly explain the "high level story line", in such a way that a non-expert can understand it, where its great to use visuals here. Mention that part 3 is a scientific article, and mention which technical background sections are found in part 2, and how the background sections in part 2 relate to the article in part 3.
  • Part 2: Preliminary materials: Technical explanations of core concepts used in the scientific article of part 3. These background explanations should make it possible for a non-expert to understand the technical side of the scientific article in part 3. Rule of thumb: target your MSc student peers; ie: what background knowledge should a non-expert MSc student from your program need to understand the scientific article.
  • Part 3: Scientific article: Preferably in double column CVPR-style Latex format. This is written in the same style as a publication in the field.
Here are some example MSc theses done in this format (some have the background as part 3; but it's clearer to have the background as part 2 and the article as part 3): Example, Example, Example, Example.

For the writing: follow my writing guidelines.

Make sure that your thesis answers the questions that are typically asked in a thesis defense (see below)

Defense.
The formal requirements (forms, timeline, green-light moment, etc.) vary per faculty. You are responsible for managing these requirements. Please reserve a room for 2 hours to have the defense.

The procedure during the defense is approximately as follows:
  • You give a presentation of around 20 minutes (follow my presentation guidelines.) The presentation time is short on purpose: we wish to assess how well you can extract the essentials of your work.
  • Some questions from the audience.
  • Detailed questions from the committee members.
  • The committee retreats and decides on a weighted grade, based on this matrix.
  • The committee motivates the grade to the candidate privately.
  • The diploma ceremony proceeds in public.
Depending on your preference: the detailed defense questions by the committee members can be done in private or in public. If you want to have this in private, then please reserve an additional room, or (beforehand) prepare the audience that they should leave the room.

Graduation questions.
Some questions you may expect during your defense, and which thus should also be answered in your thesis and at your Green-Light moment, are given below. Please try to keep Hitchens's razor in mind for each motivation and claim that you make: What can be asserted without evidence can also be dismissed without evidence. Please re-read my research guidelines.
  • What problem does your improvement solve? Does the problem exist? What is your evidence for that? (ie: you need to demonstrate that the problem exists). How bad is this problem? Ie: answer the "so what?" question; what are the consequences of this problem? So, is the problem relevant? Did you demonstrate/argue what the consequences of the problem are?
  • What is "scientific" about your work? (ie: what is interesting about it for other researchers?; what have we now learned?).
  • Can you explain *why* this effect happens? And if so, what is your evidence for this claim?
  • Did you validate your baselines? A baseline is the method that has a certain problem; the problem is the main motivator for your solution. So, you need to validate your baselines that they actually have the claimed problem. To do that, we need to be sure that the baselines are correctly implemented. Ie: how sure are you that the baselines that you compare to, are correct (eg: well trained)? What is your evidence for this claim? (for example: with a reproduction on a published result).
  • Did you validate your competitors? A competitor is a method that solves the problem of the baseline in a different way that you do. How sure are you that the competitors that you compare to, are correctly implemented or well-trained? What is your evidence for this claim? (for example: with a reproduction on a published result).
  • How can we be sure that your explanation is correct? Are you "confusing explanation with speculation" (see: Troubling Trends in Machine Learning Scholarship). What is your evidence for this?
  • How can we be sure that the improved accuracy comes from what you claim: "failure to identify the sources of empirical gains" (see: Troubling Trends in Machine Learning Scholarship). Note that improved accuracy is nice, but the scientific interest is the detailed, well-motivated, well-investigated, answer to "why does it improve?", and providing evidence for this.
  • When will your method fail? What assumptions does it make? In which cases will your approach not work, and in which cases will it work well? Do you have evidence for this?
  • Did you look at your results? So not just the numbers, but what do you see when you look at the output on individual samples? For some samples the results improve, and for others the results decrease, are there any explainable patterns for the improving/decreasing samples? Can you quantify this?
  • Why did you use this particular evaluation measure? Its nice that others use this, but is this evaluation measure the best choice for what you want to know? And why?
  • In your article you use the term XXX; what does this mean?
  • Your system has several modules; How well does each module perform? Are all modules needed? Can't we remove module XXX? (ie: you need to do an ablation study to answer these questions)
  • If you would have to start again, with the knowledge you have now, what would you have done differently?
  • Looking back, what was the most difficult part, and how did you overcome this?

Valid XHTML 1.0 Transitional
Valid CSS!