Daksitha Withanage Don M.Sc.
| Phone: | +49 821 – 598 2305 |
| Email: | daksitha.withanage.don@uni-auni-a.de () |
| Room: | 2044 (N) |
| Address: | Universit?tsstra?e 6a, 86159 Augsburg |
Research Interests
- Affective Computing
- Artificial Emotional Intelligence
- Socially Interactive Agents
- Generative AI?
- Self-supervised Learning
Bachelor/Master Thesis or Project Module
Thesis Guidelines for Prospective Students
If you're interested in writing a Bachelor’s or Master’s thesis with me, please follow these guidelines:
How to Apply
1. Looking for a Topic
- Explore the Open Topics section below.
- If no topics match your interests, you can propose your own by emailing:
- A motivational statement explaining why the topic fits your interests.
- A timeframe for the thesis (planned start and end dates).
- Note: Supervision depends on my capacity and topic relevance.
2. Have Your Own Topic or External Proposal
- If proposing your own or an external company topic:
- Include how it aligns with my research.
- For company topics, attach the original description and any specific requirements (e.g., NDAs).
- If suitable, I will guide you through the next steps.
Next Steps
-
If Accepted:
- We will formalize your topic and discuss project goals.
- Ensure you meet university-specific requirements (e.g., registration, defense talks).
-
If Declined:
- You may revise your topic or explore alternative supervisors.
Evaluation Criteria
Your thesis will be graded on:
- Literature review.
- Scientific approach and methodology.
- Clear structure and comprehensive documentation.
- Novelty and significance (especially?for master students).
- Quality of implementation or study design.
Feel free to contact me for further clarification or to apply. Looking forward to working on exciting projects together!
Open Topics :
?
Short description
This thesis focuses on training a lightweight model that generates upper-body gestures from speech input. The model can use speech audio, voice activity, and optional transcript information to predict 3D motion for a virtual character.
Research focus
The main goal is to build a reproducible baseline pipeline for speech-driven gesture generation.
Possible tasks
- Review speech-driven gesture generation literature
- Preprocess audio and 3D motion data
- Train a simple temporal model such as LSTM, GRU, CNN, or Transformer
- Generate upper-body gesture sequences
- Evaluate motion smoothness, diversity, and speech alignment
Related work areas
- Co-speech gesture generation
- Speech-to-motion learning
- Motion representation and SMPL-H
- Automatic evaluation of generated gestures
Expected outcome
A working baseline model and a documented training pipeline for generating 3D gestures from speech.
?
Related Links to Read
- GENEA Challenge 2022 paper: https://arxiv.org/abs/2208.10441
- GENEA Challenge 2023 paper: https://arxiv.org/abs/2308.12646
- GENEA Challenge 2022 project page: https://youngwoo-yoon.github.io/GENEAchallenge2022/
- Speech-driven gesture generation with motion matching: https://arxiv.org/abs/2305.11094
- Seamless Interaction dataset: https://github.com/facebookresearch/seamless_interaction
?
Short description
This thesis develops a tool for visually inspecting generated gestures together with speech audio, transcript timing, voice activity, and character animation.
Research focus
Support researchers in debugging and comparing generated gestures across different virtual characters.
Possible tasks
- Review visual analytics and gesture-evaluation literature
- Design a simple inspection interface
- Visualise audio, transcript, VAD, and motion timelines
- Show generated gestures on one or more avatars
- Add simple rating or comparison features
Related work areas
- Gesture-generation evaluation
- Visual analytics for motion data
- Speech–gesture alignment
- Human-centered AI tools
Related Links to Read
- Evaluation of gesture generation in large-scale studies: https://arxiv.org/abs/2303.08737
- Seamless Interaction dataset: https://github.com/facebookresearch/seamless_interaction
- MetaHuman documentation: https://dev.epicgames.com/documentation/en-us/metahuman/
- Blender animation documentation: https://docs.blender.org/manual/en/latest/animation/index.html
Expected outcome
A lightweight visual inspection tool for analysing generated gestures and avatar animations.
Short description
This thesis focuses on training a multimodal gesture-generation model for realistic virtual characters such as MetaHumans. The model can use speech audio, transcripts, word-level timestamps, and voice activity to generate upper-body gestures.
Research focus
Investigate how different input modalities improve gesture quality and speech alignment.
Possible tasks
- Review multimodal speech-driven gesture-generation literature
- Preprocess audio, transcript, VAD, and 3D motion data
- Train and compare several model variants
- Retarget generated gestures to a MetaHuman or similar character
- Evaluate naturalness, smoothness, and speech alignment
Related work areas
- Multimodal gesture generation
- Speech-driven animation
- Transformer and diffusion models for motion generation
- MetaHumans and embodied conversational agents
Related Links to Read
- MeLaX:Conversation with generative AI?https://dl.acm.org/doi/10.1145/3708557.3716363
- Seamless Interaction project page: https://ai.meta.com/research/seamless-interaction/
- Speech-driven gesture generation with motion matching: https://arxiv.org/abs/2305.11094
- MetaHuman documentation: https://dev.epicgames.com/documentation/en-us/metahuman/
Expected outcome
A complete pipeline from multimodal input to generated and animated gestures on a realistic virtual character.
?
?
Short description
This thesis investigates gesture generation in dyadic interaction. Instead of using only the speaker’s own speech, the model also considers the interlocutor’s speech, voice activity, or motion.
Research focus
Study whether interlocutor context improves the timing, naturalness, and social appropriateness of generated gestures.
Possible tasks
- Review dyadic gesture generation and social signal processing literature
- Prepare speaker and interlocutor input features
- Train speaker-only and dyadic-context models
- Compare generated gestures across different input settings
- Evaluate motion quality and social appropriateness
Related work areas
- Dyadic interaction modelling
- Listener-aware behaviour generation
- Turn-taking and backchannel behaviour
- Socially aware virtual agents
Related Links to Read
- Seamless Interaction project page: https://ai.meta.com/research/seamless-interaction/
- Seamless Interaction paper: https://arxiv.org/html/2506.22554v1
- Evaluation of gesture generation in large-scale studies: https://arxiv.org/abs/2303.08737
Expected outcome
A gesture-generation model that considers conversational context and supports more socially responsive virtual characters.
?
Supervised Theses
-
Automated ICEP-R Annotation of Infant-Caregiver Interactions Using V-Jepa Self-Supervised Learning (2024, Ahmed)
-
Augmenting Social Interactive Agents: Integrating Long-Term Memory in Large Language Models (2024, Lama)
-
Interactive Agent Realism: Mediapipe 3D Blendshapes for Low Resource-Intensive Listening Behavior Modeling (2024, Sarah)