Institute for Media Innovation

​Institute for Media Innovation is interested in conducting research internship projects with prospective interns for a minimum period of 5 months (Batch 1 and Batch 3 periods only).

  • Batch 1 (3rd January 2019 to 31st May 2019)
  • Batch 3 (1st August 2019 to 31st December 2019)

Name Of NTU Supervisor
Research Title
Professor Nadia Thalmann
​Skeleton based Action Recognition using GANs

Skeleton based action recognition is an actively researched topic. State of the art deep learning methods use relative skeletal joint positions and use discriminative models such as Recurrent Neural Network (RNN) and Long short term memory (LSTM) for recognition. Due to the popularity of generative models, in this project, we propose a Generative Adversarial Networks (GAN) based skeleton action recognition model that includes both relative joint positions and relative motion. The project mainly deals with proposing a GAN based model for skeleton action recognition that can handle occlusion and also include object interactions.

Skills Required: Deep learning knowledge, Python Tensorflow and Computer Vision

Professor Nadia Thalmann ​Natural Grasping with Humanoid Hand

This project aims to build a couple of new robotic hands for our humanoid robot – Nadine robot. To this end, we refer to state-of-the-art of robotic hands and design our own one with the help of CAD software such as autocad, solidwork, freecad, and others. The project includes the actuation system for the robotic hand including actuators, controller and software. After that, the motion analyses software is applied for the grasping and motion plan.

Skills Required: 3D modelling with CAD software, motor control with microcontroller and Matlab and 3D printing.


Professor Nadia Thalmann ​Realistic Immersion with Virtual Humans

This project aims to address the challenge of developing novel techniques to improve the communication between real and virtual worlds using concepts of perception-attention-action for virtual humans.

So far, we have developed comprehensive multi-party virtual reality platform for Volleyball game which captures the real users by integrating data from a head mounted device and depth cameras. Furthermore, we plan to develop the technology to allow virtual humans to be aware of their social contexts and selectively pay their attention to relevant input information. To the end, we seek to develop natural realistic reactions of virtual humans to real users in the virtual world. The specific tasks include developing action responses of virtual humans. For example, in the virtual Volleyball game, if a real user waves/points finger at the virtual players they response back with appropriate actions.

Skills Required: C++, Unity, Machine learning.


Professor Nadia Thalmann ​Autonomous Humanoid Robot Arm Motion with Visual Recognition Camera

The research will involve in humanoid robot arm autonomous motion by implementing Inverse Kinematics to control system and interaction with visual detecting camera. With integration of the control system and vison camera, the existing humanoid arm can recognize the target object and perform real-time and precise tracking motion. The ultimate goal for this project is to develop a complete system with function of object detection and humanoid arm natural grasping.

Skills Required: C/C+, Arduino, experience in robotics and object visual recognition

Professor Nadia Thalmann ​TweetBot: Automated text summarization

​The digital media holds a large amount of opinion. But due to the character limitation in Twitter, users tend to use short form and abbreviations. This project aims at development of text generation algorithms for twitter (within 280 characters). The candidate will be responsible for developing deep learning frameworks for text generation and embed it to an automated system.

Skills Required: Python programming language, Deep Learning (Tensorflow, Keras), Twitter APIs, Natural Language Understanding (NLU) and Natural Language Generation (NLG)


Professor Nadia Thalmann ​Nadine: Social Robotics Platform

​Nadine’s platform is implemented as a classic Perception-Decision-Action architecture. The perception layer is composed of a Microsoft Kinect V2 and a microphone. The perception includes face recognition, gestures recognition and some understanding of social situations. In regards to decision, our platform includes emotion and memory models as well as social attention. Finally, the action layer consists of a dedicated robot controller which includes emotional expression, lips synchronization and online gaze generation. Controlling and monitoring many of these modules require a tool to quickly identify an issue, if any. The aim of this project is to develop a module that binds these functions and represent a scalable social robotics platform. The outcome of this project would be the following:

  1. Understanding of integration of deep learning, robotics and computer graphics platforms which constitute our social robot Nadine
  2. Research on Human-robot interaction
  3. Generalized architecture of a social robotics platform (publication expected)


Skills Required: Good knowledge of data structures, algorithms, operating systems, computer networks, C++, Python expected. Experience in robotics, machine learning or computer graphics is a bonus.


Share Article