Interviewer 0:00:00
Good day, Mr.. I'm Arjun. I've had a look through your background and your professional record. I'm interested in hearing more about your work and how you see yourself fitting in here with our faculty. Let's begin.
Mr. Subhayu Ghosh 0:04:27
I am Doctor Subhayu Ghosh. I have obtained my PhD from. Department of Computer Science and Engineering. At National Institute of Technology, Durgapur. The title of my thesis was Generative AI based Audiovisual Speech Synthesis. And some advanced approaches. So I have worked under the doctoral Jana. The associate professor of department of CSC. And I have defended my thesis on 23rd March. 2026. And before my PhD, I have obtained my B.Tech from NIT Durgapur. And my research interest covered the domains. Like audio and speech processing. Biomedical image processing. Multimodal AI. Generative AI. And deep learning. And currently I am having 30 publications. 11 journal papers. Along with 19. International conferences. And right now? After defending my PhD thesis, I am looking for. Faculty position. For my career also and I have applied to VIT and right now I have got the call from VIT. So I'm interested in. Yeah, I doing my job here.
Interviewer 0:04:31
Professor, based on your experience with both undergraduate and graduate students, which teaching method have you found most effective for introducing complex concepts in multimedia or AI to beginners?
Mr. Subhayu Ghosh 0:05:45
After doing my B.Tech. I do have. For getting. 8 GPA and as well as get qualified. I was eligible for direct PhD and I have done. I have. Start joint PhD in NIT Durgapur in the same institute. And after that? I have worked on generative AI based audio visual speech synthesis. Basically. The audio signal processing as well as computer vision applications are there. And both modalities used, therefore it is multimodal AI. And my application domain is audio visual domain but it is the actually. I have developed. Replanning architectures. Learning strategies. For generative AI based works. And also. Some basic understanding of deep learning architectures I have implemented in. Yeah, biomedical image processing also and I have done my. Several public essence like. In IEEE transactions on multimedia. As well as expert system with applications etcetera that are high impact factor journal as well as. The good Taiwan conference like ICAPS. Accords command, IGCNL etc. So right now I am having 30 publications. With 11 journals and. 19 conferences.
Interviewer 0:05:46
Thank you for outlining your academic journey, Professor. In your experience, when teaching a foundational topic like deep learning to students new to the subject, do you prefer to begin with theoretical frameworks or hands-on coding exercises?
Mr. Subhayu Ghosh 0:06:10
Yeah, I am.
Mr. Subhayu Ghosh 0:06:15
Comfortable with theoretical development. Also, as well as heads or is will also work.
Interviewer 0:06:13
You've published extensively on audio-visual speech synthesis and multimodal AI. Could you describe how you translate the advanced methods from your IEEE Transactions on Multimedia paper into material that is accessible for undergraduate classroom teaching?
Interviewer 0:06:16
Understood. When you incorporate hands-on exercises for students, what tools or platforms do you typically use to help them engage with generative AI or deep learning concepts?
Mr. Subhayu Ghosh 0:09:08
Actually, my teaching philosophy begins with the. Fast the self learning after that. Teaching to the students. So and also some real life example I prefer. While delivering my talk or delivering my topics to my students. So basically when I will elaborate the generative AI based work. So I will first start with example suppose. I'm taking the example of IEEE transaction on multimedia. In that paper, what we have done. The new learning strategy we have proposed. For multi discriminative learning framework. So what is this? I can elaborate this like this way. Suppose there are three students in my classroom. One is efficient in mathematics, another one is efficient in. Chemistry and another one is efficient in physics. So I want to train one new student. Who will? Take the mathematics part from the that student who is efficient in mathematics. Who will get the physics part? From that student who will? Efficient in physics and the student will get the chemistry part. From that student who will get The Who is the efficient in chemistry? Show that strong strength of every individual student I am taking and. Training the new student. The philosophy is same in our multimedia word. There I have trained one new discriminator. By incorporating 3 discriminator. When the capsule network based vision transformer. Twist and Co occur neural network. So 3 dual network architecture have different strength. I have utilized the individual strength of every discriminator and prepared a new discriminator. By utilizing their. Uh, efficiency and strength of. All individual discriminator. So this is the philosophy of my multimedia. So while I will present my 2 weeks 2 by student, I will definitely give this type of real life example and for tools. Hands on experience as well. I will show some small prototype. Prototype means some generated samples as well. I am having the samples. How the results I am taking? How the samples looks like? It is original or generated. So this is the. Are main purpose. That we want to synthesize the generated output. Which will looks like the original. So I will show the. Synthesized output to my students as well and the philosophy as well. So by this way the students will get the main motivation. As well as the application.
Interviewer 0:09:26
Let's move on to your research trajectory and its potential institutional impact. Please outline your three-year research roadmap—specifically, how you plan to secure external funding or collaborate across disciplines, as VIT places a strong emphasis on both.
Mr. Subhayu Ghosh 0:11:58
Yeah, sure. So right now whatever, I am done. That is basically as the applications of generative AI in the. Audiovisual speech synthesis domain. So multimodal AI is also associated. So my primary research interests lie in the multimodal AI, with a strong emphasis on. Audiovisual speech processing. And also generative deep learning architecture, generative models. So my doctoral research was focused on this topic. And the publications are there. Mainly the advanced deep learning paradigms such as. Again autoencoder transformer architecture that have been used. In my future plan. In my future research plan in vit. My research will focus on developing robust. Data efficient. And interpretable multimodal learning framework. For both audio, visual and biomedical signals as well. The key direction will include the. Improving the speech synthesis framework. Under low resource and noisy conditions as well. And the research direction are well aligned with the. Undergraduate and postgraduate student projects also. That will enable effective student participation in applied research. And in the long term? I aim to expand my research towards the interdisciplinary AI applications as well. Particularly in healthcare. Or assistive technologies etc. And I plan to collaborate closely with the faculty members and the mentoring the students to produce high quality research output. In refuted journals and conferences. An emphasis will be placed on the. Hands on experimentation. And reproducible. Research Outcomes. And in vit. I intend. To actively involve my VTech and Mtech students in research through the projects, internships, and. Supervision. And I also plan to pursue. The external funding research like SARD. DST Methi. Et cetera, while contributing to the institute-level research initiatives from my long-term objective is to establish a student-centric application oriented research program and new lab also that will complement the teaching responsibility. As well as. Contributing to the academic growth of the student. And that societal impact of the institute as well.
Interviewer 0:12:00
Thank you for that detailed roadmap. If you were to scale a student-led research lab at VIT to accommodate twice as many students and projects, what specific changes would you implement to maintain both research quality and fiscal sustainability?
Mr. Subhayu Ghosh 0:14:14
Yeah, if I are. Taking to scale a new research lab at VIT. For many project. Actually not. Too much changes I will prefer. By using the existing infrastructure, I will start to involve the B.Tech and F Tech student. They could are higher higher education. And as well as the reset. Basically from the B.Tech level. I will try to engage the students to the research centric environment. By using the existing infrastructure as well. The student projects are there. So by analyzing the student project, the. Basically, in the current scenario, what they do? They actually get the project and show some output or show some report. But my research, my research trajectory will focus on. The students will analyze the problem from the basic. They will understand the problem. They will, uh. They will get. What to do? And after that they will implement it. And after that? They will involve in the project. They will. Synthesize high quality output and the reproducible work for the General Conference type publications. For both we Take and Mtech students. For their projects as well as internship. And the funding, external funding if I get. The Sr. or DST or may T level funding. Then I will contribute. To the new lab. Some new infrastructure, purchasing new GPUs and other infrastructures. So that the student can access to them. And the student will. See Generate high quality output. By using. That existing data. As well and by developing themselves as well the development and growth of the. Institute. And also the public able work. That that will impact the students profile as well as the. Institute profile and that will also impact the society. So this is my involvement of students. Will my my philosophy on this? I want to involve this.
Interviewer 0:14:17
Appreciate your insights. Now, Professor, imagine you are teaching a large, introductory AI course without relying on traditional slide-based lectures. Please describe in detail how you would implement an active learning strategy—such as a flipped classroom or peer instruction model—to maximize student engagement and understanding.
Mr. Subhayu Ghosh 0:16:14
Sure, I strongly believe that. Effective teaching in computer science and engineering should. Integrate sound. Theoretical foundations. With practical and hands on learning. So my teaching philosophy emphasized the interactive classroom. Laboratory oriented instruction. And project with assessment the three things. Firstly, the introductory classroom, interactive classroom. Secondly, the laboratory oriented. Instruction and thirdly the. Project based assessment. And enable students to develop. Both conceptual understanding as well as problem solving skill. For real world applications. So during my doctoral tenure as well, I have served as the TA. For several UG and PG courses. And. Right now. I am fully willing and prepared to teach any course assigned by the institute. But my interest will be in AI and machine learning related works, also in game theory and. Image processing, Introductory signal processing. I am comfortable with this. Actually, my teaching approach will encourage. The active student performance. The continuous evolution. And project based learning. With an emphasis of. Clarity, fairness and inclusivity. So I also aim to mentor the students in. Technical way. Technical writing. The research methodology. And ethical use of AI as well. So through my teaching, I believe. I aspire to create a motivating. Inclusive, academically rigorous learning environment. Allied with the. Educational objective of the. Vit.
Interviewer 0:16:16
Thank you, Professor. As VIT encourages mentoring and guidance, what specific strategies would you employ to inspire and academically support students pursuing individual research or scholarly projects under your supervision?
Mr. Subhayu Ghosh 0:17:44
Yeah, my teaching approach will encourage the student participation actively through the continuous evaluation and project based learning. I will give students some small skills project. They will perform it. They will analyze the real life problem. They will. They will understand what. Up to two, then they will implement it and there will be some emphasis on clarity, fairness and inclusivity as well the ethical use of AI. While I aim to mentor students. Also in. Uh, technical writing. Methodologically, writing, etc. And I want to aspire up. Motivating, inclusive, academically rigorous learning environment. And while I will mentor student, my long term objective is to establish a student centric research program, mainly application oriented student centric literature program that will complement the teaching responsibilities. Also. With the research perspective. Basically, emphasis will be placed on. Hands on experimentation. Reproducible research practices. And open source development. Which are essential for fostering a strong and sustainable research culture in VIT. So this is my goal. This is my target.
Interviewer 0:17:47
Let's discuss your approach to departmental stewardship and accreditation. Suppose VIT is seeking to improve its research per faculty metric as part of an upcoming accreditation review. What systematic steps would you take within your department to raise this metric while ensuring research quality is not compromised?
Mr. Subhayu Ghosh 0:19:45
Basically for enrich the department. My work will focus on more reproducible work. Basically why in many times what we do? Students are. Doing the project. They are submitting the report and they are done. They are getting the marks. And they are. Finishing their work. My target is to. Convert their small work, even their small work as well as their large scale work into the publication based. Outcomes like if there's small scale work they have done for two or three months then we can go for a conference like presentation. If they have. Good contribution in their work. Then we can go for the. Journal Level Publications We will target the top ten journal top tier conference with the work from the students. As well. Also patent type, we will target some patents. There are many ways for design patent as well as utility patent. The number of patents per institute. Or per department will enhance the department's. So I will target some patent. As well as some external funding project like. It's a sponsored research project or some consultancy project also in many industry also seeking consultancy with academics. Mainly they are going for IIT or top nits so we can. Take the consultancy project from the good industries and we can involve our students in that project. By that way, if we can show the consultancy project, sponsored project as well as the good. Patents as well as the. Journal and conference publication that time that growth of department will be. Obviously, we done.
Interviewer 0:19:47
That’s clear, Professor. In your experience, how have you ensured faculty buy-in and participation when introducing new documentation or research reporting processes tied to accreditation?
Mr. Subhayu Ghosh 0:21:52
Actually, involvement from all faculties is essential while we're. Talking about the growth of the department. So basically what we can do? The group project. The collaboration of. Yeah, other faculties in our department will also be essential. Suppose I have 5 students. Those who I am mentoring. I want to. Yeah, get outcome from them. Also. The other faculty who are having three or four students. I will tie up with the federal I will collaborate with that faculty. And the total students can be divided in three or four groups. Suppose the faculty is good in IoT, I am good in deep learning. So we can collaborate in this way. The student will build some prototypes of IoT. By the expertise of him and they will give, uh, implement some deep learning architectures from the insights from the. Mind so by that way. Two people, 2 professors collabor. Or this way we are efficient in speech processing shows audio signal processing and some professors are efficient in. Basic device level? Hardware level. So they are implementing 1 device. I can use my audio data in their device as well. So if we can collaborate among ourselves. And our students, we can engage our students. In a group then we can definitely do some outcomes and for publishable research work as well as we can go for patents if the work is contribution is very high and as well. We can jointly we can seek the external project from the Government of India as well. Some consultancy work as well. So this way. The collaborative work can definitely enrich the department. According to my philosophy.
Interviewer 0:21:55
Let's discuss a scenario involving both academic integrity and institutional expectations. Suppose a student alleges grading bias in your course, but the administration is pressuring faculty to maintain a certain pass percentage. How would you handle this situation to ensure fairness and uphold both ethical standards and institutional goals?
Mr. Subhayu Ghosh 0:23:50
Definitely, first of all, I will target, I will my goal. Is to involve every student Is to engage every students in the academic. Subjects. When I am teaching the students, teaching to the students a particular subject. In Delhi class, in daily basis class I will involve the students and I will take the out feedback from them. In where I have to improve myself. And also some daily basis assignment, daily basis home task I will. Give to them and I will check. I will monitor their daily performance as well. When after two or three months, when I can, I will get that. Yeah, these students is not efficiency and only this student is not. Doing well. I will specially handle this. I will. I talked with him and I will clear his. So that he can perform well as. And also. If possible. I will prepare the different set of assignments from the good students as well as those are weak. Students for them. And I will evaluate it. And in any case, if it happens that the the scenario that is mentioned. I will first approach the student. The better luck. You have to. You have to better. The in your next examination. And also the existing criteria what institute has given to me, I will maintain it and I will inspire the students to do better in daily classroom as well as after the exam.
Interviewer 0:23:52
Thank you for outlining your steps. In a situation like this, how would you document and communicate your grading decisions to both the student and administration to ensure transparency and avoid future disputes?
Mr. Subhayu Ghosh 0:25:19
First of all, I will. I believe in daily, daily assessment, so. Every week or every 15 days? I will take one assessment. Internal assessment of every class. To the students. And I will show the marks to them. So that they can get that. Yeah, I have. I am doing this type of performance and they can. Before the final examination. That they have to improve. And after the examination. If he can't improve. That time I will talk to the student personally and show him the total output, total marks he have obtained and I will manage him. To or hard to definitely bet for doing better in the next semester or next examination. In this perspective, as well as I will not. Force or I will not request to the administration or. Institute for, uh, relaxing the getting criteria because what institute has offered to be, what institute has, uh, given to me. I will maintain this. I will talk to the students and I will obviously I will maintain the transparency that the students can get. That his fault or his deficiency as well the Institute can get from me that the particular deficiency of the student.
Interviewer 0:25:22
Professor, considering your professional network and industry connections, what concrete steps would you take to create internship or employment pathways for VIT students in the multimedia or AI sectors?
Mr. Subhayu Ghosh 0:26:59
Definitely I might have. I have many professional network. In India as well as abroad also. In many top Iits and Nits, some professors. I can approach personally. So first of all, when I will take the class or when I will assess the students? I will definitely get that. Whose students are interested in future research or high higher education? So those who are efficient in research and who are interested and. Whose family? Allow them to do external research. Also, I will communicate with the professors of IIT's and. Also the foreign professors. That they can do their internship in their institute for short term as well as six month or this this time frame and my industry connections are there. While doing the placement progress, I can communicate with my industry partners or industry friends that yeah. They can hire, I can. I can give the refer to reference to the student, particular student that they can hire him. He is efficient in this particular way. So both were in academia. That hired professors or. The abroad connection. That that I am having. I will help my students to do the internship or secure higher study in their institute. As well. In the industry friends, wherever applicable I will suggest I will refer my students. Who hired them?
Mr. Subhayu Ghosh 0:28:52
Yeah, sure. While I'm doing my teaching assistants in my during my PhD tenure, basically in last time, what we have, we have done. In I, I was taking the deep learning laboratory. That time the assessment I have given them. On many project based like. There are 20 students of masters students are there. The 20 students are there in four students a group. I have divided them into the five groups. And I have given one real life problem to each group. So total 5 problems are there. All are from generative AI based. 3 problems from generative AI based and two problems were. Image segmentation problem. So every group got one problem. They have they got 15 days timeline. Their task was to. Utilize the data set, the prepared the data set by preprocessing and other things. And after that implement 1 deep learning model. And after that taking the. And some baseline comparison and prepare a. So basically the students will. Understand. That, uh. How the problem is coming? What is the problem and what is the limitations in their approach and how the model can be built and how they as the evaluation of model? Can be done. And all the projects I prepared myself. So that that they are all are real life problem and students can get interested in this. So this type of experience I am having. And I believe that in vit when I will join there. I will definitely contributing to this, uh, through the, through this type of mini projects and other assessment criteria. What? I want to develop in their ecosystem.
Interviewer 0:27:02
Let's discuss your experience with student evaluation and exam responsibilities. Can you describe a time when you had to design or implement an assessment strategy that balanced fairness, rigor, and timely feedback?
Interviewer 0:28:54
Thanks for sharing that example. How do you ensure your communication style and teaching approach remain structured and clear, especially when explaining complex AI or multimedia concepts to students with varying backgrounds?
Mr. Subhayu Ghosh 0:30:30
Sure. I I will learn myself because. I can be expert, but it is not that I uh, I know everything. So before every. Class I have to prepare myself. So while preparing myself. I will prepare myself from the 0. Like. I am a student. I don't know AI, I don't know anything. I am just preparing myself from the. Is from a layman perspective. And after that I will get everything. And when I will teach to my students, when I will elaborate anything to my student, I will believe that the students can't know anything. I will start from the 0. At the from the 02. Uh, basic level and after from the basic to intermediate and intermediate to the advanced level. Step by step I will go. So. The students who are having known background on AI. They can understand. Also the students who are efficient in AI. They can understand the advanced thing. And also I will try to implement different set of assignment. What the the students who are having expertise in AI as well as those who are not having expertise in AI, the quality or. The level of assignment will be. I will fix. In that way so that. All the all the student. Students can cover the their choice and all the students can get something from my class. This is my goal.
Interviewer 0:30:31
Understood. Can you outline your doctoral research area and explain how your specialization aligns with the requirements for this position at VIT?
Mr. Subhayu Ghosh 0:32:31
Sure. My primary research interest was in multimodal and generative AI. Basically in my doctoral Phase I have done generative AI based audio. Audio visual speech synthesis. So that includes the audio and speech processing. The idea of computer vision. The multimodal AI, generative AI and deep learning architectures. The deep learning models. So basically in my doctoral research, what we have done. Developing new deep learning architecture. Or developing new learning strategies. Or hyperviber optimization strategies. For efficient building of. Deep learning models. As well as. The application domain was audiovisual speech processing. So that it can induce the audio visual processing as well. So first of all we take the baseline model. Where the existing research, the literature from the literature survey, we found the existing based research and we compared the models our model. With the existing models and we have shown that. Yeah, we can improve the quality, we can improve the naturalness. Of the synthesis perspective, it is my mainly the doctoral part. That is the generative AI based audio visual synthesis. However. Without apart from my doctoral research, I. Traffic of my thesis I have also done extra work. On biomedical image processing. And also some protein structure design, etcetera. Basically all our applications domains of deep learning. So I have many expertise on deep learning. What else? The recent updated models of deep learning which are is being used and for audio, visual and biomedical signals I have. Done. And my key direction improves in includes the. Speech synthesis framework. Under the conditions. Of maintaining the existing research. This is my work.
Interviewer 0:32:32
Thank you for the overview. Can you share an example of a research paper you published in a reputed journal, describing the core contribution and its impact?
Mr. Subhayu Ghosh 0:34:36
Here in one journal that is published in Applied Intelligence. I have done the vision transformer based model. The vision transformer is very. New innovative architectures that is mainly used in. NLP domain. Natural language processing domain. We have incorporated the Vision Transformer architecture. In the auto encoder part. Of that our audio visuals synthesis domain. So in audiovisual speech synthesis. Basically 2 models are there. One is voice conversion model. That model converts one person's voice to the another person's. And the second model is Audiovisual synthesis model. That will. Synthesize the video of the second person according to the. Converted voice. So while using the vision transformer based auto encoder, I have incorporated the vision transformer while the auto encoder framework. So the. Synthesize quality can be good or better and also I have used the two different types of loss function 1 is cycle consistency loss that is used. For identity preservation of the starting the source speaker and also the. Deconstruction loss. To improve the quality of the. Synthesized raw food. And an ensemble of loss function was used. By uttering the cycle consistency and. Reconstruction Loss. To better identity preservation. As well as. Without compromising the. Synthesis Quality. So basically the contribution was. Incorporating one in auto encoder model for. And the voice conversion part. Auto encoder model for the video synthesis part. Incorporating vision transformer inside the autoencoder. Therefore we have called the model vitae. And fourthly, the ensemble of loss functions. For better identity preservation and synthesis quality. And fifthly, we have experimentally proved that. Our model surpasses all existing literature. In terms of. Naturalness and quality. For both audio and video perspective.
Interviewer 0:34:37
That's quite innovative, Professor. Can you briefly describe any industry projects or consultancy work you have been involved with, and how those experiences might benefit your teaching and research at VIT?