Mostrar el registro sencillo del ítem
AI Powered Commentary and Camera Direction in E-Sports
| dc.contributor.author | Narayanan, Swathi Jamjal | |
| dc.contributor.author | Joseph, Kevin Winston | |
| dc.contributor.author | Sirohi, Devansh | |
| dc.contributor.author | Chaudhary, Harsh | |
| dc.contributor.author | Shivkumar, Hitesh | |
| dc.date | 2026-03-26 | |
| dc.date.accessioned | 2026-03-09T16:21:30Z | |
| dc.date.available | 2026-03-09T16:21:30Z | |
| dc.identifier.citation | S. J. Narayanan, K. W. Joseph, D. Sirohi, H. Chaudhary, H. Shivkumar. AI Powered Commentary and Camera Direction in E-Sports, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, no. 6, pp. 116-125, 2026, http://doi.org/10.9781/ijimai.2026.6566KeywordsAI-Driven Commentary, Camera Control, Computer Vision, E--sports Analytics, Neural Architectures.AbstractReal-time, AI-driven commentary and camera direction provide revolutionary possibilities to improve spectator engagement and comprehension of live events in the rapidly advancing world of e-sports. This paper proposes an autonomous system designed to both generate dynamic commentary as well as control the spectator camera for live-streamed e-sports matches, specifically focusing on League of Legends (LoL), a popular Multiplayer Online Battle Arena (MOBA) game. It incorporates the use of GPT-4o with Vision and OpenAI’s TTS API. Synchronization of commentary with real-time camera movements is one of the major challenges tackled. This is done using a camera tracking and scene change detection algorithm that effectively adjusts the commentary to changing scenes in real-time by utilizing computer vision techniques. Further, two neural architectures for AI-driven camera control: a 2D Convolutional-LSTM (Conv-LSTM) model that concentrates on independent spatial and temporal analysis, and a 3D CNN model that combines these features to forecast camera movements in a more comprehensive way are presented. Evaluations on fluency, relevance, and strategic depth metrics, show that our integrated system improves viewer experience by providing deep and coherent narratives that are contextually aligned with the game dynamics. The proposed models are evaluated quantitatively in capturing spectator camera movement patterns.DOI: 10.9781/ijimai.2026.6566AI Powered Commentary and Camera Direction in E-SportsSwathi Jamjala Narayanan* , Kevin Winston Joseph , Devansh Sirohi , Harsh Chaudhary, Hitesh Shivkumar School of Computer Science and Engineering, Vellore Institute of Technology, Vellore (India)* Corresponding author: jnswathi@vit.ac.inReceived 6 May 2024 | Accepted 15 July 2025 | Published 19 February 2026I. IntroductionAs elucidated in [1], digital prowess and virtual battles have ushered in a new era of competitive entertainment, which has undergone a seismic shift in recent years. The traditional competitive gaming landscape has been upended by the rise of e-sports, or electronic sports, which have captured the attention of millions of people worldwide. In contrast to traditional sports, which are played on fields or courts, e-sports utilize the power of digital platforms. Participants compete fiercely in a wide range of video and computer games across different genres, including League of Legends and Counter-Strike: Global Offensive. From being thought of as a niche activity, it has developed into a multi-billion dollar industry that challenges traditional sports’ dominance and captivates audiences with its unique combination of skill, strategy, and spectacle. E-Sports is gaining popularity at a rate that surpasses generational and cultural divides, attracting a wide range of fans, players, and spectators. Millions of people tune in to watch the fierce bouts take place on virtual arenas, despite the fact that older generations may find it difficult to understand why virtual battles are so appealing. Furthermore, it is impossible to overstate the economic impact of e-sports, as their earnings surpass not just the combined earnings of traditional entertainment industries like music and film, but also rival them. The dynamics of this emerging industry become more evident upon digging deeper, demonstrating that e-sports is a cultural phenomenon that is here to stay and is drastically changing the competitive entertainment landscape.The real-time generation of engaging commentary and intelligent camera control present significant technical challenges in e-sports broadcasting. Commentary requires understanding complex game states, strategic implications, and generating natural language in real-time. Meanwhile, camera control demands rapid identification of important game events across multiple locations. These traditionally human-operated tasks are increasingly difficult to scale with the growing e-sports industry. Recent advances in computer vision and large language models offer promising solutions - vision models can track game state and identify key moments, while generative AI can produce contextual commentary. When combined, these technologies could potentially match or exceed human capabilities in capturing the dynamic nature of e-sports competitions. | es_ES |
| dc.identifier.uri | https://reunir.unir.net/handle/123456789/19151 | |
| dc.description.abstract | Real-time, AI-driven commentary and camera direction provide revolutionary possibilities to improve spectator engagement and comprehension of live events in the rapidly advancing world of e-sports. This paper proposes an autonomous system designed to both generate dynamic commentary as well as control the spectator camera for live-streamed e-sports matches, specifically focusing on League of Legends (LoL), a popular Multiplayer Online Battle Arena (MOBA) game. It incorporates the use of GPT-4o with Vision and OpenAI’s TTS API. Synchronization of commentary with real-time camera movements is one of the major challenges tackled. This is done using a camera tracking and scene change detection algorithm that effectively adjusts the commentary to changing scenes in real-time by utilizing computer vision techniques. Further, two neural architectures for AI-driven camera control: a 2D Convolutional-LSTM (Conv-LSTM) model that concentrates on independent spatial and temporal analysis, and a 3D CNN model that combines these features to forecast camera movements in a more comprehensive way are presented. Evaluations on fluency, relevance, and strategic depth metrics, show that our integrated system improves viewer experience by providing deep and coherent narratives that are contextually aligned with the game dynamics. The proposed models are evaluated quantitatively in capturing spectator camera movement patterns. | es_ES |
| dc.language.iso | eng | es_ES |
| dc.publisher | UNIR | es_ES |
| dc.relation.uri | https://www.ijimai.org/index.php/ijimai/article/view/6566 | es_ES |
| dc.rights | openAccess | es_ES |
| dc.subject | AI-driven Commentary | es_ES |
| dc.subject | Camera Control | es_ES |
| dc.subject | Computer Vision | es_ES |
| dc.subject | E- sports Analytics | es_ES |
| dc.subject | Neural Architectures | es_ES |
| dc.title | AI Powered Commentary and Camera Direction in E-Sports | es_ES |
| dc.type | article | es_ES |
| reunir.tag | ~IJIMAI | es_ES |
| dc.identifier.doi | https://doi.org/10.9781/ijimai.2026.6566 |





