Roy Saurabh is the CTO at UNESCO MGIEP with 13+ years in Data Science, specifically in context of capacity building within diverse learners’ community. Before MGIEP, Roy was heading National Skill Mission leverage ML to influence learning paths of 25 Million+ learners spread globally.
28 February 2019 | 15:00 - 15:30 | English | AIED & Education Datasets: What is at Stake?
The increasing use of technology in education – from online learning platforms and the digital content that they host, to machine learning and natural language processing (NLP) – has not only proposed scalable solutions to longstanding problems in the classroom but also raised urgent new questions about how to manage the interactions between new technologies and initiatives in education. Artificial intelligence offers the possibility of using data to provide crucial insight to student learning behavior: indeed, AI could analyze a student’s learning and offer solutions and interventions to develop an approach that is well suited to the student’s strength while also considering their weaknesses without compromising student’s privacy. In this way, the elusive dream of individualized learning may finally become a reality. It is important, however, that we acknowledge the critical role data plays in ensuring AI systems work as intended: if the legacy data provided to train an AI system is biased or otherwise morally tainted, the AI will perpetuate the bias. In light of these opportunities but also challenges, it is imperative to identify and account for the various ways in which data is collected and used. Data can be used as a private or public good. The final decision will depend on how society decides to manage this pool of data. A “commons” approach is suggested whereby the data is managed by the collective and the use regulated by an institution (rules and norms) owned and managed by a multi-stakeholder community. Lessons from managing the commons from the literature on natural resources offers some insights to how we can design such a “commons” for the data that is generated by students and used by AI for improving learning yet protecting the learner.