Associate Professor Yun Sing Koh

PhD (University of Otago); Masters of Software Engineering (University of Malaya); BSc (First class Hons) Computer Science (University of Malaya)

Profile Image
Associate Professor

Biography

Yun Sing Koh is an Associate Professor at the School of Computer Science, The University of Auckland, New Zealand. Her main research area is Artificial Intelligence (AI) and Machine Learning (ML). Specifically, focusing on several research strands: continual learning and adaptation, transfer learning anomaly detection, and data stream mining. Yun Sing is passionate about using machine learning for social good, and her research has been applied to interdisciplinary applications in environment and health domains. Yun Sing has published 100+ peer-reviewed publications in top conferences and journals, including IJCAI, IEEE ICDE, IEEE ICDM, Machine Learning Journal and Journal of Artificial Intelligence. She won the New Zealand Royal Society Fast-Start Marsden funding (2018) and the United States Office of Naval Research Grant (2019). Yun Sing has been active in the research community, including serving as the General Co-Chair at the IEEE International Conference on Data Mining 2021, Workshop Co-Chair at ECML/PKDD conference 2021, Program Co-Chair of the Australasian Data Mining Conference 2018 and as the Workshop Co-Chair for the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining.

Research | Current

My current research interests include machine learning and artificial intelligence: data stream mining, continual learning, transfer learning, anomaly detection.

 

Example of research:

Adaptive Predictive System for Life-long Learning on Data Streams.

PhD Student: Ben Halstead Pollution from wood burners has serious health implications for residents of rural towns, even in developed countries. Monitoring the level of airborne particulate matter, PM2.5, in these areas often requires making inferences about missing or corrupted readings. Air Quality inference in these cases poses two key challenges. Firstly, air quality displays non-linear spatio-temporal relationships dependent on many factors. Secondly, these factors can evolve over time, changing the distribution of data. For example, changing wind directions can have a large impact on which neighbouring sensors are most relevant to inference. Methods incorporating environmental factors to capture these changes, e.g. weather, traffic and points of interest, have found success in urban environments. However, many locations only have access to a few if any of these features, thus, inference methods must employ alternate approaches to detect and adapt to changes. We propose a data stream based system, called AirStream, to infer missing PM2.5 levels that is able to detect and adapt to changes in unknown features. Such changes in the distribution of data are known as concept drift. By treating the data set as a stream and learning incrementally, AirStream can use current concept drift detection methods to detect concept drift. We adapt to concept drift by selecting a new classifier suitable for the emerging conditions. In reality, this detection and adaption process may be affected by noise, causing errors in these classifier transitions. We propose a repairing algorithm to identify and correct these errors. We deployed our approach on two air quality studies in New Zealand rural towns, and also tested it on a Beijing benchmark data set. We found gains in inference performance comparing AirStream against seven baseline methods. We further investigate the relationship between the changes we detected and changes in underlying weather conditions. We discovered a strong predictive link between the state of our system and current meteorological conditions. This project is part of Royal Society Marsden Fast-Start. Supervisors: Assoc Prof Yun Sing Koh, Dr Pat Riddle, Prof Mykola Pechenizkiy (TU/e Eindohoven), Prof Albert Bifet (Waikato).

Keywords: Air Pollution, Data Stream Mining, Continual Learning

Acknowledgement: Dr Guy Coulson and Gustavo Olivares | NIWA

 

Machine Learning for Extreme Event Detection.

PhD Student: Olivier Graffeuille. The monitoring of water quality is an important field with impacts on local ecosystems, aquaculture and human health. An efficient way of monitoring water quality is to estimate concentration of water constituents using remote sensing data, such as satellite data. However, this task is difficult, due to (1) the limited labels available to train models, (2) its ill-posed nature whereby different combinations of water constituents can combine to produce the same optical signal, and (3) the limited transferability of models between water bodies with different characteristics. Our research aims to develop machine learning techniques to overcome these challenges. This project is part of MBIE Taiao Programme https://taiao.ai/. Supervisors: Assoc Prof Yun Sing Koh, Dr Jorg Wicker, Dr Moritz K Lehmann (Xerra, Waikato).

Keywords: Water Quality, Semi-supervised learning, Transfer Learning

 

Prediction in Evolving Data Stream Using an Adaptive System

PhD Student: Ocean Wu (Current), MS Data Science: Johnson Zhou (2021). Postdoc: Thomas Lacombe (2019). Many applications deal with data streams. Data streams can be perceived as a continuous sequence of data instances, often arriving at a high rate. In data streams, the underlying data distribution may change over time, causing decay in the predictive ability of the machine learning models. This phenomenon is known as concept drift. Moreover, it is common for previously seen concepts to recur in real-world data streams, known as recurrent concept drifts. If a concept reappears, for example, a particular weather pattern, previously learnt classifiers can be reused; thus the performance of the learning algorithm can be improved.

Scikit-ika is an open-source implementation of methods for handling recurrent concept drifts. It continuously models evolving data streams, providing accurate predictions in real-time, using probabilistic networks and meta-information to proactively predict a change in the data stream. The code developed for this project is available on GitHub and released as part of an open-source python library, as stated in the initial proposal, https://scikit-ika.github.io/. This project is funded by ONRG Global. Supervisors: Assoc Prof Yun Sing Koh, Prof Gillian Dobbie

 

Graph-Based Deep Learning Models for Brain Network Analysis

PhD students: Callum Cory (Current): The research aims at developing cutting-edge graph analytics approaches for modelling and analysing brain networks. Existing research has demonstrated prominent capabilities of network-based methods in understanding brains, however graph analytics for brain networks is still in it infancy. The research will develop novel graph-based deep learning models. Compared to shallow models, deep models are more effective in capturing the highly non-linear structures in networks and modelling subtle characteristics. Supervisors: Assoc Prof Yun Sing Koh, Assoc Prof Kelly Ke (NTU, Singapore), Dr Miao Qiao, Dr Diana-Bernavidas-Prado

 

Please get in touch for PhD, MSc, Honours projects.

 

Teaching | Current

COMPSCI361 Machine Learning

Postgraduate supervision

Current PhD Supervision / Co-Supervision

  • Di Zhao (2022) Transfer Learning for Data Stream (co-supervised with Prof Gill Dobbie)
  • Callum Cory (2021)  Graph-Based Deep Learning Models for Brain Network Analysis (co-supervised with Dr Miao Qiao, AProf Kelly Ke, Dr Diana Benavides Prado)
  • Olivier Graffeuille (2020) Machine Learning for Extreme Event Detection (co-supervised with Dr Jörg Wicker, Dr Moritz Lehman)
  • Wernsen Wong (2020) Transfer Deep Learning for Data Stream (co-supervised with Prof Gill Dobbie)
  • Peter Devine (2020) What do users say? Using unsupervised machine learning to robustly analyse multi-platform user feedback (primary supervisor: Dr Kelly Blinco)
  • Aaron Keesing (2019) Emotion Recognition in Speech (co-supervised Prof Michael Witbrock, A/Prof Ian Watson)
  • Ocean Wu (2019) Continual Learning and Adaptation for Evolving Data Streams (co-supervised with Prof Gill Dobbie)
  • Ben Halstead (2019) Adapative Predictive System for Life-long Learning (co-supervised with Dr Pat Riddle)
  • Shuxiang Zhang (2018) Concept Drift Detection (co-supervised with Prof Gill Dobbie)

Current masters students

  • Sameer Khanal (2021)    Enhanced Memory Replay Method for Continual Learning (Yun Sing Koh and Diana Bernavidas-Prado)    
  • Bowen Chen (2021) Difficulty and diversity of learning    (Yun Sing Koh, Ben Halstead (Advisor))

 

 

Distinctions/Honours

Funding and Awards

  • MBIE 2020 Catalyst Strategic NZ-Singapore Data Science Programme “Advanced Graph Analytics for Human Brain Connectivity”  2020 -2023, Key Researcher
  • MBIE Data Science Programme “Time-Evolving Data Science / Artificial Intelligence for Advanced Open Environmental Science”  2020 -2027, Key Researcher
  • MBIE Endeavour Research Programme “Our Generation, our Voices, all our Futures”, 2020 - 2025, Key Individual
  • Office Naval Research Grant “Prediction in Evolving Data Streams using an Adaptive System” 2019 – 2020, PI
  • Marsden Fast-Start 2018 “An Adaptive Predictive System for Life-long Learning on Data Streams” 2019 – 2021, PI
  • Precision Driven Health *A deep learning platform for GP referral triage 2018 -2020, Named collaborator
  • Vice-Chancellor’s Strategic Development Fund “Building resilience at the weakest link: A user-aware interactive and intelligent security system” 2018-2019, PIs: G. Russello, P. Corballis, Y.S. Koh, D. Lottridge
  • AUT University Vice Chancellor Emerging Researcher Award (2009), Auckland University of Technology

Areas of expertise

Machine learning specifically in the area of unsupervised learning, data stream mining, anomaly detection, transfer learning.

Committees/Professional groups/Services

General Chair IEEE ICDM 2021

Workshop and Tutorial Chair  ECML/PKDD 2021

Steering Committee Member: AusDM

Senior PC Member: AAAI

PC Member: SIGKDD, PAKDD, ECML/PKDD, IJCAI

Selected publications and creative works (Research Outputs)

As of 29 October 2020 there will be no automatic updating of 'selected publications and creative works' from Research Outputs. Please continue to keep your Research Outputs profile up to date.
  • Huggard, H., Koh, Y. S., Dobbie, G., & Zhang, E. (2020). Detecting Concept Drift in Medical Triage. SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 10.1145/3397271.3401228
    Other University of Auckland co-authors: Gill Dobbie
  • Zhang, S., Jung Huang, D. T., Dobbie, G., & Koh, Y. S. (2020). SLED: Semi-supervised locally-weighted ensemble detector. Proceedings - International Conference on Data Engineering. 10.1109/ICDE48307.2020.00183
    Other University of Auckland co-authors: Gill Dobbie
  • Zhuo, S., Sherlock, L., Dobbie, G., Koh, Y. S., Russello, G., & Lottridge, D. (2020). REAL-Time Smartphone Activity Classification Using Inertial Sensors-Recognition of Scrolling, Typing, and Watching Videos While Sitting or Walking. Sensors (Basel, Switzerland), 20 (3).10.3390/s20030655
    Other University of Auckland co-authors: Gill Dobbie, Giovanni Russello, Danielle Lottridge
  • Zhao, D., & Koh, Y. S. (2020). Feature drift detection in evolving data streams. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 10.1007/978-3-030-59051-2_23
  • Benavides-Prado, D., Koh, Y. S., & Riddle, P. (2020). Towards Knowledgeable Supervised Lifelong Learning Systems. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 68, 159-224.
    Other University of Auckland co-authors: Patricia Riddle
  • Wu, O., Koh, Y. S., Dobbie, G., & Lacombe, T. (2020). PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 10.1007/978-3-030-47436-2_2
    Other University of Auckland co-authors: Ocean Wu, Gill Dobbie, Thomas Lacombe
  • Anderson, R., Koh, Y. S., Dobbie, G., & Bifet, A. (2019). Recurring concept meta-learning for evolving data streams. EXPERT SYSTEMS WITH APPLICATIONS, 13810.1016/j.eswa.2019.112832
    Other University of Auckland co-authors: Gill Dobbie
  • Fournier-Viger, P., Zhang, Y., Lin, J. C.-W., Fujita, H., & Koh, Y. S. (2019). Mining local and peak high utility itemsets. INFORMATION SCIENCES, 481, 344-367. 10.1016/j.ins.2018.12.070

Identifiers

Contact details

Primary office location

SCIENCE CENTRE 303S - Bldg 303S
Level 4, Room 479
38 PRINCES ST
AUCKLAND CENTRAL
AUCKLAND 1010
New Zealand

Social links

Web links