Deep Learning Publications






One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

    Hide/Show Full Abstract Grapheme-to-Phoneme (G2P) correspondences form foundational frameworks of tasks such as text-to-speech (TTS) synthesis or automatic speech recognition. The G2P process involves taking words in their written form and generating their pronunciation. In this paper, we critique the status quo definition of a grapheme, currently a forced alignment process relating a single character to either a phoneme or a blank unit, that underlies the majority of modern approaches. We develop a linguisticallymotivated redefinition from simple concepts such as vowel and consonant count and word length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves competitive results with a 31.86% Word Error Rate on a standard benchmark, while generating linguistically meaningful grapheme segmentations.
  • Proceedings of the 28th Conference on Computational Natural Language Learning (CoNLL)., Miami, USA.

Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

    Hide/Show Full Abstract In the realm of social media discourse, the integration of slang enriches communication, reflecting the sociocultural identities of users. This study investigates the capability of large language models (LLMs) to paraphrase slang within climate-related tweets from Nigeria and the UK, with a focus on identifying emotional nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs: ChatGPT 4, Gemini, and LLaMA3 in slang paraphrasing. While ChatGPT 4 and Gemini demonstrate comparable effectiveness in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in coverage, especially of Nigerian slang. Our findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.
  • Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, USA.

Using Large Language Models to Recommend Repair Actions for Offshore Wind Maintenance.

    Hide/Show Full Abstract The Offshore Wind (OSW) industry is experiencing significant expansion, resulting in increased Operations & Maintenance (O&M) costs. Intelligent alarm systems offer the prospect of swift detection of component failures and process anomalies, enabling timely and precise interventions that could yield reductions in resource expenditure, as well as scheduled and unscheduled downtime. This paper introduces an innovative approach to tackle this challenge by capitalising on Large Language Models (LLMs). We present a specialised conversational agent that incorporates statistical techniques to calculate distances between sentences for the detection and filtering of hallucinations and unsafe output. This potentially enables improved interpretation of alarm sequences and the generation of safer repair action recommendations by the agent. Preliminary findings are presented with the approach applied to ChatGPT-4 generated test sentences. The limitation of using ChatGPT-4 and the potential for enhancement of this agent through re-training with specialised OSW datasets are discussed.
  • 2024 J. Phys.: Conf. Ser. 2875 012025.

SafeLLM: Domain-Specific Safety Monitoring for Large language Models: A Case Study for Offshore Wind Maintenance.

    Hide/Show Full Abstract The Offshore Wind (OSW) industry is experiencing significant expansion, resulting in increased Operations & Maintenance (O&M) costs. Intelligent alarm systems offer the prospect of swift detection of component failures and process anomalies, enabling timely and precise interventions that could yield reductions in resource expenditure, as well as scheduled and unscheduled downtime. This paper introduces an innovative approach to tackle this challenge by capitalising on Large Language Models (LLMs). We present a specialised conversational agent that incorporates statistical techniques to calculate distances between sentences for the detection and filtering of hallucinations and unsafe output. This potentially enables improved interpretation of alarm sequences and the generation of safer repair action recommendations by the agent. Preliminary findings are presented with the approach applied to ChatGPT-4 generated test sentences. The limitation of using ChatGPT-4 and the potential for enhancement of this agent through re-training with specialised OSW datasets are discussed.
  • 2024 Preprint.


Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

    Hide/Show Full Abstract In the realm of social media discourse, the integration of slang enriches communication, reflecting the sociocultural identities of users. This study investigates the capability of large language models (LLMs) to paraphrase slang within climate-related tweets from Nigeria and the UK, with a focus on identifying emotional nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs ChatGPT 4, Gemini, and LLaMA3 in slang paraphrasing. While ChatGPT 4 and Gemini demonstrate comparable effectiveness in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in coverage, especially of Nigerian slang. Our findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.
  • 2024 Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, Florida, US.

One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

    Hide/Show Full Abstract Grapheme-to-Phoneme (G2P) correspondences form foundational frameworks of tasks such as text-to-speech (TTS) synthesis or automatic speech recognition. The G2P process involves taking words in their written form and generating their pronunciation. In this paper, we critique the status quo definition of grapheme, currently a forced alignment process relating a single character to either a phoneme or a blank unit, that underlies the majority of modern approaches. We develop a linguistically-motivated redefinition from simple concepts such as vowel and consonant count and word length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves state-of-the-art results with a 31.86% Word Error Rate on a standard benchmark, while generating linguistically meaningful grapheme segmentations.
  • 2024 Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL), Miami, Florida, US.

Safety Monitoring for Large Language Models: A Case Study of Offshore Wind Maintenance .

    Hide/Show Full Abstract It has been forecasted that a quarter of the world’s energy usage will be supplied from Offshore Wind (OSW) by 2050 (Smith 2023). Given that up to one third of Levelised Cost of Energy (LCOE) arises from Operations and Maintenance (O&M), the motive for cost reduction is enormous. In typical OSW farms hundreds of alarms occur within a single day, making manual O&M planning without automated systems costly and difficult. Increased pressure to ensure safety and high reliability in progressively harsher environments motivates the exploration of Artificial Intelligence (AI) and Machine Learning (ML) systems as aids to the task. We recently introduced a specialised conversational agent trained to interpret alarm sequences from Supervisory Control and Data Acquisition (SCADA) and recommend comprehensible repair actions (Walker et al. 2023). Building on recent advancements on Large Language Models (LLMs), we expand on this earlier work, fine tuning LLAMA (Touvron 2018), using available maintenance records from EDF Energy. An issue presented by LLMs is the risk of responses containing unsafe actions, or irrelevant hallucinated procedures. This paper proposes a novel framework for safety monitoring of OSW, combining previous work with additional safety layers. Generated responses of this agent are being filtered to prevent raw responses endangering personnel and the environment. The algorithm represents such responses in embedding space to quantify dissimilarity to pre-defined unsafe concepts using the Empirical Cumulative Distribution Function (ECDF). A second layer identifies hallucination in responses by exploiting probability distributions to analyse against stochastically generated sentences. Combining these layers, the approach finetunes individual safety thresholds based on categorised concepts, providing a unique safety filter. The proposed framework has potential to utilise the O&M planning for OSW farms using state-of-the-art LLMs as well as equipping them with safety monitoring that can increase technology acceptance within the industry.
  • 2024 Proc. of the Safety Critical Systems Symposium SSS'24, Bristol, UK

User Engagement Triggers in Social Media Discourse on Biodiversity Conservation.

    Hide/Show Full Abstract Studies in digital conservation have increasingly used social media in recent years as a source of data to understand the interactions between humans and nature, model and monitor biodiversity, and analyse online discourse about the conservation of species. Current approaches to digital conservation are for the most part purely frequentist, i.e. focused on easily trackable and quantiiable features, or purely qualitative, which allows a deeper level of interpretation, but is less scalable. Our approach aims to evaluate the applicability of recent advances in deep learning in combination with semi-automatic analysis. We present a multimodal neural learning framework that experiments with diferent combinations of linguistic and visual features and metadata of tweets to predict user engagement from a function of likes and retweets. Experimental results show that text is the single most efective modality for prediction when a large amount of training data is available. For smaller datasets, drawing information from multiple modalities can boost performance. Notably, we ind a negative efect of large pre-trained language models when dealing with substantially unbalanced datasets. A qualitative analysis into the triggers of user engagement with tweets reveals that it emerges from a combination of online discourse topic and sentiment, and is often ampliied by user activity, e.g. when content originates from an inluencer account. We ind clear evidence of existing sub-communities around speciic topics, including animal photography and sightings, illegal wildlife trade and trophy hunting, deforestation and destruction of nature and climate change and action in a broader sense.
  • 2024 ACM Transactions on Social Computing

Redefining Digital Twins – A Wind Energy Operations and Maintenance Perspective.

    Hide/Show Full Abstract Digital Twin (DT) technology has seen an explosion in popularity, with wind energy no exception. This is particularly true for Operations & Maintenance (O&M) applications. However, this expanded use has been accompanied by loose, conflicting, definitions that threaten to reduce the term to a buzzword and prevent the technology from meeting its full potential. A number of attempts have been made to better define and classify DTs, however, these either oversimplify the term or tighten criteria, leading to the exclusion of many DT applications. A new definition framework dubbed the Digital Twin Family Tree is therefore proposed. This widens "Digital Twin" to a general umbrella term for the technology, accompanied by specific definitions. DT Tags are also used to provide individualised characteristics for implementations. A sector-specific definition was devised for component and system monitoring and predictions in wind energy O&M dubbed a CS-DT and suitable DT Tags created. The proposed framework was used to review existing research in literature, demonstrating the potential for increased understanding, explainability, and accessibility of DTs for expert and non-expert stakeholders.
  • 2024 J. Phys.: Conf. Ser. 2767 052001

BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.

    Hide/Show Full Abstract This paper outlines our multimodal ensemble learning system for identifying persuasion tech- niques in memes. We contribute an approach which utilises the novel inclusion of consistent named visual entities extracted using Google Vision API’s as an external knowledge source, joined to our multimodal ensemble via late fu- sion. As well as detailing our experiments in ensemble combinations, fusion methods and data augmentation, we explore the impact of including external data and summarise post- evaluation improvements to our architecture based on analysis of the task results.
  • 2024 SEMEVAL 2024 Shared Task on "Multilingual Detection of Persuasion Techniques in Memes", at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Mexico City, Mexico

Towards Interactive Anomaly Detection using Natural Language.

    Hide/Show Full Abstract When training models for visual anomaly detection, typically, a dataset is collected and then annotated offline. Even if collecting raw data is relatively cheap, annotations are expensive, especially if they require human expertise. We therefore propose a novel interactive learning framework that combines active learning with natural language interaction to minimise the amount of annotated training data and allow for refined human expert feedback that may be leveraged in the learning pro- cess. In our initial experiments on wind turbine drone images, we demonstrate the effectiveness of active learning for anomaly detection when using ground truth la- bels, and assess the impact on learning when collecting labels from ‘experts’ versus ‘non-experts’ using our dialogue system. In addition to anomaly labels with confi- dence scores, we collect and analyse natural language explanations, which may be used to improve both anomaly detection performance and explainability.
  • 2024 The 14th International Workshop on Spoken Dialogue Systems Technology, Sapporo, Japan

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task.

    Hide/Show Full Abstract Traditional predictive simulations and remote sensing techniques for forecasting floods are based on fixed and spatially restricted physics-based models. These models are computationally expensive and can take many hours to run, resulting in predictions made based on outdated data. They are also spatially fixed, and unable to scale to unknown areas. By modelling the task as an image segmentation problem, an alternative approach using artificial intelligence to approximate the parameters of a physics-based model in 2D is demonstrated, enabling rapid predictions to be made in real-time.
  • 2024 Northern Lights Deep Learning Conference, Tromso, Norway

Linguistic Pattern Analysis in the Climate Change-Related Tweets from UK and Nigeria.

    Hide/Show Full Abstract To understand the global trends of human opinion on climate change in specific geographical areas, this research proposes a framework to analyse linguistic features and cultural differences in climate-related tweets. Our study combines transformer networks with linguistic feature analysis to address small dataset limitations and gain insights into cultural differences in tweets from the UK and Nigeria. Our study found that Nigerians use more leadership language and informal words in discussing climate change on Twitter compared to the UK, as these topics are treated as an issue of salience and urgency. In contrast, the UK’s discourse about climate change on Twitter is characterised by using more formal, logical, and longer words per sentence compared to Nigeria. Also, we confirm the geographical identifiability of tweets through a classification task using DistilBERT, which achieves 83% of accuracy.
  • 2023 Proceedings of the CLASP Conference on Learning with Small Data (LSD), Gothenburg, Sweden

Intelligent digital twin -- machine learning system for real-time wind turbine wind speed and power generation forecasting.

    Hide/Show Full Abstract Wind power is a key pillar in efforts to decarbonise energy production. However, variability in wind speed and resultant wind turbine power generation poses a challenge for power grid integration. Digital Twin (DT) technology provides intelligent service systems, combining real-time monitoring, predictive capabilities and communication technologies. Current DT research for wind turbine power generation has focused on providing wind speed and power generation predictions reliant on Supervisory Control and Data Acquisition (SCADA) sensors, with predictions often limited to the timeframe of datasets. This research looks to expand on this, utilising a novel framework for an intelligent DT system powered by k-Nearest Neighbour (kNN) regression models to upscale live wind speed forecasts to higher wind turbine hub-height and then forecast power generation. As there is no live link to a wind turbine, the framework is referred to as a “Simulated Digital Twin” (SimTwin). 2019-2020 SCADA and wind speed data are used to evaluate this, demonstrating that the method provides suitable predictions. Furthermore, full deployment of the SimTwin framework is demonstrated using live wind speed forecasts. This may prove useful for operators by reducing reliance on SCADA systems and provides a research and development tool where live data is limited.
  • 2023 The 6th International Conference on Renewable Energy and Environment Engineering (REEE 2023)

Real-time social media sentiment analysis for rapid impact assessment of floods.

    Hide/Show Full Abstract Traditional approaches to flood modelling mostly rely on hydrodynamic physical simulations. While these simulations can be accurate, they are computationally expensive and prohibitively so when thinking about real-time prediction based on dynamic environmental conditions. Alternatively, social media platforms such as Twitter are often used by people to communicate during a flooding event, but discovering which tweets hold useful information is the key challenge in extracting information from posts in real time. In this article, we present a novel model for flood forecasting and monitoring that makes use of a transformer network that assesses the severity of a flooding situation based on sentiment analysis of the multimodal inputs (text and images). We also present an experimental comparison of a range of state-of-the-art deep learning methods for image processing and natural language processing. Finally, we demonstrate that information induced from tweets can be used effectively to visualise fine-grained geographical flood-related information dynamically and in real-time.
  • 2023 Computers & Geosciences

Domain-invariant icing detection on wind turbine rotor blades with generative artificial intelligence for deep transfer learning.

    Hide/Show Full Abstract Wind energy’s ability to liberate the world from conventional sources of energy relies on lowering the significant costs associated with the maintenance of wind turbines. Since icing events on turbine rotor blades are a leading cause of operational failures, identifying icing in advance is critical. Some recent studies have utilized deep learning (DL) techniques to predict icing events with high accuracy by leveraging rotor blade images, but these studies only focus on specific wind parks and fail to generalize to unseen scenarios (e.g., new rotor blade designs). In this paper, we aim to facilitate ice prediction on the face of lack of ice images in new wind parks. We propose the utilization of synthetic data augmentation via a generative artificial intelligence technique—the neural style transfer algorithm to improve the generalization of existing ice prediction models. We also compare the proposed technique with the CycleGAN as a baseline. We show that training standalone DL models with augmented data that captures domain-invariant icing characteristics can help improve predictive performance across multiple wind parks. Through efficient identification of icing, this study can support preventive maintenance of wind energy sources by making them more reliable toward tackling climate change.
  • 2023 Environmental Data Science, Cambridge University Press

Multi-channel Convolutional Neural Network for Precise Meme Classification.

    Hide/Show Full Abstract This paper proposes a multi-channel convolutional neural network(MC-CNN) for classifying memes and non-memes. Our architecture is trained and validated on a challenging dataset that includes non-meme formats with textual attributes, which are also circulated online but rarely accounted for in meme classification tasks. Alongside a transfer learning base, two additional channels capture low-level and fundamental features of memes that make them unique from other images with text. We contribute an approach which outperforms previous meme classifiers specifically in live data evaluation, and one that is better able to generalise ’in the wild’. Our research aims to improve accurate collation of meme content to support continued research in meme content analysis, and meme-related sub-tasks such as harmful content detection.
  • 2023 Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR). Thessaloniki, Greece.

This new conversational AI model can be your friend, philosopher, and guide ... and even your worst enemy.

    Hide/Show Full Abstract We explore the recently released ChatGPT model, one of the most powerful conversational AI models that has ever been developed. This opinion provides a perspective on its strengths and weaknesses and a call to action for the AI community (including academic researchers and industry) to work together on preventing potential misuse of such powerful AI models in our everyday lives.
  • 2023 Patterns Volume 4, Issue 1, Opinion Article

Generalized Ice Detection on Wind Turbine Rotor Blades with Neural Style Transfer.

    Hide/Show Full Abstract Wind energy’s ability to liberate the world of conventional sources of energy relies on lowering the significant costs associated with the maintenance of wind turbines. Since icing events on turbine rotor blades are a leading cause of operational failures, identifying icing in advance is critical. Some recent studies focus on specific wind parks and fail to generalize to unseen scenarios (e.g. new rotor blade designs). We propose the utilisation of synthetic data augmentation via neural style transfer to improve the generalization of existing ice prediction models. We show that training models with augmented data that captures domain-invariant icing characteristics can help improve predictive performance across multiple wind parks. Through efficient identification of icing, this study can support preventive maintenance of wind energy sources by making them more reliable towards tackling climate change.
  • 2022 Climate Change AI Workshop, NeurIPS, New Orleans, USA

A Deep Learning Framework for Wind Turbine Repair Action Prediction Using Alarm Sequences and Long Short Term Memory Algorithms.

    Hide/Show Full Abstract With an increasing emphasis on driving down the costs of Operations and Maintenance (O&M) in the Offshore Wind (OSW) sector, comes the requirement to explore new methodology and applications of Deep Learning (DL) to the domain. Condition-based monitoring (CBM) has been at the forefront of recent research developing alarm-based systems and data-driven decision making. This paper provides a brief insight into the research being conducted in this area, with a specific focus on alarm sequence modelling and the associated challenges faced in its implementation. The paper proposes a novel idea to predict a set of relevant repair actions from an input sequence of alarm sequences, comparing Long Short-term Memory (LSTM) and Bidirectional LSTM (biLSTM) models. Achieving training accuracy results of up to 80.23%, and test accuracy results of up to 76.01% with biLSTM gives a strong indication to the potential benefits of the proposed approach that can be furthered in future research. The paper introduces a framework that integrates the proposed approach into O&M procedures and discusses the potential benefits which include the reduction of a confusing plethora of alarms, as well as unnecessary vessel transfers to the turbines for fault diagnosis and correction.
  • 2022 8th International Symposium on Model-Based Safety Assessment, Munich, Germany

Automated Question-Answering for Interactive Decision Support in Operations & Maintenance of Wind Turbines.

    Hide/Show Full Abstract Intelligent question-answering (QA) systems have witnessed increased interest in recent years, particularly in their ability to facilitate information access, data interpretation or decision support. The wind energy sector is one of the most promising sources of renewable energy, yet turbines regularly suffer from failures and operational inconsistencies, leading to downtimes and significant maintenance costs. Addressing these issues requires rapid interpretation of complex and dynamic data patterns under time-critical conditions. In this article, we present a novel approach that leverages interactive, natural language-based decision support for operations & maintenance (O&M) of wind turbines. The proposed interactive QA system allows engineers to pose domain-specific questions in natural language, and provides answers (in natural language) based on the automated retrieval of information on turbine sub-components, their properties and interactions, from a bespoke domain-specific knowledge graph. As data for specific faults is often sparse, we propose the use of paraphrase generation as a way to augment the existing dataset. Our QA system leverages encoder-decoder models to generate Cypher queries to obtain domain-specific facts from the KG database in response to user-posed natural language questions. Experiments with an attention-based sequence-to-sequence (Seq2Seq) model and a transformer show that the transformer accurately predicts up to 89.75% of responses to input questions, outperforming the Seq2Seq model marginally by 0.76%, though being 9.46 times more computationally efficient. The proposed QA system can help support engineers and technicians during O&M to reduce turbine downtime and operational costs, thus improving the reliability of wind energy as a source of renewable energy.
  • 2022 IEEE Access Vol 10.

Multimodal Approach to Early Detection of Harmful Algal Blooms.

    Hide/Show Full Abstract A rise in ecological anomalous events will be observed due to climate change. One such event is the harmful algal bloom which occurs due to an increase in nutrients from anthropogenic activities and has economic and ecological effects. Algae thrive in warmer temperatures which will lead to an increase in the frequency of harmful algal blooms. To overcome this increasing frequency, early detection tools are essential. Deep learning and frequent monitoring have been used to detect this phenomenon with a focus on unimodal approaches. In this work, we propose using multiple sources of satellite and in-situ data for detecting algal blooms with a joint multimodal learning approach, focusing on the North Sea and the Irish Sea. This work will aid domain experts to monitor potential changes to the ecosystem done by human interference and to take action when necessary.
  • 2022 ECML/PKDD Workshop on Machine Learning for Earth Observation

Facilitating a smoother transition to renewable energy with AI.

    Hide/Show Full Abstract Artificial intelligence (AI) can help facilitate wider adoption of renewable energy globally. We organized a social event for the AI and renewables community to discuss these aspects at the International Conference on Learning Representations (ICLR), a leading AI conference. This opinion reflects on the key messages and provides a call for action on leveraging AI for transition toward net zero.
  • 2022 Patterns Opinion Vol 3, Issue 6.

RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification.

    Hide/Show Full Abstract Several existing resources are available for sentiment analysis (SA) tasks that are used for learning sentiment specific embedding (SSE) representations. These resources are either large, common-sense knowledge graphs (KG) that cover a limited amount of polarities/emotions or they are smaller in size, such as lexicons, which require costly human annotation and cover fine-grained emotions. Therefore using knowledge resources to learn SSE representations is either limited by the low coverage of polarities/emotions or the overall size of a resource. In this paper, we first introduce a new directed KG called ‘RELATE’, which is built to overcome both the issue of low coverage of emotions and the issue of scalability. RELATE is the first KG of its size to cover Ekman’s six basic emotions that are directed towards entities. It is based on linguistic rules to incorporate the benefit of semantics without relying on costly human annotation. The performance of ‘RELATE’ is evaluated by learning SSE representations using a Graph Convolutional Neural Network (GCN).
  • 2022 13th Language Resources and Evaluation Conference (LREC).

Towards Contextually Sensitive Analysis of Memes: Meme Genealogy and Knowledge Base.

    Hide/Show Full Abstract As online communication grows, memes have con- tinued to evolve and circulate as succinct multi- modal forms of communication. However, compu- tational approaches applied to meme-related lack the same depth and contextual sensitivity of non- computational approaches and struggle to interpret intra-modal dynamics and referentiality. This re- search proposes to a ‘meme genealogy’ of key fea- tures and relationships between memes to inform a knowledge base constructed from meme-specific online sources and embed connotative meaning and contextual information in memes. The proposed methods provide a basis to train contextually sensi- tive computational models for analysing memes and applications in automated meme annotation.
  • 2022 IJCAI Doctoral Consortium.

Imputation of Partially Observed Water Quality Data Using Self-Attention LSTM.

    Hide/Show Full Abstract Possible sensory failures on monitoring systems re- sult in partially filled data which may lead to erroneous statistical conclusions which may affect critical systems such as pollutant detectors and anomaly activity detectors. Therefore imputation becomes necessary to decrease error. This work addresses the missing data problem by experimenting with various methods in the context of a water quality dataset with high miss rates. Compared models chosen make different assumptions about the data which are Generative Adversarial Networks, Multiple Im- putation by Chained Equations, Variational Auto-Encoders, and Recurrent Neural Networks. A novel recurrent neural network architecture with self-attention is proposed in which imputation is done in a single pass. The proposed model performs with a lower root mean square error, ranging between 0.012-0.28, in three of the four locations. The self-attention components increase the interpretability of the imputation process at each stage of the network, providing information to domain experts.
  • 2022 IEEE International Joint Conference on Neural Networks (IJCNN). Padua, Italy.

Rapid assessment of offshore monopile fatigue using machine learning.

    Hide/Show Full Abstract Offshore wind turbine monopiles require structural health monitoring throughout their lifespan, yet direct structural measurements are limited. This paper combines numerical modeling and machine learning to present an approach to obtain rapid estimations of monopile fatigue using hourly metocean conditions. Aero-hydro-servo-elastic numerical simulations for a reference turbine provide the meta-model training dataset that encompasses wind-wave conditions applicable to the North Sea. Analysis reveals conditions whereby higher-order fully non-linear wave kinematics produce larger damage values compared to linear waves. This increase in damage is absent when implementing a simple probabilistic data lumping method. The prototype meta-model is developed based on convolutional neural networks to determine the monopile damage from measured wind-wave conditions at high temporal frequency. The proof-of-concept meta-model provides a step-change that demonstrates a promising approach to estimate monopile fatigue accumulation at high temporal resolution with scope for development to specific real-world offshore wind farms where validation data is available.
  • 2022 European Workshop on Structural Health Monitoring (EWSHM), Palermo, Italy.

Physics-informed machine learning for rapid fatigue assessments in offshore wind farms.

    Hide/Show Full Abstract Accurate and efficient assessment of offshore wind turbine monopile fatigue is required to inform maintenance and decommissioning decision making. Although, direct field-based measurements are limited and current industry standard f approaches are often devoid of fully non-linear waves, thus omitting critically important resonance effects. Here, numerical modelling is combined with machine learning to develop a meta-model capable of rapidly estimating monopile damage and fatigue. Fully non-linear wave kinematics were numerically modelled using higher-order boundary element methods to represent conditions recorded in the North Sea. These environmental simulations were implemented within numerical areo-hydro-servo-elastic engineering modelling of a reference turbine (NREL 5MW) with monopile foundations, for both operational and parked turbine configurations across a range of incoming wind conditions. The modelled fore-aft tower base bending moments are used to estimate of the corresponding structural damage using rainflow-counting methods, enabling identification of conditions associated with the largest damage loads. These data are applied within the development a meta-model based on convolutional neural networks to provide rapid assessment of monopile damage associated with any given environmental and operational condition.
  • 2022 Supergen ORE Hub Fourth Annual Assembly.

Modelling Phytoplankton Behaviour in the North and Irish Sea with Transformer Networks.

    Hide/Show Full Abstract Climate change will affect how water sources are managed and monitored. Continuous monitoring of water quality is crucial to detect pollution, to ensure that various natural cycles are not disrupted by anthropogenic activities and to assess the effec- tiveness of beneficial management measures taken under defined protocols. One such disruption is algal blooms in which population of phytoplank- ton increase rapidly affecting biodiversity in marine environments. The frequency of algal blooms will in- crease with climate change as it presents favourable conditions for reproduction of phytoplankton. Ma- chine learning has been used for early detection of algal blooms previously, with the focus mostly on single closed bodies of water in Far East Asia with short time ranges. In this work, we study four locations around the North Sea and the Irish Sea with different characteristics predicting activity with longer time-spans and explaining the importance of the input for the decision making process with regards to the prediction model. This work aids domain experts to monitor potential changes to the ecosystem done by human interference over longer time ranges and to take action when necessary.
  • 2022 Northern Lights Deep Learning Conference (NLDL).

Using Multimodal Data and AI to Dynamically Map Flood Risks.

    Hide/Show Full Abstract Classical measurements and modelling that underpin present flood warning and alert systems are based on fixed and spa- tially restricted static sensor networks. Computationally ex- pensive physics-based simulations are often used that can’t react in real-time to changes in environmental conditions. We want to explore contemporary artificial intelligence (AI) for predicting flood risk in real time by using a diverse range of data sources. By combining heterogeneous data sources, we aim to nowcast rapidly changing flood conditions and gain a greater understanding of urgent humanitarian needs.
  • 2022 AAAI Doctoral Consortium (AAAI-DC).

Scientometric review of artificial intelligence for operations & maintenance of wind turbines: The past, present and future.

    Hide/Show Full Abstract Wind energy has emerged as a highly promising source of renewable energy in recent times. However, wind turbines regularly suffer from operational inconsistencies, leading to significant costs and challenges in operations and maintenance (O&M). Condition-based monitoring (CBM) and performance assessment/analysis of turbines are vital aspects for ensuring efficient O&M planning and cost minimisation. Data-driven decision making techniques have witnessed rapid evolution in the wind industry for such O&M tasks during the last decade, from applying signal processing methods in early 2010 to artificial intelligence (AI) techniques, especially deep learning in 2020. In this article, we utilise statistical computing to present a scientometric review of the conceptual and thematic evolution of AI in the wind energy sector, providing evidence-based insights into present strengths and limitations of data-driven decision making in the wind industry. We provide a perspective into the future and on current key challenges in data availability and quality, lack of transparency in black box-natured AI models, and prevailing issues in deploying models for real-time decision support, along with possible strategies to overcome these problems. We hope that a systematic analysis of the past, present and future of CBM and performance assessment can encourage more organisations to adopt data-driven decision making techniques in O&M towards making wind energy sources more reliable, contributing to the global efforts of tackling climate change.
  • 2021. Renewable and Sustainable Energy Reviews 144.

A divide-and-conquer approach to neural natural language generation from structured data.

    Hide/Show Full Abstract Current approaches that generate text from linked data for complex real-world domains can face problems including rich and sparse vocabularies as well as learning from examples of long varied sequences. In this article, we propose a novel divide-and-conquer approach that automatically induces a hierarchy of “generation spaces” from a dataset of semantic concepts and texts. Generation spaces are based on a notion of similarity of partial knowledge graphs that represent the domain and feed into a hierarchy of sequence-to-sequence or memory-to-sequence learners for concept-to-text generation. An advantage of our approach is that learning models are exposed to the most relevant examples during training which can avoid bias towards majority samples. We evaluate our approach on two common benchmark datasets and compare our hierarchical approach against a flat learning setup. We also conduct a comparison between sequence-to-sequence and memory-to-sequence learning models. Experiments show that our hierarchical approach overcomes issues of data sparsity and learns robust lexico-syntactic patterns, consistently outperforming flat baselines and previous work by up to 30%. We also find that while memory-to-sequence models can outperform sequence-to-sequence models in some cases, the latter are generally more stable in their performance and represent a safer overall choice.
  • 2021. Neurocomputing 433, 300-309.

Hierarchical Multiscale Recurrent Neural Networks for Detecting Suicide Notes.

    Hide/Show Full Abstract Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data. Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media blogs in two document-level classification tasks. The first task aims to identify suicide notes from depressed and blog posts in a balanced dataset, whilst the second experiment looks at how well suicide notes can be classified when there is a vast amount of neutral text data, which makes the task more applicable to real-world scenarios. Furthermore we perform a linguistic analysis using LIWC (Linguistic Inquiry and Word Count). We present a learning model for modelling long sequences in two experiment series. We achieve an f1-score of 88.26% over the baselines of 0.60 in experiment 1 and 96.1% over the baseline in experiment 2. Finally, we show through visualisations which features the learning model identifies, these include emotions such as love and personal pronouns.
  • 2021. IEEE Transactions on Affective Computing.

XAI4Wind: A Multimodal Knowledge Graph Database for Explainable Decision Support in Operations & Maintenance of Wind Turbines.

    Hide/Show Full Abstract Condition-based monitoring (CBM) has been widely utilised in the wind industry for monitoring operational inconsistencies and failures in turbines, with techniques ranging from signal processing and vibration analysis to artificial intelligence (AI) models using Supervisory Control & Acquisition (SCADA) data. However, existing studies do not present a concrete basis to facilitate explainable decision support in operations and maintenance (O&M), particularly for automated decision support through recommendation of appropriate maintenance action reports corresponding to failures predicted by CBM techniques. Knowledge graph databases (KGs) model a collection of domain-specific information and have played an intrinsic role for real-world decision support in domains such as healthcare and finance, but have seen very limited attention in the wind industry. We propose XAI4Wind, a multimodal knowledge graph for explainable decision support in real-world operational turbines and demonstrate through experiments several use-cases of the proposed KG towards O&M planning through interactive query and reasoning and providing novel insights using graph data science algorithms. The proposed KG combines multimodal knowledge like SCADA parameters and alarms with natural language maintenance actions, images etc. By integrating our KG with an Explainable AI model for anomaly prediction, we show that it can provide effective human-intelligible O&M strategies for predicted operational inconsistencies in various turbine sub-components. This can help instil better trust and confidence in conventionally black-box AI models. We make our KG publicly available and envisage that it can serve as the building ground for providing autonomous decision support in the wind industry.
  • arXiv preprint arXiv:2012.10489, 2020.

Transparency, Interpretability and Data Availability: Key Challenges for Tackling Climate Change with AI.

    Hide/Show Full Abstract With growing natural disasters, rise in carbon emissions and faltering ecosystems, the need for furthering research in climate change has become integral. Recent studies have shown that data science can play a vital role in better understanding natural phenomena and discovering novel insights. Although no silver bullet, machine learning (ML) has been successfully utilised in an array of applications, ranging from prediction and assessment of droughts and floods, energy control in grids, water quality modelling, operations & maintenance (O&M) of renewable energy sources such as wind and solar energy etc. However, the existing studies suffer from 2 prime challenges: (1) Lack of data availability - domain specific information e.g. from wind turbines, is often commercially sensitive, making it difficult to procure large amounts of useable data - especially new kinds of data which can possibly generate significant new insights. Transfer learning techniques can help learn from little or no labelled data, ensuring accuracy and helping algorithms to generalise better. (2) The black-box nature of (deep) ML models makes them suffer from the problem of transparency, wherein, although predictions can often be made with high accuracy, confidence and trust in the model decisions is difficult. A human intelligible diagnosis of when, why, what and how a model performs (or not) is essential. Hybrid ML techniques can bridge the gap between transparency and accuracy, and causal inference can help discover hidden insights from data. Natural language generation can further help in generating informative reports and descriptions of natural disasters and O&M strategies for renewable energy sources. We propose a perspective to tackle some of these challenges in ensuring reliable decision making and envisage that making data-driven decision support systems intelligent and transparent would have a significant impact in tackling climate change.
  • 2020. Workshop on Data Science in Climate and Climate Impact Research, ETH Zurich, Switzerland.

Explainable AI for Intelligent Decision Support in Operations & Maintenance of Wind Turbine.

    Hide/Show Full Abstract As global efforts in transitioning to sustainable energy sources rise, wind energy has become a leading renewable energy resource. However, turbines are complex engineering systems and rely on effective operations & maintenance (O&M) to prevent catastrophic failures in sub-components (gearbox, generator, etc.). Wind turbines have multiple sensors embedded within their sub-components which regularly measure key internal and external parameters (generator bearing temperature, rotor speed, wind speed etc.) in the form of Supervisory Control & Data Acquisition (SCADA) data. While existing studies have focused on applying ML techniques towards anomaly prediction in turbines based on SCADA data, they have not been supported with transparent decisions, owing to the inherent black box nature of ML models. In this project, we aim to explore transparent and intelligent decision support in O&M of turbines, by predicting faults and providing human-intelligible maintenance strategies to avert and fix the underlying causes. We envisage that in contributing to explainable AI for the wind industry, our method would help make turbines more reliable, encouraging more organisations to switch to renewable energy sources for combating climate change.
  • 2020. Proceedings of the European Conference on Artificial Intelligence (ECAI)’s Doctoral Consortium, Santiago, Spain, August.

Improving the Transparency of Deep Neural Networks using Artificial Epigenetic Molecules.

    Hide/Show Full Abstract Artificial gene regulatory networks (AGRNs) are connectionist architectures inspired by biological gene regulation capable of solving tasks within complex dynamical systems. The implementation of an operational layer inspired by epigenetic mechanisms has been shown to improve the performance of AGRNs, and improve their transparency by providing a degree of explainability. In this paper, we apply artificial epigenetic layers (AELs) to two trained deep neural networks (DNNs) in order to gain an understanding of their internal workings, by determining which parts of the network are required at a particular point in time, and which nodes are not used at all. The AEL consists of artificial epigenetic molecules (AEMs) that dynamically interact with nodes within the DNNs to allow for the selective deactivation of parts of the network.
  • 2020. Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI).

Deep reinforcement learning for maintenance planning of offshore vessel transfer.

    Hide/Show Full Abstract Offshore wind farm operators need to make short-term decisions on planning vessel transfers to turbines for preventive or corrective maintenance. These decisions can play a pivotal role in ensuring maintenance actions are carried out in a timely and cost-effective manner. The present optimization of offshore vessel transfer uses mathematical models rather than learning decisions from historical data. In this paper, we design a simulated environment for an offshore wind farm based on Supervisory Control & Acquisition (SCADA) data and alarm logs of historical faults in an operational turbine. Firstly, we utilise a state-of-art decision tree model to predict fault types using SCADA features, and provide explainable decisions. Next, we apply deep reinforcement learning to automatically learn maintenance priorities corresponding to different fault types for ensuring prioritized vessel transfers for critical conditions, and deciding on optimal vessel fleet size. This can lead to significant savings in maintenance costs for the offshore wind industry.
  • Developments in Renewable Energies Offshore: Proceedings of the 4th International Conference on Renewable Energies Offshore (RENEW 2020, 12-15 October 2020, Lisbon, Portugal).

Temporal Causal Inference in Wind Turbine SCADA Data Using Deep Learning for Explainable AI.

    Hide/Show Full Abstract Machine learning techniques have been widely used for condition-based monitoring of wind turbines using Supervisory Control & Acquisition (SCADA) data. However, many machine learning models, including neural networks, operate as black boxes: despite performing suitably well as predictive models, they are not able to identify causal associations within the data. For data-driven system to approach human-level intelligence in generating effective maintenance strategies, it is integral to discover hidden knowledge in the operational data. In this paper, we apply deep learning to discover causal relationships between multiple features (confounders) in SCADA data for faults in various sub-components from an operational turbine using convolutional neural networks (CNNs) with attention. Our technique overcomes the black box nature of conventional deep learners and identifies hidden confounders in the data through the use of temporal causal graphs. We demonstrate the effects of SCADA features on a wind turbine’s operational status, and show that our technique contributes to explainable AI for wind energy applications by providing transparent and interpretable decision support.
  • Journal of Physics: Conference Series, 2020.

Deep learning with knowledge transfer for explainable anomaly prediction in wind turbines.

    Hide/Show Full Abstract The last decade has witnessed an increased interest in applying machine learning techniques to predict faults and anomalies in the operation of wind turbines. These e�orts have lately been dominated by deep learning techniques which, as in other �elds, tend to outperform traditional machine learning algorithms given su�cient amounts of training data. An important shortcoming of deep learning models is their lack of transparency – they operate as black boxes and typically do not provide rationales for their predictions, which can lead to a lack of trust in predicted out- puts. In this article, a novel hybrid model for anomaly prediction in wind farms is proposed, that combines a recurrent neural network approach for accurate classi�cation with an XGBoost deci- sion tree classi�er for transparent outputs. Experiments with an o�shore wind turbine show that our model achieves a classi�cation accuracy of up to 97%. The model is further able to generate detailed feature importance analyses for any detected anomalies, identifying exactly those com- ponents in a wind turbine that contribute to an anomaly. Finally, the feasibility of transfer learning is demonstrated for the wind domain by porting our “o�shore" model to an unseen dataset from an onshore wind farm. The latter model achieves an accuracy of 65% and is able to detect 85% of anomalies in the unseen domain. These results are encouraging for application to wind farms for which no training data is available, e.g. because they have not been in operation for long.
  • 2020. Wind Energy 23(8).

A Dual Transformer Model for Intelligent Decision Support for Maintenance of Wind Turbines.

    Hide/Show Full Abstract Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to fault prediction in wind turbines, but these predictions have not been supported with suggestions on how to avert and fix faults. We present a data-to-text generation system utilising transformers for generating corrective maintenance strategies for faults using SCADA data capturing the operational status of turbines. We achieve this in two stages: a first stage identifies faults based on SCADA input features and their relevance. A second stage performs content selection for the language generation task and creates maintenance strategies based on phrase-based natural language templates. Experiments show that our dual transformer model achieves an accuracy of up to 96.75% for alarm prediction and up to 75.35% for its choice of maintenance strategies during content-selection. A qualitative analysis shows that our generated maintenance strategies are promising. We make our human- authored maintenance templates publicly available, and include a brief video explaining our approach.
  • 2020 International Joint Conference on Neural Networks (IJCNN).

The Promise of Causal Reasoning in Reliable Decision Support for Wind Turbines.

    Hide/Show Full Abstract The global pursuit towards sustainable development is leading to increased adaptation of renewable energy sources. Wind turbines are promising sources of clean energy, but regularly suffer from failures and down-times, primarily due to the complex environments and unpredictable conditions wherein they are deployed. While various studies have earlier utilised machine learning techniques for fault prediction in turbines, their black-box nature hampers explainability and trust in decision making. We propose the application of causal reasoning in operations & maintenance of wind turbines using Supervisory Control & Acquisition (SCADA) data, and harness attention-based convolutional neural networks (CNNs) to identify hidden associations between different parameters contributing to failures in the form of temporal causal graphs. By interpreting these non-obvious relationships (many of which may have potentially been disregarded as noise), engineers can plan ahead for unforeseen failures, helping make wind power sources more reliable.
  • Fragile Earth Workshop, KDD, August 2020, San Diego, CA

Hybrid approaches to fine-grained emotion detection in social media data.

    Hide/Show Full Abstract This paper states the challenges in fine-grained target- dependent Sentiment Analysis for social media data using recurrent neural networks. Firstly, we outline the problem statement and give a brief overview of related work in the area. Then we outline progress and results achieved to date, a brief research plan and future directions of this work.
  • To appear. In AAAI-2020 Doctoral Consortium. New York, USA.

Bidirectional Dilated LSTM with Attention for Fine-grained Emotion Classification in Tweets.

    Hide/Show Full Abstract We propose a novel approach for fine-grained emotion classification in tweets using a Bidirectional Dilated LSTM (BiDLSTM) with attention. Conventional LSTM architectures can face problems when classifying long sequences, which is problematic for tweets, where crucial information is often attached to the end of a sequence, e.g. an emoticon. We show that by adding a bidirectional layer, dilations and attention mechanism to a standard LSTM, our model overcomes these problems and is able to maintain complex data dependencies over time. We present experiments with two datasets, the 2018 WASSA Implicit Emotions Shared Task and a new dataset of 240,000 tweets. Our BiDLSTM with attention achieves a test accuracy of up to 81.97% outperforming competitive baselines by up to 10.52% on both datasets. Finally, we evaluate our data against a human benchmark on the same task.
  • To appear. In Proceedings of AAAI-2020 Workshop on Affective Content Analysis. New York, USA

Transparent Deep Learning and Transductive Transfer Learning: A New Dimension for Wind Energy Research.

    Hide/Show Full Abstract Wind turbines suffer from operational inconsistencies due to a variety of factors, ranging from environmental changes, to intrinsic anomalies in specific components, such as gearbox, generator, pitch system etc. Condition monitoring of wind turbines has been a critical research area in the last decade, wherein the Supervisory Control & Data Acquisition (SCADA) data is used to analyse the operational behaviour of the turbine and predict any incipient faults to prevent catastrophic losses caused by unexpected failures. Machine learning models have formed a large part of the data-analytics based methods used for learning from historical failures through supervised learning, but they suffer from the lack of ability to provide additional capabilities for learning with little labelled data, or for that matter, no labelled faults in a different domain. Deep learning has shown immense success in areas where time-series data is to be modelled. In this paper, we propose a hybrid deep learning model combining a Long short-term memory network (LSTM) with XGBoost, a decision tree-based classifier for providing the benefits of accuracy through deep learning, and transparency through traditional decision trees. Our study shows that Transfer learning allows us to make predictions with increasing accuracy on unseen data; which is useful for simulations of new operations, new wind farms or other cases of non-available training data. This can help reduce downtime of turbines through predictive maintenance, by predicting incipient faults, or provide corrective maintenance, by assisting the engineers and technicians to analyse the root causes behind the failure, thus contributing to the reliability and uptake of wind energy as a sustainable and promising domain.
  • 2019. In WindEurope Offshore, Copenhagen, Denmark.

Natural Language Generation for Operations and Maintenance in Wind Turbines.

    Hide/Show Full Abstract Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to fault prediction in wind turbines, but these predictions have not been supported with suggestions on how to avert and fix faults. We present a data-to-text generation system using transformers to produce event descriptions from SCADA data capturing the operational status of turbines and proposing maintenance strategies. Experiments show that our model learns feature representations that correspond to expert judgements. In making a contribution to the reliability of wind energy, we hope to encourage organisations to switch to sustainable energy sources and help combat climate change.
  • 2019. In NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning. Vancouver, Canada.

Dilated LSTM with ranked units for classification of suicide notes.

    Hide/Show Full Abstract Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data. Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media blogs in a document-level classification task. Also, we present a learning model for modelling long sequences, achieving an f1-score of 0.84 over the baselines of 0.53 and 0.80 (best competing model). Finally, we also show through visualisations which features the learning model identifies.
  • 2019. In Proceedings of AI for Social Good workshop at NeurIPS (2019), Vancouver, Canada.

Dilated LSTM with attention for Classification of suicide notes.

    Hide/Show Full Abstract In this paper we present a dilated LSTM with attention mechanism for document-level classification of suicide notes, last statements and depressed notes. We achieve an accuracy of 87.34% compared to competitive baselines of 80.35% (Logistic Model Tree) and 82.27% (Bi-directional LSTM with Attention). Furthermore, we provide an analysis of both the grammatical and thematic content of suicide notes, last statements and depressed notes. We find that the use of personal pronouns, cognitive processes and references to loved ones are most important. Finally, we show through visualisations of attention weights that the Dilated LSTM with attention is able to identify the same distinguishing features across documents as the linguistic analysis.
  • 2019. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) at EMNLP. Hong Kong.

Modularity Within Artificial Gene Regulatory Networks

    Hide/Show Full Abstract Modularity is a feature of found in biological systems where it is common for functionally related processes to evolve to be individually discrete units. Such traits are prevelant in prokaryotic genomes. This work aims to understand to what extent artificial gene regulatory networks AGRNs, which take inspiration from gene regulation in nature will self-divide into modular task specific sub-networks consisting of multiple interacting nodes when solving multiple complex tasks. To investigate this, we evolve AGRNs to solve three different tasks with ranging dynamics simultaneously and evaluate the network structure. From this we aim to build an understanding of whether modularity in AGRNs is fundamental to solving multiple tasks and what effect the nature of the tasks being solved has on modularity within the networks.
  • 2019. IEEE Congress on Evolutionary Computation, Wellington, New Zealand.

A Deep Learning Approach Towards Prediction of Faults in Wind Turbines.

    Hide/Show Full Abstract With the rising costs of conventional sources of en- ergy, the world is moving towards sustainable energy sources including wind energy. Wind turbines consist of several electrical and mechanical components and experience an enormous amount of irregular loads, making their operational behaviour at times inconsis- tent. Operations and Maintenance (O&M) is a key factor in monitoring such inconsistent behaviour of the turbines in order to predict and prevent any in- cipient faults which may occur in the near future.
  • 2019. Extended Abstract in Northern Lights Deep Learning Workshop (NLDL), Tromso, Norway.

Evolutionary Constraint in Artificial Gene Regulatory Networks.

    Hide/Show Full Abstract Evolutionary processes such as convergent evolution and rapid adaptation which suggest that there are constraints on how organisms evolve. Without constraint, such processes would most likely not be possible in the time frame in which they are seen. This paper investigates how artificial gene regulatory networks (GRNs), a connectionist architecture designed for computational problem solving may too be constrained in its evolutionary pathway. To understand this further, GRNs are applied to two different computational tasks and the way their underlying genes evolve over time is observed. From this, rules about how often genes are evolved and how this correlates with thier connectivity within the GRN are deduced. By generating and applying these rules, we can build an understanding of how GRNs are constrained in their evolutionary path, and build measures to exploit this to improve evolutionary performance and speed.
  • 2018. In Proceedings of the 18th Annual UK Workshop on Computational Intelligence, Nottingham, UK. Volume 840 of the Advances in Intelligent Systems and Computing.

Unsupervised suicide note classification.

    Hide/Show Full Abstract With the greater availability of linguistic data from public social media platforms and the advancements of natural language processing, a number of opportunities have arisen for researchers to analyse this type of data. Research efforts have mostly focused on detecting the polarity of textual data, evaluating whether there is positive, negative or sometimes neutral content. Especially the use of neural networks has recently yielded significant results in polarity detection experiments. In this paper we present a more fine-grained approach to detecting sentiment in textual data, particularly analysing a corpus of suicide notes, depressive notes and love notes. We achieve a classification accuracy of 71.76% when classifying based on text and sentiment features, and an accuracy of 69.41% when using the words present in the notes alone. We discover that while emotions in all three datasets overlap, each of them has a unique ‘emotion profile’ which allows us to draw conclusions about the potential mental state that is reflects. Using the emotion sequences only, we achieve an accuracy of 75.29%. The results from unannotated data, while worse than the other models, nevertheless represent an encouraging step towards being able to flag potentially harmful social media posts online and in real time. We provide a high-level corpus analysis of the data sets in order to demonstrate the grammatical and emotional differences.
  • 2018. In Proceedings of the 7th KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM), co-located with the Knowledge Discovery and Data Mining (KDD), London, UK.

Domain Transfer for Deep Natural Language Generation from Abstract Meaning Representations.

    Hide/Show Full Abstract Stochastic natural language generation systems that are trained from labelled datasets are often domain-specific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short- term memory recurrent neural network encoder-decoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%.
  • 2017. IEEE Computational Intelligence Magazine: Special Issue on Natural Language Generation with Computational Intelligence.

Transparency Of Execution Using Epigenetic Networks.

    Hide/Show Full Abstract This paper describes how the recurrent connectionist architecture epiNet, which is capable of dynamically modifying its topology, is able to provide a form of transparent execution. EpiNet, which is inspired by eukaryotic gene regulation in nature, is able to break its own architecture down into sets of smaller interacting networks. This allows for autonomous complex task decomposition, and by analysing these smaller interacting networks, it is possible to provide a real world understanding of why specific decisions have been made. We expect this work to be useful in fields where the risk of improper decision making is high, such as medical simulations, diagnostics and financial modelling. To test this hypothesis we apply epiNet to two data sets within UCI’s machine learning repository, each of which requires a specific set of behaviours to solve. We then perform analysis on the overall functionality of epiNet in order to deduce the underlying rules behind its functionality and in turn provide transparency of execution.
  • 2017. In Proceedings of the European Conference on Artificial Life (ECAL), Lyon, France.

Deep text generation - Using hierarchical decomposition to mitigate the effect of rare data points.

    Hide/Show Full Abstract Deep learning has recently been adopted for the task of natural language generation (NLG) and shown remarkable results. However, learning can go awry when the input dataset is too small or not well balanced with regards to the examples it contains for various input sequences. This is relevant to naturally occurring datasets such as many that were not prepared for the task of natural language processing but scraped off the web and originally prepared for a different purpose. As a mitigation to the problem of unbalanced training data, we therefore propose to decompose a large natural language dataset into several subsets that “talk about” the same thing. We show that the decomposition helps to focus each learner’s attention during training. Results from a proof-of-concept study show 73% times faster learning over a flat model and better results.
  • 2017. In Proceedings of Language, Data and Knowledge (LDK), Galway, Ireland. Proceedings in: Springer Lecture Notes in Computer Science (LNCS).

DEFIne: A Fluent Interface DSL for Deep Learning Applications.

    Hide/Show Full Abstract Recent years have seen a surge of interest in deep learning models that outperform other machine learning algorithms on benchmarks across many disciplines. Most existing deep learning libraries facilitate the development of neural nets by providing a mathematical framework that helps users implement their models more efficiently. This still represents a substantial investment of time and effort, however, when the intention is to compare a range of competing models quickly for a specific task. We present DEFIne, a fluent interface DSL for the specification, optimisation and evaluation of deep learning models. The fluent interface is implemented through method chaining. DEFIne is embedded in Python and is build on top of its most popular deep learning libraries, Keras and Theano. It extends these with common operations for data pre-processing and representation as well as visualisation of datasets and results. We test our framework on three benchmark tasks from different domains: heart disease diagnosis, hand-written digit recognition and weather forecast generation. Results in terms of accuracy, runtime and lines of code show that our DSL achieves equivalent accuracy and runtime to state-of-the-art models, while requiring only about 10 lines of code per application.
  • 2017. In Proceedings of the 2nd International Workshop on Real World Domain Specific Languages (RWDSL), co-located with the International Symposium on Code Generation and Optimisation (CGO’17). Austin, Texas. In: ACM Digital Library, International Conference Proceedings Series (ICPS).

Proceedings of the 4th International Workshop on Machine Learning for Interactive Systems. Co-located with the International Conference on Machine Learning (ICML), Lille, France.

  • Cuayáhuitl, H., Dethlefs, N., Frommberger, L., van Otterlo, M., Pietquin, O.
  • Link to proceedings
    Hide/Show Full Abstract Learning systems or robots that interact with their environment by perceiving, acting or communicating often face a challenge in how to bring these different concepts together. This challenge arises because core concepts are typically studied within their respective communities, such as the computer vision, robotics and natural language processing communities, among others. A commonality across communities is the use of machine learning techniques and algorithms. In this way, machine learning is crucial in the development of truly intelligent systems, not just by providing techniques and algorithms, but also by acting as a unifying factor across communities, encouraging communication, discussion and exchange of ideas. [...]
  • 2015. Proceedings in Journal of Machine Learning Research (JMLR): Workshop and Conference Proceedings.

Introduction to the Special Issue on Machine Learning for Multiple Modalities in Interactive Systems and Robots.

  • Cuayáhuitl, H., Frommberger, L., Dethlefs, N., Raux, A., Marge, M., Zender, H.
  • Link to article
    Hide/Show Full Abstract This special issue highlights research articles that apply machine learning to robots and other systems that interact with users through more than one modality, such as speech, gestures, and vision. For example, a robot may coordinate its speech with its actions, taking into account (audio-)visual feedback during their execution. Machine learning provides interactive systems with opportunities to improve performance not only of individual components but also of the system as a whole. However, machine learning methods that encompass multiple modalities of an interactive system are still relatively hard to find. The articles in this special issue represent examples that contribute to filling this gap.
  • 2014. ACM Transactions on Interactive Intelligent Systems (ACM-TiiS).

Proceedings of the Second Workshop on Machine Learning for Interactive Systems (MLIS-2014): Bridging the Gap Between Perception, Action and Communication.

  • Cuayáhuitl, H., Frommberger, L., Dethlefs, N., van Otterlo, M.
  • PDF
    Hide/Show Full Abstract The AAAI-14 Workshop program was held Sunday and Monday, July 27– 28, 2014, at the Québec City Conven- tion Centre in Québec, Canada. The AAAI-14 workshop program included 15 workshops covering a wide range of topics in artificial intelligence. The titles of the workshops were Artificial Intelli- gence and Robotics; Artificial Intelli- gence Applied to Assistive Technologies and Smart Environments; Cognitive Computing for Augmented Human Intelligence; Computer Poker and Imperfect Information; Discovery Infor- matics; Incentives and Trust in Elec- tronic Communities; Intelligent Cine- matography and Editing; Machine Learning for Interactive Systems: Bridg- ing the Gap Between Perception, Action, and Communication; Modern Artificial Intelligence for Health Analytics; Mul- tiagent Interaction Without Prior Coor- dination; Multidisciplinary Workshop on Advances in Preference Handling; Semantic Cities — Beyond Open Data to Models, Standards, and Reasoning; Sequential Decision Making with Big Data; Statistical Relational AI; and the World Wide Web and Public Health Intelligence. This article presents short summaries of those events.
  • 2014. Co-located with the 28th Conference on Artificial Intelligence (AAAI), Quebec City, Canada.

Proceedings of the Second Workshop on Machine Learning for Interactive Systems (MLIS‘2013): Bridging the Gap Between Perception, Action and Communication.

  • Cuayáhuitl, H., Frommberger, L., Dethlefs, N., van Otterlo, M.
  • Link to proceedings
    Hide/Show Full Abstract Intelligent systems or robots that interact with their environment by perceiving, acting or communicating often face a challenge in how to bring these different concepts together. One of the main reasons for this challenge is the fact that the core concepts in perception, action and communication are typically studied by different communities: the computer vision, robotics and natural language processing communities, among others, without much interchange between them. Learning systems that encompass perception, action and communication in a unified and principled way are still rare. As machine learning lies at the core of these communities, it can act as a unifying factor in bringing the communities closer together. Unifying these communities is highly important for understanding how state-of-the-art approaches from different disciplines can be combined (and applied) to form generally interactive intelligent systems. MLIS-2013 aims to bring researchers from multiple disciplines together that are in some way or another affected by the gap between perception, action and communication. Our goal is to provide a forum for interdisciplinary discussion that allows researchers to look at their work from new perspectives that go beyond their core community and develop new interdisciplinary collaborations.
  • 2013. Co-located with the 23rd International Joint Conference on Artificial Intelligence (IJCAI). Beijing, China.

Machine Learning for Interactive Systems and Robots: A Brief Introduction.

  • Cuayáhuitl, H., van Otterlo, M., Dethlefs, N., Frommberger, L.
  • PDF
    Hide/Show Full Abstract Research on interactive systems and robots, i.e. interactive machines that perceive, act and communicate, has applied a multitude of different machine learning frameworks in recent years, many of which are based on a form of reinforcement learning (RL). In this paper, we will provide a brief introduction to the application of machine learning techniques in interactive learning systems. We identify several dimensions along which interactive learning systems can be analyzed. We argue that while many applications of interactive machines seem different at first sight, sufficient commonalities exist in terms of the challenges faced. By identifying these commonalities between (learning) approaches, and by taking interdisciplinary approaches towards the challenges, we anticipate more effective design and development of sophisticated machines that perceive, act and communicate in complex, dynamic and uncertain environments.
  • 2013. In Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems (MLIS-2013): Bridging the Gap between Perception, Action and Communication (MLIS-2013). ACM International Conference Proceedings Series, 2013. Co-located with IJCAI. Beijing, China.

Proceedings of the First Workshop on Machine Learning for Interactive Systems (MLIS’2012): Bridging the Gap Between Language, Motor Control and Vision.

  • Cuayáhuitl, H., Frommberger, L., Dethlefs, N., Sahli, H.
  • PDF
    Hide/Show Full Abstract Intelligent interactive agents that are able to communicate with the world through more than one channel of communication face a number of research questions, for example: how to coordinate them in an effective manner? This is especially important given that perception, action and interaction can often be seen as mutually related disciplines that affect each other. We believe that machine learning plays and will keep playing an important role in interactive systems. Machine Learning provides an attractive and comprehensive set of computer algorithms for making interactive systems more adaptive to users and the environment and has been a central part of research in the disciplines of interaction, motor control and computer vision in recent years. This workshop aims to bring researchers together that have an interest in more than one of these disciplines and who have explored frameworks which can offer a more unified perspective on the capabilities of sensing, acting and interacting in intelligent systems and robots.
  • 2012. Co-located with the 20th European Conference on Artificial Intelligence (ECAI). Montpellier, France.