LDR Publications

When Words Move Markets: Interpretable Behavioural and Robustness Analysis of LLMs for Financial Sentiment Reasoning via Local Perturbation Explanations.

S. Verma, K. Aslansefat, J. Chatterjee , A. Marar and A. Ekundayo
Link

Hide/Show Full Abstract

Sentiment analysis plays an integral role in the financial sector towards identifying ongoing and emerging market trends. Whilst LLMs perform efficiently in tasks like sentiment prediction, the token-level and contextual understanding behind their decisions remains under-explored. This thereby limits their adoption in regulated domains. This study aims to decipher the reasoning behind how LLMs judge sentiments by taking predictive stability, token-level evidence and contextual cues into account. We apply Generative Statistical Model-Agnostic Interpretability (GSMILE) technique, to examine how a given sentence influences the model output distributions at the local level. We fine-tuned three open-source models to compare their behaviour in this study – Gemma-3-270M by Google, Mistral-7B-Instructv0.1 and Qwen-2.5-0.5B-Instruct. Our analysis shows that LLM sentiment prediction is shaped by how importance is distributed across tokens and their interactions in a sentence. Moreover, the model predictions are driven more by contextual cues than by lexical sentiment cues. These findings suggest that high efficiency alone is insufficient to enable trust in LLM-based predictions, underscoring the importance of interpretability and transparency when using LLMs in financial analytics.

2026. Proceedings of International Conference on Applications of Natural Language to Information Systems.

Structuring the Last Mile with ReActV: Tool-Augmented Delivery Planning with Verification.

J. Chatterjee , D. Gamtenadze, A. Marar and P. Agarwal
Link

Hide/Show Full Abstract

Last-mile deliveries form a critical part of logistics and supply chain management across both business-to-business (B2B) and business-to-customer (B2C) segments. Fulfilment of such deliveries is frequently hindered by incomplete delivery instructions, unreliable location signals, language barriers, and ambiguous addresses. We propose ReActV (Reason–Act–Verify), a prompt engineering methodology inspired by the classical ReAct and Chain-of-Verification (CoVe) paradigms, by incorporating an explicit verification stage and deterministic tools within the prompting loop for grounding courier actions with structured operational context for Large Language Models (LLMs). We present a multi-agent prototype implemented with the DSPy framework and applied to a real-world dataset, which transforms free-form text delivery instructions into structured and actionable insights for couriers. Our focus is on combining stepwise reasoning, action tools and verification tools for delivery planning. The paper provides a comparative evaluation of ReActV against Zero-Shot, Chain-of-Thought-only, and ReAct baselines – with ReActV demonstrating strong performance in verification quality and delivery issue coverage. The proposed approach can be viewed as a context-aware adaptive assistant. It can help address logistics challenges under low-resource settings, particularly for novice couriers in the present-day gig economy, by providing practical insights to enrich human–AI collaboration during last-mile deliveries while improving customer satisfaction and reducing operational costs.

2026. Proceedings of WebSci Companion '26: Companion Publication of the 2026 18th ACM Web Science Conference.

Natural Lagnuage Processing Deep Learning

Clustering Internet Memes with Metric Learning and Dynamic Modality Weighting.

V. Sherratt , S. Elayan and N. Dethlefs
Link

Hide/Show Full Abstract

This paper presents a two-stage metric learning approach for large-scale clustering of internet memes into pre-defined knowledge categories. We also introduce a novel dynamic modality weighting step to adaptively balance the influence of image and text attributes which outperforms other multi- modal approaches. We train and evaluate the pipeline across 678,734 memes from KnowYourMeme.com and achieve an F1 score of 91% when assigning unseen memes from a differ- ent source to KnowYourMeme.com categories. Our proposed approach incorporates more meme types than prior research, enabling the alignment of individual memes to crucial knowl- edge sources for information retrieval tasks, with further ap- plications in meme analysis, misinformation detection, hate- ful meme detection and internet cultural studies.

2026. Proceedings of the International AAAI Conference on Web and Social Media (ICWSM 2026), Los Angeles, USA

Natural Language Processing Deep Learning

SemioMeme: A Symbolic–Subsymbolic Knowledge Graph Dataset for Multimodal Meme Analysis.

V. Sherratt , S. Elayan and N. Dethlefs
Link

Hide/Show Full Abstract

Internet memes present a challenge for computational analysis as their meaning derives from cultural context external to their observable features; however, visually similar memes carry distinct cultural meanings, whilst semantically related memes may share no perceptual similarity resulting in a decoupling of format and meaning. To support analysis requiring both, we present SemioMeme, a knowledge graph providing symbolic representations of meme concepts with their cultural connections alongside subsymbolic vision and text embeddings connected via a dedicated property between both. This supports hybrid queries that can surface cultural associations accruing through graph proximity, often invisible to similarity search or explicit labelling alone. The resource, including source data and code, is made openly available and covers 16,707 meme concepts, 507K meme instances with multimodal embeddings, and 7.2M RDF triples.

2026. Proceedings of the International AAAI Conference on Web and Social Media (ICWSM 2026), Los Angeles, USA

Natural Language Processing Deep Learning

Laying the foundations for context-aware and AI-ready fault diagnosis with the Operations Ontology

Jonsson, C., Hansen, M.S., Wilson, J., Marykovskiy, Y., Dethlefs, N. , Chatterjee, C. ., Farren, D., Draper, J., Dimopoulos, A., Wiens, M., Barber, S.
Link

Hide/Show Full Abstract

In wind farm operations, the full value of operational data is not realised due to obstacles in understanding and integration. With vast amounts of data in operational silos, reference ontologies provide a common framework for describing and connecting heterogeneous data, establishing a shared meaning that is machine-readable, human-understandable and a foundation for applying AI. As part of IEA Wind Task 43 and within the WeDoWind ecosystem, we have formed a public working group of experts across multiple disciplines to develop a foundational ontology of operations. Starting with foundational entities in the field of operations and maintenance, such as maintenance process and alarm system, we develop a section of the Operations Ontology (OpOn), focusing on a demonstration use case involving diagnostics and troubleshooting of rotor over-speed protection alarms. Here we illustrate how data annotation and integration become more intuitive and efficient with the use of the ontology, and we describe how this builds a solid foundation for the use of modern AI.

Journal of Physics: Conference Series, IOP Publishing, May 2026.

Sustainability Natural language processing

Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis.

S. Dogan , N. Dethlefs N. Dethlefs and and D.Chakraborty
Link

Hide/Show Full Abstract

Memes are a central part of online culture, yet their virality remains difficult to predict, especially in cross-lingual settings. We present a large-scale, time-series dataset of 46,578 Reddit memes collected from 25 meme-centric subreddits across eight language groups, with more than one million engagement tracking points. We propose a data-driven definition of virality based on a Hybrid Score that normalises engagement by community size and integrates dynamic features such as velocity and acceleration. This approach directly addresses the field's reliance on static, simple volume-based thresholds with arbitrary cut-offs. Building on this target, we construct a multimodal feature set that combines Visual, Textual, Contextual, Network, and Temporal signals, including structured annotations from a multimodal LLM to scale cross-lingual content labelling in a consistent way. We benchmark interpretable baselines (XGBoost, MLP) against end-to-end deep models (BERT, InceptionV3, CLIP) across early observation windows from 30 to 420 minutes. Our best model, a multimodal XGBoost classifier, achieves a PR AUC of 0.43 at 30 minutes and 0.80 at 420 minutes, indicating that early prediction of meme virality is feasible even under strong class imbalance. The results reveal a clear Content Ceiling, where content-only and deep multimodal baselines plateau at low PR AUC, while structural Network and Temporal features are necessary to surpass this limit. A SHAP-based temporal analysis further uncovers an evidentiary transition, where early predictions are dominated by network priors (author and community context), and later predictions increasingly rely on temporal dynamics (velocity, acceleration) as engagement accumulates. Overall, we reframe meme virality as a dynamic, path-dependent process governed by exposure and early interaction patterns rather than by intrinsic content alone.

2026. Proceedings of the 18th ACM Web Science Conference 2026. Braunschweig, Germany.

Natural Language Processing Deep Learning

SLANG-GraphRAG: Multi-Layered Retrieval with Domain-Specific Knowledge for Low Resource Social Media Conversations.

I. Wuraola , D. Marciniak and N. Dethlefs
Link

Hide/Show Full Abstract

Emotion classification on social media is especially difficult when texts include informal, culturally grounded language like slang. Standard NLP benchmarks often miss these nuances, particularly in low-resource settings. We present SLANG-GraphRAG, a retrieval-augmented framework that integrates a culture-specific slang knowledge graph into large language models via one-shot prompting. Using multiple retrieval strategies, we incorporate slang definitions, regional usage, and conversational context. Our results show that incorporating structured cultural knowledge into the retrieval process leads to significant improvements, improving accuracy by up to 31% and F1 score by 28%, outperforming traditional and unstructured retrieval methods. To better evaluate model behavior, we propose a probabilistic metric that reflects the distribution of human annotations, providing a more nuanced measure of performance. This highlights the value of culturally sensitive applications and more balanced evaluation in subjective NLP tasks.

2026. Finding of the European Chapter of the Association for Computational Linguistics (EACL), Rabat, Morocco.

Natural Language Processing Deep Learning

Speech-Controlled Smart Speaker for Accurate, Real-Time Health and Care Record Management.

J. Carrick, N. Dethlefs, , L. Greaves, V. Gunturi, , R. Kureshi and Y. Cheng.
Link

Hide/Show Full Abstract

To help alleviate the pressures felt by care work- ers, we have begun new research into improv- ing the efficiency of care plan management by advancing recent developments in automatic speech recognition. Our novel approach adapts off-the-shelf tools in a purpose-built application for the speech domain, addressing challenges of accent adaption, real-time processing and speech hallucinations. We augment the speech- recognition scope of Open AI’s Whisper model through fine-tuning, reducing word error rates (WERs) from 16.8 to 1.0 on a range of British dialects. Addressing the speech-hallucination side effect of adapting to real-time recognition by enforcing a signal-to-noise ratio threshold and audio stream checks, we achieve a WER of 5.1, compared to 14.9 with Whisper’s orig- inal model. These ongoing research efforts tackle challenges that are necessary to build the speech-control basis for a custom smart speaker system that is both accurate and timely.

International Workshop on Spoken Dialogue System (IWSDS), Bilbao, Spain, 2025.

Natural Language Processing Deep Learning

One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

S. Rose, , C. Kambhampati, and N. Dethlefs
Link

Hide/Show Full Abstract

Grapheme-to-Phoneme (G2P) correspondences form foundational frameworks of tasks such as text-to-speech (TTS) synthesis or automatic speech recognition. The G2P process involves taking words in their written form and generating their pronunciation. In this paper, we critique the status quo definition of a grapheme, currently a forced alignment process relating a single character to either a phoneme or a blank unit, that underlies the majority of modern approaches. We develop a linguisticallymotivated redefinition from simple concepts such as vowel and consonant count and word length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves competitive results with a 31.86% Word Error Rate on a standard benchmark, while generating linguistically meaningful grapheme segmentations.

Proceedings of the 28th Conference on Computational Natural Language Learning (CoNLL)., Miami, USA.

Deep Learning Natural Language Processing

Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

I. Wuraola, N. Dethlefs and D. Marciniak
Link

Hide/Show Full Abstract

In the realm of social media discourse, the integration of slang enriches communication, reflecting the sociocultural identities of users. This study investigates the capability of large language models (LLMs) to paraphrase slang within climate-related tweets from Nigeria and the UK, with a focus on identifying emotional nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs: ChatGPT 4, Gemini, and LLaMA3 in slang paraphrasing. While ChatGPT 4 and Gemini demonstrate comparable effectiveness in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in coverage, especially of Nigerian slang. Our findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, USA.

Deep Learning Natural Language Processing Sustainability

Using Large Language Models to Recommend Repair Actions for Offshore Wind Maintenance.

C. Walker, C. Rothon, , K. Aslansefat, Y. Papadopoulos and N. Dethlefs
Link

Hide/Show Full Abstract

The Offshore Wind (OSW) industry is experiencing significant expansion, resulting in increased Operations & Maintenance (O&M) costs. Intelligent alarm systems offer the prospect of swift detection of component failures and process anomalies, enabling timely and precise interventions that could yield reductions in resource expenditure, as well as scheduled and unscheduled downtime. This paper introduces an innovative approach to tackle this challenge by capitalising on Large Language Models (LLMs). We present a specialised conversational agent that incorporates statistical techniques to calculate distances between sentences for the detection and filtering of hallucinations and unsafe output. This potentially enables improved interpretation of alarm sequences and the generation of safer repair action recommendations by the agent. Preliminary findings are presented with the approach applied to ChatGPT-4 generated test sentences. The limitation of using ChatGPT-4 and the potential for enhancement of this agent through re-training with specialised OSW datasets are discussed.

2024 J. Phys.: Conf. Ser. 2875 012025.

Sustainability Natural Language Processing Sustainability

SafeLLM: Domain-Specific Safety Monitoring for Large language Models: A Case Study for Offshore Wind Maintenance.

C. Walker, C. Rothon, , K. Aslansefat, Y. Papadopoulos and N. Dethlefs
Link

Hide/Show Full Abstract

The Offshore Wind (OSW) industry is experiencing significant expansion, resulting in increased Operations & Maintenance (O&M) costs. Intelligent alarm systems offer the prospect of swift detection of component failures and process anomalies, enabling timely and precise interventions that could yield reductions in resource expenditure, as well as scheduled and unscheduled downtime. This paper introduces an innovative approach to tackle this challenge by capitalising on Large Language Models (LLMs). We present a specialised conversational agent that incorporates statistical techniques to calculate distances between sentences for the detection and filtering of hallucinations and unsafe output. This potentially enables improved interpretation of alarm sequences and the generation of safer repair action recommendations by the agent. Preliminary findings are presented with the approach applied to ChatGPT-4 generated test sentences. The limitation of using ChatGPT-4 and the potential for enhancement of this agent through re-training with specialised OSW datasets are discussed.

2024 Preprint.

Sustainability Natural Language Processing

Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

I. Wuraola, N. Dethlefs and D. Marciniak

Hide/Show Full Abstract

In the realm of social media discourse, the integration of slang enriches communication, reflecting the sociocultural identities of users. This study investigates the capability of large language models (LLMs) to paraphrase slang within climate-related tweets from Nigeria and the UK, with a focus on identifying emotional nuances. Using DistilRoBERTa as the baseline model, we observe its limited comprehension of slang. To improve cross-cultural understanding, we gauge the effectiveness of leading LLMs ChatGPT 4, Gemini, and LLaMA3 in slang paraphrasing. While ChatGPT 4 and Gemini demonstrate comparable effectiveness in slang paraphrasing, LLaMA3 shows less coverage, with all LLMs exhibiting limitations in coverage, especially of Nigerian slang. Our findings underscore the necessity for culturally-sensitive LLM development in emotion classification, particularly in non-anglocentric regions.

2024 Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, Florida, US.

Deep Learning Natural Language Processing

One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

S. Rose, N. Dethlefs and C. Kambhampati

Hide/Show Full Abstract

Grapheme-to-Phoneme (G2P) correspondences form foundational frameworks of tasks such as text-to-speech (TTS) synthesis or automatic speech recognition. The G2P process involves taking words in their written form and generating their pronunciation. In this paper, we critique the status quo definition of grapheme, currently a forced alignment process relating a single character to either a phoneme or a blank unit, that underlies the majority of modern approaches. We develop a linguistically-motivated redefinition from simple concepts such as vowel and consonant count and word length and offer a proof-of-concept implementation based on a multi-binary neural classification task. Our model achieves state-of-the-art results with a 31.86% Word Error Rate on a standard benchmark, while generating linguistically meaningful grapheme segmentations.

2024 Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL), Miami, Florida, US.

Deep Learning Natural Language Processing

Safety Monitoring for Large Language Models: A Case Study of Offshore Wind Maintenance .

C. Walker, C. Rothon, K. Aslansefat, Y. Papadopoulos and N. Dethlefs
Link

Hide/Show Full Abstract

It has been forecasted that a quarter of the world’s energy usage will be supplied from Offshore Wind (OSW) by 2050 (Smith 2023). Given that up to one third of Levelised Cost of Energy (LCOE) arises from Operations and Maintenance (O&M), the motive for cost reduction is enormous. In typical OSW farms hundreds of alarms occur within a single day, making manual O&M planning without automated systems costly and difficult. Increased pressure to ensure safety and high reliability in progressively harsher environments motivates the exploration of Artificial Intelligence (AI) and Machine Learning (ML) systems as aids to the task. We recently introduced a specialised conversational agent trained to interpret alarm sequences from Supervisory Control and Data Acquisition (SCADA) and recommend comprehensible repair actions (Walker et al. 2023). Building on recent advancements on Large Language Models (LLMs), we expand on this earlier work, fine tuning LLAMA (Touvron 2018), using available maintenance records from EDF Energy. An issue presented by LLMs is the risk of responses containing unsafe actions, or irrelevant hallucinated procedures. This paper proposes a novel framework for safety monitoring of OSW, combining previous work with additional safety layers. Generated responses of this agent are being filtered to prevent raw responses endangering personnel and the environment. The algorithm represents such responses in embedding space to quantify dissimilarity to pre-defined unsafe concepts using the Empirical Cumulative Distribution Function (ECDF). A second layer identifies hallucination in responses by exploiting probability distributions to analyse against stochastically generated sentences. Combining these layers, the approach finetunes individual safety thresholds based on categorised concepts, providing a unique safety filter. The proposed framework has potential to utilise the O&M planning for OSW farms using state-of-the-art LLMs as well as equipping them with safety monitoring that can increase technology acceptance within the industry.

2024 Proc. of the Safety Critical Systems Symposium SSS'24, Bristol, UK

Natural Language Processing Sustainability

User Engagement Triggers in Social Media Discourse on Biodiversity Conservation.

N. Dethlefs and H. Cuayahuitl
Link

Hide/Show Full Abstract

Studies in digital conservation have increasingly used social media in recent years as a source of data to understand the interactions between humans and nature, model and monitor biodiversity, and analyse online discourse about the conservation of species. Current approaches to digital conservation are for the most part purely frequentist, i.e. focused on easily trackable and quantiiable features, or purely qualitative, which allows a deeper level of interpretation, but is less scalable. Our approach aims to evaluate the applicability of recent advances in deep learning in combination with semi-automatic analysis. We present a multimodal neural learning framework that experiments with diferent combinations of linguistic and visual features and metadata of tweets to predict user engagement from a function of likes and retweets. Experimental results show that text is the single most efective modality for prediction when a large amount of training data is available. For smaller datasets, drawing information from multiple modalities can boost performance. Notably, we ind a negative efect of large pre-trained language models when dealing with substantially unbalanced datasets. A qualitative analysis into the triggers of user engagement with tweets reveals that it emerges from a combination of online discourse topic and sentiment, and is often ampliied by user activity, e.g. when content originates from an inluencer account. We ind clear evidence of existing sub-communities around speciic topics, including animal photography and sightings, illegal wildlife trade and trophy hunting, deforestation and destruction of nature and climate change and action in a broader sense.

2024 ACM Transactions on Social Computing

Natural Language Processing Sustainability

BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.

Hide/Show Full Abstract

This paper outlines our multimodal ensemble learning system for identifying persuasion tech- niques in memes. We contribute an approach which utilises the novel inclusion of consistent named visual entities extracted using Google Vision API’s as an external knowledge source, joined to our multimodal ensemble via late fu- sion. As well as detailing our experiments in ensemble combinations, fusion methods and data augmentation, we explore the impact of including external data and summarise post- evaluation improvements to our architecture based on analysis of the task results.

2024 SEMEVAL 2024 Shared Task on "Multilingual Detection of Persuasion Techniques in Memes", at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Mexico City, Mexico

Natural Language Processing Deep Learning

Towards Interactive Anomaly Detection using Natural Language.

Rothon, C., Keizer, S., Doddipatla, R., N. Dethlefs
Link

Hide/Show Full Abstract

When training models for visual anomaly detection, typically, a dataset is collected and then annotated offline. Even if collecting raw data is relatively cheap, annotations are expensive, especially if they require human expertise. We therefore propose a novel interactive learning framework that combines active learning with natural language interaction to minimise the amount of annotated training data and allow for refined human expert feedback that may be leveraged in the learning pro- cess. In our initial experiments on wind turbine drone images, we demonstrate the effectiveness of active learning for anomaly detection when using ground truth la- bels, and assess the impact on learning when collecting labels from ‘experts’ versus ‘non-experts’ using our dialogue system. In addition to anomaly labels with confi- dence scores, we collect and analyse natural language explanations, which may be used to improve both anomaly detection performance and explainability.

2024 The 14th International Workshop on Spoken Dialogue Systems Technology, Sapporo, Japan

Natural Language Processing Sustainability Deep Learning

Linguistic Pattern Analysis in the Climate Change-Related Tweets from UK and Nigeria.

Wuraola, I., Dethlefs, N., and D. Marciniak
Link

Hide/Show Full Abstract

To understand the global trends of human opinion on climate change in specific geographical areas, this research proposes a framework to analyse linguistic features and cultural differences in climate-related tweets. Our study combines transformer networks with linguistic feature analysis to address small dataset limitations and gain insights into cultural differences in tweets from the UK and Nigeria. Our study found that Nigerians use more leadership language and informal words in discussing climate change on Twitter compared to the UK, as these topics are treated as an issue of salience and urgency. In contrast, the UK’s discourse about climate change on Twitter is characterised by using more formal, logical, and longer words per sentence compared to Nigeria. Also, we confirm the geographical identifiability of tweets through a classification task using DistilBERT, which achieves 83% of accuracy.

2023 Proceedings of the CLASP Conference on Learning with Small Data (LSD), Gothenburg, Sweden

Natural Language Processing Deep Learning

Real-time social media sentiment analysis for rapid impact assessment of floods.

Bryan-Smith, L., Godsall, J., George, F., Egode, K., Dethlefs, N., D. Parsons.
Link

Hide/Show Full Abstract

Traditional approaches to flood modelling mostly rely on hydrodynamic physical simulations. While these simulations can be accurate, they are computationally expensive and prohibitively so when thinking about real-time prediction based on dynamic environmental conditions. Alternatively, social media platforms such as Twitter are often used by people to communicate during a flooding event, but discovering which tweets hold useful information is the key challenge in extracting information from posts in real time. In this article, we present a novel model for flood forecasting and monitoring that makes use of a transformer network that assesses the severity of a flooding situation based on sentiment analysis of the multimodal inputs (text and images). We also present an experimental comparison of a range of state-of-the-art deep learning methods for image processing and natural language processing. Finally, we demonstrate that information induced from tweets can be used effectively to visualise fine-grained geographical flood-related information dynamically and in real-time.

2023 Computers & Geosciences

Deep Learning Sustainability Natural Language Processing

This new conversational AI model can be your friend, philosopher, and guide ... and even your worst enemy.

Chatterjee, J., Dethlefs, N.
Link

Hide/Show Full Abstract

We explore the recently released ChatGPT model, one of the most powerful conversational AI models that has ever been developed. This opinion provides a perspective on its strengths and weaknesses and a call to action for the AI community (including academic researchers and industry) to work together on preventing potential misuse of such powerful AI models in our everyday lives.

2023 Patterns Volume 4, Issue 1, Opinion Article

Deep Learning Natural Language Processing

Automated Question-Answering for Interactive Decision Support in Operations & Maintenance of Wind Turbines.

Chatterjee, J., Dethlefs, N.
Link

Hide/Show Full Abstract

Intelligent question-answering (QA) systems have witnessed increased interest in recent years, particularly in their ability to facilitate information access, data interpretation or decision support. The wind energy sector is one of the most promising sources of renewable energy, yet turbines regularly suffer from failures and operational inconsistencies, leading to downtimes and significant maintenance costs. Addressing these issues requires rapid interpretation of complex and dynamic data patterns under time-critical conditions. In this article, we present a novel approach that leverages interactive, natural language-based decision support for operations & maintenance (O&M) of wind turbines. The proposed interactive QA system allows engineers to pose domain-specific questions in natural language, and provides answers (in natural language) based on the automated retrieval of information on turbine sub-components, their properties and interactions, from a bespoke domain-specific knowledge graph. As data for specific faults is often sparse, we propose the use of paraphrase generation as a way to augment the existing dataset. Our QA system leverages encoder-decoder models to generate Cypher queries to obtain domain-specific facts from the KG database in response to user-posed natural language questions. Experiments with an attention-based sequence-to-sequence (Seq2Seq) model and a transformer show that the transformer accurately predicts up to 89.75% of responses to input questions, outperforming the Seq2Seq model marginally by 0.76%, though being 9.46 times more computationally efficient. The proposed QA system can help support engineers and technicians during O&M to reduce turbine downtime and operational costs, thus improving the reliability of wind energy as a source of renewable energy.

2022 IEEE Access Vol 10.

Deep Learning Sustainability Natural Language Processing

RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification.

Schoene, A., Dethlefs, N., Ananiadou, S.
PDF

Hide/Show Full Abstract

Several existing resources are available for sentiment analysis (SA) tasks that are used for learning sentiment specific embedding (SSE) representations. These resources are either large, common-sense knowledge graphs (KG) that cover a limited amount of polarities/emotions or they are smaller in size, such as lexicons, which require costly human annotation and cover fine-grained emotions. Therefore using knowledge resources to learn SSE representations is either limited by the low coverage of polarities/emotions or the overall size of a resource. In this paper, we first introduce a new directed KG called ‘RELATE’, which is built to overcome both the issue of low coverage of emotions and the issue of scalability. RELATE is the first KG of its size to cover Ekman’s six basic emotions that are directed towards entities. It is based on linguistic rules to incorporate the benefit of semantics without relying on costly human annotation. The performance of ‘RELATE’ is evaluated by learning SSE representations using a Graph Convolutional Neural Network (GCN).

2022 13th Language Resources and Evaluation Conference (LREC).

Deep Learning Natural Language Processing

Towards Contextually Sensitive Analysis of Memes: Meme Genealogy and Knowledge Base.

Sherratt, V.
PDF

Hide/Show Full Abstract

As online communication grows, memes have con- tinued to evolve and circulate as succinct multi- modal forms of communication. However, compu- tational approaches applied to meme-related lack the same depth and contextual sensitivity of non- computational approaches and struggle to interpret intra-modal dynamics and referentiality. This re- search proposes to a ‘meme genealogy’ of key fea- tures and relationships between memes to inform a knowledge base constructed from meme-specific online sources and embed connotative meaning and contextual information in memes. The proposed methods provide a basis to train contextually sensi- tive computational models for analysing memes and applications in automated meme annotation.

2022 IJCAI Doctoral Consortium.

Deep Learning Natural Language Processing

Using Multimodal Data and AI to Dynamically Map Flood Risks.

Bryan-Smith, L.
PDF

Hide/Show Full Abstract

Classical measurements and modelling that underpin present flood warning and alert systems are based on fixed and spa- tially restricted static sensor networks. Computationally ex- pensive physics-based simulations are often used that can’t react in real-time to changes in environmental conditions. We want to explore contemporary artificial intelligence (AI) for predicting flood risk in real time by using a diverse range of data sources. By combining heterogeneous data sources, we aim to nowcast rapidly changing flood conditions and gain a greater understanding of urgent humanitarian needs.

2022 AAAI Doctoral Consortium (AAAI-DC).

Deep Learning Sustainability Natural Language Processing

Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue.

Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy
PDF

2021 SIGDIAL.

Natural Language Processing

A divide-and-conquer approach to neural natural language generation from structured data.

Dethlefs, N., Schoene, A., Cuayahuitl, H.
PDF

Hide/Show Full Abstract

Current approaches that generate text from linked data for complex real-world domains can face problems including rich and sparse vocabularies as well as learning from examples of long varied sequences. In this article, we propose a novel divide-and-conquer approach that automatically induces a hierarchy of “generation spaces” from a dataset of semantic concepts and texts. Generation spaces are based on a notion of similarity of partial knowledge graphs that represent the domain and feed into a hierarchy of sequence-to-sequence or memory-to-sequence learners for concept-to-text generation. An advantage of our approach is that learning models are exposed to the most relevant examples during training which can avoid bias towards majority samples. We evaluate our approach on two common benchmark datasets and compare our hierarchical approach against a flat learning setup. We also conduct a comparison between sequence-to-sequence and memory-to-sequence learning models. Experiments show that our hierarchical approach overcomes issues of data sparsity and learns robust lexico-syntactic patterns, consistently outperforming flat baselines and previous work by up to 30%. We also find that while memory-to-sequence models can outperform sequence-to-sequence models in some cases, the latter are generally more stable in their performance and represent a safer overall choice.

2021. Neurocomputing 433, 300-309.

Deep Learning Natural Language Processing

Hierarchical Multiscale Recurrent Neural Networks for Detecting Suicide Notes.

Schoene, A., Turner, A., de Mel, G., Dethlefs, N.
PDF

Hide/Show Full Abstract

Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data. Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media blogs in two document-level classification tasks. The first task aims to identify suicide notes from depressed and blog posts in a balanced dataset, whilst the second experiment looks at how well suicide notes can be classified when there is a vast amount of neutral text data, which makes the task more applicable to real-world scenarios. Furthermore we perform a linguistic analysis using LIWC (Linguistic Inquiry and Word Count). We present a learning model for modelling long sequences in two experiment series. We achieve an f1-score of 88.26% over the baselines of 0.60 in experiment 1 and 96.1% over the baseline in experiment 2. Finally, we show through visualisations which features the learning model identifies, these include emotions such as love and personal pronouns.

2021. IEEE Transactions on Affective Computing.

Deep Learning Natural Language Processing

XAI4Wind: A Multimodal Knowledge Graph Database for Explainable Decision Support in Operations & Maintenance of Wind Turbines.

Chatterjee, J., Dethlefs, N.
PDF

Hide/Show Full Abstract

Condition-based monitoring (CBM) has been widely utilised in the wind industry for monitoring operational inconsistencies and failures in turbines, with techniques ranging from signal processing and vibration analysis to artificial intelligence (AI) models using Supervisory Control & Acquisition (SCADA) data. However, existing studies do not present a concrete basis to facilitate explainable decision support in operations and maintenance (O&M), particularly for automated decision support through recommendation of appropriate maintenance action reports corresponding to failures predicted by CBM techniques. Knowledge graph databases (KGs) model a collection of domain-specific information and have played an intrinsic role for real-world decision support in domains such as healthcare and finance, but have seen very limited attention in the wind industry. We propose XAI4Wind, a multimodal knowledge graph for explainable decision support in real-world operational turbines and demonstrate through experiments several use-cases of the proposed KG towards O&M planning through interactive query and reasoning and providing novel insights using graph data science algorithms. The proposed KG combines multimodal knowledge like SCADA parameters and alarms with natural language maintenance actions, images etc. By integrating our KG with an Explainable AI model for anomaly prediction, we show that it can provide effective human-intelligible O&M strategies for predicted operational inconsistencies in various turbine sub-components. This can help instil better trust and confidence in conventionally black-box AI models. We make our KG publicly available and envisage that it can serve as the building ground for providing autonomous decision support in the wind industry.

arXiv preprint arXiv:2012.10489, 2020.

Natural Language Processing Sustainability

A Dual Transformer Model for Intelligent Decision Support for Maintenance of Wind Turbines.

Chatterjee, J., Dethlefs, N.

Hide/Show Full Abstract

Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to fault prediction in wind turbines, but these predictions have not been supported with suggestions on how to avert and fix faults. We present a data-to-text generation system utilising transformers for generating corrective maintenance strategies for faults using SCADA data capturing the operational status of turbines. We achieve this in two stages: a first stage identifies faults based on SCADA input features and their relevance. A second stage performs content selection for the language generation task and creates maintenance strategies based on phrase-based natural language templates. Experiments show that our dual transformer model achieves an accuracy of up to 96.75% for alarm prediction and up to 75.35% for its choice of maintenance strategies during content-selection. A qualitative analysis shows that our generated maintenance strategies are promising. We make our human- authored maintenance templates publicly available, and include a brief video explaining our approach.

2020 International Joint Conference on Neural Networks (IJCNN).

Sustainability Natural Language Processing Deep Learning

Hybrid approaches to fine-grained emotion detection in social media data.

Schoene, A.
PDF

Hide/Show Full Abstract

This paper states the challenges in fine-grained target- dependent Sentiment Analysis for social media data using recurrent neural networks. Firstly, we outline the problem statement and give a brief overview of related work in the area. Then we outline progress and results achieved to date, a brief research plan and future directions of this work.

To appear. In AAAI-2020 Doctoral Consortium. New York, USA.

Natural Language Processing Deep Learning

Bidirectional Dilated LSTM with Attention for Fine-grained Emotion Classification in Tweets.

Schoene, A., Turner, A., Dethlefs, N.

Hide/Show Full Abstract

We propose a novel approach for fine-grained emotion classification in tweets using a Bidirectional Dilated LSTM (BiDLSTM) with attention. Conventional LSTM architectures can face problems when classifying long sequences, which is problematic for tweets, where crucial information is often attached to the end of a sequence, e.g. an emoticon. We show that by adding a bidirectional layer, dilations and attention mechanism to a standard LSTM, our model overcomes these problems and is able to maintain complex data dependencies over time. We present experiments with two datasets, the 2018 WASSA Implicit Emotions Shared Task and a new dataset of 240,000 tweets. Our BiDLSTM with attention achieves a test accuracy of up to 81.97% outperforming competitive baselines by up to 10.52% on both datasets. Finally, we evaluate our data against a human benchmark on the same task.

To appear. In Proceedings of AAAI-2020 Workshop on Affective Content Analysis. New York, USA

Natural Language Processing Deep Learning

Natural Language Generation for Operations and Maintenance in Wind Turbines.

Chatterjee, J., Dethlefs, N.
PDF

Hide/Show Full Abstract

Wind energy is one of the fastest-growing sustainable energy sources in the world but relies crucially on efficient and effective operations and maintenance to generate sufficient amounts of energy and reduce downtime of wind turbines and associated costs. Machine learning has been applied to fault prediction in wind turbines, but these predictions have not been supported with suggestions on how to avert and fix faults. We present a data-to-text generation system using transformers to produce event descriptions from SCADA data capturing the operational status of turbines and proposing maintenance strategies. Experiments show that our model learns feature representations that correspond to expert judgements. In making a contribution to the reliability of wind energy, we hope to encourage organisations to switch to sustainable energy sources and help combat climate change.

2019. In NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning. Vancouver, Canada.

Natural Language Processing Deep Learning Sustainability

Dilated LSTM with ranked units for classification of suicide notes.

Schoene, A., Turner, A., Dethlefs, N.
PDF

Hide/Show Full Abstract

Recent statistics in suicide prevention show that people are increasingly posting their last words online and with the unprecedented availability of textual data from social media platforms researchers have the opportunity to analyse such data. Furthermore, psychological studies have shown that our state of mind can manifest itself in the linguistic features we use to communicate. In this paper, we investigate whether it is possible to automatically identify suicide notes from other types of social media blogs in a document-level classification task. Also, we present a learning model for modelling long sequences, achieving an f1-score of 0.84 over the baselines of 0.53 and 0.80 (best competing model). Finally, we also show through visualisations which features the learning model identifies.

2019. In Proceedings of AI for Social Good workshop at NeurIPS (2019), Vancouver, Canada.

Natural Language Processing Deep Learning

Dilated LSTM with attention for Classification of suicide notes.

Schoene, A., Lacey, G., Turner, A., Dethlefs, N.
PDF

Hide/Show Full Abstract

In this paper we present a dilated LSTM with attention mechanism for document-level classification of suicide notes, last statements and depressed notes. We achieve an accuracy of 87.34% compared to competitive baselines of 80.35% (Logistic Model Tree) and 82.27% (Bi-directional LSTM with Attention). Furthermore, we provide an analysis of both the grammatical and thematic content of suicide notes, last statements and depressed notes. We find that the use of personal pronouns, cognitive processes and references to loved ones are most important. Finally, we show through visualisations of attention weights that the Dilated LSTM with attention is able to identify the same distinguishing features across documents as the linguistic analysis.

2019. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019) at EMNLP. Hong Kong.

Natural Language Processing Deep Learning

Cross-dialectal speech processing

Whettam, D., Gargett, A., Dethlefs, N.

Hide/Show Full Abstract

Despite advances in technology, language diversity remains a challenge to the speech processing community, but there is also an opportunity to rise to this challenge through research and innovation. Pluricentric languages play an important role in such work, particularly where these languages are better resourced. Dedicated researchers across several decades, have steadily contributed resources for some language varieties, increasing general availability of a range of data archives...

2019. INTERSPEECH Satellite Workshop on Pluricentric Languages in Speech Technology, Graz, Austria.

Natural language processing Publications

When Words Move Markets: Interpretable Behavioural and Robustness Analysis of LLMs for Financial Sentiment Reasoning via Local Perturbation Explanations.

Structuring the Last Mile with ReActV: Tool-Augmented Delivery Planning with Verification.

Clustering Internet Memes with Metric Learning and Dynamic Modality Weighting.

SemioMeme: A Symbolic–Subsymbolic Knowledge Graph Dataset for Multimodal Meme Analysis.

Laying the foundations for context-aware and AI-ready fault diagnosis with the Operations Ontology

Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis.

SLANG-GraphRAG: Multi-Layered Retrieval with Domain-Specific Knowledge for Low Resource Social Media Conversations.

Speech-Controlled Smart Speaker for Accurate, Real-Time Health and Care Record Management.

One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

Using Large Language Models to Recommend Repair Actions for Offshore Wind Maintenance.

SafeLLM: Domain-Specific Safety Monitoring for Large language Models: A Case Study for Offshore Wind Maintenance.

Understanding Slang with LLMs: Modelling Cross-Cultural Nuances through Paraphrasing.

One-Vs-Rest Neural Network English Grapheme Segmentation: A Linguistic Perspective.

Safety Monitoring for Large Language Models: A Case Study of Offshore Wind Maintenance .

User Engagement Triggers in Social Media Discourse on Biodiversity Conservation.

BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.

Towards Interactive Anomaly Detection using Natural Language.

Linguistic Pattern Analysis in the Climate Change-Related Tweets from UK and Nigeria.

Real-time social media sentiment analysis for rapid impact assessment of floods.

This new conversational AI model can be your friend, philosopher, and guide ... and even your worst enemy.

Automated Question-Answering for Interactive Decision Support in Operations & Maintenance of Wind Turbines.

RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification.

Towards Contextually Sensitive Analysis of Memes: Meme Genealogy and Knowledge Base.

Using Multimodal Data and AI to Dynamically Map Flood Risks.

Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue.

A divide-and-conquer approach to neural natural language generation from structured data.

Hierarchical Multiscale Recurrent Neural Networks for Detecting Suicide Notes.

XAI4Wind: A Multimodal Knowledge Graph Database for Explainable Decision Support in Operations & Maintenance of Wind Turbines.

A Dual Transformer Model for Intelligent Decision Support for Maintenance of Wind Turbines.

Hybrid approaches to fine-grained emotion detection in social media data.

Bidirectional Dilated LSTM with Attention for Fine-grained Emotion Classification in Tweets.

Natural Language Generation for Operations and Maintenance in Wind Turbines.

Dilated LSTM with ranked units for classification of suicide notes.

Dilated LSTM with attention for Classification of suicide notes.

Cross-dialectal speech processing

Unsupervised suicide note classification.

Domain Transfer for Deep Natural Language Generation from Abstract Meaning Representations.

Deep text generation - Using hierarchical decomposition to mitigate the effect of rare data points.

Natural language-based presentation of cognitive stimulation to people with dementia in assistive technology: a pilot study.

Extrinsic vs Intrinsic Evaluation of Natural Language Generation for Spoken Dialogue Systems and Social Robotics.

Automatic Identification of Suicide Notes from Linguistic and Sentiment Features.

Information Density and Overlaps in Spoken Dialogue.

Why bother? Is evaluation of NLG in an end-to-end Spoken Dialogue System worth it?

Hierarchical Reinforcement Learning for Situated Language Generation.

Cluster-Based Prediction of User Ratings for Stylistic Surface Realisation.

A Semi-Supervised Clustering Approach for Semantic Slot Labelling.

Training a Statistical Surface Realiser from Automatic Slot Labelling.

The PARLANCE Mobile App for Interactive Search in English and Mandarin.

Non-Strict Hierarchical Reinforcement Learning for Interactive Systems and Robots.

Context-Sensitive Natural Language Generation: From Knowledge-Driven to Data-Driven Techniques.

Introduction to the Special Issue on Machine Learning for Multiple Modalities in Interactive Systems and Robots.

A Joint Learning Approach for Situated Language Generation.

Getting to Know Users: Accounting for the Variability in User Ratings.

Two Alternative Frameworks for Deploying Spoken Dialogue Systems to Mobile Platforms for Evaluation “in the Wild”.

Conditional Random Fields for Responsive Surface Realisation Using Global Features.

Hierarchical Joint Learning for Natural Language Generation.

Hierarchical Joint Learning for Natural Language Generation.

Impact of ASR N-Best Information on Bayesian Dialogue Act Recognition.

Proceedings of the Young Researcher’s Roundtable on Spoken Dialogue Systems.

Barge-in Effects in Bayesian Dialogue Act Recognition and Simulation.

Demonstration of the PARLANCE System: A Data-Driven, Incremental, Spoken Dialogue System for Interactive Search.

Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems.

Optimising Incremental Generation for Spoken Dialogue Systems: Reducing the Need for Fillers.

Hierarchical Dialogue Policy Learning Using Flexible State Transitions and Linear Function Approximation.

Comparing HMMs and Bayesian Networks for Surface Realisation.

Hierarchical Multiagent Reinforcement Learning for Coordinating Verbal and Nonverbal Actions in Robots.

Towards Optimising Modality Allocation for Multimodal Output Generation in Incremental Dialogue.

Dialogue Systems Using Online Learning: Beyond Empirical Methods.

Incremental Spoken Dialogue Systems: Tools and Data.

Optimising Incremental Generation for Information Presentation of Mobile Search Results.

Spatially-Aware Dialogue Control Using Hierarchical Reinforcement Learning.

Generation of Adaptive Route Descriptions in Urban Environments.

Hierarchical Reinforcement Learning and Hidden Markov Models for Task-Oriented Natural Language Generation.

Optimizing Situated Dialogue Management in Unknown Environments.

Optimising Natural Language Generation Decision Making for Situated Dialogue.

Combining Hierarchical Reinforcement Learning and Bayesian Networks for Natural Language Generation.

The Bremen System for the GIVE-2.5 Challenge.

Position Paper in the Young Researchers’ Roundtable on Spoken Dialogue Systems (YRRSDS).