ZEKUN WU

London, England, United Kingdom

1K followers 500+ connections

View mutual connections with ZEKUN

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to follow

Holistic AI

UCL

About

I am a Responsible AI Researcher.

Activity

OpenAI has announced o3, which appears to be the most powerful AI model to date. There has been some attention given to the massive dollar costs of…

OpenAI has announced o3, which appears to be the most powerful AI model to date. There has been some attention given to the massive dollar costs of…

Liked by ZEKUN WU
The recent #IEA Global Conference on Energy and AI brought together key stakeholders from both the Tech sector and the Energy sector to discuss the…

The recent #IEA Global Conference on Energy and AI brought together key stakeholders from both the Tech sector and the Energy sector to discuss the…

Liked by ZEKUN WU
Was an amazing experience to be at NeurIPS 2024 this year! Thanks to Holistic AI and ZEKUN WU for the opportunity, and Mengfei Liang for being an…

Was an amazing experience to be at NeurIPS 2024 this year! Thanks to Holistic AI and ZEKUN WU for the opportunity, and Mengfei Liang for being an…

Liked by ZEKUN WU

Join now to see all activity

Experience

Holistic AI

London, England, United Kingdom
-
-

Paris, Île-de-France, France
-

Greater London, England, United Kingdom
-

London, England, United Kingdom
-

China
-

Shanghai, China
-

Wuhan, Hubei, China
-

Shenzhen, Guangdong, China

Education

UCL

2024 - 2028

Sustainability and Machine Learning Research Group
Supervisors: Dr. Mar ́ıa P ́erez-Ortiz, Dr. Adriano Koshiyama, and Dr Sahan Bulathwela.
2022 - 2023

Carbon Re Academic Excellence Prize for graduating as 1st best of the whole cohort
Supervisors: Dr. Adriano Koshiyama and Dr. Sahan Bulathwela
2023 NeurIPS SoLaR publication：Towards Auditing Large Language Models: Improving Text-based Stereotype Detection
2022 - 2022
2019 - 2022

Supervisor: Professor John Shawe-Taylor
2017 - 2019

Projects

Bias amplification in the process of model collapse of Language Models

Jul 2024
This project aims to investigate whether bias amplification occurs as language models progress towards model collapse, particularly in a future scenario where synthetic content predominates the internet. Model collapse refers to the degradation in performance and diversity of language models when they are trained on increasingly synthetic datasets. Initial experiments have successfully replicated model collapse in small-sized language models using BOLD and wiki data. The next steps involve…

This project aims to investigate whether bias amplification occurs as language models progress towards model collapse, particularly in a future scenario where synthetic content predominates the internet. Model collapse refers to the degradation in performance and diversity of language models when they are trained on increasingly synthetic datasets. Initial experiments have successfully replicated model collapse in small-sized language models using BOLD and wiki data. The next steps involve implementing more sophisticated metrics to assess both model collapse and bias amplification, providing insights into how biases might intensify or change during this process. This research seeks to contribute to the development of more resilient and equitable language models in an era increasingly dominated by synthetic content.

Other creators
Inference Energy consumption and carbon emission analysis of LLMs with different FAST ML techniques

Jun 2024
This is a UCL IXN project involving a Master's student. This project analyzes the energy consumption and carbon emissions of LLMs using various FAST ML techniques to promote environmentally sustainable AI practices.

Other creators
Toolkit for Detecting and Mitigating Data Contamination in Large Language Models to Ensure Fair Evaluation

Jun 2024
This research focuses on creating a comprehensive and user-friendly toolkit for detecting and mitigating data contamination in LLMs. The toolkit will feature advanced methods such as KIEval for absolute contamination assessment, TS-Guessing for detecting subtle biases, and a Reworded Question Test for verifying data integrity. By integrating these tools, the project aims to ensure that LLMs are evaluated accurately and fairly, preventing artificially inflated scores and maintaining the…

This research focuses on creating a comprehensive and user-friendly toolkit for detecting and mitigating data contamination in LLMs. The toolkit will feature advanced methods such as KIEval for absolute contamination assessment, TS-Guessing for detecting subtle biases, and a Reworded Question Test for verifying data integrity. By integrating these tools, the project aims to ensure that LLMs are evaluated accurately and fairly, preventing artificially inflated scores and maintaining the integrity of AI research and applications.

Other creators
Automated Toolkit for Real-time Open Generation Bias Alignment Benchmarking in Large Language Models

Mar 2024
This research focuses on automating the creation of benchmark data for bias alignment in LLMs, building on the Bias in Open-ended Language Generation Dataset (BOLD) project. The automated library will facilitate the use of the latest and real-time data for open generation bias benchmarking, allowing flexible comparisons of biases across different demographic descriptors and it will introduce new alignment metrics utilizing advanced statistical methods.

Other creators
JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models

Mar 2024
JobFair presents a novel framework for evaluating hierarchical gender hiring bias in Large Language Models (biss) used for resume scoring, exposing issues of reverse bias and overdebiasing. The framework utilizes a real, anonymized resume dataset from the Healthcare, Finance, and Construction industries and proposes new statistical and computational hiring bias metrics based on a counterfactual approach. The study analyzes hiring biases in ten state-of-the-art LLMs, identifying significant…

JobFair presents a novel framework for evaluating hierarchical gender hiring bias in Large Language Models (biss) used for resume scoring, exposing issues of reverse bias and overdebiasing. The framework utilizes a real, anonymized resume dataset from the Healthcare, Finance, and Construction industries and proposes new statistical and computational hiring bias metrics based on a counterfactual approach. The study analyzes hiring biases in ten state-of-the-art LLMs, identifying significant biases against males in healthcare and finance, with GPT-4o and GPT-3.5 showing the most bias, while Gemini-1.5-Pro, Llama3-8b-Instruct, and Llama3-70b-Instruct are the least biased. The framework can be easily adapted to investigate other social traits and tasks.

Other creators
Quantitative Evaluation Framework for Natural Language Explanations in Compliance with the EU AI Act

Mar 2024
Building on the foundation laid by the paper “AI Explainability in the EU AI Act: A Case for an NLE Approach Towards Pragmatic Explanations” (https://cjai.co.uk/wp-content/uploads/2024/07/Cambridge-Journal-of-AI-Vol.-1-Issue-1-f.pdf), this project aims to develop a comprehensive framework for quantitatively evaluating natural language explanations (NLEs) provided by AI systems to ensure compliance with the EU AI Act. A multi-agent system will be created, involving an interactor to ask follow-up…

Building on the foundation laid by the paper “AI Explainability in the EU AI Act: A Case for an NLE Approach Towards Pragmatic Explanations” (https://cjai.co.uk/wp-content/uploads/2024/07/Cambridge-Journal-of-AI-Vol.-1-Issue-1-f.pdf), this project aims to develop a comprehensive framework for quantitatively evaluating natural language explanations (NLEs) provided by AI systems to ensure compliance with the EU AI Act. A multi-agent system will be created, involving an interactor to ask follow-up questions for deeper explanations and an evaluator to continuously rate the NLEs based on predefined principles. This approach will enhance the transparency, understandability, and accountability of AI systems.

Other creators
Automatic hallucination benchmarking and mitigation framework

Feb 2024
This project aims to create a comprehensive framework for automatically generating QA pairs from given knowledge domain files to benchmark and mitigate hallucinations in LLMs. The framework benchmarks hallucinations, applies various mitigation techniques, reassesses the models to ensure resolution, and iterates to improve reliability. The core contributions include reducing the cost of generation, improving the quality of generated content, and enhancing validation metrics, thereby minimizing…

This project aims to create a comprehensive framework for automatically generating QA pairs from given knowledge domain files to benchmark and mitigate hallucinations in LLMs. The framework benchmarks hallucinations, applies various mitigation techniques, reassesses the models to ensure resolution, and iterates to improve reliability. The core contributions include reducing the cost of generation, improving the quality of generated content, and enhancing validation metrics, thereby minimizing the need for human validation. Mitigation techniques such as chain of verification, Retrieval-Augmented Generation (RAG), fine-tuning, and knowledge editing are implemented to tackle hallucinations effectively. Key Contributions:

1. Cost Reduction in Generation: The framework significantly lowers the cost associated with generating QA pairs by automating the process.
2. Quality Improvement: By systematically benchmarking and mitigating hallucinations, the framework enhances the overall quality of generated content.
3. Validation Enhancement: The project introduces improved metrics and validation techniques, reducing the dependency on human validation.
4. Mitigation Techniques: Implementation of diverse methods to mitigate hallucinations, including chain of verification, RAG, fine-tuning, and knowledge editing.

Other creators
Categorization and target bias in open-generation bias metric models

Feb 2024
This is a UCL IXN project involving a Master’s student. This project categorizes and assesses target biases in open-generation bias metrics. The core of this project is to evaluate polarity models such as sentiment, toxicity, regard, and others used in the metrics of open generation bias benchmarks. The project has completed counterfactual experiments to examine polarity differences between sentences with counterfactual demographic descriptions, proving the existence of these differences.

Other creators
Mitigation methodologies for explainability in the image and traditional ML models

Feb 2024
This is a UCL IXN project involving a Master's student. This project builds on Holistic AI's previous research on explainability metrics for traditional machine learning models, as detailed in the paper Evaluating Explainability for Machine Learning Predictions Using Model-Agnostic Metrics(https://arxiv.org/abs/2302.12094) and the associated open source library(https://github.com/holistic-ai/holisticai). The goal is to develop methods to mitigate explainability issues while balancing other…

This is a UCL IXN project involving a Master's student. This project builds on Holistic AI's previous research on explainability metrics for traditional machine learning models, as detailed in the paper Evaluating Explainability for Machine Learning Predictions Using Model-Agnostic Metrics(https://arxiv.org/abs/2302.12094) and the associated open source library(https://github.com/holistic-ai/holisticai). The goal is to develop methods to mitigate explainability issues while balancing other aspects such as efficacy, bias, and privacy. The initial phase focusing on traditional ML mitigation is complete. The project is now extending to create metrics for image-based models, specifically targeting saliency maps, and subsequently developing mitigation methods.

Other creators
Personality Manipulation of LLMs through knowledge editing techniques

Feb 2024
This is a UCL IXN project involving a UCL Master’s student. This project builds on Holistic AI’s previous research (Eliciting Personality Traits in Large Language Models (https://arxiv.org/abs/2402.08341) to develop techniques for manipulating the personality of LLMs using knowledge editing methods. The objective is to observe the impact of these techniques on the LLMs’ expression of various personality traits.

Other creators
Towards Systematizing Large Language Model Audits: A Holistic Four-Tiered Approach

Feb 2024
This project aims to develop a systematic audit framework for Large Language Models (LLMs) within the Safeguard product. The framework is designed to address the ethical and operational challenges associated with the deployment of LLMs. It introduces a structured, four-tiered approach comprising Triage, Assessment, Mitigation, and Assurance tiers to ensure the responsible and ethical use of LLMs.

The Triage tier identifies and prioritizes potential risks, setting the technological…

This project aims to develop a systematic audit framework for Large Language Models (LLMs) within the Safeguard product. The framework is designed to address the ethical and operational challenges associated with the deployment of LLMs. It introduces a structured, four-tiered approach comprising Triage, Assessment, Mitigation, and Assurance tiers to ensure the responsible and ethical use of LLMs.

The Triage tier identifies and prioritizes potential risks, setting the technological foundation for further analysis. The Assessment tier evaluates the operational integrity and compliance of LLMs through qualitative assessments, benchmarking, and red-teaming exercises. The Mitigation tier focuses on addressing identified issues, implementing strategies to reduce risks effectively. Finally, the Assurance tier ensures ongoing compliance and performance optimization through continuous monitoring and periodic reassessments.

This comprehensive approach ensures that each tier builds upon the insights gained from the previous one, enabling a robust and dynamic auditing process. The framework is designed to evaluate various aspects of LLMs, including performance stability, explainability, privacy and security, fairness and bias, sustainability, and legal compliance, thereby contributing to the development of responsible and trustworthy AI systems.

Other creators
Advancing text-based stereotype detection and benchmarking in LLMs

May 2023
This project focuses on creating a dataset for training text-based stereotype detectors, using explainable AI techniques to ensure the detectors align with human understanding of stereotypes, and employing these detectors to benchmark stereotypes in Large Language Models.

Other creators
Advancing Pain Recognition through Statistical Correlation-Driven Multimodal Fusion

Feb 2024 - Aug 2024

This research introduces an innovative multimodal data fusion methodology for pain behaviour recognition, emphasizing the role of explainable AI in the Affective Computing field. The approach integrates statistical correlation analysis with human-centred insights, presenting two key innovations:

1. Statistical Relevance Weights: Incorporating data-driven statistical relevance weights into the fusion strategy to effectively utilize complementary information from heterogeneous…

This research introduces an innovative multimodal data fusion methodology for pain behaviour recognition, emphasizing the role of explainable AI in the Affective Computing field. The approach integrates statistical correlation analysis with human-centred insights, presenting two key innovations:

1. Statistical Relevance Weights: Incorporating data-driven statistical relevance weights into the fusion strategy to effectively utilize complementary information from heterogeneous modalities.

2. Human-Centered Movement Characteristics: Embedding human-centric movement characteristics into multimodal representation learning for detailed and interpretable modelling of pain behaviours.

Our methodology, validated across various deep learning architectures, demonstrates superior performance and broad applicability. We propose a customizable framework that aligns each modality with a suitable classifier based on statistical significance, advancing personalized and effective multimodal fusion. Furthermore, this approach enhances the explainability of AI models by providing clear, interpretable insights into how different data modalities contribute to pain recognition. By focusing on data diversity and modality-specific representations, the study sets new standards for recognizing complex pain behaviors and supports the development of empathetic and socially responsible AI systems.
Summer Cetus-Talk Online Research Program “Deep Learning in Artificial Intelligence"

Jul 2021 - Aug 2021

Completed studies under Professor Pietro Lio’s guidance from the University of Cambridge. Researched the concept and application of deep learning and transformer.
Hiring Pipeline, Corporate Project with Avanade

Oct 2020 - Mar 2021

Developed NLP tools that detected implicit bias during the hiring pipelines, such as
gender bias caused by the improper semantics usage of the unigram in CV writing. Designed system using Django Rest in the backend and React in the front end. Developed website: http://students.cs.ucl.ac.uk/2020/group9/.
2020 ULTRA Coding Competition

Oct 2020 - Oct 2020

World ranked 93rd (Active Red Giraffe) on the final leaderboard.
National Economics Challenge (Skt Education)

Aug 2020 - Sep 2020

Published personal story on the official Skt WeChat account. Organized an open online sharing session to give a speech and share personal experiences
of NEC and college application with the students.
Hybrid Retrieval Augmented Generation for AI Policy

-
This is a UCL IXN project involving a UCL Master’s student. This project focuses on developing frameworks for Retrieval-Augmented Generation (RAG) applications tailored for policy documents.

Other creators

More activity by ZEKUN

🚀 Excited to share that our team presented four research papers at NeurIPS 2024 across various workshops on December 14th and 15th. This reflects…

🚀 Excited to share that our team presented four research papers at NeurIPS 2024 across various workshops on December 14th and 15th. This reflects…

Liked by ZEKUN WU
🚀 My favorite NeurIPS 2024 visualization so far: What are the brightest AI minds focusing on this year? 🇨🇳 China dominates Computer Vision (54%)…

🚀 My favorite NeurIPS 2024 visualization so far: What are the brightest AI minds focusing on this year? 🇨🇳 China dominates Computer Vision (54%)…

Liked by ZEKUN WU
GenAI and large language models (LLMs) are set to change the world - so we need robust and reliable ways of evaluating them. Over the course of…

GenAI and large language models (LLMs) are set to change the world - so we need robust and reliable ways of evaluating them. Over the course of…

Liked by ZEKUN WU
🎉 The Holistic AI x UCL AI Society Hackathon 2024 brought together brilliant minds to tackle some of the most pressing challenges in AI…

🎉 The Holistic AI x UCL AI Society Hackathon 2024 brought together brilliant minds to tackle some of the most pressing challenges in AI…

Liked by ZEKUN WU
Is MMLU Western-centric? 🤔 As part of a massive cross-institutional collaboration: 🗽Find MMLU is heavily overfit to western culture 🔍…

Is MMLU Western-centric? 🤔 As part of a massive cross-institutional collaboration: 🗽Find MMLU is heavily overfit to western culture 🔍…

Liked by ZEKUN WU

View ZEKUN’s full profile

See who you know in common
Get introduced
Contact ZEKUN directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named ZEKUN WU

99 others named ZEKUN WU are on LinkedIn

See others named ZEKUN WU

Add new skills with these courses

See all courses

ZEKUN WU

London, England, United Kingdom 1K followers 500+ connections

About

Activity

OpenAI has announced o3, which appears to be the most powerful AI model to date. There has been some attention given to the massive dollar costs of…

Liked by ZEKUN WU

The recent #IEA Global Conference on Energy and AI brought together key stakeholders from both the Tech sector and the Energy sector to discuss the…

Liked by ZEKUN WU

Was an amazing experience to be at NeurIPS 2024 this year! Thanks to Holistic AI and ZEKUN WU for the opportunity, and Mengfei Liang for being an…

Liked by ZEKUN WU

Experience

Holistic AI

-

-

-

-

-

-

-

-

Education

UCL

Projects

Bias amplification in the process of model collapse of Language Models

Jul 2024

Inference Energy consumption and carbon emission analysis of LLMs with different FAST ML techniques

Jun 2024

Toolkit for Detecting and Mitigating Data Contamination in Large Language Models to Ensure Fair Evaluation

Jun 2024

Automated Toolkit for Real-time Open Generation Bias Alignment Benchmarking in Large Language Models

Mar 2024

JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models

Mar 2024

Quantitative Evaluation Framework for Natural Language Explanations in Compliance with the EU AI Act

Mar 2024

Automatic hallucination benchmarking and mitigation framework

Feb 2024

Categorization and target bias in open-generation bias metric models

Feb 2024

Mitigation methodologies for explainability in the image and traditional ML models

Feb 2024

Personality Manipulation of LLMs through knowledge editing techniques

Feb 2024

Towards Systematizing Large Language Model Audits: A Holistic Four-Tiered Approach

Feb 2024

Advancing text-based stereotype detection and benchmarking in LLMs

May 2023

Advancing Pain Recognition through Statistical Correlation-Driven Multimodal Fusion

Feb 2024 - Aug 2024

Summer Cetus-Talk Online Research Program “Deep Learning in Artificial Intelligence"

Jul 2021 - Aug 2021

Hiring Pipeline, Corporate Project with Avanade

Oct 2020 - Mar 2021

2020 ULTRA Coding Competition

Oct 2020 - Oct 2020

National Economics Challenge (Skt Education)

Aug 2020 - Sep 2020

Hybrid Retrieval Augmented Generation for AI Policy

-

More activity by ZEKUN

🚀 Excited to share that our team presented four research papers at NeurIPS 2024 across various workshops on December 14th and 15th. This reflects…

Liked by ZEKUN WU

🚀 My favorite NeurIPS 2024 visualization so far: What are the brightest AI minds focusing on this year? 🇨🇳 China dominates Computer Vision (54%)…

Liked by ZEKUN WU

GenAI and large language models (LLMs) are set to change the world - so we need robust and reliable ways of evaluating them. Over the course of…

Liked by ZEKUN WU

🎉 The Holistic AI x UCL AI Society Hackathon 2024 brought together brilliant minds to tackle some of the most pressing challenges in AI…

Liked by ZEKUN WU

Is MMLU Western-centric? 🤔 As part of a massive cross-institutional collaboration: 🗽Find MMLU is heavily overfit to western culture 🔍…

Liked by ZEKUN WU

View ZEKUN’s full profile

Other similar profiles

Faiz Ahmed

Sophie Thompson

Oksana Stechkevych

Lucy Baker

Shreyas Arunesh

Lewis Wyld

Logan Dean-Edwards

Nathaniel Okenwa

London, England, United Kingdom

1K followers 500+ connections