Binyan Xu
AI Security · Vision-Language Models

Binyan Xu

Ph.D. student in Information Engineering at The Chinese University of Hong Kong. My research explores how advanced models like CLIP and embodied agents can be both defenders and attackers in AI security.

Deep neural network security
Backdoor & adversarial attacks / defenses
Vision-language & embodied AI safety
Short bio
I am a Ph.D. student in the Department of Information Engineering at the Chinese University of Hong Kong, advised by Prof. Kehuan Zhang. Before that, I received my B.Sc. in Automation (Honors) from Xi’an Jiaotong University and spent a semester as an exchange student in EECS at UC Berkeley. My research focuses on making modern AI systems more trustworthy in adversarial settings, with an emphasis on backdoor attacks and defenses, universal adversarial perturbations, and the security of vision-language and embodied agents.
Research interests
  • Backdoor attacks & defenses for DNNs
  • Universal & transferable adversarial attacks
  • Using CLIP / VLMs / LLMs for AI security
  • Security of embodied agents & robotic systems

Research Overview

Friend and Foe: Exploring the dual role of vision-language models in AI safety.

Friend and Foe: Vision-Language Models as Both Attackers and Defenders
Ph.D. pre-candidacy thesis proposal · Department of Information Engineering, CUHK
Vision-language models (VLMs) such as CLIP enable powerful cross-modal reasoning, but they also introduce new attack surfaces. My research proposal studies this dual role across three directions:
  1. UnivIntruder: Using a single public VLM to craft universal, transferable and targeted adversarial perturbations that can hijack fully black-box image classifiers.
  2. CLIP-Guided Defense: Leveraging CLIP as an external semantic inspector to separate poisoned from clean data and defend against a wide range of backdoor attacks.
  3. Environment-Driven Jailbreaks (in progress): Studying how carefully designed physical environments around embodied agents can bypass safety mechanisms via their perception pipelines.

Selected Publications

First-author works on AI security, vision-language models and backdoor robustness.

One Surrogate to Fool Them All: Universal, Transferable, and Targeted Adversarial Attacks with CLIP
ACM Conference on Computer and Communications Security (CCS) 2025 · Oral
Adversarial Attack
This work shows that a single, publicly available vision-language model (CLIP) can serve as a universal surrogate for generating targeted, universal perturbations that transfer to diverse, fully black-box vision models without accessing their architectures, parameters or data.
Universal perturbation Transferability Black-box models CLIP
CLIP-Guided Backdoor Defense through Entropy-Based Poisoned Dataset Separation
ACM Multimedia (MM) 2025 · Oral
Backdoor Defense
We propose CLIP-Guided Defense (CGD), which uses CLIP to identify and separate likely poisoned and clean samples, then retrains the victim model using CLIP’s logits as guidance. CGD achieves low attack success rates (often below 1%) while preserving clean accuracy across multiple datasets and diverse backdoor attacks, including clean-label and clean-image settings.
Data separation Entropy-based scoring Clean-image backdoors Robust retraining
Breaking the Stealth–Potency Trade-off in Clean-Image Backdoors with Generative Trigger Optimization
AAAI Conference on Artificial Intelligence (AAAI) 2026 · Oral
Backdoor Attack
This line of work introduces a generative framework for clean-image backdoors: attackers only relabel images while keeping visual content unchanged, yet still plant powerful backdoors with high success rates and minimal clean accuracy degradation. By optimizing over a latent trigger space, we demonstrate that high stealth and high potency can coexist, and we explore their impact across classification, regression, and segmentation tasks.
Clean-image backdoors Generative models InfoGAN

Selected Project

Industry research experience at Microsoft Research Asia (MSRA).

AI-Aided Ascetic Graphic Design Generation
Microsoft Research Asia · Stars of Tomorrow Internship
Industry collaboration
During my internship at Microsoft Research Asia (MSRA), I co-developed an AI-aided ascetic graphic design system that combines:
  • an autoregressive vision-language layout generator that produces structured, constraint-aware designs,
  • a diffusion-based module that refines style and texture while preserving layout semantics.
The system achieves higher aesthetic quality and fewer constraint violations compared to prior approaches, and demonstrates how generative models can augment professional design workflows in a controllable way.
Diffusion models Layout generation Vision-language Human–AI collaboration

Awards & Education

Academic training and recognitions along the way.

Education
Ph.D. in Information Engineering
The Chinese University of Hong Kong · Sep. 2023 – Aug. 2027 (expected)
GPA: 3.86. Research on AI security, backdoor attacks/defenses, and vision-language / embodied AI.
Exchange Program in EECS
University of California, Berkeley · Aug. 2021 – Dec. 2021
GPA: 4.0. Coursework and projects in computer science and artificial intelligence.
B.Sc. in Automation (Honor)
Xi’an Jiaotong University · Sep. 2019 – Jul. 2023
GPA: 3.92. Graduated with honors under the Qian Xuesen Program.
Honors Youth Program (西安交大少年班)
Xi’an Jiaotong University · Sep. 2017 – Jul. 2019
GPA: Top 10%. Middle School to University Preparatory Course.
Selected awards & scholarships
  • Finalist Award, Mathematical Contest in Modeling (MCM)
  • Qian Xuesen Program Honorary Graduate
  • Outstanding Graduates Scholarship
  • First-class Funding for Studying Abroad (~¥100k)

Contact & Links

The fastest way to reach me is via email.

Contact

I am always happy to discuss AI security, backdoor attacks/defenses, and vision-language or embodied AI. If you are interested in collaboration, feel free to send me a short email describing your idea.

Email: binyxu@ie.cuhk.edu.hk

Elsewhere