Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Mengmeng Ren; Li Qiao; Long Yang; Zhen Gao; Jian Chen; Mahdi Boloursaz Mashhadi; Pei Xiao; Rahim Tafazolli; Mehdi Bennis

doi:10.1109/TVT.2025.3566488

Back

Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Journal article

Open access

Peer reviewed

Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Mengmeng Ren, Li Qiao, Long Yang, Zhen Gao, Jian Chen, Mahdi Boloursaz Mashhadi, Pei Xiao, Rahim Tafazolli and Mehdi Bennis

IEEE Transactions on Vehicular Technology, Vol.74(9)

02/2505

DOI: https://doi.org/10.1109/TVT.2025.3566488

Abstract

Servers

Transmitters

Receivers

Wireless communication

Image edge detection

Computational modeling

Collaboration

Energy consumption

Electronic mail

Pre-trained multi-modal/vision language models (M/VLMs)

semantic communication

zero/few-shot captioning

collaborative edge-device generative AI

This paper develops an edge-device collaborative Generative Semantic Communications (Gen SemCom) framework leveraging pre-trained Multi-modal/Vision Language Models (M/VLMs) for ultra-low-rate semantic communication via textual prompts. The proposed framework optimizes the use of M/VLMs on the wireless edge/device to generate high-fidelity textual prompts through visual captioning/question answering, which are then transmitted over a wireless channel for SemCom. Specifically, we develop a multiuser Gen SemCom framework using pre-trained M/VLMs, and formulate a joint optimization problem of prompt generation offloading, communication and computation resource allocation to minimize the latency and maximize the resulting semantic quality. Due to the non-convex nature of the problem with highly coupled discrete and continuous variables, we decompose it as a two-level problem and propose a low-complexity swap/leaving/joining (SLJ)-based matching algorithm. Simulation results demonstrate significant performance improvements over the conventional semantic-unaware/non-collaborative generation offloading benchmarks. Index Terms—Pre-trained multi-modal/vision language models (M/VLMs), semantic communication, zero/few-shot captioning, collaborative edge-device generative AI.

Files and links (1)

pdf

main_VT_MEG_updated3.10 MBDownload View

Author's Accepted Manuscript Open Access

Metrics

14 File views/ downloads

42 Record Views

Details

Title: Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs
Creators: Mengmeng Ren - Xidian University
Li Qiao - University of Surrey, School of Computer Science and Electronic Engineering
Long Yang - Xidian University
Zhen Gao - Beijing Institute of Technology
Jian Chen
Mahdi Boloursaz Mashhadi - University of Surrey, School of Computer Science and Electronic Engineering
Pei Xiao (Author) - University of Surrey, School of Computer Science and Electronic Engineering
Rahim Tafazolli - University of Surrey, School of Computer Science and Electronic Engineering
Mehdi Bennis - University of Oulu
Publication Details: IEEE Transactions on Vehicular Technology, Vol.74(9)
Publisher: IEEE; PISCATAWAY
Number of pages: 6
First online publication date: 02/2505
Date accepted for publication: 26/04/2025
Grant note: National Natural Science Foundation of China: 62371367 Key Research and Development Program of Shaanxi: 2023-ZDLGY-50 Innovation Capability Support Program of Shaanxi: 2024ZC-KJXX-080, 2024RS-CXTD-01 U.K. Engineering and Physical Sciences Research Council: EP/X013162/1 RCF-CHISTERA Project MUSE-COM<^>2 and RCF Project Semantics-Native Communication and Protocol Learning in 6GChina Scholarship Council
This work was supported in part by the National Natural Science Foundation of China under Grant 62371367, in part by the Key Research and Development Program of Shaanxi under Grant 2023-ZDLGY-50, in part by the Innovation Capability Support Program of Shaanxi under Grant 2024ZC-KJXX-080 and Grant 2024RS-CXTD-01, in part by U.K. Engineering and Physical Sciences Research Council under Grant EP/X013162/1, and in part by the RCF-CHISTERA Project MUSE-COM<^>2 and RCF Project Semantics-Native Communication and Protocol Learning in 6G. The work of Mengmeng Ren was supported by China Scholarship Council.
Identifiers: 99990166202346; WOS:001577082900019
Academic Unit: School of Computer Science and Electronic Engineering
Language: English
Resource Type: Journal article

Generative Semantic Communication via Textual Prompts: Latency Performance Tradeoffs

Abstract

Files and links (1)

Metrics

Details

Usage Policy