Introduction
Over the past nine years, the landscape of stream processing research has evolved significantly, driven by advancements in real-time analytics, AI integration, and distributed architectures. In response to these shifts, we are pleased to present the 9th Workshop on Stream Processing, Stream-based AI & Stream Data Management in Big Data at IEEE Big Data 2025, bringing together the latest contributions in the field.
Stream processing remains a cornerstone of the Big Data ecosystem, with its applications expanding across IoT, AI-driven decision-making, financial technology, and cybersecurity. Innovations in serverless computing, edge stream analytics, continual learning, evolving graphs and event-driven data mesh architectures are now defining the future of real-time computing, reflecting the dynamic progress in this domain.
Key Evolutions in Stream Processing
Stream processing systems have become central components of modern data-intensive architectures, playing a central role in managing and analyzing continuous data flows [18, 19]. These systems increasingly integrate specialized frameworks for data management and analytics, enabling efficient real-time integration across diverse applications [20].
A key evolution in this space is the incorporation of advanced data management functionalities, allowing seamless interaction between transactional and analytical workloads [21]. One notable example is the deep integration of stateful stream processing with Hybrid Transactional and Analytical Processing (HTAP). This paradigm enables real-time analytics on streaming data with minimal latency by unifying "hot" (real-time) and "cold" (historical) data streams. Concepts such as Stream Tables and Materialized Views are leveraged to facilitate low-latency decision-making, demonstrating how stream processing is evolving beyond traditional event processing into a foundational pillar of modern data infrastructure [1,2].
New initiatives in data mesh architectures are also driving real-time data products and self-serve streaming infrastructures [3].
Parallel to this, AI-powered stream processing has reached a new level of maturity, with adaptive streaming machine learning models that address concept drift, support incremental learning, and perform real-time anomaly detection [4,6,7]. Emerging frameworks like Apache Flink, Ray Streaming [9,5] extend streaming pipelines to natively support machine learning workflows.
Graph-Based Stream Processing and Temporal Analytics
Graph-based stream processing is a crucial component of real-time analytics, particularly for applications requiring temporal graph analytics, such as fraud detection, cybersecurity, social network analysis, and supply chain optimization [10, 11]. Advances in this field enable real-time anomaly detection, link prediction, and dynamic knowledge graph construction, enhancing the scalability of streaming event processing.
Recent breakthroughs in graph neural networks (GNNs) have further expanded real-time stream processing capabilities, supporting tasks such as social network monitoring [12], dynamic fraud detection, and network anomaly detection.
Federated Learning, Edge Stream Processing, and Privacy-Aware Architectures
Federated learning and edge computing are reshaping real-time analytics, reducing reliance on centralized cloud architectures. Privacy-preserving stream analytics has become essential [8], enabling IoT devices to process data locally while ensuring secure and distributed intelligence.
Continual Learning in Stream-Based Applications
Continual learning [15, 16, 17] enables models to assimilate new information while preserving previously acquired knowledge. In the context of stream processing and real-time analytics, this capability is used for adapting to evolving data patterns and concept drift—situations where the statistical properties of the data change over time. Implementing continuous learning within streaming architectures allows systems to update predictive models on-the-fly, ensuring sustained accuracy and responsiveness. This adaptability is particularly useful in dynamic environments such as financial markets, cybersecurity, and IoT applications, where real-time decision-making hinges on the ability to learn from continuous data flows.
New Frontiers: Digital Twins, Decentralized Streaming, and Smart Systems
Stream processing in digital twins is a rapidly growing field, enabling real-time data pipelines for industrial automation, smart cities, and healthcare applications [13]. Similarly, decentralized streaming architectures are gaining traction, particularly in blockchain monitoring, decentralized finance (DeFi) analytics [14], and secure streaming frameworks.
With these evolving trends, the 2025 workshop will serve as a premier venue for researchers and industry leaders to discuss the latest advances in real-time streaming, event processing, and AI-driven stream analytics. We invite contributions to:
- Stateful stream processing & hybrid transactional-analytical models
- AI-powered stream mining, anomaly detection, and federated learning
- Temporal graph streaming, graph neural networks, and dynamic event detection
- Edge-based streaming analytics and privacy-preserving architectures
- Real-time data products in data mesh and event-driven architectures
- Streaming digital twins for industrial automation and smart cities
- Decentralized streaming for blockchain, cybersecurity, and DeFi analytics
This workshop will bring together leading researchers and practitioners to explore cutting-edge advancements and shape the future of real-time data streaming.