Smart home technology is changing forever, and one of its most impactful applications is Human Activity Recognition (HAR). HAR enables smart systems to monitor daily activities such as cooking, sleeping, or exercising, providing essential support in domains like healthcare and assisted living. However, while deep learning models have significantly improved HAR accuracy, they often operate as “black boxes,” offering little transparency into their decision-making process.
To address this, researchers from the University of Milan—Michele Fiori, Davide Mor, Gabriele Civitarese, and Claudio Bettini—have introduced GNN-XAR, the first explainable Graph Neural Network (GNN) for smart home activity recognition. This innovative model not only improves HAR performance but also generates human-readable explanations for its predictions.
The need for explainable AI in smart homes
Most existing HAR systems rely on deep learning models such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). While effective, these models struggle with explainability, making it difficult for users—including medical professionals and data scientists—to understand why a specific activity was detected. Explainable AI (XAI) seeks to mitigate this by providing insights into model decisions, enhancing trust and usability in real-world applications.
Graph Neural Networks (GNNs) have emerged as a powerful tool for modeling time-series sensor data in smart homes, as they can capture both spatial and temporal relationships between sensor readings. However, existing GNN-based HAR approaches lack built-in explainability. This is where GNN-XAR differentiates itself, offering an innovative solution that combines graph-based HAR with interpretability mechanisms, making it the first of its kind in the field.
Emoti-Attack: How emojis can trick AI language models
How GNN-XAR works
GNN-XAR introduces a novel graph-based approach to sensor data processing. Instead of treating sensor readings as isolated events, it constructs dynamic graphs that model relationships between different sensors over time. Each graph is processed using a Graph Convolutional Network (GCN), which identifies the most probable activity being performed. To ensure transparency, an adapted XAI technique specifically designed for GNNs highlights the most relevant nodes (sensor readings) and arcs (temporal dependencies) that contributed to the final prediction.
The graph construction process is a key innovation in GNN-XAR. Sensor events—such as motion detections, appliance usage, and door openings—are represented as nodes, while edges capture their temporal and spatial relationships. The system distinguishes between two sensor types:
- Explicit interaction sensors (e.g., cabinet door sensors), which generate both ON and OFF events.
- Passive sensors (e.g., motion detectors), where only activation events matter, and duration is computed.
To maintain structure and efficiency, the system introduces super-nodes that group related sensor events. This allows the GNN model to process complex sensor interactions while keeping computations manageable.
How GNN-XAR explains its decisions
Unlike traditional deep learning models, which provide only classification outputs, GNN-XAR uses GNNExplainer, a specialized XAI method tailored for graph-based models. This method identifies the most important nodes and edges that influenced a prediction. The key innovation in GNN-XAR is its adaptation of GNNExplainer to work seamlessly with smart home data, ensuring that explanations are both accurate and human-readable.
For example, if the system predicts “meal preparation,” it may highlight events such as repeated fridge openings followed by stove activation, providing a logical and understandable rationale for its classification. The model then converts this explanation into natural language, making it accessible to non-expert users.
Experimental results
GNN-XAR was tested on two public smart home datasets—CASAS Milan and CASAS Aruba—which contain sensor data from real homes. The model was evaluated against DeXAR, a state-of-the-art explainable HAR system that uses CNN-based methods. The results showed that GNN-XAR not only provided more accurate predictions but also generated more meaningful explanations compared to existing XAI-based HAR methods.
Key findings include:
- Slightly higher recognition accuracy than DeXAR, especially for activities with strong temporal dependencies (e.g., “leaving home”).
- Superior explainability, as measured by an evaluation method using Large Language Models (LLMs) to assess explanation clarity and relevance.
- Improved handling of complex sensor relationships, enabling more reliable HAR performance.
Featured image credit: Ihor Saveliev/Unsplash