In the light of the aforementioned requirements we present here “Mining Minds”, a novel framework aimed at comprehensively mining human’s daily life data generated from heterogeneous resources for producing personalized health and wellness support. Mining Minds philosophy revolves around the concepts of data, information, knowledge and service curation, which refer to the discovery, processing, adaptation and evolution of both contents and mechanisms for the provision of high quality support services. Motivated by these concepts, a multilayer architecture is particularly devised for Mining Minds—Fig. 1. In a nutshell, the data curation layer (DCL) is in charge of processing and persisting the data obtained from the multimodal data sources (MDS), which abstractly defines the possible sources of user health and wellness data. This includes, but is not limited to, data from social networks, questionnaires, wearable biomedical devices or ambient intelligence systems. The data processed by DCL is primarily used by the information curation layer (ICL) to infer low-level and high-level person-centric information. This information mainly describes the user context and behavior, and, to some extent, their physical, mental and social state. The information extracted by ICL is leveraged by the knowledge curation layer (KCL) to nurture and evolve the health and wellness knowledge primarily created by human experts. Data, information and knowledge are used by the service curation layer (SCL) to create intelligent health and wellness support services, mostly in the form of smart coaching and support recommendations. All the contents and processes are accommodated in terms of security and privacy by the supporting layer (SL), which also provides analysis of user experience, feedback and trends to guarantee the highest personalization.
Data curation layer
Data curation layer is responsible for acquiring, curating and persisting the data obtained from MDS so it can be processed for higher level understanding. To that end DCL relies on two main modules, Sensory Data Processing and Curation and Big Data Storage. Within the former, Data Acquisition supports the acquisition and synchronization of raw sensory data obtained from diverse sources, both in real-time and offline manner, as generic data streams. Due to the heterogeneous nature of the data, it is acquired asynchronously in real-time and temporarily cached in data buffers. These data buffers are initialized depending upon the number of data sources, i.e., each data source has a data buffer in the Data Acquisition component. All the data buffers are synchronized and communicated to ICL for the determination of the associated low and high-level contexts. In parallel, this synchronized data is stored in Big Data Storage for non-volatile persistence.
Upon receiving the context information determined by ICL, the context instances are curated by the Representation and Mapping component as a time-based log registering the detected human behaviors. This time-based log is termed as user Life-Log or simply Life-Log and persisted in the Intermediate Database for shareability with other layers and applications. The stream of life-log instances is analyzed by a monitoring component called life-log monitor (LLM). The responsibility of the LLM is to perform time-based monitoring of the different attributes and variables hosted in the Life-Log, and support trigger-based mechanisms to notify SCL for the occurrence of an abnormal or special event related to a given user. These abnormal events normally represent risky or unhealthy behaviors and are here defined as “situation events” or “situations” in general. These situations are described through diverse constraints, e.g., age, gender or medical conditions and monitorable variables, e.g., intensity of a particular activity or its duration. Situation events can be generated both statically at design-time and dynamically at run-time upon request from KCL.
The life-log data persisted in the Intermediate Database is regularly synchronized with the Big Data Storage. Big Data Storage also provides read access to raw sensory and life-log data. In case of historic data required by SL for analytics or KCL for data-driven rule generation, Big Data Storage provides queries for data streaming and intermediate data generation. These queries can be customized on request to return specific data based on the attributes selected by KCL and SL. Security and privacy components from SL are further involved in these processes to request authentication and data stream encryption before its persistence or sharing.
Information curation layer
Information curation layer represents the Mining Minds core for the inference and modeling of the user context [26]. ICL is composed by two main modules, namely, low level context awareness (LLCA) and high level context awareness (HLCA). LLCA is in charge of converting the wide-spectrum of data obtained from the user interaction with the real and cyber-world, into abstract concepts or categories, such as physical activities, emotional states, locations and social patterns. These categories are intelligently combined and processed at HLCA in order to identify more meaningful semantic representations of the user context.
Low level context awareness is composed by four key components, respectively, Activity Recognizer, Emotion Recognizer, Location Detector and SNS Analyzer. The identification of the user physical actions is performed through the Activity Recognizer. This component may build on several sensing modalities as they happen to be available to the user, such as wearable inertial sensors, video and audio. The output of this component corresponds to elementary activity categories such as “sitting” or “walking”. The Emotion Recognizer is defined to infer user emotional states, such as “surprise” or “sadness”, by using video and audio data as well as more sophisticated sources exploring human physiological variations and responses. The user situation is determined by the Location Detector, which essentially builds on the data collected through indoor and outdoor positioning sensors, such as video and GPS, to specify the exact location of the user. The SNS Analyzer is in charge of processing the information generated by the user during their interactions in regular social networks, including posts, mentions, traces and even global social trends, in the form of both text and multimedia data. From here, personal and general interests, conducts and sentiments may be determined. All these components require compatible multimodal sensory data to operate. The provisioning of the necessary data is performed through the Input Adapter, which receives and routes the data curated by DCL to each LLCA component depending on its nature. Once new low-level context categories are identified after the analysis of this data, the Output Adapter serves them to DCL for persistence and to HLCA for further processing.
High level context awareness makes use of two components, namely, High-Level Context Builder and High-Level Context Reasoner, to represent, verify, classify and categorize the user high-level context. The context representation and verification is performed through ontologies, adopted in the past as a unified conceptual backbone for modeling context, while its classification and categorization is done through ontological inference and reasoning. Whenever new information is received from LLCA, a new ontological instance is created by the High-Level Context Builder and categorized into one of the considered high-level contexts by the High-Level Context Reasoner. Thus for example, based on the actual time—e.g., midday; location—e.g., restaurant; and inferred activities—e.g., sitting; this component can determine the precise user context—e.g., lunch.
Knowledge curation layer
Knowledge curation layer is devised to enable the creation and evolution of both health and wellness knowledge. The knowledge is created either by the domain expert or knowledge engineer by using expert-driven or data-driven approaches. The Expert-Driven module provides a set of rule authoring components to allow specialists to describe in a logical form causes or premises and effects or conclusions, e.g., “if gender is male and age lower than 65 then activity level should be moderate”. The authoring process is further supported through evidence materials and domain vocabularies to confirm the viability of the rules and facilitate their elaboration. The Data-Driven module leverages the contents of the life-log for the automatic generation of rules. To that end, a data broker interface is defined to glean the contents of interest from the data persisted in DCL based on the features or attributes chosen by the expert, e.g., gender, emotional state and activity level. The process is automated by selecting and learning diverse mining models to discover and represent the underlying relationship among the considered health and wellness factors.
In both expert-driven and data-driven cases the generated rules are verified in terms of consistency and validated to avoid potential violations or redundancy with existing rules prior to be stored into the Knowledge Bases. KCL rules are not only persisted in traditional knowledge bases but also indexed according to salient conditions of these rules, also called “causes” or “situations”. These situations refer to particular attributes of the rules than can be monitored by the platform and used for triggering the execution of specific rules. Accordingly, during the rule creation process the expert can select these condition attributes for their particular monitoring at DCL. The categorization of the knowledge bases through these indexes is particularly considered to enhance the performance of the reasoning processes hosted in SCL. In fact, once a situation is detected only its associated rules are shared with SCL through the Knowledge-Sharing Interface upon request of this layer.
The evolution of the knowledge is procured through two main mechanisms. On the one hand, the expert creation process can be considered as a sort of maintenance per se. In that view, rules may be dynamically updated or replaced based on new health and wellness findings from experts. On the other hand, rules can be added, replaced or modified through the data-driven approach while using new life-log contents collected from different users.
Service curation layer
Service curation layer provides the means to transform the data, information and knowledge curated by DCL, ICL and KCL into actual health and wellness support services. The services are managed by the Service Orchestrator, in charge of attending the potential requests, invoking the necessary services and coordinating the processes involved in the curation of the services. The requests may be of various types, i.e., scheduled on time—e.g, “every day at 8 am”; triggered by direct user queries—e.g., “suggest me an exercise plan for today’s workout”; or based on events—e.g., “user arrives at home”. The last type of request particularly relates to the concept of situation, already described in previous sections. The idea is that the LLM component from DCL triggers SCL whenever a situation event is identified in order to generate a new recommendation for the user.
The services needed to satisfy a given request are invoked from an extensible catalog containing reference and auxiliary services. A major reference service is devised for this architecture for the generation of personalized health and wellness recommendations. This service consists of two parts. First, generalized recommendations are developed by the Recommendation Builder component through reasoning on the user profile and life-log data provided by DCL and the knowledge facilitated by KCL for the specific domain of the service. In the case of handling a request derived from a situation detection the indexed rules hosted by KCL are particularly employed. Second, the recommendations undergo a personalization process through the Recommendation Interpreter in order to deliver the one that best fits the user interests and demands. Through this component all the potential recommendations are filtered based on the user preferences, conditions and possessions, as well as their actual context. Thus, for example, when the objective of the recommendation is to encourage the user to exercise, cycling would be avoided if the user does not own a bike, or a visit to the regular gym omitted in case the person is on a business trip. Similarly, this component can delay the delivery of a given recommendation when it is considered not to be a convenient moment for the user, e.g., if the person is in the middle of a meeting. Prior to be communicated to the user, the recommendation is refined to be easily interpreted by including multimedia contents to increase the interpretability and also incorporating motivational and engagement strategies to foster the user interest and attention.
Supporting layer
The role of SL is to enrich the overall Mining Minds functionalities through advanced analytics, interactive and personalized UI/UX, implicit and explicit feedback analysis, and adequate privacy and security mechanisms.
The Analytics module is in charge of mining in a multi-dimensional and retrospective manner the data sets collected and curated from multiple users to reveal population health and wellness associations, patterns and trends. These trends may refer to current facts as well as expected or future tendencies. The exploration of present trends is performed through the Descriptive Analytics, which employs statistical techniques to relate explanatory variables of the persisted data. Thus for example, based on the analysis of the inferred people lifestyles, it can be found that there is a growing use of hot beverages among adolescents, which further relates to a dramatic increase of stress patterns. The discovery of potential future facts is carried out by the Predictive Analytics, which develops on the outcomes of the Descriptive Analytics to make forecasts by using regression and machine learning models. Descriptive and predictive analytics contents are organized by the Visualization Enabler, which adjusts the style of the information to be communicated to the users based on their expertise and role.
Evaluating the services supported by Mining Minds requires feedback from the users, which is here powered by the Feedback Analysis component. The sources of feedback may be of a diverse nature, ranging from explicit feedback provided by the user, for example, through questionnaires, to implicit feedback obtained from the user behavioral responses. Analyzing implicit and explicit feedback from the users is motivated by the aspects of functionality, content, and presentation. Functionality-based feedback refers to the findings obtained while comparing, for example, the system recommendations and the behavioral reaction of the user to those recommendations. Content-based feedback measures the user satisfaction with respect to the specific information provided as part of the delivered services. Finally, presentation-based feedback measures the human-computer interaction with respect to the user interface (UI), which is of particular utility to understand the user experience (UX). All these types of feedback are devised to help assessing the level of interest and adherence of users to the services provided through Mining Minds as well as to evolve and maintain the internal contents and processes handled by the platform.
Considering user preferences, habits or mood, the UI/UX module enables the end-user applications interface to be adapted accordingly. This adaptation is needed to adjust the human-computer interaction experience with respect to font size, theme, or audio levels, among other characteristics. Two main components are involved in this process. First, the UI Interaction Tracker collects the data from the interaction between the person and the application to analyze the user’s ability to understand and use the system, e.g., the readability of the contents or the perceptibility of the controls. Then, the UX component measures the satisfaction level based on the analysis of the collected data. The immediate result is a dynamic adaptation of the UI based on the measurements extracted from the evaluation of the UX.
Given the sensitivity of the collected user data, privacy and security need to be assured and exhibited, not only for storage, but also during the processing and delivery of services. To that end, state-of-the-art cryptographic primitives along with indigenous protocols are considered. For secure storage, the AES standard is particularly used, whereas for oblivious processing, homomorphic encryption and private matching is used. Considering the intensive data flow between end-user applications and Mining Minds, data randomization techniques are used to ensure a high entropy for minimal leakage of information. An authorized model ensures the legitimate disclosure of personal data and services with users. Slow processing of information is a common byproduct of the encryption; thus, to assist partial swiftness to Mining Minds, sensitive and non-sensitive information is decoupled where required. Anonymization procedures are also considered to enable the use of the collected and mined users data by third party agents, e.g., for research purposes.