Analytics is a very general term for correlating and processing raw data to get meaningful results. The analysis algorithms can be as simple as data reduction or averaging a stream of sensor readings or as complex as the most sophisticated systems for artificial intelligence or machine learning (AI / ML). Today, analysis is often performed in the cloud, which is the most scalable and cost-effective solution.
However, in the future, analyzes will increasingly be distributed across the cloud, edge computing, and end devices to take advantage of their improved latency, network bandwidth, security, and reliability. Here we will discuss some of the architectures and tradeoffs associated with distributing analytics beyond the confines of the traditional cloud.
How Distributed Analytics Creates Added Value
Simple analyzes involve data reduction, correlation, and averaging, resulting in an output data stream that is much smaller than the input data. For example, consider a system that supplies fresh water to a large building. It might be helpful to know the pressure and flow values at different system points to optimize the pumps and monitor consumption. This could include several pressure and flow sensors distributed across the distribution pipes. The software periodically polls the sensors, adjusts the pump settings, and creates a consumption report for the building manager.
However, the raw data from the sensors can be misleading – e.g., B. a momentary pressure drop when a device is flushed. Analysis algorithms can average the readings from a given sensor over time and combine and correlate the readings from multiple sensors to create a more accurate and helpful picture of conditions in the pipelines. These readings could be sent to an analysis unit in the cloud. Still, a much more efficient architecture would be if the sensors did some of the averaging themselves and local edge computers did the correlation and reporting.
This is called distributed analytics, and it can improve the efficiency, accuracy, and coat of many analytical systems. These readings could be sent to an analysis unit in the cloud. Still, a much more efficient architecture would be if the sensors did some of the averaging themselves and local edge computers did the correlation and reporting. This is called distributed analytics, and it can improve the efficiency, accuracy, and coat of many analytical systems.
These readings could be sent to an analysis unit in the cloud. Still, a much more efficient architecture would be if the sensors did some of the averaging themselves and local edge computers did the correlation and reporting. This is called distributed analytics, and it can improve the efficiency, accuracy, and coat of many analytical systems.
Analytics becomes more complicated when AI / ML techniques are used. AI / ML usually works in two phases:
- A modeling phase in which large amounts of data are distilled to create a model for the AI / ML system
- An inference phase in which this model is applied to the data flowing in a system to produce the desired results, often in real-time
The models are almost always created in large server farms or the cloud with today’s systems, often as an offline process. Then the resulting AI / ML models are combined and sent to various techniques that carry out the inference phase of the models on live data and generate the desired results. The inference phase can run in the cloud but has recently moved towards the edge to improve latency, network bandwidth, reliability, and security. There are trade offs to consider when deciding which level of compute resources to use for each phase.
Also Read: What Is pCloud And Its Features?
Inference Phase Of AI / ML
The inference phase of the AI / ML can be distributed relatively quickly between several processors at the peer level or a hierarchy of processing layers up and down. When the models have been precalculated, the KI / ML algorithms’ data can be divided between several processors and processed in parallel. Dividing the workload across multiple processors at the peer level offers capacity, performance, and scaling advantages, as more computing resources can be used as the workload increases. In addition, system reliability can be improved because neighboring processors are still available to do the job if one processor fails.
The inference can also be split across several levels of a hierarchy, perhaps with different algorithm parts, which work on different levels of the processor. This allows the KI / ML algorithms to be broken down logically so that each hierarchy level can execute the most efficient subset of the algorithm. For example, in an AI / ML system for video analytics, the intelligence in the camera could perform adaptive contrast enhancement, relay this data to edge computers to perform feature extraction, send it to neighboring data centers to perform object detection, and finally, the Carryout high-level cloud functions such as threat detection or the creation of heat maps.
This can be a very efficient partition so that each level of the hierarchy can execute the most efficient subset of the algorithm. For example, in an AI / ML system for video analytics, the intelligence in the camera could perform adaptive contrast enhancement, relay this data to edge computers to perform feature extraction, send it to neighboring data centers to perform object detection, and finally, the Carryout high-level cloud functions such as threat detection or the creation of heat maps.
This can be a very efficient partition so that each level of the hierarchy can execute the most efficient subset of the algorithm. For example, in an AI / ML system for video analytics, the intelligence in the camera could perform adaptive contrast enhancement, relay this data to edge computers to perform feature extraction, send it to neighboring data centers to perform object detection, and finally, the Carryout high-level cloud functions such as threat detection or the creation of heat maps. This can be a very efficient partition.
To perform feature extraction, send them to neighboring data centers to perform object detection, and finally, the cloud could perform high-level functions like threat detection or heat mapping. This can be a very efficient partition. To perform feature extraction, send them to neighboring data centers to perform object detection, and finally, the cloud could perform high-level functions like threat detection or heat mapping. This can be a very efficient partition.
Also Read: Top Seven Cybersecurity Risks Associated with Cloud Migration
Learning phase of AI / ML algorithms
The learning phase of AI / ML algorithms is more challenging to distribute. The problem is the context size. To prepare a model, the AI / ML system takes enormous amounts of training data and processes it with various complex algorithms of the learning phase to generate a model that can be executed relatively quickly in the inference phase. If only part of the training data is available on a particular compute node, the algorithms have difficulty generalizing the model. For this reason, the training is usually carried out in the cloud, where storage space and volume are practically unlimited.
However, specific scenarios require that the training algorithms be distributed across multiple compute nodes at the peer level or up and down the cloud-to-edge hierarchy. Learning at the edge, in particular, allows the learning process to collect a lot of training data from nearby sensors and act on it without the cloud being involved – which improves latency, reliability, security, and network bandwidth. Advanced distributed learning algorithms are currently being developed to address these challenges.
Conclusion
AI / ML is an essential skill for almost all future electronic systems. Understanding the options for how these systems’ inference and training capabilities can be partitioned across a hierarchy of computational resources is key to our future success.