Designing a new way of optimizing complex coordinated systems

Coordination of complicated interactive systems, whether different ways of transporting in the city or various components that have to work together to create an efficient and efficient robot, is an increasingly important subject for software designers. Now scientists have developed a whole new way to approach these complex problems, using simple diagrams as a tool to reveal better access to optimizing software in deep learning models.

It is said that the new method allows solving these complex tasks so simple that it can be reduced to a drawing that would fit on the back of the napkin.

The new approach is described in the magazine Machine learning researchIn the post of incoming doctoral student Vincent Abbott and Professor Gioel Zardini of the MIT laboratory for information and decision systems (Lids).

“We have suggested a new language for an interview about this new system,” says Zardini. This new “diagram” language is a surpay based on something called category theory, he explains.

It all has to do with designing basic architecture of computer algorithms – a program that actually ends in sensing and controlling different different parts of the system that is optimized. “The components are different pieces of algorithm and have to talk to each other, exchange information, but also respond with energy, memory consumption, etc.” Such optimizations are notoriously difficult, since any change in one part of the system can cause changes in other parts that can further affect other parts, etc.

Scientists decided to focus on a specific class of deep learning algorithms, which are currently a hot topic of research. Deep learning is the basis of large artificial intelligence models, including large language models such as Chatgpt models and generation models such as Midjourney. These models manipulate the data of “deep” series of matrix multiplications scattered with other operations. Matci numbers are parameters and are updated during long training runs, allowing you to find complex patterns. Models consist of billions of parameters, which makes the calculation expensive and Hénece improved the use of resources and optimization invalid.

The scheme can deal with details of parallelized operations, which consist of deep learning, revealing the relations between algorithms and the parallelized graphics unit for processing (GPU) on which they operate, supplied by companies such as Nvidia. “I’m very excited about it,” says Zardini because we have found a tongue that very nicely assembles deep learning algorithms, expressing all important things, which are operators you use, “for example energy consumption, memory allocation and any other parameter for which you are trying to optimize.

Most progress in deep learning stems from optimizing resource efficiency. The latest Deepseek model has shown that a small team can compete with the top models from Openai and other main laboratories focusing on efficient sources and the relationship between software and hardware. Usually, when deriving these optimizations, it says: “People need a lot of experiments and mistakes to discover new architecture.” For example, a widely used optimization program called Flasat took more than four years to develop, he says. But with a new framework that develops, “we can actually approach this problem in a more formal way.” And all this is visually represented in a precisely defined graphic language.

However, the method used to find these improvisations “are very limited” says. “I think it shows that there is a main gap because we do not have a formal systemic method of interconnection of the algorithm either to its optimal design or even really understand how many sources will take.” But now, with a new diagram -based method that owes, such a system exists.

The theory of categories that is the basis of this approach is the way of mathematical description of different components of the system and how they interact with a generalized, abstract way. Various perspectives may be related. For example, mathematical formulas may be related to algorithms that implement and use the sources or descriptions of the system, can be reported for robust “chain diagrams.

“The theory of categories can be considered as mathematics of abstraction and composition,” says Abbott. “Any compositional system can be described using categories theory and the relationships between the compositional system can also.” Algebraic rules that are usually associated with functions can also be represented as diagrams, he says. “Then many visual tricks that we can do with diagrams can be related to algebraic tricks and functions. So this correspondence creates between these different systems.”

As a result, it says, “This solves a very important problem, which means we have these deep learning algorithms, but are not clearly underestimated as mathematical models.” But by representing them as diagrams, it is possible to approach them formally and systematically, he says.

One thing that allows this is a clear visual understanding of the way in which parallel processes in the real world can be suppressed by parallel processing on multi -core computer GPUs. “In this way,” says Abbott, “diagrams can restore the function and then reveal how to do it optimally on the GPU.”

The “attention” algorithm is used by deep learning algorithms that require general, contextual information and are a key phase of serialized blocks that make up large language models such as Chatgpt. Flasting is an optimization that lasted years to evolve, but results in a six -fold improvement in attention algorithms.

Zardini, which applies their method to an established algorithm, says that “Here we are able to literally deduce it to a napkin.” Then he adds, “Okay, maybe it’s a big napkin.” However, in order to take home a point on how much their new approach can simplify the solution of these complex algorithms, they called their formal research document on the work “FlashhatTention on a napkin”.

This method, Abbott says, “Allows optimization to be really fast deservation, unlike predominant methods.” While initially used this approach to the existing algorithm of Flashtett, thereby verifying its effectiveness, “we hope to use this language to automate improvizes detection,” says Zardini, who, in addition to being the main investigator in Lids, is Rudge and Nancy Al. Civil and environmental engineering and associated faculty with Institute for Data, Systems and Society.

The plan is that in the end it says that software is developing to the extent that “the researcher will record his code AS with the new algorithm automatically detect what can be improved, what can be optimized, and return the optimized version of the user’s algorithm.

In addition to automating algorithms of Zardini, it notes that robust analysis of how the deep-algorithms are used to use hardware sources, allows systematic joint design of hardware and software. This line of work integrates with the focus of Zardini on a categorical common design that uses category theory tools to simultaneously optimize different components of the motor system.

Abbott says “I believe that all this field of optimized deep learning models is that critically Nedbdesressed, and the diagrams of this work are so exciting. They open the door to this problem.”

“The quality of this research has impressed me. … A new approach to diagram of the deep learning algorithms that this document used in this article could be a very important step,” says Jeremy Howard, founder and CEO of the answers. “This article is the first time I have seen a record that has been used for practical analysis of the deep -learning algorithm performance.

“This is a beautifully made piece of theoretical research, which also aims to see high availability to uninitiated readers – the property rarely seen in papers of this kind,” says Petar Velickovic, head scientist on Google Deepmind and reading at Cambridge University associated with this work. These scientists say, “They are clearly excellent community and I can’t wait to see what the next come with!”

The new diagram -based language that has been published online is already attractive and the interest in the development of software. Reviewer from Abbott’s previous post Introduction The diagrams noted that “the proposed diagrams of the nerve circuits look great from an artistic point of view (if I can assess it).” “It’s technical research, but it’s also flashy!” Zardini says.

Leave a Comment