The IT field gets regularly polluted with buzzwords to fuel “emerging tech” that vows to be the next big thing, often with suspicious premises to be honest, and among them one is trying to find its way under the term “Granular Computing”. If you’ve never heard of it, do not panic: you’re not missing out on anything new really. Software developers have been doing it under different forms since the very first day they started programming, even without knowing it. In fact, it is strongly related to the concept of abstraction. The mere act of thinking can be seen as a granular process. Granular Computing is not a technology but more of a philosophical concept, although many works have tried to formalize it to some extent. Generally it is described as an approach to computing for dealing with complex information entities. The next sections are an attempt to illustrate what it is all about, with a specific focus on its relationship to software engineering.

## An overview of Granular Computing

Granular Computing is a broad term encompassing a variety of methodologies from several fields that share many commonalities, with AI probably being the most notable one. Apparently, it seems to have its roots in the fuzzy set theory, and L.A.Zadeh (the father of fuzzy logic) is credited for having coined the term. The word “granular” derives from the three fundamental concepts of this paradigm: granules, granularity and granulation. Granules are the basic components for the representation of knowledge at different levels of detail. An information granule is a semantically meaningful group of elements that share some similarity. They arise naturally in the application of abstraction methods as the building blocks of complex systems or in the processing of data. Clusters, subsets and classes are typical examples of granules grouped by some form of similarity such as distance, affinity and functionality.

Granulation refers to the process of generating information granules and the associated structures by means of specific algorithms, which may vary according to the domain of application. However, two common operations can generally be defined in the granulation process: composition, where multiple granules with similar features are combined into a single granule, and decomposition, where a granule is broken down into multiple smaller granules. Composition is a process of synthesis as it aggregates information at a finer level of details and produces more abstract entities, making it a bottom-up method. Decomposition, on the other hand, is a process of analysis wherein coarse granules are broken down into smaller entities at a finer level of detail, thus making it a top-down approach.

The process of granulation, whether top-down or bottom-up, generates level hierarchy structures, where each level represents a granulated view of the entity object of study at a specific degree of detail (granularity). Figure 1 shows an example of granular hierarchy produced by an application of these methods. At each level of granularity the single granules possess a determined set of attributes and are associated to other units by means of specific relationships, usually based on some form of similarity. The relation $$R$$ is a bidirectional correspondence that represents the link between granules at different levels. In the top-down direction this relation can be seen as the explicitation of a granule, that is a decomposition operation, while in the bottom-up direction as a generalization of associated granules, that is a composition operation.

More formally, a granule $$G$$ is a composite entity made of smaller objects $$g$$ (sub-granules) each represented by a set of attributes $$\left\langle a_1,…,a_p \right\rangle = A_g$$, where the configuration of attributes determines the state (or behavior) of the granule. The relationships between granules at the same level (also known as intra-relationships) can be expressed in terms of some similarity function $$s(g_i, g_j)$$ that measures how they are close to each other. The relationships between granules at different levels (also known as inter-relationships) can be expressed in terms of the bi-directional correspondence R, defined as follows

$$R \overset{\Delta}{=} \begin{cases} g_{i}^{\ell-1} \leftarrow R_\uparrow(\{g_1^\ell,…,g_n^\ell\}) = R_\uparrow(G^\ell) \\ G^\ell = \{g_1^\ell,…,g_n^\ell\} \leftarrow R_\downarrow(g_i^{\ell-1}) \end{cases} \tag{1}$$

where $$R_\uparrow : L_\ell \to L_{\ell-1}$$ indicates the bottom-up relation that synthesizes a granule $$g$$ at level $$L_{\ell-1}$$ from granules at level $$L_\ell$$, aggregated by similarity, and $$R_\downarrow : L_{\ell-1} \to L_\ell$$ indicates the top-bottom relation that decomposes a granule $$g_i$$ at level $$L_{\ell-1}$$ into a finer granule $$G$$ at level $$L_\ell$$. These operations are also called coarsening and refinement respectively.

The choice of the similarity function determines how the granules are grouped together, and thus which attributes they ultimately have in common. Since a granule is characterized by the attributes shared by its sub-granules, this means that the choice of a particular similarity function entails looking at a specific aspect of a system while ignoring others. The consequence is that information may be purposely discarded and R can become a “lossy” operation. In other words, if can define an operator $$\Vert G^\ell \Vert$$ that represents the “information content” of a granule at some level (e.g. number of attributes, instructions, etc.), then the following relation holds in general

$$\Vert G^{\ell-1} \Vert \leq \Vert G^\ell \Vert \tag{2}$$

Obviously, the above expression of an object being more or less granular than another only makes sense in relation to the choice of R, which exactly defines the correspondence between granules at different levels. Also, from a semantic point of view $$G^{\ell-1}$$ and $$G^\ell$$ must be equivalent. That is, if there existed an operation $$S(A_g)$$ that measures the semantic content of a granule with respect to some attributes at any given level, then the relation R must preserve such measure, or in other words, R must be an isomorphism with respect to S.

Put it simply, although coarser granules may contain less information than the corresponding finer granules, they both must represent the same “concept”, otherwise the implementation wouldn’t be correct. This implies that there is a limit on the amount (and type) of information that can be discarded from granules, which must still contain sufficient data to maintain semantic consistency. The lowest level of the hierarchy represents the implementation of the solution to the problem. At this level the granules are considered atomic, that is they can’t be further decomposed.

The structure in Figure 1 describes a system in terms of detail, that is the amount of information used to represent it at a specific level of granularity. However, we can also use an expanded view in terms of concreteness, which indicates the level of physicality of a representation in the detail dimension, that is how close it is to a real-world realization using physical objects. This structure can then be augmented by adding another dimension orthogonal to that of detail, as depicted in Figure 2.

Moving to a plane downwards will increase the level of concreteness up to the limit where the system is represented by a physical realization. Conversely, moving to planes upwards will increase the level of generalization up to the limit of pure computational level where the system is represented using pure abstract constructs (e.g. mathematical equations, informal language).

The level in between is an algorithmic representation where the system is defined using a series of instructions expressed in a symbolic language that precisely describe how the output is generated from the input. Each granule at this level can be associated to many computational and physical representations. For example, a system can be described using different mathematical equations or specification languages and be physically implemented on a variety of hardware architectures. Note that although only three levels are depicted in the representation of Figure 2, in real-world scenarios there may be other intermediate levels that could still be classified as one of those three types.

## Granular computing in software engineering

In software engineering, all these ideas occur in the application of the method of abstraction, which is one of the fundamental concept in computer programming. Abstracting the idiosyncrasies of the machines has been the main endeavor of software development since the dawn of digital computers and the concept of information granules is seamlessly applicable in this context. In fact, from an algorithmic point of view, all constructs used in computer programs can be seen as granules arranged together according to some similarity criterion. At the lowest implementation level these granules take the form of bits, conveniently grouped in bytes, machine words and instructions. Instructions are functionally grouped in procedures which, in turn, are semantically grouped in modules and, at the highest level, logically in programs.

On the algorithmic plane, details are increasingly discarded (hidden, to be precise) at coarser levels of granularity in order to translate the implementation of algorithms from the way human think to the way machines operate. At the same time, the design of a software system proceeds at varying level of concreteness from the most abstract computational level describing the “what”, by means of general specifications, to the algorithmic level defining the “how”, where the solution to the problem is precisely formalized in terms of operations to be performed, and down to the physical realization on specific hardware. As an example, consider the problem of improving the quality of a signal being output by a device prior to feeding it to another device. One possible computational representation can be the following

$$s_d(t) = F(s(t) + n(t)) = \begin{cases} \int_{-\infty}^{+\infty} h(\tau)s_n(t-\tau) d\tau \\ h(t) = ke^{-t^2 / 2\sigma ^ 2} \end{cases}$$

which is in mathematical form, or in equivalent plain English specifications as follows

1. The signal is corrupted by normally distributed additive noise
2. The signal is processed with a Gaussian noise reduction filter

At the topmost algorithmic level, the “denoise signal” solution could be very coarsely expressed using a high-level domain-specific language that, in a very expressive manner, describes the steps to be performed. By using a lower level general purpose language, such as C, more detail can be added to represent the algorithms with increased granularity. At an even more detailed degree, a very low level representation may be achieved using an Assembly language, until we reach the bottom level of the machine code where granules represent single bits. Figure 3 shows the granular hierarchy of this example.

It is evident how the semantics encoded in the algorithm becomes more cryptic the more we get closer to the implementation level, where it represents the nature of the machine. However, such semantics are preserved between levels, even if the granules (instructions or block of instructions with associated variables) have different representations. The two loop blocks represent the creation of a Gaussian filter and a convolution operation respectively, regardless of whether they are expressed in C, Assembly or machine code. And within these blocks, single instructions represent equations, algebraic operations and conditions equally well at each level. The similarity here is expressed in terms of “computational proximity” (all instructions needed to carry out a procedure are sequentially grouped together) and the relation R is a translation operation from one language to another.

Granular Computing plays a big role in the simplification of software development by reducing the friction in the human-machine interface. Humans perceive and interact with a complex fuzzy world that they model and manipulate through abstract concepts represented by a powerful (natural) language. This contrast with the deterministic simple nature of computers that only understand two symbols. There is a huge impedance mismatch between these two worlds. Manipulating complex abstract concepts with such a poorly expressive language is an enormous mental effort. The development of high level languages has provided increasing levels of simplification by introducing abstract constructs that the programmer can use to easily manipulate complex real-world entities at different degrees of detail, without even thinking about bits and registers.

A different incarnation of this computing paradigm can be found in object-oriented programming. Class hierarchies, abstract data types and polymorphism are all forms of granular computing, although their purpose is not that of “humanizing” the interface to machines but rather of organizing computer programs in such a way that it becomes easy to develop, use and maintain them. A class with its associated attributes and methods is essentially a granule where the intrarelationships are expressed in terms of similar functionality that are logically grouped together. A consequence of discriminating by functionality is the separation of concerns that allows defining clear operational boundaries between objects, thus avoiding “spaghetti code”. Not only OOP in particular, but structured programming in general is a method of “computing with granules”.

As an example, Figure 4 shows a possible object-oriented representation of media objects using a class hierarchy. The granular structure of this representation is immediately apparent and arises naturally in the process of refactoring functionalities common to formally different but semantically identical entities. In fact, audio, video, image and text have different forms but they all represent the same concept of “media object”, that is something that can be read, written, rendered, encoded and decoded. The relation R in this case is an inheritance operation, which is called specialization when top-bottom and generalization (or abstraction) when bottom-up.

But Granular Computing goes beyond the implementation aspect of software development. In fact, another, and probably more interesting, definition of Granular Computing is as a way of problem solving. It allows structured thinking at varying scales of specificity that facilitates the search for a solution from different points of view when dealing with complex entities. For example, business analysts can reason in terms of components of the system, their interfaces and interaction patterns, with business processes in mind, while coders on a more detailed scale can reason in terms of computer processes, threads, resource management, data structures, networking, etc. with focus on specific technologies. This allows the creation of formal representations of the abstracted entity describing its behavior and structure (i.e. models) at a level of granularity that’s the most appropriate to solve the problem.

Another interesting aspect of the application of Granular Computing lies in the fact that the representation of information granules at different levels of specificity means that the interaction patterns also follow the same level of detail. That is, we can observe different dynamics of a system at different scales by changing the level of granularity. This consequence is known as the emergent behavior effect and is of great importance in the study of large complex systems. In software development this effect is exploited for debugging and/or profiling by observing the response of computational entities with varying granularity, such as functions, objects, modules, threads, processes, etc.

And last, but not least, Granular Computing also occurs in software testing. Quality control of software systems, in fact, is done at different scales. At the lowest level of granularity unit testing is performed by controlling the response of code blocks (functions/methods) using code-related primitives as metrics (code coverage, exceptions, etc.) to guarantee structural integrity. On a higher level of detail, functional testing ignores code blocks and concerns with logical components to guarantee that application logic is carried out as per specifications using coarser metrics such as defect rate, usability, accessibility, etc.. And at the highest level of granularity, behavioral tests are performed on the system as a whole to guarantee the adherence to the business scope using business metrics such as visualizations, sales, installations, signups, etc.

## Conclusion

The concepts behind Granular Computing are not really new, as they have historically occurred in many fields in various forms, such as clustering in data mining, scale-space representation in DSP, dimension reduction in statistics, deep learning in AI, speech analysis in NLP, and many others. It is not a novel paradigm but rather the organization of these common ideas that have different implementations in different domains into a discipline for structured thinking and design. The examples given in this article show how Granular Computing is present in software engineering too in many shapes.

As technology to create artificial intelligence (AI) progresses, software systems will be transitioning from a machine-centric nature to a human-centric one. These intelligent systems will be inherently knowledge-intensive and characterized by the ability to understand and manipulate abstract concepts that are naturally used by humans. Inevitably, software development will be impacted in the approach to information processing as the manipulation of knowledge requires handling more complexity. Granular Computing certainly represents an interesting philosophy from which to draw inspiration to design such systems.

Published inSoftware Engineering