Dubbo’s Cloud-Native Transformation: Analysis of Application-Level Service Discovery

By Liu Jun (Lugui), Apache Dubbo PMC


Since this new mechanism is so important, how does it work? Today, I will explain it in detail. In the initial community version, we gave this mechanism a mysterious name, that is, service introspection. I will further explain the origin of this name in the following sections and use “service introspection” to refer to this new application-level service discovery mechanism.

Developers familiar with Dubbo know that services have been defined by using Remote Procedure Call (RPC)-oriented methods. This is also the basis for Dubbo’s development of friendly and powerful governance features. So, why do we need to additionally define an application-level service discovery mechanism? How does this mechanism work? What is the difference between this mechanism and the existing one? What benefits can we seek from it? What are the benefits of cloud-native adaptation and performance improvement?

With all these questions in mind, let’s begin.

What Is Service Introspection?

  1. Which kind of model embodies application-level service discovery, and what is the difference between this model and the existing Dubbo service discovery model?
  2. Why do we call it service introspection?

The so-called “application and instance granularity” or “RPC service granularity” stresses a data organization format for address discovery.

Take Dubbo’s current address discovery data format as an example, which is an “RPC service granularity” format. It uses an RPC service as the key and the instance list as the value to organize data:

"RPC Service1": [
{"name":"instance1", "ip":"", "metadata":{"timeout":1000}},
{"name":"instance2", "ip":"", "metadata":{"timeout":2000}},
{"name":"instance3", "ip":"", "metadata":{"timeout":3000}},
"RPC Service2": [Instance list of RPC Service2],
"RPC ServiceN": [Instance list of RPCServiceN]

The new “application granularity-based service discovery” mechanism uses an application name as the key and the list of instances deployed by the application as the value. As a result, this introduces two differences:

  1. The data mapping relationship changes from RPC Service -> Instance to Application -> Instance.
  2. Less data is involved, and the registration center does not include the RPC service and its configuration information.
"application1": [
{"name":"instance1", "ip":"", "metadata":{}},
{"name":"instance2", "ip":"", "metadata":{}},
{"name":"instanceN", "ip":"", "metadata":{}}

To further understand the changes brought about by the new model, let’s take a look at the relationship between applications and RPC services. Typically, multiple RPC services may be defined in one application. Therefore, Dubbo’s previous service discovery granularity is more delicate, and more data entries are generated in the registration center, which is proportional to the RPC service. Meanwhile, this results in data redundancy to some extent.

After we briefly go through the basic working mechanism of application-level service discovery, let’s see why it is called “service introspection.”

To this end, let’s also begin with its working principle. As previously mentioned, the data model of application-level service discovery introduced the following changes: the data volume of the data center declines, RPC service-related data is removed from the registration center, and only application-level data and instance-level data are retained. To ensure that the absent RPC service data can still be correctly perceived by the consumer side, we have established a separate communication channel between the consumer and the provider. In this channel, the consumer and the provider exchange information through specific ports. Here, we regard the behavior that the provider actively exposes its own information as an introspection mechanism. Therefore, from this perspective, we name the entire mechanism “service introspection”.

Why Do We Need Service Introspection?

  • The model is aligned with mainstream microservice models in the industry, such as Spring Cloud and Kubernetes Native Service.
  • The model helps improve performance and scalability. The reorganization (reduction) of the registration center data can minimize the storage and push pressure on the registration center, reducing the address calculation pressure on the Dubbo consumer. Meanwhile, the cluster size becomes predictable and assessable. The size is independent of the number of RPC interfaces and only dependent on the scale of instance deployment.

1. Align With Mainstream Microservice Models

  • One is the automatic synchronization of the instance address because the service consumer needs to know the address to establish a connection.
  • The other is the automatic synchronization of the RPC method definition. The service consumer needs to know the specific definition of the RPC service, regardless of whether the service mode is representational state transfer (REST) or remote method invocation (RMI.)

For data synchronization between RPC instances with the help of the registration center, the REST mode has defined an interesting maturity model. If you are interested, click this link for reference.

According to the definition of 4-level maturity in the referenced article, Dubbo’s current interface-level model corresponds to level 4.

Next, let’s see how Dubbo, Spring Cloud, and Kubernetes are designed around the goal of automated instance address discovery.

2. Spring Cloud

Currently, RPC service information is negotiated by offline agreement or offline management systems. The pros and cons of this schema are summarized below:

  • Advantages: The deployment structure is clear and the workload of the address push is low.
  • Disadvantages: Address subscription requires the specification of the application name, and provider application changes (splits) need to be consumer-aware. In addition, RPC calls cannot be synchronized automatically.

3. Dubbo

4. Dubbo + Kubernetes

  • Service registration is taken over by the platform. As a result, the provider no longer needs to care about service registration.
  • Service discovery on the consumer side will be Dubbo’s focus. By interfacing with API Server and the Domain Name System (DNS) at the platform layer, the Dubbo client can query a set of endpoints (a group of pods that run the provider) through a service name (usually corresponding to the application name) and trigger Dubbo’s built-in load balancing capability by mapping the endpoints to Dubbo’s internal address list.

As an abstract concept, how to map a Kubernetes service to Dubbo is worth discussing.

In the case of mapping Service Name to Application Name, Dubbo applications have a one-to-one correspondence to Kubernetes services. Moreover, these applications are transparent to microservice O&M and construction and are decoupled from the development stage.

apiVersion: v1
kind: Service
name: provider-app-name
app: provider-app-name
- protocol: TCP
targetPort: 9376

In the case of mapping Service Name to Dubbo RPC Service, Kubernetes maintains the binding of the scheduled service and the application’s built-in RPC service, so the number of services to be maintained increases.

apiVersion: v1
kind: Service
name: rpc-service-1
app: provider-app-name
ports: ##
apiVersion: v1
kind: Service
name: rpc-service-2
app: provider-app-name
ports: ##
apiVersion: v1
kind: Service
name: rpc-service-N
app: provider-app-name
ports: ##

Based on the analysis of the preceding different microservice framework models, we can find that in the abstract definition of microservices, Dubbo is quite different from other products, such as Spring Cloud and Kubernetes. Spring Cloud and Kubernetes adopt similar microservice model abstraction methods. The two products only care about the synchronization of instance addresses. If we look into some other service framework products, we will find that most of them are designed in the same way, that is, at level 3 in the REST maturity model.

In contrast, Dubbo is special as its design aims at the granularity of RPC services. It corresponds to level 4 in the REST maturity model.

As shown in the detailed analysis of each model, each model has its pros and cons. The reason why we believed that Dubbo had to make changes and align itself with other microservice discovery models was that when we first determined Dubbo’s cloud-native solution, we found that Dubbo needed to support Kubernetes Native Service, which requires model alignment as a prerequisite. Another reason is the demand from the user side for Dubbo’s scenario-based engineering practices. Thanks to Dubbo’s support for multi-registration and multi-protocol capabilities, Dubbo can connect different microservice systems. However, the inconsistency of service discovery models has become one of the obstacles.

5. Microservice Clusters at a Larger Scale: Solve Performance Bottlenecks

The left side of the figure shows the typical workflow of a microservices framework. In this framework, the provider and consumer implement automated address notification through the registration center. The table in the figure shows the provider instance information.

The application DEMO contains three interfaces, DemoService 1, 2, and 3. The IP address of the current instance is

  • For Spring Cloud and Kubernetes models, the registration center stores only one piece of data, DEMO —
  • For the old Dubbo model, the registration center stores three pieces of interface-level data corresponding to interfaces DemoService 1, 2, and 3. In this case, lots of repeated address data occur.

We can conclude that the amount of data stored and pushed by the application granularity-based model is proportional to the number of applications and instances. Only when the number of applications or the number of application instances increases will the pressure of address push increase.

For an interface granularity-based model, the amount of data is positively correlated with the number of interfaces. Given that an application usually carries multiple interfaces, the order of magnitudes of the interface-level model needs to time a multiplier to compare to that of the application-level model. Another key point is that interface granularity leads to an opaque evaluation of the cluster size. Compared with the growth in the number of instances and applications, which is usually included in O&M planning, the definition of the interface is more of the internal behavior of the service side, which can bypass the evaluation and impose pressure on the cluster.

Take a consumer-side service subscription as an example. According to my rough statistics on some medium- to large-scale Dubbo head users in the community and based on the actual scenario of the target companies, a consumer application needs to consume (subscribe to) more than 10 provider applications, or specifically, the number of interfaces to be consumed (subscribed) reaches 30. On average, the three interfaces subscribed by the consumer come from the same provider application. In this way, if the application granularity is used as the basic unit of address notification and addressing, the average address push and calculation volume will drop by more than 60%.

In extreme cases, when more consumer-side consumption interfaces come from the same application, the address push and memory consumption volume will be further reduced, with a potential reduction of even more than 80%.

A typical scenario is the gateway application in the Dubbo system. Some gateway applications consume (subscribe to) more than 100 applications, while the number of the consumption (subscription) services is more than 1,000. On average, 10 interfaces come from the same application. If we change the granularity of address push and calculation to the application level, the amount of address push will change from n 1000 to n 100, with a reduction by nearly 90%.

Working Principle

1. Design Guidelines

In my opinion, we must continue to adhere to the following design guidelines during service model migration:

  • The new service discovery model needs to realize imperceptible migration to original Dubbo consumer-side developers. Dubbo still needs to orient to RPC service programming and RPC service governance and be completely imperceptible to the user side.
  • Meanwhile, Dubbo needs to develop an automatic RPC service metadata coordination mechanism between the consumer and the provider to solve the problem that traditional microservice models cannot synchronize RPC-level interface configurations.

2. Detailed Explanation of Basic Principles

The main differences are listed below:

  • The registration center data is organized in the format of the “application-instance list” and no longer contains RPC service information.

The following example shows the metadata of each instance. The general principle is that the metadata contains only information related to the current instance node and excludes RPC service-level information.

The general information mainly contains these items: the instance address, instance environment variables, the metadata of metadata services, and several other necessary properties.

"name": "provider-app-name",
"id": "",
"address": "",
"port": 20880,
"sslPort": null,
"payload": {
"id": null,
"name": "provider-app-name",
"metadata": {
"metadataService": "{\"dubbo\":{\"version\":\"1.0.0\",\"dubbo\":\"2.0.2\",\"release\":\"2.7.5\",\"port\":\"20881\"}}",
"endpoints": "[{\"port\":20880,\"protocol\":\"dubbo\"}]",
"storage-type": "local",
"revision": "6785535733750099598",
"registrationTimeUTC": 1583461240877,
"serviceType": "DYNAMIC",
"uriSpec": null
  • The client and the server negotiate the RPC method information by themselves.

After the registration center no longer synchronizes the RPC service information, service introspection sets up a built-in RPC service information negotiation mechanism between the service consumer and the provider. This reflects the origin of the name “service introspection”. The server-side instance exposes a predefined MetadataService RPC service, and the consumer obtains the configuration information related to the RPC method of each instance by calling MetadataService.

Currently, the format of data returned by MetadataService is:

"dubbo:// anyhost=true&application=demo-provider&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=org.apache.dubbo.demo.DemoService&methods=sayHello&pid=9585&release=2.7.5&side=provider&timestamp=1583469714314",
"dubbo:// anyhost=true&application=demo-provider&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=org.apache.dubbo.demo.DemoService&methods=sayHello&pid=9585&release=2.7.5&side=provider&timestamp=1583469714314",
"dubbo:// anyhost=true&application=demo-provider&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=org.apache.dubbo.demo.DemoService&methods=sayHello&pid=9585&release=2.7.5&side=provider&timestamp=1583469714314"

For developers that are familiar with Dubbo’s RPC service granularity-based service discovery model, they can find that the service introspection mechanism splits the uniform resource locator (URL) used to be transmitted by the registration center into two parts:

  • One part of the data related to the instance is still kept in the registration center, such as the IP address, port number, and machine identifier.
  • The other part of data related to RPC methods is removed from the registration center and exposed to the consumer through MetadataService.

Ideally, a URL can be strictly divided into instance-related data and RPC service-related data. However, you can clearly see that data redundancy occurs in the implemented version, and some data failed to be rationally divided. This issue is especially true for MetadataService. As you can see, the returned data is a URL list assembly, which contains full data.

The following figure shows the complete workflow for service introspection, which details the collaboration process among service registration, service discovery, MetadataService, and RPC calls.

  1. When the service provider starts, it first parses the “ordinary services” defined by the application and registers them one by one as RPC services. Then, it registers the built-in MetadataService and finally opens the TCP listening port.
  2. After the service provider is started, instance information (which includes only instance-related data, such as the IP address and port number) is registered to the registration center. At this point, the startup of the provider is completed.
  3. When the service consumer starts, it queries the address list in the registration center according to the application name of the provider to be consumed and completes the subscription to implement the automatic notification of subsequent address changes.
  4. Once the consumer obtains the address list, it initiates a call to MetadataService. The returned result contains all the “ordinary services” defined by the application and their related configuration information.
  5. At this point, the consumer can receive external traffic and initiate Dubbo RPC calls to the provider.

The preceding workflow only considered a case where everything went smoothly. However, in a more specific design or coding implementation, we need to strictly stipulate framework behavior for certain unexpected cases. For example, if the consumer fails to call MetadataService, it will not be able to receive external traffic until the retry is known to be successful.

Key Mechanisms in Service Introspection

1. The Metadata Synchronization Mechanism

  • Built-in MetadataService: MetadataService is exposed through the standard Dubbo protocol. It returns the “ordinary service” configuration in the memory to the consumer according to the query conditions. This step occurs before the consumer address is selected and called.
  • Metadata center: The metadata center introduced in version 2.7 is reused. After the provider instance starts, it tries to organize the internal RPC service into the metadata center as a piece of metadata. The consumer will actively query the metadata center each time it receives a push update from the registration center.

Note: The timing for the consumer to query the metadata center is after the notification of the address update of the registration center is received. Through the data issued by the registration center, we can know when the metadata of an instance has been updated. Only at this point will you need to query the metadata center.

2. The Two-Way Relationship Between RPC Services and Application Mapping

This is sample code from the legacy consumer development and configuration practice:

<!-- The framework directly queries or subscribes to the address list in the registration center through RPC Service 1/2/N. -->
<dubbo:registry address="zookeeper://"/>
<dubbo:referenceinterface="RPCService1" />
<dubbo:referenceinterface="RPCService2" />
<dubbo:referenceinterface="RPCServiceN" />

This is sample code from the new consumer development and configuration practice:

<!-- The framework can only query or subscribe to the address list in the registration center through RPC Service 1/2/N and additional provided-by="provider-app-x". -->
<dubbo:referenceinterface="RPC Service 1"provided-by="provider-app-x"/>
<dubbo:referenceinterface="RPC Service 2"provided-by="provider-app-x" />
<dubbo:referenceinterface="RPC Service N"provided-by="provider-app-y" />

The method of specifying the provider application name in the preceding example is the current practice of Spring Cloud. It requires the developer on the consumer side to explicitly specify the provider application to be consumed.

The root cause of the preceding problem is that the registration center does not know any information related to the RPC service. As a result, it can only query the application by the application name.

To make the entire development process more transparent to legacy Dubbo users while avoiding the impact of the specified provider on scalability (see below for details), we have designed a set of mapping relationships between RPC services and application names to automatically complete the conversion from RPC services to provider application names on the consumer side.

The reason for establishing a mapping relationship between interfaces and applications in Dubbo is that the mapping relationship between services and applications is not definite. A typical scenario is application-service splitting. For example, the preceding configuration defines PC Service 2 as a service in provider-app-x. In the future, the service may be split by developers into another application, such as provider-app-x-1. This split needs to be perceived by all PC Service 2 consumers, and the application needs to be modified and upgraded accordingly, which is costly.

Whether to use the Dubbo framework to help developers solve this problem transparently or leave the problem to developers is only a matter of strategic choice. Currently, both options are available in Dubbo 2.7.5 and later versions. I prefer to leave it to service developers by leveraging organizational constraints. This approach can further reduce the complexity of the Dubbo framework and improve runtime stability.

Summary and Prospects

We hope that Dubbo will retain its strengths in simple programming and service governance based on the new model. Meanwhile, we must note that the application granularity-based model increases complexity and requires further optimization and enhancement. On the other hand, in addition to the address storage and push, further application granularity exploration is required as it can potentially help Dubbo in addressing.

About the Author

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.