Getting Started with Service Mesh: Origin, Development, and Current Status


  • In the past, only a monolithic application needed to be deployed and managed. Now, it has been split into several parts, causing O&M costs to rise exponentially.
  • Originally, modules could interact directly through calls within the application (inter-process communication). Now, they are split into different processes or even different nodes and can only communicate through remote procedure calls (RPCs).

What Is Service Mesh?

The Birth of Service Mesh

Service Mesh

  • Dedicated infrastructure layer: A service mesh is not designed to solve business issues, but is a dedicated infrastructure layer (middleware).
  • Service-to-service communication: A service mesh is designed to handle service-to-service communication.
  • Reliable delivery of requests: Why is special processing necessary for service-to-service communication? A service mesh aims to ensure the reliable delivery of requests between services even when the network is unreliable.
  • Cloud native application: From the beginning, service mesh was created for modern cloud native applications and targeted future technology development trends.
  • Network proxies: Typically, a service mesh is implemented as an array of lightweight network proxies, without the awareness of the application.
  • Deployed alongside application code: The network proxies must be deployed with the application for one-to-one service. If communication between the application and the proxy is remote and unreliable, we have not addressed the issue.

The Pattern of Service Mesh

Why Do We Need Service Mesh?

The Rise of Microservices

  • Single responsibility: After an application is split up, a single microservice is usually only responsible for a single highly cohesive and self-contained function. Therefore, it is easy to develop, understand, and maintain.
  • Flexible architecture: Different microservice applications are basically independent in terms of technology selection, allowing each to select the most suitable technology stack.
  • Isolated deployment: In contrast to a giant monolithic application, an individual microservice application has significantly less code and output, facilitating continuous integration and rapid deployment. At the same time, through process-level isolation, microservice applications can continue to function if a peer application encounters a fault. This makes them more fault tolerant than monolithic applications.
  • Independent expansion: In the era of monolithic applications, if a module has a resource bottleneck (such as CPU or memory), it could only be expanded by expanding the entire application, resulting in significant resource waste. In the era of microservices, we can expand individual microservices as needed for more precise scaling.

How Can We Find Service Providers?

How Can We Ensure the Reliability of RPCs?

How Can We Reduce the Service Call Latency?

How Can We Ensure the Security of Service Calls?

Service Communication: Stone Age

  • Service discovery: helps you find services to call.
  • Circuit breaker: mitigates unreliable dependencies between services.
  • Load balancing: delivers requests more promptly by evenly distributing traffic.
  • Secure communication: involves transport layer security (TLS), identification (certificates and signatures), and role-based access control (RBAC).
  • Reinventing the wheel: How can we concentrate on business innovation if we need to write and maintain a large amount of non-functional code?
  • Business coupling: Service communication logic and business code logic are mixed together, resulting in some strange distributed bugs.

Service Communication: Modern Times

  • Not completely transparent: Programmers still need to correctly understand and use these libraries and the learning cost and the probability of errors are still very high.
  • Limited technology selection: After applying these technologies, an application will often be bound to the corresponding language. and framework (vendor lock-in).
  • High maintenance costs: When libraries are upgraded, the application must be built and deployed again. This is annoying and can cause faults.

Service Communication: The New Age

Mainstream Implementations of Service Mesh

Overview of Mainstream Implementations

  • Linkerd: This implementation was developed by Buoyant in Scala. On January 15, 2016, it was initially released and joined CNCF on January 23, 2017. On May 1, 2018, Linkerd 1.4.0 was released.
  • Envoy: This implementation was developed by Lyft in C++ 11. On September 13, 2016, it was initially released and joined CNCF on September 14, 2017. On March 21, 2018, Envoy 1.6.0 was released.
  • Istio: This implementation was developed by Google and IBM in Go. On May 10, 2017, it was initially released. On March 31, 2018, Istio 0.7.1 was released.
  • Conduit: This implementation was also developed by Buoyant in Rust and Go. On December 5, 2017, it was initially released. On April 27, 2018, Conduit 0.4.1 was released.


  • Dynamic routing: The downstream target service can be determined based on the upstream service request parameters. In addition to conventional service routing policies, Linkerd can support canary release, A/B testing, environment isolation, and other such scenarios through its dynamic routing capabilities.
  • Service discovery: After the target service is determined, the next step is to obtain the address list of the corresponding instances (such as by querying the service registry).
  • Load balancing: If there are multiple addresses in the list, Linkerd will select one appropriate low-latency instances through the load balancing algorithm (such as Least Loaded or Peak EWMA).
  • Request execution: Send a request to the instance selected in the preceding step and record the latency and response.
  • Retry: If the request does not receive a response, select another instance to retry it (Linkerd must know that the request is idempotent).
  • Circuit breaking: If requests sent to an instance often fail, the instance is automatically removed from the address list.
  • Timeout: If the request times out (no result is returned before the given deadline), a failure response is returned automatically.
  • Observability: Linkerd continuously collects and reports behavioral data, including Metrics and Tracing.


  • High performance: Envoy is written based on the local code (C++ 11), making it faster than Linkerd based on Scala.
  • Scalable: Both L4 and L7 proxy functions are based on the pluggable Filter Chain mechanism (similar to Netfilter and servlet filter).
  • Protocol upgrade: It supports two-way and transparent HTTP/1 to HTTP/2 proxies.
  • Other features: service discovery (ensures eventual consistency), load balancing (supports region awareness), stability (supports retry, timeout, circuit breaking, speed limit, and anomaly detection), observability (supports statistics, logs, and tracing), and easy debugging.


  • Envoy: forms the data plane (other components form the control plane) and can be replaced by other proxies (such as Linkerd or nginMesh).
  • Pilot: is responsible for traffic management and provides the platform-specific service model definition, APIs, and implementations.
  • Mixer: is responsible for policies and controls. Its core functions include PreCheck, quota management, and telemetry reporting.
  • Istio-Auth: supports RBAC permission control at multiple granularities and two-way SSL authentication, including identification, communication security, and key management capabilities.

Istio Components — Pilot

Istio Component — Mixer

Istio Component — Auth


  • Ultralight and blazingly fast: The data plane of Conduit is written in Rust, making it incredibly small, fast, and secure. Compared with C and C++, the biggest advantage of Rust is its security. Each proxy requires less than 10 MB of memory (RSS) and has sub-millisecond p99 latency, providing the functionality of service mesh without the cost.
  • Security from the start: From Rust’s memory security to default TLS, Conduit is built to provide secure cloud-native environments from the ground up.
  • End-to-end visibility: Conduit automatically measures and aggregates service success rates, latencies and request volumes, giving you an unfettered view of service behavior across your infrastructure without having to change the application code.
  • Kubernetes enhanced: Conduit adds reliability, visibility, and security to your Kubernetes cluster, giving you control of the runtime behavior of your applications.


Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store