Understanding AMBA CHI: The Backbone of Modern Coherent SoCs

- 12월 15, 2025

The AMBA CHI protocol has revolutionized how we design complex System-on-Chips (SoCs), especially those requiring high performance and robust cache coherency. Let's dive into what makes CHI stand out and why top SoC designers are choosing it.

What is AMBA CHI?

AMBA Coherent Hub Interface (CHI) is a high-performance, cache-coherent interconnect protocol developed by Arm as part of its AMBA 5 standard. Introduced in 2013, CHI was engineered to meet the escalating demands of modern SoCs, which often feature multiple processing cores, shared caches, and intricate memory systems. Its core mission is to ensure seamless and efficient communication between these components.

CHI vs. ACE: A Leap Forward

CHI evolved from its predecessor, AMBA ACE (AXI Coherency Extensions), bringing significant architectural and performance enhancements. Here's why CHI is generally preferred:

Packet-Based Design: Unlike ACE's signal-level approach, CHI uses a packet-based architecture. This makes it far more scalable and efficient for complex, large-scale SoCs.
Reduced Complexity & Wire Count: CHI employs four unified channels (REQ, SNP, DAT, RSP) which streamline communication and dramatically reduce the number of wires needed compared to ACE.
Layered Architecture: CHI offers a well-defined layered structure (Protocol, Network, Link), allowing for flexibility and optimization in different parts of the SoC.
Efficient Flow Control: It uses a link-credit system, which is more efficient than ACE's valid/ready handshakes, and incorporates protocol-centric flow control mechanisms to prevent deadlocks and improve data flow.
Advanced Coherency Features: CHI supports sophisticated cache coherency states, atomic operations (like LDADD in CHI-B), and cache stashing, enabling finer control and better performance.

Why Top SoC Companies Choose CHI

The advantages of CHI translate directly into tangible benefits for SoC designers:

Scalability: Its packet-based and layered design makes it ideal for increasingly complex SoCs with many cores and accelerators.
Performance: Optimized flow control, efficient coherency management, and features like atomic operations lead to higher throughput and lower latency.
Power Efficiency: The streamlined architecture and reduced wire count can contribute to lower power consumption.
Flexibility: The separation of protocol and transport layers allows designers to tailor the interconnect to specific performance, power, and area targets.

This makes CHI a go-to choice for applications in mobile, automotive, networking, and data centers, where performance and efficiency are paramount. Extensions like CHI C2C are also being developed to enable chiplet architectures, further enhancing its applicability in future designs.

CHI Protocol Architecture: RTL, Channels, and Ports

In CHI, components are referred to as Nodes connected via Interfaces, rather than the traditional master/slave model. The communication is managed through distinct channels, each serving a specific purpose:

REQ (Request) Channel: Used for sending control signals and transaction requests, such as read, write, or cache maintenance operations.
SNP (Snoop) Channel: Essential for maintaining cache coherency. This channel carries snoop requests from the interconnect or master nodes to other nodes to check their cache states.
DAT (Data) Channel: This channel is dedicated to transferring actual data, used for both write data being sent to memory/caches and read data being returned to the requester.
RSP (Response) Channel: Conveys status signals back to the requester, confirming the completion of a transaction and its success or failure.

These channels work in concert within the layered communication model to ensure robust and efficient data transfer.

CHI Protocol Transaction Example: A Simple Read

Let's illustrate a basic read transaction that doesn't require snooping, often called a ReadNoSnp.

Initiation: A Requester (e.g., a CPU core) wants to read data from a specific memory address. It sends a ReadNoSnp request packet on the REQ channel to the interconnect. This packet includes the address and the size of the data to be read.
Processing: The interconnect routes this request to the appropriate destination, such as a memory controller. The memory controller then fetches the requested data.
Data Transfer: The fetched data is sent back from the memory controller to the Requester via the DAT channel.
Confirmation: The memory controller sends a confirmation packet on the RSP channel to the Requester, indicating that the read operation has been successfully completed.

For atomic operations (like Load-Add, where data is read, a value is added, and the result is written back), the process is more complex. For example, a LDADD (Load-Add) operation would involve the Requester sending the operation details (address, value to add) on the REQ channel. The memory controller or an intermediate node performs the addition, writes the new value back, and returns the original value to the Requester via the DAT and RSP channels, all as a single, atomic transaction.

이 블로그 검색

Brave Ogu

구글 안티그래비티 완전 분석 — 모델·요금제·CLI 총정리