Understanding Memory Performance

Memory and Cache

Memory and Latency

  1. The Intel(R) Xeon(R) Platinum 8163 CPU with the frequency of 2.50 GHz has a 32 KB L1D cache, a 32 KB L1I cache, a 1 MB L2 cache, and a 32 MB L3 cache.
  2. Each cache has a stable latency.
  3. The latency increases exponentially across different caches.

Memory Bandwidth

-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 128728.5 0.134157 0.133458 0.136076
Scale: 128656.4 0.134349 0.133533 0.137638
Add: 144763.0 0.178851 0.178014 0.181158
Triad: 144779.8 0.178717 0.177993 0.180214
-------------------------------------------------------------
  1. The size of the memory array must be far greater than the size of the L3 cache; otherwise, the test results indicate the cache throughput.
  2. The number of CPUs is important. Generally, one or two cores cannot fully utilize the memory bandwidth. The memory bandwidth can be effectively tested only with all the CPUs of the computer. However, you can test the memory latency by running the stream algorithm with a single core.

Other

  1. Relationship between memory and NUMA: NUMA can effectively improve the memory throughput and reduce the memory latency.
  2. Selection of the stream algorithm compilation method: ICC compilation can effectively improve the performance score of memory bandwidth. The reason is that Intel has optimized CPU instructions to accelerate data read/write operations and instruction execution by means of instruction vectoring and instruction prefetching. ICC compilation can also be applied to other C code to provide instruction efficiency.

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to use Related Name attribute in Django Model?

Order the development of web and mobile applications from No-code studios: 3 times faster and…

Linked List in Java: How to Implement a Linked List in Java?

How I Create a Whatsapp REST API and CLI to Automate My Messages With Golang (Part 1)

Top 7 Static Code Analysis Tools

Connect Your Telegram to Boto

Why are we using AWS spot instances?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

JOOQ overview, setup and code generation

The new view of app architecture

Overview: Merge Requests

Evolving the State Design Pattern