# THE METHOD OF CONTROLLING COUNTER RESTRUCTURE IN PARALLEL PROCESSING SYSTEM

Chu Duc Toan' Electric Power University

#### ABSTRACT

In parallel processing systems, the efficient use of system resources is an important requirement. Improving performance and increasing speed are related to many issues, both hardware and software [1, 2]. The analysis of processing system operation shows which affects the performance and speed of processing system: During referencing to memory, the processor uses only a command cycle in order to require to read or write data into memory, then wait for the completion of memory cycle before next memory access. Therefore, CPU speed is not taken full advantage, memory access conflicts occur when two or more components simultaneously access to a memory location. This paper proposes the method of controlling counter restructure to meet the requirements of information processing speed. The model used is a restructure controller with FPGA technology. The solution of speed increase is done by maintaining the maximized chain of memory access requests.

**Keywords**: restructure controller with FPGA technology; speed; parallel processing system; performance: the mechanism of parallel memory controller.

#### INTRODUCTION

Current processing systems have a big difference between the operation speed of processor and that of memory operations. This rate is generally from 5 to 15 times [4, 5]. To take full advantage of processor time, the memory is organized in parallel as an interleaving model with S-access memory architecture. This is a solution for memory conflicts in accordance with parallel memory models in parallel processing systems.

S-access model using lower interleaving address order is described in Figure 1. S-access method allows all modules to be accessed simultaneously. Each module is

associated with a data latch. The data from each module is delivered through latch and multiplexer to a single data channel.

Figure 2 shows a time graph for many- word reading accesses using S- access configuration. Suppose that the memory access time  $T_a$  and latch delay time  $\tau$ , time to access a single data word is  $T_a+\tau$ . However, the total time to access the next string k word, starting at module i, is  $T_a+k$  tif  $i+k \le M$ , and opposite case is  $2T_a+(i+k-M)\tau$ . In both cases, (M is the ratio of CPU speed and memory speed). The condition to access vectors efficiently is  $M\tau \le T_a$  if not data overflow will occur [3].



Figure 1. S - Access Memory configuration

Tel. 0982917093, Email: toancd@epu edu.vn



Figure 2. Time scheme for S - access configuration



Figure 3. Parallel memory control structure in M coefficient combining method



Figure 4. Rearrangement of control block architecture in FPGA technology for memory module  $M = 16 = 2^4$ 

## THE STRUCTURE OF PARALLEL MEMORY CONTROL SYSTEM

Considering a typical and enough structure for a system shown in Figure 3. Assuming the ratio of CPU speed and memory speed is  $M,\chi$  is the probability that a next request accesses the next memory modules,  $\theta$  is the probability that a next request accesses a determined memory module, but not the next memory module.

To improve the performance of data flow, considering the characteristics of the system with the assistance of FPGA technology, we can create software architecture for information collecting and information processing. Parallel memory control structure followed the combination method with M factor and restructure control block in FPGA

technology for memory module M = 16 is illustrated in Figure 3 and Figure 4.

### Processing mode:

In this mode, specific control partition for address channel and system control channel is described in detail in Table 1

Where: Addr Multiplexer is a pointer to memory module corresponding requirements and SCAN Multiplexer allows right access as scanning cycle to ensure memory recovery time.

## \* Information collection:

When switching to information collecting, specific control partition for address channel and system control channel in this mode is described in detail in Table 2

Table 1. Address processing mode control

|   | Circuit connection                            | E <sub>0</sub> | Ad             | dr Multip      | lexer                | Memory module      |                       |  |  |
|---|-----------------------------------------------|----------------|----------------|----------------|----------------------|--------------------|-----------------------|--|--|
| ı |                                               |                | a <sub>0</sub> | a <sub>1</sub> | 2 (2 -1)             | A0                 | Al                    |  |  |
|   | Address channel and<br>system control channel | 1              | A0             | Al             | A(2 <sup>m</sup> .1) | A(2 <sup>m</sup> ) | A((2 <sup>m</sup> +1) |  |  |

Table 2 Address information collecting mode control

| Circuit connection                            | E <sub>0</sub> | Addr Multiplexer      |                       |         | Memory module |    |         |       |          |  |
|-----------------------------------------------|----------------|-----------------------|-----------------------|---------|---------------|----|---------|-------|----------|--|
| Circuit connection                            |                | <i>a</i> <sub>0</sub> | <i>a</i> <sub>1</sub> | a(2 -1) | $A_{0}$       | A, | A(2"-1) | A(1") | A((2"+1) |  |
| Address channel and<br>system control channel | 0              |                       |                       | ×       | Ao            | Aı | A(2".1) | A(2") | A((2"-1) |  |



Figure 5. Time graph of information recording into memory



Figure 6. The time graph for the process of reading information from memory

After collection is complete, to read the data just follow 2- step algorithm:

Step 1. Copy data from 16 memory modules

Step 1. Copy data from 16 memory modules in corresponding position on each other. As a result, we have 16 data regions containing same content.

Step 2. Composite address channel as table by FPGA technology. The result is that memory space is organized into 16 parallel standard memory modules and reading process conducts as normal.

Suppose to retrieve data as the order of memory access request sequence with address 00, 02, 04, 06, 08, 10, 12, 14, 16 ... then the system will ignore the modules # 1, # 3, # 5, # 7, # 9, # 11, # 13, # 15, although there are full of original data.

In this case the length of request string k is only 8, equal to half of maximum value of k (= 16).

CONCLUSION

This paper proposes degradation processing system. When the task of processing is only

in one direction, the system optimization is nearly absolute by the aid of FPGA technology with system architecture rearrangement technique. Architecture rearrangement control system by FPGA always requires data read/ write line to achieve maximum value k = max = constant.

#### REFERENCES

- [1]. Barry W. (1996), "Computer architecture design and performance", University of North Carolina, Prentice Hall, New York.
- [2]. Chou Y., Fahs B., AND Abraham S (2004), "Microarchitecutre optimizations for exploiting memory-level parallelism", ACM pp. 29-70.
- [3]. Hamacher, C., Vranesic, Z., Zaky, S. (2002), Computer Organization, McGraw-Hill, Inc., New York.
- [4]. Mehdi R. Zargham, (2001), Computer Architecture Single and Parallel Systems, Southern Illinois University, Prentice-Hall. Inc., London.
- [5]. Rao G. S. (1998), "Performance Analysis of Cache Memories." Journ. of Assoc. of Comp. Mach., vol. 25. no.3, pp. 378-397.

## MỘT PHƯƠNG PHÁP ĐIỀU KHIỂN TÁI KIẾN TRÚC BỘ ĐẾM TRONG HỆ XỬ LÝ SONG SONG

Chu Đức Toàn

Day học Điện hợc

### TÓM TẤT

Trong các hệ xư lý song song, việc sử dụng hiệu quá các tài nguyên hệ thống là yêu cầu hhết sức quan trong. Việc nâng cao hiệu năng, năng cao tốc độ gồm nhiều vớn để, liên quan cá đến phần cứng và phần mềm [1, 2]. Phân tích hoạt động của hệ xư lý cho thấy nguyên nhân làm ảnh hướng đền hiệu năng, tốc độ của hệ xư lý là: quá trính tham chiếu đến bộ nhỏ, bộ xử lý chỉ sử dụng một chu ký lệnh để yêu cầu đọc hoặc ghi đối liệu vào bộ nhỏ, sua độ phái chờ chu ký bộ nhỏ kết thúc trước khi có thể truy cập tiếp bộ nhỏ. Do đó, không tận dụng triệt để tốc độ của CPU; xung đột truy cập bố nhỏ xáy ra khi có hai hạy nhiều thành phần đồng thời trụy cập tối một vị trí nhỏ Bài bảo để xuất phương pháp điều khiển tái kiến trước bộ đểm nhằm đấp ứng yêu cầu tốc độ xử lý thông tin. Mô hình được đừng là bộ điều khiển tái kiến trước bật đếm nhằm đấp ứng yêu cầu tốc độ xử lý thông địno thực họng bà phá pháp điệu khiển tái kiến trước bật đến nghật PPGA, giải pháp tăng tốc độ được thực hiện bằng cách duy trí chuỗi yếu của truy cập bộ nhỏ luôn đạt que đại:

Từ khóa: Điều khiển tái kiến trúc bằng cóng nghệ FPGA; tốc độ; hệ xử lý song song; hiệu năng; cơ cấu điều khiển bỏ nhớ song song.

Ngày nhân bài: 24/2/2012, ngày phán biện: 14/3/2012, ngày duyệt dãng 12/6/2012

21

<sup>\*</sup> Tel. 0982917093, Email: toancd@epu edu.vn