Abstract
A multi-stage algorithm is a computing algorithm consisting of a sequence of nested loop constructs to be executed sequentially. In this paper, a systematic approach to address the multi-stage systolic mapping problem is proposed. To reduce the inter-stage data communication overhead, we argue that the adjacent stages should have matched I/O interface. For this, the conditions of I/O matching between two stage's mapping are established. A systematic method to derive the I/O matched mapping is also presented. To improve the performance degradation due to theinitiation andconclusion phases of computation in systolic array, a technique calledchaining which tries to overlap part of the computations in successive stages and thus effectively reduces the computation latency is employed. With these results, the multi-stage mapping problem is formulated as an optimization problem and a heuristic search based multi-stage systolic mapping (MSSM) tool is developed. Several design examples are presented to illustrate the potential use of MSSM.
Similar content being viewed by others
References
R. Kuhn, “Transforming algorithms for single stage and VLSI architectures,” inWorkshop on Interconnection Networks for Parallel and Distributed Processing, 1980. IEEE Press, Piscataway, NJ, pp. 11–19.
D. Moldovan, “On the analysis and synthesis of VLSI algorithms,”IEEE Trans. on Computers, vol. C-31, 1982, pp. 1121–1126.
D. Moldovan, “On the design of algorithms for VLSI systolic arrays,”Proc. of IEEE, vol. 71, 1983, pp. 113–119.
D. Moldovan and J. Fortes, “Partitioning and mapping algorithms into fixed size systolic arrays,”IEEE Trans. on Computers, vol. C-35, 1986, pp. 1–12.
W. Miranker and A. Winkler, “Spacetime representations of computational structures,”Computing, vol. 32, 1984, pp. 93–114.
P. Quinton, “Automatic synthesis of systolic arrays from uniform recurrent equations,” inProc. 11th Annu. Symp. Comput. Architecture, 1984, pp. 208–214.
P. Quinton, “Mapping recurrences on parallel architectures,” inProc. Third Int. Conf. Supercomputing, International Supercomputing Institute, Inc. Florda, May 1988, pp. 1–8.
G. Li and B. Wah, “The design of optimal systolic arrays,”IEEE Trans. on Computers, vol. C-34, 1985, pp. 66–77.
M. Chen and C. Mead, “Concurrent algorithms as space-time recursion equations,” inProc. USC Workshop VLSI Modern Signal Processing, Los Angeles, CA, 1982, pp. 31–52.
S. Kung, “On supercomputing with systolic/wavefront array processors,”Proc. of IEEE, vol. 72, 1984, pp. 867–884.
H. Kung and W. Lin, “An algebra for VLSI algorithm design,” inConf. on Elliptic problem solvers, 1983.
P. Cappello and K. Steiglits, “Unifying VLSI array designs with geometric transformation,” inInt'l. Conf. on Parallel Processing, IEEE Press, Piscataway, NJ, 1983, pp. 448–457.
P. Lee and Z. Kedem, “Synthesizing linear array algorithms from nested for loop algorithms,”IEEE Trans. on Computers, vol. 37, 1988, pp. 1578–1598.
M. Lam and J. Mostow, “A transformational model of VLSI systolic design,”Computer, 1985, pp. 42–52.
J. Fortes, K. Fu, and B. Wah, “Systematic approaches to the design of algorithmically specified systolic arrays,” inInt'l. Conf. on Design Automation, IEEE, 1985, pp. 300–303.
S. Rao and T. Kailath, “What is a systolic algorithm,”SPIE, vol. 614, 1986, pp. 34–48.
A. Athavale, J. JáJá, and J. Rowlett, “Compiling programs for systolic arrays,” inVLSI Signal Processing III, IEEE Press, Piscataway, NJ, 1988, pp. 509–519.
S. Kung,VLSI Array Processors, Englewood Cliffs, NJ: Prentice Hall, 1987.
A. Schrijver,Theory of Integer and Linear Programming, New York: Wiley and Sons, 1988.
X. Zhong, I. Wong, and S. Rajopadhye, “Bounds on the number of linear allocation functions,” inVLSI Signal Processing IV, IEEE Press, 1990, pp. 85–94.
Y. Wong and J.-M. Delosme, “Optimization of processor count for systolic arrays,” Technical Report YALEU/DCS/RR-697, Yale University, 1989.
Y. Hwang and Y. Hu, “Parameterized dependence graph and its applications to multi-stage systolic array mapping,” inProc. ICASSP, IEEE Press, Piscataway, NJ, 1989, pp. 1029–1032.
S. Kung and S. Jean, “A VLSI array compiler system (VACS) for array design,” inVLSI Signal Processing III, (E.R. Brodersen and H.S. Moscovitz, eds.), IEEE Press, Piscataway, NJ, 1988, pp. 495–508.
C. Paige and M. Saunders, “Least squares estimation of discrete linear dynamic systems using orthogonal transformation,”SIAM J. Numerical Analysis, 1977, pp. 180–193.
K. Liu, S. Hsieh, and K. Yao, “Comparisons of parallel SVD in VLSI array processors,” inVLSI Signal Processing IV, IEEE Press, Piscataway, NJ, 1990, pp. 135–146.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Hwang, YT., Hu, Y.H. MSSM—A design aid for multi-stage systolic mapping. J VLSI Sign Process Syst Sign Image Video Technol 4, 125–145 (1992). https://doi.org/10.1007/BF00925118
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF00925118