skip to main content
10.1145/384197.384216acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
Article

Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP Architecture

Published: 01 August 2001 Publication History

Abstract

The TMS320C6000 architecture is a leading family of Digital Signal Processors (DSPs). To achieve peak performance, this VLIW architecture relies heavily on software pipelining. Traditionally, software pipelining has been restricted to regular (FOR) loops. More recently, software pipelining has been extended to irregular (WHILE) loops, but only on architectures that provide special-purpose hardware such as rotating (predicate and general-purpose) register files, specific instructions for filling/draining software pipelined loops, and possibly hardware support for speculative code motion. In contrast, the TMS320C6000 family has a limited, static register file and no specialized hardware beyond the ability to predicate instructions using a few static registers. In this paper, we describe our experience extending a production compiler for the TMS320C6000 family to software pipeline irregular loops. We discuss our technique for preprocessing irregular loops so that they can be handled by the existing software pipeliner. Our approach is much simpler than previous approaches and works very well in the presence of the DSP applications and the target architecture which characterize our environment. With this optimization, we achieve impressive speedups on several key DSP and non-DSP algorithms.

References

[1]
Chang, P. P., Warter, N., Mahlke, S. A., Chen, W. Y., and Hwu, W. W., "Three Architectural Models for Compilercontrolled Speculative Execution," IEEE Transactions on Computers, Vol. 44, No. 4, pp. 481-494, 1995.
[2]
Lam, Monica, "Software Pipelining: An effective Scheduling Technique for VLIW Machines," Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, pp. 318-328, 1988.
[3]
Lavery, Daniel M., and Hwu, Wen-Mei, "Modulo Scheduling of Loops in Control-Intensive Non-Numeric Programs." Proceedings of the 29 th Annual International Conference on Microarchitecture (Micro'29), pp. 126-137, 1996.
[4]
Mahlke, S. A., et al., "Sentinel Scheduling for VLIW and Superscalar Processors," Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-V), pp. 238-247, 1992.
[5]
Rau, B. R. and Glaeser, C. D., "Some Scheduling Techniques and An Easily Schedulable Horizontal Architecture for High Performance Scientific Computing," Proceedings of the 14 th Annual Workshop on Microprogramming, pp.183-198, 1981.
[6]
Rau, B. Ramakrishna, Schlansker, Michael S., and Tirumalai, P. P., "Code Generation Schema for Modulo Scheduled Loops," Proceedings of the 25 th Annual International Symposium on Microarchitecture, pp.158- 169, 1992.
[7]
Stotzer, Eric and Ernst Leiss, "Modulo Scheduling for the TMS320C6x VLIW DSP Architecture," ACM SIGPLAN 199 Workshop on Languages, Compilers and Tools for Embedded Systems (LCTES'99), pp.28-34, 1999.
[8]
Texas Instruments, Inc., TMS320C6000 CPU and Instruction Set Reference Guide, (literature number SPRU189), 2000.
[9]
Texas Instruments, Inc., TMS320C6000 Optimizing Compiler User's Guide, (literature number SPRU187), 2000.
[10]
Texas Instruments, Inc., TMS320C6000 Programmer's Guide, (literature number SPRU198), 2000.
[11]
Tirumalai, P., Lee, M. and Schlansker, M., "Parallelization of Loops with Exits on Pipelined Architectures," Supercomputing '90, pp.200-212, IEEE, 1990.

Cited By

View all
  • (2008)Timing optimization via nest-loop pipelining considering code sizeMicroprocessors & Microsystems10.1016/j.micpro.2008.02.00232:7(351-363)Online publication date: 1-Oct-2008
  • (2007)A Modulo Scheduling Algorithm for a Coarse-Grain Reconfigurable Array Template2007 IEEE International Parallel and Distributed Processing Symposium10.1109/IPDPS.2007.370371(1-8)Online publication date: Mar-2007
  • (2007)Reducing the Code Size of Retimed Software Loops under Timing and Resource ConstraintsEmbedded System Design: Topics, Techniques and Trends10.1007/978-0-387-72258-0_22(255-268)Online publication date: 2007
  • Show More Cited By

Index Terms

  1. Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP Architecture

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    LCTES '01: Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems
    August 2001
    250 pages
    ISBN:1581134258
    DOI:10.1145/384197
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2001

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. VLIW architectures
    2. WHILE loops
    3. digital signal processors
    4. software pipelining

    Qualifiers

    • Article

    Conference

    LCTES01
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 116 of 438 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2008)Timing optimization via nest-loop pipelining considering code sizeMicroprocessors & Microsystems10.1016/j.micpro.2008.02.00232:7(351-363)Online publication date: 1-Oct-2008
    • (2007)A Modulo Scheduling Algorithm for a Coarse-Grain Reconfigurable Array Template2007 IEEE International Parallel and Distributed Processing Symposium10.1109/IPDPS.2007.370371(1-8)Online publication date: Mar-2007
    • (2007)Reducing the Code Size of Retimed Software Loops under Timing and Resource ConstraintsEmbedded System Design: Topics, Techniques and Trends10.1007/978-0-387-72258-0_22(255-268)Online publication date: 2007
    • (2006)Design optimization and space minimization considering timing and code size via retiming and unfoldingMicroprocessors and Microsystems10.1016/j.micpro.2005.11.00230:4(173-183)Online publication date: Jun-2006

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media