programming languages
19. A language that doesnt affect the way you think about programming, is not worth knowing. Alan Perlis, Epigrams on Programming, ACMs SIGPLAN Notices Volume 17, No. 9, September 1982, pages 7-13
direct programming
Originally computers were programmed directly in a language that the computer understood.
This direct programming could involve directly wiring the program into the computer. In some cases, this involved a soldering iron. In other cases there was some kind of plug-board ot make it easier to change the programmed instructions. This method was known as hard wiring.
Large telegraph networks and later large telephone networks became so complex as to essentially be a computer on a system-wide basis. Many of the ideas (especially logic circuits) that were later necessary to create computers were first developed for large scale telegraph and telephone systems.
In some early computers the programming could be accomplished with a set of switches. The use of front panel switches (and corresponding indicator lights) continued as an option on many mainframe and minicomputer systems. Some microcomputer systems intended for hobbyists and for dedicated systems also had some kind of front panel switches.
Another method was the use of punched cards. This was a technology originally developed for controlling early industrial age factories, particularly large looms. The designs or patterns for the cloth would be programmed using punched cards. This made it easy to switch to new designs. Some of the large looms became so complex that they were essentially computers, although that terminology wasnt used at the time.
machine code and object code
Both the front panel switch and the punched card methods involved the use of numeric codes. Each numeric code indicated a different machine instruction. The numbers used internally are known as machine code. The numbers on some external media, such as punched cards (or disk files) are known as object code.
assembly and assemblers
One of the early developments was a symbolic assembler. Instead of writing down a series of binary numbers, the programmer would write down a list of machine instructions, using human-readable symbols. A special program, the assembler, would convert these symbolic instructions into object or machine code.
Assembly languages have the advantage that they are easier to understand than raw machine code, but still give access to all of the power of the computer (as each assembler symbol translates directly into a specific machine instruction).
Assembly languages have the disadvantage that they are still very close to machine language. These can be difficult for a human to follow and understand and time-consuming for a human to write. Also, programs written in assembly are tied to a specific computer hardware and cant be reused on another kind of computer.
The human readable version of assembly code is known as source code (it is the source that the assembler converts into object code). All programs written in high level languages are also called source code.
high level languages
High level languages are designed to be easier to understand than assembly languages and allow a program to run on multiple different kinds of computers.
The source code written in high level languages needs to be translated into object code. The two basic approaches are compilers and interpetters. Some programming languages are available in both interpretted and compiled versions.
High level languages have usually been designed to meet the needs of some particular kind of programming. For example, FORTRAN was originally intended for scientific programming. COBOL was originally intended for business programming and data processing. SQL was originally intended for data base queries. C was originally intended for systems programming. LISP was originally intended for list processing. PHP was originally intended for web scripting. Ada was originally intended for embedded systems. BASIC and Pascal were originally intended as teaching languages.
Some high level languages were intended to be general purpose programming languages. Examples include PL/I and Modula-2. Some languages that were originally intended for a specific purpose have turned into general purpose programming languages, such as C and Pascal.
compilers
Compilers convert a finished program (or section of a program) into object code. This is often done in steps. Some compilers convert high level language instructions into assembly language instructions and then an assembler is used to create the finished object code.
Some compilers convert high level language instructions into an intermediate language. This intermediate language is platform-independent (it doesnt matter which actual computer hardware is eventually used). The intermediate language is then converted into object code for a specific kind of computer. This approach makes it easier to move (or port) a compiler from one kind of computer to another. Only the last step (or steps) need to be rewritten, while the main complier is reused.
Compiled code almost always runs faster than interpretted code. An optimizing compiler examines a high level program and figures out ways to optimize the program so that it runs even faster.
A C program is considered to be strictly conforming to ANSI C if the program only uses features and libraries as they are described in the ANSI standard (with no additional optional features or extensions).
A conforming hosted implementation accepts any strictly conforming program. This applies to a program that is intended to run on an operating system.
A conforming freestanding implementation accepts any strictly conforming program that doesnt use any library facilities other than those in the header files float.h, limits.h, stdarg.h, and stdef.h.. This applies to a program that is intended to run on an embedded system or other environment with minimal operating system support (such as no file system)..
compilers and assemblers
A brief explanation of the difference between compilers and assemblers.
An assembler converts symbolic (human readable text) instructions into their corresponding machine or object instructions. There is generally a one-to-one correspondence between assembler instructions and machine instructions (at the macro-machine level).
A compiler converts a high level language into machine or object code. Typically there are many machine instructions for every high level language instruction. There are some exceptions some older languages, such as COBOL and FORTRAN, had several instructions that translated directly into a single machine instruction, but even in those cases, most of the useful portions of the language were translated into many machine instructions.
An example from the C programming language:
if (x==0) z=3; /* test to see if x is zero, if x is zero, then set z to 3 */
The same example in 8080 assembly language (everything after the semicolon ; is a comment to help you understand):
LXIH $E050 ; point to location of variable x (Load Double Immediate into HL register pair)
MOVAM ; load value of x into the accumulator (MOVe to A register from Memory)
CMPA ; test the value of the accumulator (CoMPare A register with itself)
JNZ @1 ; if not zero, then skip variable assignment (Jump is Not Zero)
MVIA #3 ; load three (new value for variable z) into accumulator (MoVe Immediate into A register the number three)
LXIH $E060 ; point to location of variable z (Load Double Immediate into HL register pair)
MOVMA ; store three into variable z (MOVe to Memory from A register)
@1 NOP ; drop through or skip to here to continue program (No OPeration)
DS $E050 ; reserve memory for variable x (Data Storage)
DS $E060 ; reserve memory for variable z (Data Storage)
The same example in 8080 machine code (the comments after the semicolon ; wouldnt normally be included, but are added to help you follow along, in a real case of object code it would be just binary/hexadecimal numbers):
21 ; LXIH
50 ; data address
E0 ; data address
7E ; MOVAM
BF ; CMPA
C2 ; JNZ
0D ; code address
00 ; code address
1E ; MVIA
03 ; data
21 ; LXIH
60 ; data address
E0 ; data address
77; MOVMA
00 ; NOP
and later in memory: the data storage
$E050 ; variable x, unknown contents
$E060 ; variable y, becomes three (3)
You will notice that there is one machine instruction for each assembly instruction (some instructions are followed by data or addresses), while there are many assembly or machine instructions for one C instruction.
linkers
As programs grow in size, requiring teams of programmers, there is a need to break them up into separate files so that different team members can work on their individual assignments without interfering with the work of others. Each file is compiled separately and then combined later.
Linkers are programs that combine the various parts of a large program into a single object program. Linkers also bring in support routines from libraries. These libraries contain utility and other support code that is reused over and over for lots of different programs.
Historically, linkers also served additional purposes that are no longer necessary, such as resolving relocatable code on early hardware (so that more than one program could run at the same time).
loaders
A loader is a program that loads programs into main memory so that they can be run. In the past, a loader would have to be explicitely run as part of a job. In modern times the loader is hidden away in the operating system and called automatically when needed.
interpreters
Interpreters convert each high level instruction into a series of machine instructions and then immediately run (or execute) those instructions. In some cases, the interpreter has a library of routines and looks up the correct routine from the library to handle each high level instruction.
Interpreters inherently run more slowly than the same software compiled. In the early days of computing this was a serious problem. Since the mid-1980s, computers have become fast enough that interpreters run fine for most purposes.
Most of the scripting languages common on the web and servers are intereted languages. This includes JavaScript, Perl, PHP, Python, Ruby.
Note that some modern programming languages (including Java and Python) translate the raw text into an intermediate numeric code (usually a byte code) for a virtual machine. This method is generally faster than older traditional methods of interpreting scripts and has the advantage of providing a pkatform-independent stored code.
editors
An editor is a program that is used to edit (or create) the source files for programming. Editors rarely have the advanced formatting and other features of a regular word processor, but sometimes include special tools and features that are useful for programming.
Two important editors are emacs and vi from the UNIX world. I personally use Tom Benders Tex-Edit Plus, which is available in multiple different languages (Danish, English, French, German, Italian, Japanese, Spanish).
command line interface
A command line interface is an old-style computer interface where the programmer (or other person) controls the computer by typing lines of text. The text lines are used to give instructions (or commands) to the computer. The most famous example of a command line interface is the UNIX shell.
In addition to built-in commands, command line interfaces could be used to run programs. Additional information could be passed to a program, such as names of files to use and various program switches that would modify how a program operated.
See the information on how to use the shell.
development environment
A development environment is an integrated set of programs (or sometimes one large monolithic program) that is used to support writing computer software. Development environments typically include an editor, compiler (or compilers), linkers, and various additional support tools. Development environments may include their own limited command line interface specifically intended for programmers.
The term development environment can also be used to mean the collection of programs used for writing software, even if they arent integrated with each other.
Because there are a huge number of different development environments and a complete lack of any standardization, the methods used for actually typing in, compiling, and running a program are not covered by this book. Please refer to your local documentation for details.
The development environment for UNIX, Linux, and Mac OS X are discussed in the chapter on shell programming.
Stanford introduction
Stanford CS Education Library This [the following section until marked as end of Stanford University items] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.
The C Language
C is a professional programmers language. It was designed to get in ones way as little as possible. Kernighan and Ritchie wrote the original language definition in their book, The C Programming Language (below), as part of their research at AT&T. Unix and C++ emerged from the same labs. For several years I used AT&T as my long distance carrier in appreciation of all that CS research, but hearing thank you for using AT&T for the millionth time has used up that good will.
Some languages are forgiving. The programmer needs only a basic sense of how things work. Errors in the code are flagged by the compile-time or run-time system, and the programmer can muddle through and eventually fix things up to work correctly. The C language is not like that.
The C programming model is that the programmer knows exactly what they want to do and how to use the language constructs to achieve that goal. The language lets the expert programmer express what they want in the minimum time by staying out of their way.
C is simple in that the number of components in the language is small-- If two language features accomplish more-or-less the same thing, C will include only one. Cs syntax is terse and the language does not restrict what is allowed -- the programmer can pretty much do whatever they want.
Cs type system and error checks exist only at compile-time. The compiled code runs in a stripped down run-time model with no safety checks for bad type casts, bad array indices, or bad pointers. There is no garbage collector to manage memory. Instead the programmer mangages heap memory manually. All this makes C fast but fragile.
Analysis -- Where C Fits
Because of the above features, C is hard for beginners. A feature can work fine in one context, but crash in another. The programmer needs to understand how the features work and use them correctly. On the other hand, the number of features is pretty small.
Like most programmers, I have had some moments of real loathing for the C language. It can be irritatingly obedient -- you type something incorrectly, and it has a way of compiling fine and just doing something you dont expect at run-time. However, as I have become a more experienced C programmer, I have grown to appreciate Cs straight-to-the point style. I have learned not to fall into its little traps, and I appreciate its simplicity.
Perhaps the best advice is just to be careful. Dont type things in you dont understand. Debugging takes too much time. Have a mental picture (or a real drawing) of how your C code is using memory. Thats good advice in any language, but in C its critical.
Perl and Java are more portable than C (you can run them on different computers without a recompile). Java and C++ are more structured than C. Structure is useful for large projects. C works best for small projects where performance is important and the progammers have the time and skill to make it work in C. In any case, C is a very popular and influential language. This is mainly because of Cs clean (if minimal) style, its lack of annoying or regrettable constructs, and the relative ease of writing a C compiler.
Other Resources
The C Programming Language, 2nd ed., by Kernighan and Ritchie. The thin book which for years was the bible for all C programmers. Written by the original designers of the language. The explanations are pretty short, so this book is better as a reference than for beginners.
Stanford CS Education Library This [the following section until marked as end of Stanford University items] is document #101, Essential C, in the Stanford CS Education Library. This and other educational materials are available for free at http://cslibrary.stanford.edu/. This article is free to be used, reproduced, excerpted, retransmitted, or sold so long as this notice is clearly reproduced at its beginning. Copyright 1996-2003, Nick Parlante, nick.parlante@cs.stanford.edu.
end of Stanford introduction
free music player coding example
Coding example: I am making heavily documented and explained open source code for a method to play music for free almost any song, no subscription fees, no download costs, no advertisements, all completely legal. This is done by building a front-end to YouTube (which checks the copyright permissions for you).
View music player in action: www.musicinpublic.com/.
Create your own copy from the original source code/ (presented for learning programming).
Because I no longer have the computer and software to make PDFs, the book is available as an HTML file, which you can convert into a PDF.
Names and logos of various OSs are trademarks of their respective owners.