Programming language
Material from Wikipedia the free encyclopedia
Go to: navigation, Search
A programming language is a formal sign system designed for writing computer programs.
The programming language defines a set of lexical, syntactic and semantic rules that determine the appearance of the program and the actions that the performer (usually a computer) will perform under its control.
Since the creation of the first programmable machines, mankind has come up with more than eight thousand programming languages (including non standard, visual and esoteric languages) [1].
Every year their number increases.
Some languages can only be used by a small number of their own developers, while others become known to millions of people.
Professional programmers can speak a dozen or more different programming languages.
The programming language is intended for writing computer programs, which are a set of rules that allow a computer to perform a particular computational process, organize the management of various objects, etc.
A programming language differs from natural languages in that it is designed for human interaction with a computer, while natural languages are used for people to communicate with each other.
Most programming languages use special constructs to define and manipulate data structures and control the calculation process.
As a rule, a programming language exists in several, but significantly different forms:
a language standard is a set of specifications that define its syntax and semantics; a language standard can develop historically (see standardization for more details); standard embodiments — implementations) are actually software tools that ensure operation according to a particular version of the language standard; such software tools differ in manufacturer, brand and variant (version), release time, completeness of the standard implementation, additional features; there may be certain errors or features of the implementation that affect the practice of using the language or even its standard.
Content
1 History 1.1 Early stages of development 1.2 Improvement 1.3 Integration and development
2 Standardization of programming languages 2.1 Data types 2.2 Data Structures 2.3 Semantics of programming languages 2.4 Programming paradigm 2.5 Ways of implementing languages 2.6 Low level programming languages 2.7 High level programming languages 2.8 Symbols used
3 Categories of programming languages 3.1 Mathematically based programming languages
4 See also 5 Notes 6 References 7 References
History[edit / edit wiki text]
Main article: History of programming languages
Early stages of development[edit / edit wiki text]
We can say that the first programming languages appeared even before the appearance of modern electronic computers: already in the XIX century, devices were invented that can be called programmable with a degree of conditionality — for example, mechanical pianos and looms.
To control them, sets of instructions were used, which, within the framework of modern classification, can be considered prototypes of domain specific programming languages.
Significant is the "language" in which Lady Ada Augusta Countess Lovelace wrote a program for calculating Bernoulli numbers for Charles Babbage's Analytical Machine, which, if implemented, would become the first computer — albeit mechanical, with a steam engine — in the world.
In 1930-1940, A. Church, A. Turing, A. Markov in the USSR developed mathematical abstractions (lambda calculus, Turing machine, normal algorithms), respectively, for the formalization of algorithms.
At the same time, in the 1940s, electric digital computers appeared and a language was developed that can be considered the first high level computer programming language - "Plankalkül", created by the German engineer K. Zuse in the period from 1943 to 1945[2].
Computer programmers of the early 1950s, especially such as UNIVAC and IBM 701, used machine code directly when creating programs, the program recording on which consisted of ones and zeros and which is considered to be a first generation programming language (while different machines from different manufacturers used different codes, which required rewriting the program when switching to another computer) Soon, this method of programming was replaced by the use of second generation languages, also limited by the specifications of specific machines, but simpler for human use due to the use of mnemonics (symbolic designations of machine commands) and the ability to map names to addresses in machine memory.
They are traditionally known as assembly languages and autocodes.
However, when using an assembler, it became necessary to translate the program into the language of machine codes before executing it, for which special programs were developed, also called assemblers.
There were also problems with the portability of a program from a computer of one architecture to another, and the need for a programmer to think in terms of "low — level" when solving a problem a cell, an address, a command.
Later, the second generation languages were improved: they added support for macros.
Since the mid 1950s, third generation languages such as Fortran, Lisp and Cobol began to appear[3].
Programming languages of this type are more abstract (they are also called "high level languages") and universal, they do not have a strict dependence on a specific hardware platform and the machine commands used on it.
A program in a high level language can be executed (at least in theory, in practice there are usually a number of specific versions or dialects of the language implementation) on any computer on which there is a translator for this language (a tool that translates the program into the machine language, after which it can be executed by the processor).
Updated versions of these languages are still in use in software development, and each of them has had a certain impact on the subsequent development of programming languages[4].
At the same time, in the late 1950s, Algol appeared, which also served as the basis for a number of further developments in this area.
It should be noted that the format and application of early programming languages were largely influenced by interface restrictions[5].
Improvement[edit / edit wiki text]
In the period of the 1960s — 1970s, the main paradigms of programming languages used today were developed, although in many aspects this process was only an improvement of the ideas and concepts laid down in the first languages of the third generation.
The APL language influenced functional programming and became the first language that supported array processing[6].
The PL / 1 language (NPL) was developed in the 1960s as a combination of the best features of Fortran and Cobol.
The Simula language, which appeared around the same time, for the first time included support for object oriented programming.
In the mid 1970s, a group of specialists introduced the Smalltalk language, which was already entirely object oriented.
In the period from 1969 to 1973, the C language was developed, which is still popular to this day[7] and became the basis for many subsequent languages, for example, as popular as C++ and Java.
In 1972, Prolog was created — the most famous (although not the first, and far from the only) logical programming language.
In 1973, an extended polymorphic typing system was implemented in the ML language, which marked the beginning of typed functional programming languages.
Each of these languages has spawned a family of descendants, and most modern programming languages are ultimately based on one of them.
In addition, in the 1960s and 1970s, there were active debates about the need to support structured programming in certain languages[8].
In particular, the Dutch specialist E. Dijkstra made proposals in the press to completely abandon the use of GOTO instructions in all high level languages.
Techniques aimed at reducing the volume of programs and increasing the productivity of the programmer and the user were also developed.
Unification and development[edit / edit wiki text]
In the 1980s, there came a period that can be conditionally called the time of consolidation.
The C++ language combined the features of object oriented and system programming, the US government standardized the Ada language, derived from Pascal and intended for use in on board control systems for military facilities, significant investments were made in Japan and other countries of the world in studying the prospects of the so called fifth generation languages, which would include logical programming constructs[9].
The community of functional languages has adopted ML and Lisp as a standard.
In general, this period was characterized by relying on the foundation laid in the previous decade rather than developing new paradigms.
An important trend that was observed in the development of programming languages for large scale systems was the focus on the use of modules — three dimensional units of code organization.
Although some languages, such as PL/1, already supported the corresponding functionality, the modular system was also reflected and applied in the languages of Module 2, Oberon, Ada and ML.
Modular systems were often combined with generalized programming constructs[10].
An important area of work is visual (graphic) programming languages in which the process of "writing" a program as a text is replaced by the process of "drawing" (constructing a program in the form of a diagram) on a computer screen.
Visual languages provide visibility and a better perception of the program logic by a person.
In the 1990s, due to the active development of the Internet, languages that allow you to create scripts for web pages became widespread — mainly Perl, which developed from a scripting tool for Unix systems, and Java.
The popularity of virtualization technologies also increased.
These changes, however, also did not represent fundamental innovations, but rather an improvement of already existing paradigms and languages (in the latter case, mainly the C family).
Currently, the development of programming languages is moving towards improving security and reliability, creating new forms of modular code organization and integration with databases.
Standardization of programming languages[edit / edit wiki text]
International standards have been created for many widely used programming languages.
Special organizations regularly update and publish specifications and formal definitions of the corresponding language.
Within the framework of such committees, the development and modernization of programming languages continues and issues of expanding or supporting existing and new language constructs are being resolved.
Data types[edit / edit wiki text]
Main article: Data types
Modern digital computers are binary and data is stored in binary (binary) code (although implementations in other number systems are also possible).
This data usually reflects information from the real world (names, bank accounts, measurements, etc.), representing high level concepts.
A special system by which data is organized in a program is the type system of a programming language; the development and study of type systems is known as type theory.
Languages can be divided into static typed and dynamic typed languages, as well as typeless languages (for example, Forth).
Statically typed languages can be further subdivided into languages with a mandatory declaration, where each variable and function declaration has a mandatory type declaration, and languages with inferred types.
Sometimes dynamically typed languages are called latently typed.
Data structures[edit / edit wiki text]
Main article: Data Structures
Type systems in high level languages allow you to define complex, composite types, so called data structures.
As a rule, structural data types are formed as a Cartesian product of basic (atomic) types and previously defined composite types.
Basic data structures (lists, queues, hash tables, binary trees, and pairs) are often represented by special syntactic constructs in high level languages.
Such data is structured automatically.
Semantics of programming languages[edit / edit wiki text]
Main article: Semantics (programming)
There are several approaches to defining the semantics of programming languages.
The most widely distributed varieties of the following three: operational, derivational (axiomatic) and denotational (mathematical).
When describing semantics in the framework of an operational approach, the execution of programming language constructs is usually interpreted using some imaginary (abstract) computer.
Axiomatic semantics describes the consequences of executing language constructs using the logic language and setting pre and postconditions.
Denotational semantics operates with concepts typical of mathematics — sets, correspondences, as well as judgments, statements, etc.
Programming paradigm[edit / edit wiki text]
Main article: Programming paradigm
A programming language is built in accordance with a particular basic model of computing and programming paradigm.
Despite the fact that most languages are focused on the imperative model of computing set by the von Neumann computer architecture, there are other approaches.
We can mention languages with a stack computing model (Fort, Factor, PostScript, etc.), as well as functional (Lisp, Haskell, ML, F#, REFAL, based on the model of computation introduced by the Soviet mathematician A. A. Markov Jr., etc.) and logical programming (Prolog).
Currently, declarative and visual programming languages are also actively developing, as well as methods and tools for developing problem specific languages (see Language oriented programming).
Ways to implement languages[edit / edit wiki text]
Programming languages can be implemented as compiled, interpreted, and embedded.
A program in a compiled language is converted (compiled) into machine code (a set of instructions) for this type of processor using a compiler (a special program) and then assembled into an executable module that can be run as a separate program.
In other words, the compiler translates the source code of the program from a high level programming language into binary codes of processor instructions.
If the program is written in an interpreted language, the interpreter directly executes (interprets) the source text without prior translation.
At the same time, the program remains in the source language and cannot be run without an interpreter.
The computer processor, in this regard, can be called an interpreter for machine code.
The division into compiled and interpreted languages is conditional.
So, for any traditionally compiled language, such as Pascal, you can write an interpreter.
In addition, most modern "pure" interpreters do not execute language constructs directly, but compile them into some high level intermediate representation (for example, with variable dereference and macro expansion).
For any interpreted language, you can create a compiler — for example, the Lisp language, initially interpreted, can be compiled without any restrictions.
The code created during the execution of the program can also be dynamically compiled during execution.
As a rule, compiled programs run faster and do not require additional programs to run, since they are already translated into machine language.
At the same time, every time the program text changes, it needs to be recompiled, which slows down the development process.
In addition, the compiled program can only be executed on the same type of computers and, as a rule, under the same operating system for which the compiler was designed.
To create an executable file for a different type of machine, a new compilation is required.
Interpreted languages have some specific additional features (see above), in addition, programs can be run on them immediately after the change, which facilitates development.
A program in an interpreted language can often be run on different types of machines and operating systems without additional effort.
However, interpreted programs run noticeably slower than compiled ones, in addition, they cannot be executed without an interpreter program.
Some languages, such as Java and C#, are between compiled and interpreted.
Namely, the program is compiled not into a machine language, but into machine independent low level code, byte code.
Next, the byte code is executed by the virtual machine.
Interpretation is usually used to execute byte code, although some parts of it can be translated into machine code directly during program execution using Just in time compilation (JIT) technology to speed up the program.
For Java, the byte code is executed by the Java Virtual Machine (Java Virtual Machine, JVM), for C# — the Common Language Runtime.
This approach, in a sense, allows you to use the advantages of both interpreters and compilers.
It should be mentioned that there are languages that have both an interpreter and a compiler (Fort).
Low level programming languages[edit / edit wiki text]
Main article: Low level programming language
The first computers had to be programmed with binary machine codes.
However, programming in this way is quite a time consuming and difficult task.
To simplify this task, low level programming languages began to appear, which allowed setting machine commands in a form understandable to humans.
Special translator programs were created to convert them into binary code.
An example of a low level language is assembler.
Low level languages are focused on a specific type of processor and take it into account features, therefore, to transfer an assembler program to another hardware platform, it needs to be almost completely rewritten.
There are certain differences in the syntax of programs for different compilers.
However, the central processors for AMD and Intel computers are practically compatible and differ only in some specific commands.
But specialized processors for other devices, for example, video cards and phones, contain significant differences.
Low level languages are usually used for writing small system programs, device drivers, interface modules with non standard equipment, programming specialized microprocessors, when the most important requirements are compactness, speed and the ability to directly access hardware resources.
High level programming languages[edit / edit wiki text]
Main article: High level programming language
The features of specific computer architectures are not taken into account in them, so the created applications are easily transferred from computer to computer.
In most cases, it is enough to simply recompile the program for a specific computer architecture and operating system.
It is much easier to develop programs in such languages and fewer errors are allowed.
The program development time is significantly reduced, which is especially important when working on large software projects.
Now in the development environment, it is considered that programming languages that have direct access to memory and registers or have assembler inserts should be considered programming languages with a low level of abstraction.
Therefore, most of the languages that were considered high level languages before 2000 are no longer considered as such.
Address Programming language Fortran Cobol Algol Pascal Pascal ABC Python Java C Basic C++ Objective C Smalltalk C# Delphi
The disadvantage of some high level languages is the large size of programs in comparison with programs in low level languages.
On the other hand, for algorithmically and structurally complex programs, when using supercompilation, the advantage may be on the side of high level languages.
The text of programs in a high level language is smaller, however, if we take it in bytes, the code originally written in assembly language will be more compact.
Therefore, mostly high level languages are used to develop software for computers and devices that have a large amount of memory.
And different types of assembler are used for programming other devices, where the size of the program is critical.
Symbols used[edit / edit wiki text]
Modern programming languages are designed to use ASCII, that is, the availability of all ASCII graphic characters is a necessary and sufficient condition for writing any language constructs.
ASCII control characters are used only in a limited way: only carriage return CR, line feed LF and horizontal tab HT are allowed (sometimes also vertical tab VT and transition to the next page FF).
For more information on this topic, see: Portable character Set.
Early languages that emerged in the era of 6 bit characters used a more limited set.
For example, the Fortran alphabet includes 49 characters (including a space): A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 = + - * / () .
, $ ' :
A notable exception is the APL language, which uses a lot of special characters.
The use of characters outside of ASCII (for example, KOI8 R characters or Unicode characters) depends on the implementation: sometimes they are allowed only in comments and character / string constants, and sometimes in identifiers.
In the USSR, there were languages where all the keywords were written in Russian letters, but such languages did not gain much popularity (the exception is the Built in 1C programming language:Enterprise).
For more information on this topic, see: Programming languages with keywords are not in English.
The expansion of the set of symbols used is hindered by the fact that many software development projects are international.
It would be very difficult to work with a code where the names of some variables are written in Russian letters, others in Arabic, and others in Chinese characters.
At the same time, for working with text data, new generation programming languages (Delphi 2006, C#, Java) support Unicode.
Categories of programming languages[edit / edit wiki text]
Functional Procedural (imperative) Stack Aspect oriented Declarative Dynamic Training Descriptions of interfaces Prototypical Object oriented Reflexive (that is, supporting reflection) Logical Scripting (scenario) Esoteric
Mathematically based programming languages[edit / edit wiki text]
A number of well known authors[11] [12] distinguish into a special category "languages inherited from mathematics" (Eng. mathematically derived languages).
Alan Kay also separates languages that are" style in the flesh "(crystallization of style) from other languages that are" agglutination of features " [13].
These are languages whose semantics are the direct embodiment of a certain mathematical model, slightly adapted (without violating the integrity) in order to be a more practical language for developing real programs.
Only some languages fall into this category, most languages are designed primarily based on the possibility of effective translation into a Turing machine, and have only a certain subset in their composition that embodies a particular mathematical model from arithmetic to means of parallelism (for example, Occam π [en] is Occam, supplemented by a set of constructions that embody calculus).
Examples of mathematically based languages and mathematical models implemented by them:
Agda, Epigram[en], Idris[en] - intuitionistic type theory[en] Martin Lefa.
APL and its descendants (J, K) are the original semantics, which has no name, embodying the Iverson notation for array calculus (the term "array languages" is often found).
Coq calculus of inductive constructions [en].
Erlang calculus of processes (initially in the form of an actor model, later a justification for the calculus was also built[14]).
Forth is a stack machine and a concatenative programming language.
Haskell category theory (including a "Cartesian closed category" embodying lambda calculus; a category of monads for modeling side effects; an extension of the Hindley Milner type system; a system of genera; etc.)
Joy — function composition and homomorphism (in other words, a pure concatenative programming language, and, as a consequence, a pure functional).
Lisp Church's lambda calculus (including the S expression language embodying Church's pair notation).
Scheme is a" refined " Lisp dialect (more strongly typed, more homonymous[en], limited to hygienic macro definitions[en] and observing the numerical tower[en]), supplemented with continuation notation.
ML is a typed lambda calculus, that is, a lambda calculus supplemented by the Hindley — Milner type system.
Prolog predicate calculus.
Mercury predicate calculus, supplemented by the Hindley Milner type system.
Smalltalk set theory[15] (in compliance with the numerical tower [en]).
SQL tuple calculus (a variant of relational calculus, in turn based on the calculus of first — order predicates) SGML and its descendants (HTML, XML) - tree notation (an important case of graphs).
Unlambda combinatorial logic.
Regular expressions.
Refal is Turchin's original semantics, called "Refal of the machine" or "Refal of the automaton", created on the basis of the normal Markov algorithm, embodying the composition of the theory of automata, matching with a sample and rewriting terms.
The presence of a mathematical justification for a language can guarantee (or, at least, promise with a very high probability) some or all of the following positive properties:
A significant increase in the stability of programs.
In some cases — by constructing a proof of reliability for the language itself (see type safety), significantly simplifying the formal verification of programs, and even obtaining a language that is itself an automatic proof system (Coq, Agda).
In other cases — it is due to early detection of errors on the first trial runs of programs (Forth and regular expressions).
Ensuring potentially higher efficiency of programs.
Even if the semantics of a language is far from the architecture of the target compilation platform, formal methods of global program analysis can be applied to it (although the complexity of writing even a trivial translator may be higher).
For example, for the Scheme and Standard ML languages, there are developed full software optimizing compilers and supercompilers, the result of which can confidently compete in speed with a low level C language and even outpace the latter (although the resource intensity of the compilers themselves is much higher).
One of the fastest DBMS — KDB[16] - is written in the K language.
The Scala language (which inherited mathematics from ML) provides higher speed on the JVM platform than the" native " Java language for it.
On the other hand, Forth has a reputation as one of the most undemanding languages for resources (less demanding than C) and is used for developing real time applications for itself e low power computers; in addition, the Fort translator is one of the least time consuming to implement in assembly language.
A pre known (unlimited or, conversely, clearly delineated) limit for the growth of the complexity of software components, systems and complexes that can be expressed by means of this language with the preservation of quality indicators[11][17].
Languages that do not have a mathematical justification (namely, such languages are most often used in the mainstream: C++, Java, C#, Delphi, etc.), in practice limit the implemented functionality and/or reduce the quality as the system becomes more complex[18], since they are characterized by exponential growth curves of complexity both regarding the work of one individual person and regarding the complexity of project management as a whole[19][20].
The predicted complexity of the system leads either to a step by step decomposition of the project into many smaller tasks, each of which is solved by the corresponding language; or to language oriented programming for the case when the task addressed by the language is just a description of semantics and/or symbolic calculations (Lisp, ML, Haskell, Refal, Regular expressions).
Languages with an unlimited limit on the growth of program complexity are often referred to as metalanguages (which is not true in the direct interpretation of the term, but it is reducible in practice, since any mini language chosen to solve a certain subtask as part of a general problem can be represented as a syntactic and semantic subset of this language, without requiring translation[21]).
Convenience for a person when solving problems that this language is oriented by nature (see problem oriented language), which to some extent can also (indirectly) affect the stability of the resulting programs by increasing the probability of detecting errors in the source code and reducing code duplication.
It should be borne in mind that languages inherited from "inherited from mathematics" will no longer necessarily have these properties.
For example, the Python language combines several of the mentioned models, but there is no justification for combining them, so it cannot be considered "inherited from mathematics", and, as a result, only the last of these properties is inherent in it.
See also[edit / edit wiki text]
Computer language Grammar with phrasal structure Structural programming Typed lambda calculus High level programming language Programming Hello, world!
Code design Standard
Notes[edit / edit wiki text]
↑ List of programming languages (English.).
???
(???).
Checked ???.
Archived from the original source on August 22, 2011.
↑ Rojas, Raúl, et al. (2000).
«Plankalkül: The First High Level Programming Language and its Implementation».
Institut für Informatik, Freie Universität Berlin, Technical Report B 3/2000. (full text)
↑ Linda Null, Julia Lobur, The essentials of computer organization and architecture, Edition 2, Jones & Bartlett Publishers, 2006, ISBN 0-7637-3769-0, p. 435 ↑ O'Reilly Media.
History of programming languages (PDF).
Verified on October 5, 2006.
Archived from the original source on May 10, 2013.
↑ Frank da Cruz.
IBM Punch Cards Columbia University Computing History.
↑ Richard L. Wexelblat: History of Programming Languages, Academic Press, 1981, chapter XIV.
↑ François Labelle.
Programming Language Usage Graph.
SourceForge.
Checked on June 21, 2006.
Archived from the original source on May 10, 2013.
↑ (2006) "The Semicolon Wars".
American Scientist 94 (4): 299–303.
↑ Tetsuro Fujise, Takashi Chikayama, Kazuaki Rokusawa, Akihiko Nakase (December 1994).
«KLIC: A Portable Implementation of KL1»
