This article explains the differences between statically typed and dynamically typed languages, examines the concepts of "strong" and "weak" typing, and compares the power of typing systems in different languages. Recently, there has been a clear movement towards more strict and powerful typing systems in programming, so it is important to understand what we are talking about when talking about types and typing.



A type is a collection of possible values. An integer can have the values ​​0, 1, 2, 3, and so on. Boolean can be true or false. You can come up with your own type, for example, the “High Five” type, in which the possible values ​​are “high” and “5”, and nothing else. It's not a string or a number, it's a new, separate type.


Statically typed languages ​​restrict the types of variables: a programming language might know, for example, that x is an Integer. In this case, the programmer is prohibited from doing x = true, this will be incorrect code. The compiler will refuse to compile it, so we won't even be able to run the code. Another statically typed language might have different expressive capabilities, and none of the popular type systems can express our HighFive type (but many can express other, more sophisticated ideas).


Dynamically typed languages ​​mark values ​​with types: the language knows that 1 is an integer, 2 is an integer, but it cannot know that the variable x always contains an integer.


The language runtime checks these labels at different points in time. If we try to add two values, it can check whether they are numbers, strings, or arrays. Then it will add these values, glue them together, or throw an error, depending on the type.

Statically typed languages

Static languages ​​check the types in a program at compile time, before the program runs. Any program in which types violate the rules of the language is considered incorrect. For example, most static languages ​​will reject the expression "a" + 1 (C is an exception to this rule). The compiler knows that "a" is a string and 1 is an integer, and that + only works when the left and right sides are of the same type. So he doesn't have to run the program to realize there's a problem. Each expression in a statically typed language is of a specific type, which can be determined without running code.


Many statically typed languages ​​require a type denotation. The Java function public int add(int x, int y) takes two integers and returns a third integer. Other statically typed languages ​​may infer the type automatically. The same addition function in Haskell looks like this: add x y = x + y . We don't tell the language the types, but it can figure them out for itself because it knows that + only works on numbers, so x and y must be numbers, so the add function takes two numbers as arguments.


This does not reduce the "static" nature of the type system. Haskell's type system is renowned for being static, strict, and powerful, and Haskell is ahead of Java on all these fronts.

Dynamically typed languages

Dynamically typed languages ​​do not require the type to be specified, but they do not define it themselves. Variable types are unknown until they have specific values ​​at startup. For example, a function in Python


def f(x, y): return x + y

can add two integers, concatenate strings, lists, and so on, and we cannot understand what exactly is happening until we run the program. It's possible that function f will be called with two strings at some point, and with two numbers at another time. In this case, x and y will contain the values different types at different times. This is why values ​​in dynamic languages ​​are said to have a type, but variables and functions do not. The value 1 is definitely an integer, but x and y can be anything.

Comparison

Most dynamic languages ​​will throw an error if types are used incorrectly (JavaScript is a notable exception; it tries to return a value for any expression, even when it doesn't make sense). When using dynamically typed languages, even a simple error like "a" + 1 can occur in a production environment. Static languages ​​prevent such errors, but of course the degree of prevention depends on the strength of the type system.


Static and dynamic languages ​​are built on fundamentally different ideas about program correctness. In a dynamic language, "a" + 1 is a valid program: the code will run and an error will appear in the runtime environment. However, in most statically typed languages ​​the expression "a" + 1 is not a program: It will not compile and will not run. This is incorrect code, just like a set of random characters!&%^@*&%^@* is incorrect code. This additional concept of correctness and incorrectness has no equivalent in dynamic languages.

Strong and weak typing

The concepts of “strong” and “weak” are very ambiguous. Here are some examples of their use:

    Sometimes "strong" means "static".
    It's simple, but it's better to use the term "static" because most people use and understand it.

    Sometimes "strong" means "does not do implicit type conversion".
    For example, JavaScript allows you to write "a" + 1, which can be called "weak typing". But almost all languages ​​provide some level of implicit conversion that allows you to automatically convert from integers to floating point numbers like 1 + 1.1. In reality, most people use the word "strong" to define the boundary between acceptable and unacceptable conversion. There is no generally accepted boundary; they are all imprecise and depend on the opinion of a particular person.

    Sometimes "strong" means that it is impossible to bypass the language's strict typing rules.

  • Sometimes "strong" means memory-safe.
    C is an example of a memory-unsafe language. If xs is an array of four numbers, then C will happily execute xs or xs , returning some value from the memory immediately behind xs .

Let's stop. Here's how some languages ​​meet these definitions. As you can see, only Haskell is consistently "strong" in all respects. Most languages ​​are not that clear.



(The "When as" in the "Implicit Conversions" column means that the division between strong and weak depends on what conversions we consider acceptable).


Often the terms "strong" and "weak" refer to a vague combination of various definitions above, and others not shown here. All this confusion makes the words "strong" and "weak" almost meaningless. When you want to use these terms, it is better to describe what exactly is meant. For example, you could say that "JavaScript returns a value when a string with a number is added, but Python returns an error." In this case, we will not waste our energy trying to come to agreement about the multiple meanings of the word “strong.” Or, even worse: we will end up with unresolved misunderstandings due to terminology.


Most of the time, the terms "strong" and "weak" on the Internet are vague and poorly defined opinions of specific individuals. They are used to call a language "bad" or "good", and this opinion turns into technical jargon.



Strong Typing: A type system that I love and am comfortable with.

Weak typing: A type system that bothers me or that I am not comfortable with.

Gradual typing

Is it possible to add static types to dynamic languages? In some cases - yes. In others it is difficult or impossible. The most obvious problem is eval and other similar features of dynamic languages. Doing 1 + eval("2") in Python yields 3. But what does 1 + eval(read_from_the_network()) yield? It depends on what's online at the time of execution. If we get a number, then the expression is correct. If it's a string, then no. There is no way to know before running, so it is not possible to parse the type statically.


An unsatisfactory solution in practice is to set the eval() expression to type Any, which is similar to Object in some object-oriented programming languages ​​or interface() in Go: it is a type that can be satisfied by any value.


Values ​​of type Any are unrestricted, so the type system's ability to help us with eval code disappears. Languages ​​that have both eval and a type system must give up type safety whenever eval is used.


Some languages ​​have optional or gradual typing: they are dynamic by default, but allow some static annotations to be added. Python recently added optional types; TypeScript is a superset of JavaScript that has optional types; Flow performs static analysis of good old JavaScript code.


These languages ​​provide some of the benefits of static typing, but they will never provide the absolute guarantees of truly static languages. Some functions will be statically typed and some will be dynamically typed. A programmer should always be aware and wary of the difference.

Compiling Statically Typed Code

When compiling statically typed code, the syntax is first checked, as in any compiler. Then the types are checked. This means that a static language may first complain about one syntax error, and after fixing it, complain about 100 typing errors. The syntax error fix did not create those 100 typing errors. The compiler simply had no way of detecting type errors until the syntax was corrected.


Static language compilers can usually generate more quick code than dynamic compilers. For example, if the compiler knows that the add function accepts integers, then it can use the native ADD instruction central processor. A dynamic language will check the type at runtime, choosing from a variety of add functions depending on the types (add integers or floats, or concatenate strings or maybe lists?) or need to decide that an error has occurred and the types do not match. All these checks take time. Dynamic languages ​​use various tricks for optimization, such as JIT compilation (just-in-time), where the code is recompiled at runtime after all the necessary type information has been obtained. However, no dynamic language can match the speed of neatly written static code in a language like Rust.

Arguments for static and dynamic types

Proponents of a static type system point out that without a type system, simple errors can lead to problems in production. This is, of course, true. Anyone who has used a dynamic language has experienced this first hand.


Proponents of dynamic languages ​​point out that such languages ​​seem to be easier to write code. This is definitely true for some kinds of code we write from time to time, like that eval code. This is a controversial solution for regular work, and here it makes sense to remember the vague word “easy”. Rich Hickey did a great job talking about the word “easy” and its connection to the word “simple.” After watching this report, you will understand that it is not easy to use the word “easy” correctly. Beware of "easy".


The pros and cons of static and dynamic typing systems are still poorly understood, but they certainly depend on the language and the specific problem being solved.


JavaScript tries to continue even if it means a meaningless conversion (like "a" + 1 resulting in "a1"). Python, on the other hand, tries to be conservative and often returns errors, as in the case of "a" + 1 .


There are different approaches with different levels of security, but Python and JavaScript are both dynamically typed languages.



Haskell will not allow you to add an integer and a float without an explicit conversion first. C and Haskell are both statically typed, despite such big differences.


There are many variations of dynamic and static languages. Any blanket statement like "static languages ​​are better than dynamic languages ​​when it comes to X" is almost guaranteed to be nonsense. This may be true in the case of specific languages, but then it is better to say "Haskell is better than Python when it comes to X."

Variety of static typing systems

Let's take a look at two famous examples of statically typed languages: Go and Haskell. The Go typing system does not have generic types, types with "parameters" from other types. For example, we can create our own type for MyList lists, which can store any data we need. We want to be able to create a MyList of integers, a MyList of strings, and so on, without changing source MyList. The compiler must keep an eye on typing: if there is a MyList of integers, and we accidentally add a string there, then the compiler must reject the program.


Go was specifically designed to not allow types like MyList to be defined. The best that can be done is to create a MyList of "empty interfaces": MyList can contain objects, but the compiler simply does not know their type. When we retrieve objects from MyList, we need to tell the compiler their type. If we say “I’m getting a string,” but in reality the value is a number, then there will be a runtime error, as is the case with dynamic languages.


Go also doesn't have many of the other features found in modern statically typed languages ​​(or even some systems from the 1970s). Go's creators had their reasons for these decisions, but outsiders' opinions on the matter can sometimes sound harsh.


Now let's compare with Haskell, which has a very powerful type system. If you set the type to MyList, then the type of the “list of numbers” is simply MyList Integer . Haskell will prevent us from accidentally adding a string to the list, and will make sure that we don't put an element from the list into a string variable.


Haskell can express much more complex ideas directly with types. For example, Num a => MyList a means "MyList of values ​​that belong to the same type of numbers." It could be a list of integers, floats, or fixed-precision decimals, but it will definitely never be a list of strings, which is checked at compile time.


You can write an add function that works with any numeric type. This function will have type Num a => (a -> a -> a) . It means:

  • a can be any numeric type (Num a =>).
  • The function takes two arguments of type a and returns type a (a -> a -> a).

The last example. If the function type is String -> String , then it accepts a string and returns a string. But if it's String -> IO String , then it also does some I/O. This could be accessing a disk, accessing a network, reading from a terminal, and so on.


If the function has type No IO, then we know that it does not perform any I/O operations. In a web application, for example, you can tell whether a function modifies the database simply by looking at its type. No dynamic and almost no static languages ​​can do this. This is a feature of languages ​​with the most powerful typing systems.


In most languages, we would have to wade through a function and all the functions that are called from there, and so on, trying to find something that changes the database. This is a tedious process and easy to make mistakes. And the Haskell type system can answer this question simply and reliably.


Compare this power to Go, which is unable to express the simple idea of ​​MyList, let alone "a function that takes two arguments, both numeric and of the same type, and that does I/O."


The Go approach makes it easier to write Go programming tools (in particular, the compiler implementation can be simple). Plus, there are fewer concepts to learn. How these benefits compare with the significant limitations is a subjective question. However, there is no arguing that Haskell is more difficult to learn than Go, and that Haskell's type system is much more powerful, and that Haskell can prevent many more types of bugs when compiling.


Go and Haskell are such different languages ​​that grouping them into the same class of "static languages" can be misleading, even though the term is used correctly. When comparing practical security benefits, Go is closer to dynamic languages ​​than Haskell.


On the other hand, some dynamic languages ​​are safer than some static languages. (Python is generally considered much more secure than C). When you want to make generalizations about static or dynamic languages ​​as groups, do not forget about the huge number of differences between languages.

Specific examples of differences in the capabilities of typing systems

More powerful typing systems can specify constraints at smaller levels. Here are some examples, but don't get too hung up on them if the syntax isn't clear.


In Go you can say "the add function takes two integers and returns an integer":


func add(x int, y int) int ( return x + y )

In Haskell you can say "a function takes any numeric type and returns a number of the same type":


f:: Num a => a -> a -> a add x y = x + y

In Idris you can say "the function takes two integers and returns an integer, but the first argument must be less than the second argument":


add: (x: Nat) -> (y: Nat) -> (auto smaller: LT x y) -> Nat add x y = x + y

If you try to call the function add 2 1 where the first argument is greater than the second, the compiler will reject the program at compile time. It is impossible to write a program where the first argument is greater than the second. Rarely does a language have this capability. In most languages, this check happens at runtime: we would write something like if x >= y: raise SomeError() .


There is no Haskell equivalent to the type in the Idris example above, and there is no Go equivalent to either the Haskell or Idris example. As a result, Idris can prevent many bugs that Haskell cannot prevent, and Haskell can prevent many bugs that Go will not notice. In both cases it is necessary additional features typing systems that will make the language more complex.

Typing systems of some static languages

Here is a rough list of some languages' typing systems in order of increasing power. This list will give you a general idea of ​​the power of the systems, do not treat it as an absolute truth. Languages ​​collected in one group can be very different from each other. Every typing system has its quirks, and most of them are very complex.

  • C (1972), Go (2009): These systems are not powerful at all, without support for generic types. It is not possible to define a MyList type that would mean "list of integers", "list of strings", etc. Instead, you will have to make a “list of undesignated values”. The programmer must manually report "this is a list of strings" every time a string is retrieved from the list, and this may result in an execution error.
  • Java (1995), C# (2000): Both languages ​​support generic types, so you could say MyList and get a list of strings that the compiler knows about and can enforce type rules. The elements in the list will be of type String, and the compiler will force the compilation rules as usual, so runtime errors are less likely.
  • Haskell (1990), Rust (2010), Swift (2014): All of these languages ​​have several advanced features, including generic types, algebraic data types (ADTs), and type classes or something similar (class types, traits, and protocols, respectively). Rust and Swift are more popular than Haskell and are promoted by large organizations (Mozilla and Apple, respectively).
  • Agda (2007), Idris (2011): These languages ​​support dependent types, allowing you to create types like "a function that takes two integers x and y, where y is greater than x." Even the "y is greater than x" constraint is forced during compilation. When executed, y will never be less than or equal to x, no matter what happens. Very subtle but important properties of a system can be checked statically in these languages. Very few programmers study them, but these languages ​​arouse great enthusiasm among them.

There is a clear movement towards more powerful typing systems, especially as measured by the popularity of languages ​​rather than the mere fact of languages ​​existing. A notable exception is Go, which explains why many proponents of static languages ​​consider it a step backwards.


Group two (Java and C#) are mainstream languages, mature and widely used.


Group three is on the cusp of entering the mainstream, with great support from Mozilla (Rust) and Apple (Swift).


Group four (Idris and Agda) are far from the mainstream, but that may change over time. Group three languages ​​were far from the mainstream ten years ago.

This article contains the necessary minimum of those things that you simply need to know about typing, so as not to call dynamic typing evil, Lisp a typeless language, and C a language with strong typing.

IN full version located detailed description all types of typing, seasoned with code examples, links to popular programming languages ​​and illustrative pictures.

I recommend reading the short version of the article first, and then the full version if you wish.

Short version

Based on typing, programming languages ​​are usually divided into two large camps - typed and untyped (typeless). The first includes, for example, C, Python, Scala, PHP and Lua, and the second includes assembly language, Forth and Brainfuck.

Since “typeless typing” in its essence is as simple as a plug, it is not further divided into any other types. But typed languages ​​are divided into several more overlapping categories:

  • Static/dynamic typing. Static is defined by the fact that the final types of variables and functions are set at compile time. Those. the compiler is already 100% sure which type is where. In dynamic typing, all types are discovered during program execution.

    Examples:
    Static: C, Java, C#;
    Dynamic: Python, JavaScript, Ruby.

  • Strong/weak typing (also sometimes called strong/lax). Strong typing is distinguished by the fact that the language does not allow mixing in expressions Various types and does not perform automatic implicit conversions, for example, you cannot subtract a set from a string. Weakly typed languages ​​perform many implicit conversions automatically, even though loss of precision may occur or the conversion is ambiguous.

    Examples:
    Strong: Java, Python, Haskell, Lisp;
    Weak: C, JavaScript, Visual Basic, PHP.

  • Explicit/implicit typing. Explicitly typed languages ​​differ in that the type of new variables/functions/their arguments must be specified explicitly. Accordingly, languages ​​with implicit typing shift this task to the compiler/interpreter.

    Examples:
    Explicit: C++, D, C#
    Implicit: PHP, Lua, JavaScript

It should also be noted that all these categories overlap, for example, the C language has static weak explicit typing, and Python language- dynamic strong implicit.

However, there are no languages ​​with static and dynamic typing at the same time. Although, looking ahead, I’ll say that I’m lying here - they really exist, but more on that later.

Detailed version

If the short version wasn't enough for you, that's fine. It’s not for nothing that I wrote a detailed one? The main thing is that it was simply impossible to fit all the useful and interesting information into a short version, and a detailed one would probably be too long for everyone to read without straining.

Typeless typing

In typeless programming languages, all entities are considered simply sequences of bits of varying lengths.

Typeless typing is usually inherent in low-level (assembly language, Forth) and esoteric (Brainfuck, HQ9, Piet) languages. However, along with its disadvantages, it also has some advantages.

Advantages
  • Allows you to write at an extremely low level, and the compiler/interpreter will not interfere with any type checks. You are free to perform any operations on any type of data.
  • The resulting code is usually more efficient.
  • Transparency of instructions. If you know the language, there is usually no doubt what this or that code is.
Flaws
  • Complexity. There is often a need to represent complex values ​​such as lists, strings, or structures. This may cause inconvenience.
  • Lack of checks. Any meaningless actions, such as subtracting a pointer to an array from a symbol, will be considered completely normal, which is fraught with subtle errors.
  • Low level of abstraction. Working with any complex data type is no different from working with numbers, which of course will create many difficulties.
Strong typeless typing?

Yes, this exists. For example, in assembly language (for the x86/x86-64 architecture, I don’t know others) you cannot assemble a program if you try to load data from the rax register (64 bits) into the cx register (16 bits).

mov cx, eax ; assembly time error

So it turns out that assembler still has typing? I believe that these checks are not enough. And your opinion, of course, depends only on you.

Static and dynamic typing

The main thing that distinguishes static typing from dynamic typing is that all type checking is performed at compile time, not runtime.

Some people may think that static typing is too restrictive (in fact, it is, but this has long been eliminated with the help of some techniques). Some people say that dynamically typed languages ​​are playing with fire, but what features make them stand out? Do both species really have a chance of existing? If not, then why are there so many languages ​​that are both statically and dynamically typed?

Let's figure it out.

Benefits of Static Typing
  • Type checking occurs only once - at the compilation stage. This means that we won't have to constantly figure out if we're trying to divide a number by a string (and either throw an error or perform a conversion).
  • Execution speed. From the previous point it is clear that statically typed languages ​​are almost always faster than dynamically typed ones.
  • Under some additional conditions, it allows you to detect potential errors already at the compilation stage.
Benefits of Dynamic Typing
  • Easy to create universal collections- heaps of everything and everyone (rarely does such a need arise, but when dynamic typing arises it will help out).
  • Convenience of describing generalized algorithms (for example, array sorting, which will work not only on a list of integers, but also on a list of real numbers and even on a list of strings).
  • Easy to learn - Dynamically typed languages ​​are usually very good for getting started with programming.

Generalized programming

Okay, the most important argument for dynamic typing is the convenience of describing generic algorithms. Let's imagine a problem - we need a function to search through several arrays (or lists) - an array of integers, an array of reals and an array of characters.

How are we going to solve it? Let's solve it in 3 different languages: one with dynamic typing and two with static typing.

I will use one of the simplest search algorithms - brute force. The function will receive the element being searched for, the array (or list) itself and return the index of the element, or if the element is not found - (-1).

Dynamic solution (Python):

Def find(required_element, list): for (index, element) in enumerate(list): if element == required_element: return index return (-1)

As you can see, everything is simple and there are no problems with the fact that the list can contain numbers, lists, or other arrays. Very good. Let's go further - solve the same problem in C!

Static solution (C):

Unsigned int find_int(int required_element, int array, unsigned int size) ( for (unsigned int i = 0; i< size; ++i) if (required_element == array[i]) return i; return (-1); } unsigned int find_float(float required_element, float array, unsigned int size) { for (unsigned int i = 0; i < size; ++i) if (required_element == array[i]) return i; return (-1); } unsigned int find_char(char required_element, char array, unsigned int size) { for (unsigned int i = 0; i < size; ++i) if (required_element == array[i]) return i; return (-1); }

Well, each function individually is similar to the Python version, but why are there three of them? Has static programming really lost?

Yes and no. There are several programming techniques, one of which we will now consider. It's called generic programming and the C++ language supports it quite well. Let's take a look at the new version:

Static solution (generic programming, C++):

Template unsigned int find(T required_element, std::vector array) ( for (unsigned int i = 0; i< array.size(); ++i) if (required_element == array[i]) return i; return (-1); }

Fine! It doesn't look much more complicated than the Python version and doesn't require much writing. In addition, we got an implementation for all arrays, not just the 3 needed to solve the problem!

This version seems to be exactly what we need - we get both the advantages of static typing and some of the advantages of dynamic typing.

It's great that this is possible at all, but it could be even better. Firstly, generalized programming can be more convenient and beautiful (for example, in the Haskell language). Secondly, in addition to generalized programming, you can also use polymorphism (the result will be worse), function overloading (similarly) or macros.

Statics in dynamics

It should also be mentioned that many static languages ​​allow dynamic typing, for example:

  • C# supports the dynamic pseudo-type.
  • F# supports syntactic sugar in the form of the ? operator, on the basis of which imitation of dynamic typing can be implemented.
  • Haskell - dynamic typing is provided by the Data.Dynamic module.
  • Delphi - through a special Variant type.

Also, some dynamically typed languages ​​allow you to take advantage of static typing:

  • Common Lisp - type declarations.
  • Perl - since version 5.6, quite limited.

Strong and weak typing

Strongly typed languages ​​do not allow entities of different types to be mixed in expressions and do not perform any automatic conversions. They are also called "strongly typed languages". The English term for this is strong typing.

Weakly typed languages, on the contrary, encourage the programmer to mix different types in one expression, and the compiler itself will reduce everything to a single type. They are also called "loosely typed languages". The English term for this is weak typing.

Weak typing is often confused with dynamic typing, which is completely wrong. A dynamically typed language can be either weakly or strongly typed.

However, few people attach importance to typing strictness. It is often stated that if a language is statically typed, then you can catch many potential errors when compiling. They are lying to you!

The language must also have strong typing. Indeed, if the compiler, instead of reporting an error, simply adds a string to a number, or, even worse, subtracts another from one array, what good is it to us that all the “checks” of types will be at the compilation stage? That's right - weak static typing is even worse than strong dynamic typing! (Well, that's my opinion)

So does weak typing have no advantages at all? It may look like this, but despite the fact that I am an ardent supporter of strong typing, I must agree that weak typing also has advantages.

Want to know which ones?

Benefits of Strong Typing
  • Reliability - You will get an exception or compilation error instead of incorrect behavior.
  • Speed ​​- Instead of hidden conversions, which can be quite expensive, with strong typing you need to write them explicitly, which forces the programmer to at least know that this piece of code may be slow.
  • Understanding how the program works - again, instead of implicit type casting, the programmer writes everything himself, which means he roughly understands that comparing a string and a number does not happen by itself and not by magic.
  • Certainty - when you write transformations by hand you know exactly what you are converting and to what. You will also always be aware that such conversions may result in loss of precision and incorrect results.
Benefits of Weak Typing
  • Convenience of using mixed expressions (for example, from integers and real numbers).
  • Abstracting from typing and focusing on the task.
  • Brevity of the entry.

Okay, we figured it out, it turns out that weak typing also has advantages! Are there ways to transfer the advantages of weak typing to strong typing?

It turns out there are even two.

Implicit type casting, in unambiguous situations and without data loss

Wow... Quite a long point. Let me further shorten it to “limited implicit conversion” So what does the unambiguous situation and data loss mean?

An unambiguous situation is a transformation or operation in which the essence is immediately clear. For example, adding two numbers is an unambiguous situation. But converting a number to an array is not (perhaps an array of one element will be created, perhaps an array with such a length, filled with elements by default, and perhaps the number will be converted to a string, and then to an array of characters).

Losing data is even easier. If we convert real number 3.5 to an integer - we will lose part of the data (in fact, this operation is also ambiguous - how will rounding be done? big side? To a smaller one? Removing the fractional part?).

Conversions in ambiguous situations and conversions with loss of data are very, very bad. There is nothing worse than this in programming.

If you don't believe me, study the PL/I language or even just look up its specification. It has rules for converting between ALL data types! This is just hell!

Okay, let's remember about limited implicit conversion. Are there such languages? Yes, for example in Pascal you can convert an integer to a real number, but not vice versa. There are also similar mechanisms in C#, Groovy and Common Lisp.

Okay, I said that there is still a way to get a couple of advantages of weak typing in a strong language. And yes, it exists and is called constructor polymorphism.

I will explain it using the example of the wonderful Haskell language.

Polymorphic constructors arose from the observation that safe implicit conversions are most often needed when using numeric literals.

For example, in the expression pi + 1, you don’t want to write pi + 1.0 or pi + float(1) . I just want to write pi + 1!

And this is done in Haskell, thanks to the fact that the literal 1 does not have a concrete type. It is neither whole, nor real, nor complex. It's just a number!

As a result, when writing a simple function sum x y , multiplying all numbers from x to y (with an increment of 1), we get several versions at once - sum for integers, sum for reals, sum for rationals, sum for complex numbers and even sum for all those numeric types that you yourself defined.

Of course, this technique only saves when using mixed expressions with numeric literals, and this is just the tip of the iceberg.

Thus, we can say that the best solution is to balance on the edge between strong and weak typing. But no language yet strikes the perfect balance, so I lean more towards strongly typed languages ​​(such as Haskell, Java, C#, Python) rather than weakly typed ones (such as C, JavaScript, Lua, PHP).

Explicit and Implicit Typing

An explicitly typed language requires the programmer to specify the types of all variables and functions that it declares. The English term for this is explicit typing.

An implicitly typed language, on the other hand, encourages you to forget about types and leave the task of inferring types to the compiler or interpreter. The English term for this is implicit typing.

At first, you might think that implicit typing is equivalent to dynamic, and explicit typing is equivalent to static, but later we will see that this is not so.

Are there advantages to each type, and again, are there combinations of them and are there languages ​​that support both methods?

Benefits of Explicit Typing
  • Having each function have a signature (for example, int add(int, int)) makes it easy to determine what the function does.
  • The programmer immediately writes down what type of values ​​can be stored in a particular variable, eliminating the need to remember it.
Benefits of Implicit Typing
  • Shorthand notation - def add(x, y) is clearly shorter than int add(int x, int y) .
  • Resistance to change. For example, if in a function the temporary variable was of the same type as the input argument, then in an explicitly typed language, when changing the type of the input argument, you will also need to change the type of the temporary variable.

Okay, it's clear that both approaches have both pros and cons (who expected anything else?), so let's look for ways to combine these two approaches!

Explicit typing by choice

There are languages ​​with implicit typing by default and the ability to specify the type of values ​​if necessary. The translator will output the real type of expression automatically. One of these languages ​​is Haskell, let me give a simple example for clarity:

Without explicit type specification add (x, y) = x + y -- Explicit type specification add:: (Integer, Integer) -> Integer add (x, y) = x + y

Note: I intentionally used an uncurried function, and also intentionally wrote a private signature instead of the more general add:: (Num a) -> a -> a -> a , because I wanted to show the idea without explaining Haskell syntax.

Hm. As we can see, it is very nice and short. Writing a function takes only 18 characters on one line, including spaces!

However, automatic type inference is quite a complex thing, and even in such a cool language as Haskell, it sometimes fails. (an example is the monomorphism constraint)

Are there languages ​​with explicit typing by default and implicit typing if necessary? Con
sure.

Implicit typing by choice

The new C++ language standard, called C++11 (previously called C++0x), introduced the auto keyword, which allows the compiler to infer the type based on context:

Let's compare: // Manually specifying the type unsigned int a = 5; unsigned int b = a + 3; // Automatic output of type unsigned int a = 5; auto b = a + 3;

Not bad. But the recording has not decreased much. Let's look at an example with iterators (if you don't understand, don't be afraid, the main thing to note is that the recording is greatly reduced thanks to automatic output):

// Manually specifying the std::vector type vec = randomVector(30); for (std::vector::const_iterator it = vec.cbegin(); ...) ( ... ) // Automatic type inference auto vec = randomVector (thirty); for (auto it = vec.cbegin(); ...) ( ... )

Wow! This is the abbreviation. Okay, but is it possible to do something like Haskell, where the return type depends on the types of the arguments?

And again the answer is yes, thanks keyword decltype in combination with auto:

// Manual type int divide(int x, int y) ( ... ) // Automatic type inference auto divide(int x, int y) -> decltype(x / y) ( ... )

This form of notation may not seem very good, but when combined with generic programming (templates/generics), implicit typing or automatic type inference works wonders.

Some programming languages ​​according to this classification

I will give a small list of popular languages ​​and write how they are divided into each category of “typing”.

JavaScript - Dynamic / Weak / Implicit Ruby - Dynamic / Strong / Implicit Python - Dynamic / Strong / Implicit Java - Static / Strong / Explicit PHP - Dynamic / Weak / Implicit C - Static / Weak / Explicit C++ - Static / Semi-Strong / Explicit Perl - Dynamic / Weak / Implicit Objective-C - Static / Weak / Explicit C# - Static / Strong / Explicit Haskell - Static / Strong / Implicit Common Lisp - Dynamic / Strong / Implicit

Perhaps I made a mistake somewhere, especially with CL, PHP and Obj-C, if you have a different opinion on some language, write in the comments.

Everything is very simple. It's like the difference between a hotel and a private apartment.

Only those who are registered there live in the apartment. If, say, the Sidorov family lives in it, then the Pupkin family, for the life of us, cannot live there. At the same time, Petya Sidorov can live in this apartment, then Grisha Sidorov can move there (sometimes they can even live there at the same time - this is an array). This is static typing.

The Sidorov family can live in the hotel at one time. They don’t even always have to register there. Then they will leave there, and the Pupkins will move there. And then the Kuznetsovs. And then someone else. This is dynamic typing.

If we return to programming, then the first case (static typing) is found in, say, the languages ​​C, C++, C#, Java and others. Before you embezzle for the first time variable value, you must tell what you will store there: integers, floating point numbers, strings, etc. ( the Sidorovs will live in this apartment). Dynamic typing, on the other hand, does not require this. When you assign a value, you simultaneously assign the variable its type ( Vasya Pupkin from the Pupkin family now lives in this hotel room). This is found in languages ​​such as PHP, Python and JavaScript.

Both approaches have their advantages and disadvantages. Which one is better or worse depends on the problems being solved. You can read more in detail, say, on Wikipedia.

With static typing, you know exactly the type of the variable at the time of writing the program and developing the algorithm and take this into account. those. if you said that the variable G is a four-byte unsigned integer, then in the algorithm it will always be a four-byte unsigned integer (if anything, then you need to explicitly convert it or know how the translator converts it in a certain range of situations, but mainly if there is a type mismatch it is an error in the algorithm, and the compiler will at least warn you), with a static one, you can’t put the string “Vasya the fool” into the number and additional checks before using the variable to determine “is there a number there” are not required, you carry out all the correctness of the data at the time of their entry into the program or as required by the algorithm itself.

with dynamic typing, the type of the same variable is generally unknown to you and can change already during the execution of the program, and you take this into account, no one will warn you about a potential error in the algorithm due to a type mismatch (when developing the algorithm you assumed that G is an integer, and the user entered, say, a floating point number, or worse, a string, or, say, after an arithmetic operation, instead of an integer, there turned out to be a floating point number, and you next step try to use bit operations...), on the other hand, you don’t have to bother with many little things.

When you learn programming languages, you often hear phrases like “statically typed” or “dynamically typed” in conversations. These concepts describe the process of type checking, and both static and dynamic type checking refer to different systems types. A type system is a set of rules that assign a property, called a “type,” to various entities in a program—variables, expressions, functions, or modules—with the ultimate goal of reducing errors by ensuring that data is displayed correctly.

Don't worry, I know this all sounds confusing, so we'll start with the basics. What is “type matching” and what is a type anyway?

Type

Code that has undergone dynamic type checking is generally less optimized; In addition, there is the possibility of runtime errors and, as a consequence, the need to check before each launch. However, dynamic typing opens the door to other, powerful programming techniques, such as metaprogramming.

Common Misconceptions

Myth 1: Static/dynamic typing == strong/weak typing

A common misconception is that all statically typed languages ​​are strongly typed and all dynamically typed languages ​​are weakly typed. This is not true, and here's why.

A strongly typed language is one in which variables are bound to specific data types, and which will throw a type error if the expected and actual types do not match - whenever checked. The easiest way to think of a strongly typed language is as a highly type-safe language. For example, in the piece of code already used above, a strongly typed language will produce an obvious typing error, which will interrupt the execution of the program:

X = 1 + "2"

We often associate statically typed languages ​​such as Java and C# with being strongly typed (which they are) because the data type is explicitly set when a variable is initialized - as in this Java example:

String foo = new String("hello world");

However, Ruby, Python, and JavaScript (all dynamically typed) are also strongly typed, although the developer does not need to specify the type of the variable when declaring it. Let's look at the same example, but written in Ruby:

Foo = "hello world"

Both languages ​​are strongly typed but use different type checking techniques. Languages ​​such as Ruby, Python and JavaScript do not require explicit type definitions due to type inference - the ability to programmatically infer desired type variable depending on its value. Type inference is a separate property of the language and is not part of type systems.

A weakly typed language is a language in which variables are not tied to a specific data type; they still have a type, but the type safety restrictions are much weaker. Consider the following PHP code example:

$foo = "x"; $foo = $foo + 2; // not an error echo $foo; // 2

Since PHP is weakly typed, there is no error in this code. Similar to the previous assumption, not all weakly typed languages ​​are dynamically typed: PHP is a dynamically typed language, but C, also a weakly typed language, is truly statically typed.

The myth has been destroyed.

Although static / dynamic and strong / weak system types and are different, they both deal with type safety. The easiest way to put it is that the first system talks about when type safety is checked, and the second tells how.

Myth 2: Static/dynamic typing == compiled/interpreted languages

It is true to say that most statically typed languages ​​are usually compiled and dynamically typed languages ​​are interpreted, but this statement cannot be generalized, and there is a simple example of this.

When we talk about language typing, we talk about the language as a whole. For example, it doesn't matter what version of Java you use - it will always be statically typed. This is different from the case where the language is compiled or interpreted, since in this case we are talking about a specific implementation of the language. In theory, any language can be either compiled or interpreted. Most popular implementation Java language uses compilation to bytecode, which is interpreted by the JVM - but there are other implementations of this language that are compiled directly into machine code or interpreted as is.

If this is still not clear, I advise you to read this series.

Conclusion

I know there was a lot of information in this article - but I believe you got it done. I would like to include information about strong/weak typing in a separate article, but this is not such an important topic; in addition, it was necessary to show that this type of typing is not related to type checking.

There is no clear answer to the question “which typification is better?” - each has its own advantages and disadvantages. Some languages ​​- such as Perl and C# - even allow you to choose between static and dynamic type checking systems. Understanding these systems will allow you to better understand the nature of the errors that occur and also make it easier to deal with them.

  • Dynamic typing is a technique widely used in programming languages ​​and specification languages, in which a variable is associated with a type at the moment of assigning a value, and not at the moment of declaring the variable. Thus, in different parts of the program, the same variable can take on values ​​of different types. Examples of languages ​​with dynamic typing are Smalltalk, Python, Objective-C, Ruby, PHP, Perl, JavaScript, Lisp, xBase, Erlang, Visual Basic.

    The opposite technique is static typing.

    In some languages ​​with weak dynamic typing, there is a problem with comparing values, for example, PHP has comparison operators “==”, “!=” and “===”, “!==”, where the second pair of operations compares values, and types of variables. The operation “===” gives true only if there is a complete match, in contrast to “==”, which considers the following expression true: (1=="1"). It is worth noting that this is not a problem with dynamic typing in general, but with specific programming languages.

Related concepts

A programming language is a formal language designed to write computer programs. A programming language defines a set of lexical, syntactic and semantic rules that define appearance programs and actions that the performer (usually a computer) will perform under its control.

Syntactic sugar in a programming language is syntactic features, the use of which does not affect the behavior of the program, but makes the use of the language more convenient for humans.

A property is a way to access the internal state of an object, simulating a variable of some type. Accessing an object property looks the same as accessing a structural field (in structured programming), but is actually implemented through a function call. When you try to set the value of a given property, one method is called, and when you try to get the value of this property, another method is called.

Extended Backus–Naur Form (EBNF) is a formal system for defining syntax in which some syntactic categories are sequentially defined through others. Used to describe context-free formal grammars. Suggested by Niklaus Wirth. It is an expanded processing of the Backus-Naur forms, differs from the BNF in more “capacious” designs, allowing, with the same expressive ability, to simplify...

Applicative programming is a type of declarative programming in which writing a program consists of systematically applying one object to another. The result of such an application is again an object that can participate in applications both as a function and as an argument, and so on. This makes the program notation mathematically clear. The fact that a function is denoted by an expression indicates the possibility of using value-functions - functional...

A concatenative programming language is a programming language based on the idea that the concatenation of two pieces of code expresses their composition. In such a language, implicit indication of function arguments is widely used (see pointless programming), new functions are defined as a composition of functions, and concatenation is used instead of application. This approach is contrasted with applicative programming.

A variable is an attribute of a physical or abstract system that can change its, usually numerical, value. The concept of a variable is widely used in fields such as mathematics, science, engineering, and programming. Examples of variables include: air temperature, function parameter, and much more.

Syntactic analysis (or parsing, jargon parsing ← English parsing) in linguistics and computer science is the process of comparing a linear sequence of lexemes (words, tokens) of a natural or formal language with its formal grammar. The result is usually a parse tree (syntax tree). Usually used in conjunction with lexical analysis.

A generalized algebraic data type (GADT) is one of the types of algebraic data types, which is characterized by the fact that its constructors can return values ​​other than the type associated with it. Designed under the influence of works on inductive families among researchers of dependent types.

Semantics in programming is a discipline that studies the formalization of the meanings of programming language constructs through the construction of their formal mathematical models. Various tools can be used as tools for constructing such models, for example, mathematical logic, λ-calculus, set theory, category theory, model theory, and universal algebra. Formalization of the semantics of a programming language can be used to describe the language, define the properties of the language...

Object-oriented programming (OOP) is a programming methodology based on representing a program as a collection of objects, each of which is an instance of a specific class, and the classes form an inheritance hierarchy.

Dynamic variable - a variable in a program, a place in random access memory for which it is allocated during program execution. In essence, it is a section of memory allocated by the system to a program for specific purposes while the program is running. This is how it differs from a global static variable - a piece of memory allocated by the system to a program for specific purposes before the program starts running. A dynamic variable is one of the memory variable classes.