Manjusaka

Manjusaka

Detailed Explanation of Swift's Type Checker

This article will revolve around some Swift compiler error messages that have constantly forced me to rewrite code:

Error: Your expression is too complex; please break it down into simpler expressions. (Translator's note: The original text is error: expression was too complex to be solved in reasonable time; consider breaking up the expression into distinct sub-expressions)

I will look at the example that triggered the error and discuss the negative impact of other compilation errors caused by the same underlying issues. I will guide you through what happens during the compilation process and then tell you how to resolve these errors in a short amount of time.

I will design a linear algorithm for the compiler to replace the original exponential algorithm to completely solve this problem without resorting to other more complex methods.

Compilation Errors in Valid Code#

If you try to compile this code in Swift 3, you will encounter the following error message:

let a: Double = -(1 + 2) + -(3 + 4) + 5

This code is valid and correct in every way; theoretically, during the compilation process, it should be optimized into a fixed value.

However, this code cannot pass Swift's type checking during compilation. The compiler will tell you that this code is too complex. But wait, this code doesn't seem complex at all, does it? It contains 5 variables, 4 addition operations, 2 negation operations, and one forced conversion to Double.

But how can the compiler say that this statement, which contains only 12 elements, is quite complex?

There are many expressions that encounter the same problem during compilation. Most expressions contain some variables, basic data operations, and possibly some overloads. The following expressions will face the same error message during compilation:

let b = String(1) + String(2) + String(3) + String(4)

let c = 1 * sqrt(2.0) * 3 * 4 * 5 * 6 * 7

let d = ["1" + "2"].reduce("3") { "4" + String($0) + String($1) }

let e: [(Double) -> String] = [
  { v in String(v + v) + "1" },
  { v in String(-v) } + "2",
  { v in String(Int(v)) + "3" }
]

The above code adheres to Swift syntax and programming rules, but they cannot pass type checking during compilation.

Needlessly Long Compile Times#

Compilation errors are just one of the side effects of the Swift type checker's flaws. For example, you can try the following example:

let x = { String("\($0)" + "") + String("\($0)" + "") }(0)

This code does not produce a compilation error, but on my computer, using Swift 2.3 it takes 4s to compile, and using Swift 3 it takes 15s. A significant amount of time is spent on type checking during compilation.

Now, you may not encounter too many issues that take this long, but in a large Swift project, you will face many error messages like expression was too complex to be solved in reasonable time.

Unpredictable Operations#

Next, I will discuss a feature of the Swift type checker: the type checker chooses to resolve non-generic overloads whenever possible. The comments in the compiler code that handle this specific behavior provide an explanation; it is an optimization technique to avoid performance issues that cause expression was too complex errors.

Here are some specific examples:

let x = -(1)

This code will fail to compile, and we will receive an error message: Ambiguous use of operator ‘-‘.

This code is not very ambiguous; the compiler will understand that we want to use a variable of integer type, treating 1 as an Int, while selecting the following overload from the standard library:

prefix public func -<T : SignedNumber>(x: T) -> T

However, Swift can only perform non-generic overloads. In this case, the implementations for Float, Double, and Float80 are not complete, and the compiler cannot choose which implementation to use based on context, leading to this error message.

Certain specific optimizations can optimize operators, but they may lead to some issues:

func f(_ x: Float) -> Float { return x }
func f<I: Integer>(_ x: I) -> I { return x }

let x = f(1)

prefix operator %% {}
prefix func %%(_ x: Float) -> Float { return x }
prefix func %%<I: Integer>(_ x: I) -> I { return x }

let y = %%1

In this code, we define two functions (f and a custom operator prefix %%). Each function has two overloads, one for (Float) -> Float and the other for <I: Integer>(I) -> I.

When calling f(1), the implementation <I: Integer>(I) -> If(1) will be chosen, and x will be treated as Int. This is what you would expect.

When calling %%1, the implementation (Float) -> Float will be used, and y will be treated as Float, which is the opposite of what we expect. During compilation, the compiler chooses to treat 1 as Float instead of Int, even though treating it as Int would also work normally. The reason for this situation is that the compiler determines the variable's type before performing generic overloads. This is not a context-consistent approach; it is a compromise by the compiler to avoid errors like expression was too complex to be solved and for performance optimization.

Working Around the Problem in Our Code#

Generally speaking, the defect of Swift showing code as too complex is not a significant issue, provided you do not use two or more of the following features in a single expression:

  • Method overloading (including operator overloading)
  • Constants
  • Ambiguous type closures
  • Expressions that lead Swift to perform incorrect type conversions

In general, if you do not use the features mentioned above, you will not encounter error messages like expression was too complex. However, if you choose to use the features mentioned above, you may face some confusing issues. Typically, when writing a sufficiently large method and other regular code, it is easy to use these features, which means that sometimes we may need to carefully consider how to avoid excessive use of these features.

You certainly want to pass compilation with just a few minor modifications rather than making extensive changes to your code. The following small suggestions may help a bit.

When the aforementioned compilation error message appears, the compiler may suggest that you break the original expression into different sub-expressions:

let x_1: Double = -(1 + 2)
let x_2: Double = -(3 + 4)
let x: Double = x_1 + x_2 + 5

Well, from the result, this modification is effective, but it can be a bit annoying, especially when breaking it into sub-expressions significantly disrupts code readability.

Another suggestion is to reduce the number of times the compiler needs to select methods and operator overloads during compilation through explicit type conversion.

let x: Double = -(1 + 2) as Double + -(3 + 4) as Double + 5

This approach avoids the need for the compiler to look up the corresponding negation overload when using (Float) -> Float or (Float80) -> Float80. This method effectively reduces the compiler's lookup process from 6 times to 4 times during compilation.

One point to note in the above handling method: unlike other languages, in Swift, Double(x) is not equivalent to x as Double. Constructors typically behave like ordinary methods, and when there is a need for different parameter overloads, the compiler will still include various overloads of the constructor in the search space (even though these overloads may be in different positions in the code). In the previous example, using Double for explicit type conversion before the parentheses will solve part of the problem (this method is beneficial for the compiler's type checking), while in some cases, this method may lead to other issues (see the example about String mentioned at the beginning of this article). Ultimately, using the as operator is the best way to solve such problems without increasing complexity. Fortunately, the priority of the as operator is higher than most binary operators, allowing us to use it in most cases.

Another method is to use a separately named custom function:

let x: Double = myCustomDoubleNegation(1 + 2) + myCustomDoubleNegation(3 + 4) + 5

This method can solve a series of problems caused by method overloading. However, using this approach in a series of lightweight code will make our code look particularly ugly.

Now, let's talk about the last method; in many cases, you can replace methods and operators based on the situation:

let x: Double = (1 + 2).negated() + (3 + 4).negated() + 5

Using the corresponding method instead of common arithmetic operators effectively reduces the number of overloads, and using the . operator is more efficient than directly calling methods, so this method can effectively solve the problems mentioned earlier.

Swift Type Constraint System Analysis#

The expression was too complex error that occurs during compilation is thrown by the Swift compiler's semantic analysis system. The purpose of the semantic analysis system is to resolve type issues throughout the code, ensuring that the types of input expressions are correct and safe.

Most importantly, the entire error message is defined by the constraints system solver (CSSolver.cpp) written in the semantic analysis system. The type constraint system constructs a graph of types and methods from Swift expressions and constrains the code based on the relationships between the nodes. The constraint system will infer for each node until each node has a clear type constraint.

To be honest, the above may be too abstract; let's look at some concrete examples.

let a = 1 + 2

The type constraint system parses the expression into the following:

a simple constraints graph

Each node's name starts with T (indicating that a clear type needs to be determined), and they represent type constraints or method overloads that need to be resolved. In this graph, these nodes are constrained by the following rules:

  1. T1 is of type ExpressibleByIntegerLiteral
  2. T2 is of type ExpressibleByIntegerLiteral
  3. T0 is a method that takes (T1,T2) and returns T3
  4. T0 is infix +, which has 28 implementations in Swift
  5. T3 can be exchanged with T4

Tip: In Swift 2.X, the alternative to ExpressibleByIntegerLiteral is IntegerLiteralConvertible

In this system, the type constraint system follows the principle of minimal separation. The separated units are constrained by a rule that each unit is an individual with a set of independent values. In the above example, there is actually only one minimal unit: in constraint 4, T0 is overloaded. When overloaded, the compiler chooses the first implementation in the infix + implementation list: the one with the signature (Int, Int) -> Int.

Through this minimal unit, the type constraint system begins to constrain the types of elements: based on constraint 3, T1, T2, and T3 are determined to be of type Int, and based on constraint 4, T4 is also confirmed to be of type Int.

Once T1 and T2 are determined to be Int (initially they were considered ExpressibleByIntegerLiteral), the overload for infix + has been determined, and at this point, the compiler no longer needs to consider other possibilities and treats it as the final solution. After determining the corresponding types for each node, we can choose the overload method we need.

Let's Look at a More Complex Example!#

So far, nothing unexpected has occurred; you might not imagine that when expressions start to become complex, the Swift compilation system will begin to produce error messages continuously. Let's modify the previous example: first, put 2 in parentheses, second, add a negation operator, and third, specify the return value as Double type.

let a: Double = 1 + -(2)

The entire node structure is shown in the following diagram:

a slightly more complex constraints graph

The node constraints are as follows:

  1. T1 is of type ExpressibleByIntegerLiteral
  2. T3 is of type ExpressibleByIntegerLiteral
  3. T2 is a method that takes T3 and returns T4
  4. T2 is prefix -, which has 6 implementations in Swift
  5. T0 is a method that takes T1, T4, and returns T5
  6. T0 is infix +, which has 28 implementations in Swift
  7. T5 is of type Double

Compared to the previous example, there are two additional constraints; let's see how the type constraint system will handle this example.

The first step: choose the minimal separation unit. This time it is constraint 4: "T2 is prefix -, which has 6 implementations in Swift." The system finally chooses the implementation with the signature (Float) -> Float.

The second step: like the first step, choose the minimal separation unit; this time it is constraint 6: "T0 is infix +, which has 28 implementations in Swift." The system chooses the implementation with the signature (Int, Int) -> Int.

The last step: use the above type constraints to determine the types of all nodes.

However, there is a problem here: the signature (Float) -> Float implementation we chose in the first step conflicts with the signature (Int, Int) -> Int implementation we chose in the second step and our constraint 5 (T0 is a method that takes T1, T4, and returns T5). The solution is to abandon the current choice and roll back to the second step for T0.

Ultimately, the system will traverse all infix + implementations and find that none of the implementations satisfy both constraint 5 and constraint 7 (T5 is of type Double).

Thus, the type constraint system will roll back to the first step and choose the implementation with the signature (Double, Double) -> Double for T2. Finally, this implementation also satisfies the constraints of T0.

However, upon discovering that the Double type and ExpressibleByIntegerLiteral do not match, the type constraint system will continue to roll back, searching for suitable overload methods.

T2 has a total of 6 implementations, but the last 3 implementations cannot be optimized (because they are generic implementations, thus having a higher priority than the implementation that explicitly declares parameters as Double).

In the type constraint system, this special optimization is one of the features of rapid overload mentioned in Unexpected behaviors.

Thanks to this special "optimization," the type constraint system requires 76 queries to find a reasonable solution. If we add some additional overloads, this number can become unimaginable. For example, if we add another infix + operator in the example, like let a: Double = 0 + 1 + -(2), it will require 1190 queries to find a reasonable solution.

The process of querying solutions is a typical operation with exponential time complexity. The search range for separation units is called “Cartesian product”, and for n separation units in the graph, the algorithm will search within the n-dimensional Cartesian product (which is also an operation with exponential space complexity).

Based on my tests, having 6 separation units within a single statement is enough to trigger the expression was too complex error in Swift.

Linearizing the Type Constraint System#

The best solution to the problem repeatedly mentioned in this article is to fix it in the compiler.

The type constraint system uses an exponential time complexity algorithm to solve method overloading issues because Swift needs to traverse and search elements in the n-dimensional “Cartesian product” space generated by method overloads to determine a suitable option (this should be the best solution until a better one is found).

To avoid generating an n-dimensional Cartesian product space, we need to design a method to achieve the independence of related logic implementations without letting them depend on each other.

Before we begin, I must give you an important reminder:

Friendly reminder, these things only represent my personal opinion: The following discussions are my theoretical analysis of how to solve function overloading issues in Swift's type constraint system. I have not written anything to prove my proposed solutions, which may mean I will overlook some very important aspects.

Premise#

We want to achieve the following two goals:

  1. A node should not depend on or reference other nodes.
  2. The separation units derived from the previous method should intersect with those derived from the latter method and further simplify the two constraints of the separation units.

The first goal can be achieved by restricting the constraint paths of nodes. In Swift, the constraints of each node are bidirectional, and the constraints of each node start from every branch of the expression, then propagate from the main trunk to the linear traversal of child nodes. During this process, we can selectively merge the same constraint logic to combine these constraints instead of referencing the corresponding type constraints from other nodes.

In the second goal, supporting the previous method by reducing the complexity of constraint propagation further simplifies the relevant constraints. The most important intersection point between the separation units of each overloaded method is the output of an overloaded function, which may serve as the input for another overloaded function. This operation should be calculated based on the 2D Cartesian product generated by the two overloaded methods whose parameters intersect. For other possible intersection points, providing a mathematically rigorous intersection proof is very difficult, and such proof is unnecessary; we only need to replicate the greedy strategy used by Swift in complex situations for type selection.

Let's Revisit the Previous Example#

Let's see what the type constraint system would look like if we achieved the two goals mentioned earlier. First, let's review the node graph generated previously:

let a: Double = 1 + -(2)

a slightly more complex constraints graph

Then let's also review the following node constraints:

  1. T1 is of type ExpressibleByIntegerLiteral
  2. T3 is of type ExpressibleByIntegerLiteral
  3. T2 is a method that takes T3 and returns T4
  4. T2 is one of the 6 implementations of prefix -, and to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  5. T4 is one of the six return values of prefix -, and similarly, to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  6. T0 is a method that takes T1, T4, and returns T5
  7. T0 is one of the 6 implementations of infix +, and the parameters coming from the right side are any of the return values from prefix -, and to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  8. T5 is of type Double

Passing Node Constraints from Right to Left#

We traverse from right to left (from leaf nodes to the main trunk).

When the node constraints propagate from T3 to T2, a new constraint is added: "T2's input value must be a value converted from ExpressibleByIntegerLiteral." Now, with the new constraint rule and the original rules acting simultaneously, once we confirm that all nodes with T2 are successfully constrained by the new rule, or if there is a conflict with the rule "specific operation overloads take precedence over generic operation overloads" (for example, in prefix -, Double, Float, or Float80 will be prioritized), we can discard the newly established node constraint rule. During the propagation of node constraints from T2 to T4, a new constraint is added: "T4 must be one of the 6 types returned by prefix -, with Double, Float, or Float80 being prioritized." During the propagation of node constraints from T4 to T0, a new constraint is added: "T0's second parameter must be any of the 6 types returned by prefix -, with Double, Float, or Float80 being prioritized." Combining with the existing node constraints of T0, the node constraint for T0 becomes: "T0 is one of the 6 implementations of infix +, and the parameter coming from the right side is any of the return parameters from prefix -, with Double, Float, or Float80 being prioritized." When the node constraints propagate from T1 to T0, no new constraints need to be added (here, T0 has already been strictly constrained by the conditions we added, and the originally used ExpressibleByIntegerLiteral type has been replaced by any of the types Double, Float, or Float80). When the node constraints propagate from T0 to T5, a new constraint is added: "T5 is one of the 6 return values of infix +, and the second parameter of infix + is the return value from prefix -, with Double, Float, or Float80 being prioritized." Under the combined action of the above constraints, we can ultimately confirm that the type of T5 is Double.

After the above process, the entire set of node constraints iterates into the following:

  1. T1 is of type ExpressibleByIntegerLiteral
  2. T3 is of type ExpressibleByIntegerLiteral
  3. T2 is a method that takes T3 and returns T4
  4. T2 is one of the 6 implementations of prefix -, and to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  5. T4 is one of the six return values of prefix -, and similarly, to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  6. T0 is a method that takes T1, T4, and returns T5
  7. T0 is one of the 6 implementations of infix +, and the parameters coming from the right side are any of the return values from prefix -, and to satisfy the principle that special operator overloads have a higher priority than generic overloads in Swift, the types Double, Float, or Float80 are prioritized.
  8. T5 is of type Double

Passing Node Constraints from Left to Right#

Now we start traversing from left to right (first traversing the main trunk, then the leaf nodes).

We start traversing from T5, where constraint 5 states: "T5 is of type Double." At this point, we add a new constraint to T0: "T0's return value type must be Double." Once this constraint takes effect, we can exclude all overloads of infix + except for (Double, Double) -> Double. The node constraints continue to propagate from T0 to T1, and based on the parameter requirements of infix +, we create a new constraint for T1: T1 must be of type Double. Under the influence of multiple constraints, the previously mentioned " T1 is of type ExpressibleByIntegerLiteral" becomes " T1 is of type Double." When the node constraints propagate from T0 to T4, based on the requirements of the second parameter of infix +, we determine that the type of T4 is Double. During the propagation of node constraints from T4 to T2, we add a new constraint: "T2's return value must be of type Double." Under the combined action of the above rules, we can determine that T2 is of type (Double) -> Double for prefix -. Finally, based on the above constraints, we can determine that T3 is of type Double.

In the end, the entire type constraint system becomes the following:

  1. T1 is of type Double.
  2. T3 is of type Double.
  3. T2 is of type (Double) -> Double for prefix -.
  4. T0 is of type (Double, Double) -> Double for infix +.
  5. T5 is of type Double.

Now, the entire type constraint operation has come to an end.

Well, I propose this algorithm to improve the operations related to method overloading, so I will denote the number of method overloads as n. Then I will denote the average number of function overloads as m.

As I mentioned earlier, in Swift, the compiler determines the final result by searching within an n-dimensional Cartesian product space. Its time complexity is O(m^n).

The algorithm I propose searches within a 2D space for n-1 separation units. Its execution time is m^2*n. Since m is associated with n, we can derive its final time complexity as O(n).

Generally speaking, when n is large, linear complexity algorithms are more adaptable than exponential time complexity algorithms, but we need to clarify what kind of situation can be considered as n being a large number. In this example, 3 is already a very "large" number. As I mentioned earlier, the built-in type constraint system in Swift will perform 1190 searches to confirm the final result. Meanwhile, the algorithm I designed only requires 336 searches. This clearly reduces the final time consumption.

I conducted an interesting experiment: in the previously mentioned example let a: Double = 1 + -(2), both the type constraint system in Swift and the algorithm I designed search within a 2D Cartesian product space containing 168 possibilities.

The type constraint algorithm currently used in Swift selects 76 out of the 168 possibilities generated in the 2D Cartesian product space from the overloads of prefix - and infix +. However, doing so results in 567 calls to ConstraintSystem::matchTypes, of which 546 are used to search for suitable overload functions.

My designed algorithm searches all 168 possibilities, but according to my analysis, it ultimately only generates 22 calls to ConstraintSystem::matchTypes.

Determining a non-public algorithm requires many guesses, so knowing the specific details of an algorithm is very difficult. However, I believe that my algorithm performs better than or at least matches the existing algorithms under any magnitude.

Will Swift Soon Improve Its Type System?#

While I would love to say, "I did all the work myself; look how perfectly this code runs," that can only be a thought. A whole system consists of thousands of logical units and cannot be discussed by isolating a single node.

Do you think the Swift development team is trying to linearize the type constraint system? I hold a negative view on this.

In this article, “[swift-dev] A type-checking performance case study” indicates that the official developers believe it is normal for the type constraint system to use an exponential time complexity algorithm. Rather than spending time optimizing the algorithm, it would be better to refactor the standard library to make it more reasonable.

A little complaint:

  • It seems that the first two chapters of this article are completely useless; I should quietly delete them.
  • I believe my idea is correct; the type constraint system should undergo significant improvements so that we won't be troubled by the issues mentioned above.

Friendly reminder: Theoretically, the type constraint system is not the most critical part of the entire language, so if it undergoes improvements, it should be released in a minor version iteration rather than a major version update.

Conclusion#

In my experience with Swift, the error expression was too complex to be solved in reasonable time occurs frequently, and I do not consider it a simple error. If you use a large number of methods or mathematical operations in a single example, you should regularly refer to this article.

The exponential time complexity algorithm used in Swift may also lead to longer compilation times. Although I do not have precise statistics on the time allocation throughout the compilation process, it is likely that the system spends most of its time on calculations related to the type constraint system.

This problem can be avoided when we write our code, but honestly, it is unnecessary to do so. If the compiler could adopt the linear time algorithm I proposed, I am confident that these issues would no longer be problems.

Until the compiler makes specific changes, the issues mentioned in this article will continue to trouble us, and the struggle with the compiler will persist.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.