Pure functions

Pure functions resemble mathematical functions. They do nothing other than compute a result based on their input. Pure function has two important characteristics:

It has no side effects. Just like mathematical function, pure function only returns value.
It is consistent. Its result is determined only by the input arguments. Given the same input data, it will always return the same output.

An example of pure function is:

public static string Greet(Person person, string greeting = "Hello") => $"{greeting}, {person?.FirstName} {person?.LastName}"; var john = new Person("John", "Doe"); Greet(john);

Note that the only thing Greet uses to calculate its result is the Person object we pass as an argument and also it doesn't have side effects and doesn't change state at all.

What are side effects? Any interaction with the outside world from within a function. That could be changing a variable that exists outside the function or calling another function⁽¹⁾. Some examples of side effects are:

Changing global state – for example instance field.
Changing arguments or outer variable.
Performing I/O or HTTP request.
Getting the current time.

In contrast impure functions may cause side effects or use other things than input arguments to calculate output value.

Let's assume we have a function MultiplyByTwo(int x) which takes an argument called x and multiplies it by 2. To use this function, we simply provide a value for x: MultiplyByTwo(3). This means exactly the same thing as writing 6. In any place you see MultiplyByTwo(3) you can substitute with 6. So, Console.WriteLine(MultiplyByTwo(3)) is the same as Console.WriteLine(6). This is true only if MultiplyByTwo is a pure function. If it had side effects, such as printing the result to the console, we couldn't simply replace all calls to MultiplyByTwo(3) with 6 without losing logs to the console.

Predictability

Pure functions are predictable in the sense that they will always return the same output for a given input. If we pass john to the Greet function from above, it will always return "Hello, John Doe".

Greet(john); // => Hello, John Doe Greet(john); // => Hello, John Doe Greet(john); // => Hello, John Doe

Of course, not all functions are "predictable" in this way:

public static string GreetWithTime(Person person, string greeting = "Hello") => $"{greeting}, {person?.FirstName} {person?.LastName}. Current time is {DateTime.Now}"; GreetWithTime(john); // => Hello, John Doe. Current time is 6/11/2018 7:50:58 AM GreetWithTime(john); // => Hello, John Doe. Current time is 6/11/2018 7:51:27 AM GreetWithTime(john); // => Hello, John Doe. Current time is 6/11/2018 7:52:01 AM

GreetWithTime returns a different result each time we call it in spite of the fact that we pass the same john object. The reason for this is because GreetWithTime relies on external state – the current time. Since the time changes each time we call the function, its return value changes as well. This makes it unpredictable.

Pure functions and parallelization

Let's take a look at the following simple example:

var names = new List<string>() { "John", "Melcy", "Thor" }; int counter = 1; string Format(string s) => $"{counter++}) {s}"; List<string> result = names .Select(x => x.ToUpper()) .Select(Format) .ToList();

We create a list of names and an integer counter. Then we iterate through the list, make the names in upper-case, and use local function to prepend the current value of the counter. The problem is in Format which is an impure function because it uses the shared counter variable.

What if we want to do the same operation but for bigger set of data in parallel? We can use PLINQ and simply add AsParallel() to the list.

List<string> result = names .AsParallel() .Select(x => x.ToUpper()) .Select(Format) .ToList();

The parallel version will have multiple threads reading and updating the counter which will make the result unpredictable. One possible way to fix this problem is to avoid using shared variable.

List<string> result = names .AsParallel() .Select(x => x.ToUpper()) .Zip(ParallelEnumerable.Range(1, names.Count), (name, c) => $"{c}) {name}") .ToList();

There is not explicit mutability of shared state in this version.

Mutating function arguments

An example of non-pure function that changes the contents of its parameter:

public static void Concat(StringBuilder sb, string append) { sb.Append('-' + append); } StringBuilder sb = new StringBuilder("StringOne"); Concat(sb, "StringTwo");

We can implement Concat as pure function like so:

public static string Concat(string s, string append) { return s + '-' + append; } string s1 = "StringOne"; string s2 = Concat(s1, "StringTwo");

This version produces the same output but it retains the concatenated value in intermediate variable s2. It passes both requirements for pure function – it doesn't depend on any external input, it doesn't change any data and it doesn't have side effects.

Functions with local side effects

For a function to be pure it must not mutate any state. But what about functions that always return the same result even though they change state internally?

public static int Sum(int to) { int sum = 0; for (int i = 0; i < to; i++) sum += i; return sum; }

The sum variable is changed in every iteration of the loop. The function has a local side effect – it is not visible from the outside and the user of the function does not care. Therefore, this function is pure.

Memoization

With pure functions, we only need to compute their output once for given inputs. Caching and reusing the result of computation is called memorization, and can be done safely with pure functions.

1) Calling a pure function is not a side effect and the calling function is still pure.

Continue reading: Expressions vs Statements

Resources:

Functional Programming in C#

Learn how to write better C# code

Pure functions