Jekyll2017-10-27T06:19:58+00:00http://mustoverride.com/VSadov’s BlogRandom pile of my own opinions.Vladimir Sadovhttp://mustoverride.comC# Local Functions vs. Lambda Expressions.2017-04-08T00:00:00+00:002017-04-08T00:00:00+00:00http://mustoverride.com/local_functions<p>C# Local Functions are often viewed as a further enhancement of lambda expressions. While the features are related, there are also major differences.</p>
<p>Local Functions is the C# implementation of <a href="https://en.wikipedia.org/wiki/Nested_function">Nested function</a> feature. It is a bit unusual for a language to get support for nested functions several versions after supporting lambdas. Usually it is the other way around.</p>
<p>Lambdas, or first-class functions in general, require implementation of local variables that are not allocated on the stack and have life times tied to the functional objects that need them. It is nearly impossible to implement them correctly and efficiently without relying on Garbage Collection or dropping the burden of variable ownership on the user via solutions such as capture lists. That was a serious blocking issue for some early languages.<br />
A simple implementation of nested functions does not run into such complications, so it is more common for a language to support only nested functions and not lambdas.</p>
<p>Anyways, since C# had lambdas for a long time, it does make sense to look at the Local Functions in terms of differences and similarities.</p>
<h2 id="lambda-expressions">Lambda expressions.</h2>
<p>Lambda expressions like <code class="highlighter-rouge">x => x + x</code> are expressions that abstractly represent a piece of code and how it binds to parameters and variables in its lexical environment. Being an abstract representation of code, a lambda expression cannot be used on its own. In order to use values produced by a lambda expression, it needs to be converted to something more material such as a delegate or an expression tree.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
<span class="k">using</span> <span class="nn">System.Linq.Expressions</span><span class="p">;</span>
<span class="k">class</span> <span class="nc">Program</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// can't do much with the lambda expression directly</span>
<span class="c1">// (x => x + x).ToString(); // error</span>
<span class="c1">// can assign to a variable of delegate type and invoke</span>
<span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">></span> <span class="n">f</span> <span class="p">=</span> <span class="p">(</span><span class="n">x</span> <span class="p">=></span> <span class="n">x</span> <span class="p">+</span> <span class="n">x</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="nf">f</span><span class="p">(</span><span class="m">21</span><span class="p">));</span> <span class="c1">// prints "42"</span>
<span class="c1">// can assign to a variable of expression type and introspect</span>
<span class="n">Expression</span><span class="p"><</span><span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>></span> <span class="n">e</span> <span class="p">=</span> <span class="p">(</span><span class="n">x</span> <span class="p">=></span> <span class="n">x</span> <span class="p">+</span> <span class="n">x</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">e</span><span class="p">);</span> <span class="c1">// prints "x => (x + x)"</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>There are several things that are worth noting:</p>
<ul>
<li>
<p>lambdas are expressions that produce functional values.</p>
</li>
<li>
<p>lambda values have unbounded life times - from the execution of the lambda expression and as long as any reference to the value exists. That implies that any local variables used, or “captured”, by the lambda from the enclosing method must be allocated on the heap. Since the life time of the lambda value is not limited by the life time of the stack frame where it was produced, the variables cannot be allocated on that stack frame.</p>
</li>
<li>
<p>lambda expression requires that all external variables used in the body are definitely assigned at the time the lambda expression is executed. The moment of the first and the last use of a lambda are rarely deterministic, so the language assumes that lambda values can be used right after creation and as long as they are reachable.<br />
As a result a lambda value must be fully functional at the point of its creation and all outer variables that it uses must be definitely assigned.</p>
</li>
</ul>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kt">int</span> <span class="n">x</span><span class="p">;</span>
<span class="c1">// ERROR: 'x' is not definitely assigned</span>
<span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">></span> <span class="n">f</span> <span class="p">=</span> <span class="p">()</span> <span class="p">=></span> <span class="n">x</span><span class="p">;</span>
</code></pre></div></div>
<ul>
<li>lambdas do not have names and cannot be referred to symbolically. In particular lambda expressions cannot be declared recursively.</li>
</ul>
<p>NOTE: It is possible to <em>make</em> a recursive lambda by invoking a variable to which the lambda is assigned or by passing to a higher-order method which self-applies its parameter (see: <a href="https://blogs.msdn.microsoft.com/wesdyer/2007/02/02/anonymous-recursion-in-c/">Anonymous Recursion in C#</a>), but that does not make such expressions truly self-referential.</p>
<h2 id="local-functions">Local functions.</h2>
<p>Local function is basically just a method declared inside another method as a way of reducing visibility of the method to the scope within which it is declared.</p>
<p>Naturally, the code in a local function has access to everything that is accessible in its containing scope - local variables, enclosing methods’s parameters, type parameters, local functions. A notable exception is the visibility of outer method’s labels. Labels of the enclosing method are not visible in a local function. That is just normal lexical scoping and it works the same as in lambdas.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="kt">object</span> <span class="n">o</span><span class="p">;</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">M1</span><span class="p">(</span><span class="kt">int</span> <span class="n">p</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">l</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="c1">// lambda has access to o, p, l,</span>
<span class="n">Action</span> <span class="n">a</span> <span class="p">=</span> <span class="p">()=></span> <span class="n">o</span> <span class="p">=</span> <span class="p">(</span><span class="n">p</span> <span class="p">+</span> <span class="n">l</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">M2</span><span class="p">(</span><span class="kt">int</span> <span class="n">p</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">l</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="c1">// Local Function has access to o, p, l,</span>
<span class="k">void</span> <span class="nf">a</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">o</span> <span class="p">=</span> <span class="p">(</span><span class="n">p</span> <span class="p">+</span> <span class="n">l</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The obvious difference from lambdas is that local functions have names and can be used without any indirection. Local functions can be recursive.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">Fac</span><span class="p">(</span><span class="kt">int</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="nf">FacRecursive</span><span class="p">(</span><span class="kt">int</span> <span class="n">a</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="n">a</span> <span class="p"><=</span> <span class="m">1</span> <span class="p">?</span>
<span class="m">1</span> <span class="p">:</span>
<span class="n">a</span> <span class="p">*</span> <span class="nf">FacRecursive</span><span class="p">(</span><span class="n">a</span> <span class="p">-</span> <span class="m">1</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">return</span> <span class="nf">FacRecursive</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The main semantical difference from lambda expressions is that local functions are not expressions, they are declaration statements. Declarations are very passive entities when it comes to code execution. In fact declarations do not really get “executed”. Similarly to other declarations like labels, local function declarations simply introduce the functions into containing scope without running any code.</p>
<p>What is more important is that neither declarations by themselves nor regular invocations of a nested function result in an indefinite capture of the environment. In simple and common cases, like an ordinary invoke/return scenario, the captured locals do not need to be heap-allocated.</p>
<p>Example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">M</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">num</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="c1">// has access to num</span>
<span class="k">void</span> <span class="nf">Nested</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">num</span><span class="p">++;</span>
<span class="p">}</span>
<span class="nf">Nested</span><span class="p">();</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">num</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The code above is emitted as roughly equivalent of (decompiled):</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="c1">// A struct to hold "num" variable.</span>
<span class="c1">// We are not storing it on the heap,</span>
<span class="c1">// so it does not need to be a class</span>
<span class="k">private</span> <span class="k">struct</span> <span class="err"><></span><span class="nc">c__DisplayClass0_0</span>
<span class="p">{</span>
<span class="k">public</span> <span class="kt">int</span> <span class="n">num</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">M</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// reserve storage for "num" in a display struct on the _stack_</span>
<span class="n">C</span><span class="p">.<></span><span class="n">c__DisplayClass0_0</span> <span class="n">env</span> <span class="p">=</span> <span class="k">default</span><span class="p">(</span><span class="n">C</span><span class="p">.<></span><span class="n">c__DisplayClass0_0</span><span class="p">);</span>
<span class="c1">// num = 123</span>
<span class="n">env</span><span class="p">.</span><span class="n">num</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="c1">// Nested()</span>
<span class="c1">// note - passes env as an extra parameter</span>
<span class="n">C</span><span class="p">.<</span><span class="n">M</span><span class="p">></span><span class="nf">g__a0_0</span><span class="p">(</span><span class="k">ref</span> <span class="n">env</span><span class="p">);</span>
<span class="c1">// System.Console.WriteLine(num)</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">env</span><span class="p">.</span><span class="n">num</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// implementation of the the "Nested()".</span>
<span class="c1">// note - takes env as an extra parameter</span>
<span class="c1">// env is passed by reference so it's instance is shared</span>
<span class="c1">// with the caller "M()"</span>
<span class="k">internal</span> <span class="k">static</span> <span class="k">void</span> <span class="p"><</span><span class="n">M</span><span class="p">></span><span class="nf">g__a0_0</span><span class="p">(</span><span class="k">ref</span> <span class="n">C</span><span class="p">.<></span><span class="n">c__DisplayClass0_0</span> <span class="n">env</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">env</span><span class="p">.</span><span class="n">num</span> <span class="p">+=</span> <span class="m">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Note that the code above calls the implementation of “Nested()” directly (not via a delegate indirection) and does not introduce an allocation of display storage on the heap (as lambda would have). The locals are stored in a struct instead of a class. The life time of the <code class="highlighter-rouge">num</code> was not altered by its use in <code class="highlighter-rouge">Nested()</code>, so it can still be allocated on the stack. <code class="highlighter-rouge">M()</code> could just pass <code class="highlighter-rouge">num</code> by reference, but compiler uses a struct for packaging, so that it could pass all locals like <code class="highlighter-rouge">num</code> using just one env parameter.</p>
<p>Another interesting point is that Local Functions can be used as long as they are visible in a given scope. This is an important fact that makes recursive and mutually recursive scenarios possible. That also makes the exact location of the local function declaration in the source largely unimportant.</p>
<p>For example all the variables of the enclosing method must be definitely assigned at the <em>invocation</em> of a Local Function that reads them, not at its declaration. Indeed, making that requirement at declaration would not do any good if an invocation can happen earlier.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">M</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// error here -</span>
<span class="c1">// Use of unassigned local variable 'num'</span>
<span class="nf">Nested</span><span class="p">();</span>
<span class="kt">int</span> <span class="n">num</span><span class="p">;</span>
<span class="c1">// whether 'num' is assigned here or not is irrelevant</span>
<span class="k">void</span> <span class="nf">Nested</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">num</span><span class="p">++;</span>
<span class="p">}</span>
<span class="n">num</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="c1">// no error here - 'num' is assigned</span>
<span class="nf">Nested</span><span class="p">();</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">num</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Also - if a local function is never used, it is no better than a piece of unreachable code and any variable, that it would otherwise use, does not need to be assigned.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">M</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">num</span><span class="p">;</span>
<span class="c1">// warning - Nested() is never used.</span>
<span class="k">void</span> <span class="nf">Nested</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// no errors on unassigned 'num'.</span>
<span class="c1">// this code never runs.</span>
<span class="n">num</span><span class="p">++;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="so-what-is-the-purpose-of-local-functions">So, what is the purpose of Local Functions?</h2>
<p>The main value proposition of local functions, in comparison to lambdas, is that local functions are simpler, both conceptually and in terms of run time overhead.</p>
<p>Lambdas serve their role as <a href="https://en.wikipedia.org/wiki/First-class_function">first-class functions</a> very well, but sometimes you only need a simple helper. Lambda assigned to a local variable could do the job, but there is an overhead of indirection, allocation of a delegate and possibly a closure. A private method works too and is cheaper to call, but there is an issue with encapsulation, or lack thereof. Such helper would be visible to everyone in the containing type. Too many helpers like this can result in a serious mess.</p>
<p>A Local Function fits this scenario nicely. The overhead of calling a Local Function is comparable with a call to a private method, but there is no issue with polluting the containing type with a method that nothing else should call.</p>Vladimir Sadovhttp://mustoverride.comC# Local Functions are often viewed as a further enhancement of lambda expressions. While the features are related, there are also major differences.C# Tuples. Conversions.2017-02-11T00:00:00+00:002017-02-11T00:00:00+00:00http://mustoverride.com/tuples_conversions<p>In a statically typed language like C#, every new kind of type or a new expression needs to define how it fits into the framework of type conversions. Tuples are not an exception.</p>
<p>Truth be told, initially it was believed that it would be better for tuples to have only a very limited support for conversions. That was mostly out of fear that a composite type such as tuple would run into contradicting scenarios with conversion classification. I.E. if <code class="highlighter-rouge">(int, object)</code> needs to convert to <code class="highlighter-rouge">(object, int)</code>, do we have an implicit or explicit conversion or something in between? Is it boxing, or unboxing, or both?<br />
Forcing the user to deconstruct/reconstruct into a tuple of a desired type would avoid the issues, but it was quickly found to be inconvenient.</p>
<p><strong>“Distributing” behavior of tuple conversions.</strong></p>
<p>The overall guiding principle for tuple conversions is that tuple conversions are composite conversions consisting of <code class="highlighter-rouge">N</code> underlying conversions, one per element, and classification questions are “distributed” to the underlying conversions, which themselves could be tuple conversions, and in such case the classification is recursive.</p>
<p>This uncomplicated principle allows tuple conversions to be a relatively low-friction feature, but the design has some interesting details.</p>
<p><strong>Tuple Literal Conversions and Target Typing.</strong></p>
<p>C# distinguishes conversions <em>from expression</em> and conversions <em>from type</em>.</p>
<p>Conversions from <em>expression</em> are used when expression results are coerced to be of a particular type. -</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// conversion from expression is used to turn int into long</span>
<span class="kt">long</span> <span class="n">x</span> <span class="p">=</span> <span class="kt">int</span><span class="p">.</span><span class="n">MaxValue</span><span class="p">;</span>
</code></pre></div></div>
<p>Conversions from <em>type</em> are used in analysis that operates with types - like when determining the best overload resolution candidate.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">void</span> <span class="nf">M1</span><span class="p">(</span><span class="kt">int</span> <span class="n">val</span><span class="p">)</span> <span class="p">=></span> <span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"int"</span><span class="p">);</span>
<span class="k">void</span> <span class="nf">M1</span><span class="p">(</span><span class="kt">long</span> <span class="n">val</span><span class="p">)</span> <span class="p">=></span> <span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"long"</span><span class="p">);</span>
<span class="c1">// overload resolution analyses parameter types of the applicable candidates</span>
<span class="c1">// M1(int) is selected.</span>
<span class="c1">// An implicit conversion from `int` to `long` makes it "better"</span>
<span class="nf">M1</span><span class="p">(</span><span class="kt">short</span><span class="p">.</span><span class="n">MaxValue</span><span class="p">);</span>
</code></pre></div></div>
<p>Existence of a conversion from a type generally entails that similar conversion from an expression of such type to the same target type also exists. The opposite is not always true though.<br />
There are several reasons for the distinction:</p>
<ul>
<li>some expressions do not have a natural type at all.<br />
<em>Example:</em> <code class="highlighter-rouge">(x)=>x</code> does not have any type on its own, but converts to <code class="highlighter-rouge">Func<int, int></code></li>
<li>some expressions have a natural type, but their value fits other types.<br />
<em>Example:</em> <code class="highlighter-rouge">42</code> has type <code class="highlighter-rouge">int</code>, but implicitly convertible to <code class="highlighter-rouge">byte</code>, even though <code class="highlighter-rouge">int</code> does not.</li>
<li>
<p>some types have special behaviors.<br />
<em>Example:</em> expressions of type <code class="highlighter-rouge">dynamic</code> implicitly convert to any type, but the <em>type</em> <code class="highlighter-rouge">dynamic</code> does not have such conversions.</p>
<p>Indeed, since most types are implicitly convertible to <code class="highlighter-rouge">dynamic</code>, having equal conversion the other way would make an overload that takes <code class="highlighter-rouge">dynamic</code> always ambiguous. An implicit conversion from <code class="highlighter-rouge">dynamic</code> <em>expresson</em> , though, just means that conversions of <code class="highlighter-rouge">dynamic</code> values are statically acceptable with the actual dynamic conversions happening at the run time.</p>
</li>
</ul>
<p>Tuple conversions transparently have the same distinction. There are tuple conversions that exist only from tuple literals, but not from tuple types. That happens when there are conversions from the argument expressions of the literal to the target element types, but not from the types of those arguments.</p>
<p>Examples of tuple literal conversions:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// the RHS tuple literal does not have a natural type at all</span>
<span class="c1">// because some of the argument expressions do not have a type.</span>
<span class="c1">// Yet, it is implicitly convertible to the LHS type</span>
<span class="c1">// because every argument _expression_ is implicitly convertible</span>
<span class="p">(</span><span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>,</span> <span class="kt">string</span><span class="p">,</span> <span class="kt">object</span><span class="p">)</span> <span class="n">t1</span> <span class="p">=</span> <span class="p">((</span><span class="n">x</span><span class="p">)=></span><span class="n">x</span><span class="p">,</span> <span class="k">null</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
<span class="c1">// RHS has natural type (int, (int, int)),</span>
<span class="c1">// but is implicitly convertible to (byte, (short, object))</span>
<span class="c1">// because element-wise implicit conversions from argument expressions exist.</span>
<span class="p">(</span><span class="kt">byte</span><span class="p">,</span> <span class="p">(</span><span class="kt">short</span><span class="p">,</span> <span class="kt">object</span><span class="p">))</span> <span class="n">t2</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">));</span>
</code></pre></div></div>
<p><strong>Target-typing and evaluation order</strong></p>
<p>Conversion of a literal to the target type is often called ‘target-typing’. That is because the RHS is never materialized in its natural type, instead an instance of the target type is directly created from the RHS value. Indeed, the RHS may not even have a natural type so an instance of such type would not be possible to create.</p>
<p>All the same rules apply to tuple literal conversions, just in a “distributed” manner.</p>
<p>Example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>,</span> <span class="kt">string</span><span class="p">)</span> <span class="n">x</span> <span class="p">=</span> <span class="p">((</span><span class="n">x</span><span class="p">)=></span><span class="n">x</span><span class="p">,</span> <span class="k">null</span><span class="p">);</span> <span class="c1">// is the same as</span>
<span class="p">(</span><span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>,</span> <span class="kt">string</span><span class="p">)</span> <span class="n">x</span> <span class="p">=</span> <span class="p">((</span><span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>)((</span><span class="n">x</span><span class="p">)=></span><span class="n">x</span><span class="p">),</span> <span class="p">(</span><span class="kt">string</span><span class="p">)</span><span class="k">null</span><span class="p">);</span>
<span class="p">(</span><span class="kt">byte</span><span class="p">,</span> <span class="kt">short</span><span class="p">)</span> <span class="n">y</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">);</span> <span class="c1">// is the same as</span>
<span class="p">(</span><span class="kt">byte</span><span class="p">,</span> <span class="kt">short</span><span class="p">)</span> <span class="n">y</span> <span class="p">=</span> <span class="p">((</span><span class="kt">byte</span><span class="p">)</span><span class="m">1</span><span class="p">,</span> <span class="p">(</span><span class="kt">short</span><span class="p">)</span><span class="m">2</span><span class="p">);</span>
</code></pre></div></div>
<p>The evaluation order of target typing in tuple literals is observable when both arguments and conversions have sideeffects:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
<span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// literal tuple conversion is "distributed" to the arguments of the tuple.</span>
<span class="c1">// I.E. every argument is individually target-typed.</span>
<span class="c1">// An instance of (int, int, int) is never created.</span>
<span class="c1">//</span>
<span class="c1">// This is very similar to a constructor call:</span>
<span class="c1">// (C1 a, C1 b, C1 c) t = new ValueTuple<C1,C1, C1>(NextInt(), NextInt(), NextInt());</span>
<span class="c1">//</span>
<span class="p">(</span><span class="n">C1</span> <span class="n">a</span><span class="p">,</span> <span class="n">C1</span> <span class="n">b</span><span class="p">,</span> <span class="n">C1</span> <span class="n">c</span><span class="p">)</span> <span class="n">t</span> <span class="p">=</span> <span class="p">(</span><span class="nf">NextInt</span><span class="p">(),</span> <span class="nf">NextInt</span><span class="p">(),</span> <span class="nf">NextInt</span><span class="p">());</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"result: "</span> <span class="p">+</span> <span class="n">t</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">private</span> <span class="k">static</span> <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
<span class="k">static</span> <span class="kt">int</span> <span class="nf">NextInt</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"produced: "</span> <span class="p">+</span> <span class="n">i</span><span class="p">);</span>
<span class="k">return</span> <span class="n">i</span><span class="p">++;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">class</span> <span class="nc">C1</span>
<span class="p">{</span>
<span class="k">private</span> <span class="kt">int</span> <span class="n">val</span><span class="p">;</span>
<span class="k">public</span> <span class="nf">C1</span><span class="p">(</span><span class="kt">int</span> <span class="n">val</span><span class="p">)</span> <span class="p">=></span> <span class="k">this</span><span class="p">.</span><span class="n">val</span> <span class="p">=</span> <span class="n">val</span><span class="p">;</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">implicit</span> <span class="k">operator</span> <span class="nf">C1</span><span class="p">(</span><span class="kt">int</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"converted: "</span> <span class="p">+</span> <span class="n">arg</span><span class="p">);</span>
<span class="k">return</span> <span class="k">new</span> <span class="nf">C1</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">override</span> <span class="kt">string</span> <span class="nf">ToString</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">return</span> <span class="n">val</span><span class="p">.</span><span class="nf">ToString</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">===</span> <span class="n">prints</span><span class="p">:</span>
<span class="n">produced</span><span class="p">:</span> <span class="m">0</span>
<span class="n">converted</span><span class="p">:</span> <span class="m">0</span>
<span class="n">produced</span><span class="p">:</span> <span class="m">1</span>
<span class="n">converted</span><span class="p">:</span> <span class="m">1</span>
<span class="n">produced</span><span class="p">:</span> <span class="m">2</span>
<span class="n">converted</span><span class="p">:</span> <span class="m">2</span>
<span class="n">result</span><span class="p">:</span> <span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">)</span>
</code></pre></div></div>
<p><strong>Implicit and Explicit conversions</strong></p>
<p>Tuple conversions can be implicit or explicit.<br />
Naturally, a tuple type/expression:</p>
<ul>
<li>has an implicit tuple conversion to the target type if all elements have implicit conversions.</li>
<li>has an explicit tuple conversion to the target type if all elements have explicit conversions.</li>
</ul>
<p>Example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="n">ti</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
<span class="c1">// type `int` has implicit conversion to `dynamic`, so this works</span>
<span class="p">(</span><span class="kt">dynamic</span><span class="p">,</span> <span class="kt">dynamic</span><span class="p">)</span> <span class="n">td</span> <span class="p">=</span> <span class="n">ti</span><span class="p">;</span>
<span class="c1">// `dynamic` type has _explicit_ conversion to `int`, so this works</span>
<span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="n">ti1</span> <span class="p">=</span> <span class="p">((</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">))</span><span class="n">td</span><span class="p">;</span>
<span class="c1">// Method((long, long)) is preferred</span>
<span class="c1">// since `(long, long)` is implicitly convertible to `(dynamic, dynamic)`,</span>
<span class="c1">// but `(dynamic, dynamic)` has no implicit conversion to `(long, long)`</span>
<span class="nf">Method</span><span class="p">(</span><span class="n">ti</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Method</span><span class="p">((</span><span class="kt">long</span><span class="p">,</span> <span class="kt">long</span><span class="p">)</span> <span class="n">ll</span><span class="p">){}</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Method</span><span class="p">((</span><span class="kt">dynamic</span><span class="p">,</span> <span class="kt">dynamic</span><span class="p">)</span> <span class="n">dd</span><span class="p">){}</span>
</code></pre></div></div>
<p>The principle of existence of a conversion when underlying conversion exists is very similar to the lifted conversions in a case of nullable types. As long as <code class="highlighter-rouge">T</code> converts to <code class="highlighter-rouge">U</code>, same conversion exists between <code class="highlighter-rouge">T?</code> and <code class="highlighter-rouge">U?</code>. The main difference for tuples is that they have more than one underlying conversion and classification of the overall tuple conversion is performed conservatively based on <em>all</em> the underlying conversions.</p>
<p>It is actually possible for a conversion to be both lifted into nullable and into tuple conversions.</p>
<p>Example of a conversion lifted into a nullable, a tuple and then a nullable conversion again:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="kt">int</span><span class="p">?,</span> <span class="kt">int</span><span class="p">?)?</span> <span class="n">nubTupleOfNubs</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
<span class="c1">// `int` has implicit conversion to `long`, thus</span>
<span class="c1">// `(int?, int?)? has implicit conversion to `(long?, long?)?`</span>
<span class="p">(</span><span class="kt">long</span><span class="p">?,</span> <span class="kt">long</span><span class="p">?)?</span> <span class="n">td</span> <span class="p">=</span> <span class="n">nubTupleOfNubs</span><span class="p">;</span>
</code></pre></div></div>
<p><strong>Tuple Conversions are Standard Conversions, unconditionally.</strong></p>
<p>User-defined conversions is, perhaps, the most complicated aspect of C# conversions.</p>
<p>To define composition with user-defined operators, C# language has a concept of Standard Conversions. Standard Conversions are specially privileged conversions - they can “stack” with user-defined conversion <em>operators</em> to form user-defined <em>conversions</em>. The reason for the existence of such set of conversions is to widen the applicability of user-defined conversions to more cases than covered by the operator. The reason for the set to be small, and in particular to not include user-defined conversions, is to limit the number of combinations that can result in a conversion.</p>
<p>For example if there is a user-defined conversion operator from type <code class="highlighter-rouge">C1</code> to <code class="highlighter-rouge">byte</code>, then an instance of type <code class="highlighter-rouge">C1</code> is also convertible to <code class="highlighter-rouge">short</code>. Since there is a standard conversion from <code class="highlighter-rouge">byte</code> to <code class="highlighter-rouge">short</code>, compiler can stitch one user-defined operator and one standard conversion into a user defined conversion from <code class="highlighter-rouge">C1</code> to <code class="highlighter-rouge">short</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C1 --- [implicit user defined operator] ---> byte --- [implicit numeric conversion] ---> short
</code></pre></div></div>
<p>Note that the chain of conversions is never longer than 2 - one user-defined operator and, optionally, one standard conversion on either end. With such constraints the algorithm for finding conversion chains stays fairly simple.</p>
<p>Consider that we are looking for conversion from <code class="highlighter-rouge">T1</code> to <code class="highlighter-rouge">T2</code>. Since any user-defined operator involved would need to be defined in either <code class="highlighter-rouge">T1</code> or <code class="highlighter-rouge">T2</code>, these are the only types we would look into. We would collect all the user-defined operators defined in these types that convert from <code class="highlighter-rouge">T1</code> or to <code class="highlighter-rouge">T2</code>. Now, for those operators that go “half way” - from <code class="highlighter-rouge">T1</code> to <code class="highlighter-rouge">S1</code> or from <code class="highlighter-rouge">S2</code> to <code class="highlighter-rouge">T2</code>, we would look for a <em>standard</em> conversion that would “complete” the conversion - from <code class="highlighter-rouge">S1</code> to <code class="highlighter-rouge">T2</code> or from <code class="highlighter-rouge">T1</code> to <code class="highlighter-rouge">S2</code>. If one such found, then we can build a conversion from <code class="highlighter-rouge">T1</code> to <code class="highlighter-rouge">T2</code>, if more than one found, then we have an ambiguity.</p>
<p>The point is that the search space has a strict upper bound. If, for example the conversion, that stacks with user defined operator, could be another user-defined conversion, we would need to look at potentially endless chains of conversions involving unlimited number of intermediate types.</p>
<p>The question is whether tuple conversions belong to “The Exclusive Club of Standard Conversions” or not. It was decided that <em>tuple conversions are, in fact, standard conversions</em>.</p>
<p>The convenience is obvious - if, for example, there is a user-defined implicit conversion operator from <code class="highlighter-rouge">C1</code> to <code class="highlighter-rouge">(int, int)</code>, then we can implicitly convert <code class="highlighter-rouge">C1</code> to <code class="highlighter-rouge">(long, long)</code> as well.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C1 --- [implicit user defined op.] ---> (int, int) --- [implicit tuple conv.] ---> (long, long)
</code></pre></div></div>
<p>A curious part is that tuple conversions are standard conversions regardless whether their underlying conversions are standard or not. The underlying conversions could even themselves be user defined conversions.<br />
This is a case where the conversion classification is <em>not</em> “distributed” to the underlying conversions. Turns out that for the purpose of limiting the search space, such requirement is unnecessary. - at the top we still have a chain of conversions no longer than 2, and underlying element conversions, even if user-defined, cannot nest indefinitely, because tuples cannot nest indefinitely.</p>
<p>It does, however, allow for some interesting scenarios.</p>
<p>Example:<br />
(implicit expanding into nested tuples of any level)</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
<span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">C1</span> <span class="n">y</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">C1</span><span class="p">();</span>
<span class="c1">// `C1` converts to `(byte, C1)`, and thus to `(int, C1)` too.</span>
<span class="c1">// `C1` converts to `(byte, byte)`, and thus to `(int, int)` too.</span>
<span class="c1">// as a result `C1` converts to types like `(int, (int, ...))`</span>
<span class="c1">// regardless of how deeply they are nested</span>
<span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)))))))))))</span> <span class="n">x12</span> <span class="p">=</span> <span class="n">y</span><span class="p">;</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">x12</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">class</span> <span class="nc">C1</span>
<span class="p">{</span>
<span class="k">private</span> <span class="kt">byte</span> <span class="n">x</span><span class="p">;</span>
<span class="k">static</span> <span class="k">public</span> <span class="k">implicit</span> <span class="k">operator</span> <span class="p">(</span><span class="kt">byte</span><span class="p">,</span> <span class="n">C1</span><span class="p">)(</span><span class="n">C1</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="p">((</span><span class="kt">byte</span><span class="p">)(</span><span class="n">arg</span><span class="p">.</span><span class="n">x</span><span class="p">++),</span> <span class="n">arg</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">public</span> <span class="k">implicit</span> <span class="k">operator</span> <span class="p">(</span><span class="kt">byte</span> <span class="n">c</span><span class="p">,</span> <span class="kt">byte</span> <span class="n">d</span><span class="p">)(</span><span class="n">C1</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="p">((</span><span class="kt">byte</span><span class="p">)(</span><span class="n">arg</span><span class="p">.</span><span class="n">x</span><span class="p">++),</span> <span class="p">(</span><span class="kt">byte</span><span class="p">)(</span><span class="n">arg</span><span class="p">.</span><span class="n">x</span><span class="p">++));</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">prints</span><span class="p">:</span>
<span class="p">(</span><span class="m">0</span><span class="p">,</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="p">(</span><span class="m">3</span><span class="p">,</span> <span class="p">(</span><span class="m">4</span><span class="p">,</span> <span class="p">(</span><span class="m">5</span><span class="p">,</span> <span class="p">(</span><span class="m">6</span><span class="p">,</span> <span class="p">(</span><span class="m">7</span><span class="p">,</span> <span class="p">(</span><span class="m">8</span><span class="p">,</span> <span class="p">(</span><span class="m">9</span><span class="p">,</span> <span class="p">(</span><span class="m">10</span><span class="p">,</span> <span class="m">11</span><span class="p">)))))))))))</span>
</code></pre></div></div>
<p><strong>Tuple conversions and extension methods</strong></p>
<p>Another interesting example of “distributed” conversion classification in tuples involves applicability checks for extension method receivers.</p>
<p>Generally an expression is acceptable as a receiver of an extension method call if the extension method targets the type of that expression or any of its base types or implemented interfaces.</p>
<p>From a more formal point of view an expression is applicable as a receiver of an extension method if it is convertible to the type of the instance parameter via:</p>
<ul>
<li>identity conversion</li>
<li>implicit reference conversion</li>
<li>implicit boxing conversion</li>
</ul>
<p>Based just on that an extension method defined on <code class="highlighter-rouge">object</code> would be applicable to an expression of type <code class="highlighter-rouge">(int[], int[])</code>. However an extension defined on <code class="highlighter-rouge">(IEnumerable<int>, IEnumerable<int>)</code> would not be applicable. Early users of the feature indicated that such limitation is unexpected and inconvenient (see <a href="https://github.com/dotnet/roslyn/issues/16159">bug 16159</a>).</p>
<p>The solution was to add implicit tuple conversions to the set of allowed instance conversions, but require that all underlying element conversions are valid instance conversions. I.E. the instance conversion rule became distributed and recursive in a case of tuples.</p>
<p>Examples:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
<span class="k">using</span> <span class="nn">System.Collections.Generic</span><span class="p">;</span>
<span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// ok, </span>
<span class="c1">// `string` has implicit reference conversion to `IEnumerable<char>`</span>
<span class="p">(</span><span class="s">"hello"</span><span class="p">,</span> <span class="s">"hi"</span><span class="p">).</span><span class="nf">M1</span><span class="p">();</span>
<span class="c1">// ok</span>
<span class="c1">// `int` has implicit boxing conversion to `object`</span>
<span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">)).</span><span class="nf">M2</span><span class="p">();</span>
<span class="c1">// ok</span>
<span class="c1">// the first element is convertible as a whole</span>
<span class="c1">// the second element is convertible recursively</span>
<span class="p">((</span><span class="s">"hi"</span><span class="p">,</span> <span class="s">"hello"</span><span class="p">),</span> <span class="p">(</span><span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">)).</span><span class="nf">M2</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">class</span> <span class="nc">C1</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">M1</span><span class="p">(</span><span class="k">this</span> <span class="p">(</span><span class="n">IEnumerable</span><span class="p"><</span><span class="kt">char</span><span class="p">>,</span> <span class="n">IEnumerable</span><span class="p"><</span><span class="kt">char</span><span class="p">>)</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"M1"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">M2</span><span class="p">(</span><span class="k">this</span> <span class="p">(</span><span class="kt">object</span><span class="p">,</span> <span class="p">(</span><span class="kt">object</span><span class="p">,</span> <span class="kt">object</span><span class="p">))</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"M2"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>Why so complicated?</strong></p>
<p>Conversions is a pervasive and complicated aspect of the language. Some degree of complexity is unavoidable when a feature needs to work with conversions and behave in a consistent and predictable manner.</p>
<p>Integration with conversions is often cited as contributing a good portion of the famous “<a href="https://blogs.msdn.microsoft.com/ericgu/2004/01/12/minus-100-points/">minus 100 points</a>” penalty that applies to every new language feature and needs to be balanced out with benefits.</p>Vladimir Sadovhttp://mustoverride.comIn a statically typed language like C#, every new kind of type or a new expression needs to define how it fits into the framework of type conversions. Tuples are not an exception.C# Tuples. More about element names.2017-01-28T00:00:00+00:002017-01-28T00:00:00+00:00http://mustoverride.com/tuples_names<p>C# tuples can have optional element names. Here are some interesting details about tuple element names and how they are treated by the language.</p>
<p>The matter of allowing named elements was a major choice in the design of C# tuples. It was definitely attractive to allow element names when tuples are used in APIs.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="kt">int</span> <span class="n">CustomerID</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Orders</span><span class="p">)</span> <span class="nf">GetRecord</span><span class="p">(){...}</span>
</code></pre></div></div>
<p>is clearly more descriptive and less error prone than</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// NOTE: the first element is CustomerID, second is Orders</span>
<span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="nf">GetRecord</span><span class="p">(){...}</span>
</code></pre></div></div>
<p>On the other hand names could become an obstacle when implementing abstract operations that operate with tuples.<br />
If a dictionary factory is implemented in terms of Key and Value tuples, would it work with Customers and Orders?</p>
<p>What about completely generic algorithms? -<br />
If I have <code class="highlighter-rouge">(int X, int Y)</code> and <code class="highlighter-rouge">int Z</code>, can I apply the following?</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">(</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">,</span> <span class="n">V</span><span class="p">)</span> <span class="n">Append</span><span class="p"><</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">,</span> <span class="n">V</span><span class="p">>((</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">)</span> <span class="n">tu</span><span class="p">,</span> <span class="n">V</span> <span class="n">v</span><span class="p">)</span> <span class="p">=></span> <span class="p">(</span><span class="n">tu</span><span class="p">.</span><span class="n">Item1</span><span class="p">,</span> <span class="n">tu</span><span class="p">.</span><span class="n">Item2</span><span class="p">,</span> <span class="n">v</span><span class="p">);</span>
</code></pre></div></div>
<p>If users can’t use tuples in generic/abstract scenarios just because of the element names, they’d be inclined to avoid the names altogether making the whole support of names questionable.</p>
<p>C# designers wanted to have both the expressiveness of the names, but also to make sure that names do not “stand in the way” when tuples are used as structural types. So the guiding principle was set to be:</p>
<p><strong>Element names are semantically insignificant except when used directly.</strong></p>
<p>The tuple types with element names are really the same as ones without. The only addition is the presence of “friendly names”.<br />
In particular all tuple elements have the default <code class="highlighter-rouge">Item1</code>, <code class="highlighter-rouge">Item2</code>,…. <code class="highlighter-rouge">ItemN</code> names, even those that have “friendly” element names. It is allowed for friendly names to be the same as the default names, but only as long as they are in the right position.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Item2 causes an error here, since it is in a wrong position.</span>
<span class="c1">// Item2 name is essentially already taken by the element #2</span>
<span class="p">(</span><span class="kt">int</span> <span class="n">Item1</span><span class="p">,</span> <span class="kt">int</span> <span class="n">X</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Item2</span><span class="p">)</span> <span class="n">v</span><span class="p">;</span>
</code></pre></div></div>
<p>Another consequence is that overloaded methods whose signatures differ only in tuple element are disallowed.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">Ext</span><span class="p">((</span><span class="kt">int</span> <span class="n">X</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Y</span><span class="p">)</span> <span class="n">arg</span><span class="p">){}</span>
<span class="c1">// error CS0111: Type 'C' already defines a member called 'Ext' with the same parameter types</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">Ext</span><span class="p">((</span><span class="kt">int</span> <span class="n">V</span><span class="p">,</span> <span class="kt">int</span> <span class="n">W</span><span class="p">)</span> <span class="n">arg</span><span class="p">){}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Conversely - overload resolution will not consider element names when selecting the target of an invocation.<br />
The following call is ambiguous since, ignoring element names, both <code class="highlighter-rouge">Ext</code> methods have the same signatures.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">M</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">v</span> <span class="p">=</span> <span class="k">default</span><span class="p">((</span><span class="kt">int</span> <span class="n">X</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Y</span><span class="p">));</span>
<span class="c1">// error CS0121: The call is ambiguous between the following. . .</span>
<span class="n">v</span><span class="p">.</span><span class="nf">Ext</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">class</span> <span class="nc">Ext1</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">Ext</span><span class="p">(</span><span class="k">this</span> <span class="p">(</span><span class="kt">int</span> <span class="n">X</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Y</span><span class="p">)</span> <span class="n">arg</span><span class="p">){}</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">class</span> <span class="nc">Ext2</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">Ext</span><span class="p">(</span><span class="k">this</span> <span class="p">(</span><span class="kt">int</span> <span class="n">V</span><span class="p">,</span> <span class="kt">int</span> <span class="n">W</span><span class="p">)</span> <span class="n">arg</span><span class="p">){}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>The dynamic type of a tuple variable is just the underlying ValueTuple.</strong></p>
<p>Essentially the “tuple” part of these types, including their element names, is a compile-time decoration that compiler understands, uses and propagates through expressions.</p>
<p>The <a href="https://en.wikipedia.org/wiki/Type_erasure">erasure</a> of tuple related information can be observable by checking the type of boxed instances or the static type as tracked by CLR type system.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Program</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// tuple instances do not know they are tuples</span>
<span class="kt">object</span> <span class="n">instance</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="m">2</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">instance</span><span class="p">.</span><span class="nf">GetType</span><span class="p">());</span>
<span class="c1">// CLR does not trace tuple types either.</span>
<span class="nf">PrintStaticType</span><span class="p">((</span><span class="n">Alice</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="m">2</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">void</span> <span class="n">PrintStaticType</span><span class="p"><</span><span class="n">T</span><span class="p">>(</span><span class="n">T</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="k">typeof</span><span class="p">(</span><span class="n">T</span><span class="p">));</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">The</span> <span class="n">output</span> <span class="k">is</span><span class="p">:</span>
<span class="n">System</span><span class="p">.</span><span class="n">ValueTuple</span><span class="err">`</span><span class="m">2</span><span class="p">[</span><span class="n">System</span><span class="p">.</span><span class="n">Int32</span><span class="p">,</span> <span class="n">System</span><span class="p">.</span><span class="n">Int32</span><span class="p">]</span>
<span class="n">System</span><span class="p">.</span><span class="n">ValueTuple</span><span class="err">`</span><span class="m">2</span><span class="p">[</span><span class="n">System</span><span class="p">.</span><span class="n">Int32</span><span class="p">,</span> <span class="n">System</span><span class="p">.</span><span class="n">Int32</span><span class="p">]</span>
</code></pre></div></div>
<p><strong>Representing element names in metadata</strong></p>
<p>Since CLR types themselves do not store tuple information, compiler emits extra information to specify tuple element names in member signatures.<br />
The encoding is rather simple - <code class="highlighter-rouge">TupleElementNamesAttribute</code> contains an array of element name strings in the pre-order depth-first traversal order of the parts of the corresponding type. Basically - when you go through the type declaration every tuple element would consume one string from the attribute. If no tuple element names are present the attribute does not need to be emitted.</p>
<p>Example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// "C" and "F" are intentionally missing - will be encoded as "null" strings.</span>
<span class="k">static</span> <span class="n">Dictionary</span><span class="p"><(</span><span class="kt">int</span> <span class="n">A</span><span class="p">,</span> <span class="kt">int</span> <span class="n">B</span><span class="p">),</span> <span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span> <span class="n">D</span><span class="p">)?></span> <span class="nf">Test</span><span class="p">((</span><span class="kt">int</span><span class="p">[]</span> <span class="n">E</span><span class="p">,</span> <span class="kt">int</span><span class="p">)[]</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="k">null</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Emitted as an equivalent of:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="na">[return: TupleElementNames(new string[]</span><span class="p">{</span><span class="s">"A"</span><span class="p">,</span><span class="s">"B"</span><span class="p">,</span><span class="k">null</span><span class="p">,</span><span class="s">"D"</span><span class="p">})]</span>
<span class="k">private</span> <span class="k">static</span> <span class="n">Dictionary</span><span class="p"><</span><span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>,</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>?></span> <span class="nf">Test</span>
<span class="p">([</span><span class="nf">TupleElementNames</span><span class="p">(</span><span class="k">new</span> <span class="kt">string</span><span class="p">[]{</span><span class="s">"E"</span><span class="p">,</span><span class="k">null</span><span class="p">})]</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">[],</span> <span class="kt">int</span><span class="p">>[]</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="k">null</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>As explained in <a href="/tuples_valuetuple/">earlier post</a>, ValueTuple types that match a tuple pattern are promoted into corresponding tuple types during metadata import. In addition to that, the element names are “rehydrated” from a TupleElementNames attribute, if one is specified for the given part of a member signature.</p>
<p>Note that in terms of cross-language interoperability, understanding <code class="highlighter-rouge">TupleElementNames</code> attribute or the tuple encoding pattern is optional.<br />
If the consuming language does not care about element names (like F#), it can ignore the attribute and just see the signature with “nameless” tuples. If the consuming language does not understand tuples at all (like C#6), it can still interoperate by using ValueTuple structs.</p>
<p><strong>Compile time propagation of tuple types</strong></p>
<p>Note that compile time propagation of the tuple types can go quite far, including through the generic type inference. At compile time the tuple types are “real types”.</p>
<p>Example of a tuple type with element names propagated through several level of type inference:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// The only argument with a natural type is the "42"</span>
<span class="c1">// T infers its type from "42"</span>
<span class="c1">// U has dependency on T which is resolved via lambda inference once we know T</span>
<span class="c1">// U[ ] is the return type of Apply and is known once we know U</span>
<span class="c1">// type of 'r' is inferred to be the same as the return type of 'Apply'</span>
<span class="kt">var</span> <span class="n">r</span> <span class="p">=</span> <span class="nf">Apply</span><span class="p">(</span><span class="m">42</span><span class="p">,</span> <span class="p">(</span><span class="n">val</span><span class="p">)</span> <span class="p">=></span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="n">val</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="n">val</span><span class="p">.</span><span class="nf">ToString</span><span class="p">()));</span>
<span class="c1">// As a result</span>
<span class="c1">// r has type: (int Alice, string Bob)[ ]</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">r</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="n">Alice</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">r</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="n">Bob</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">static</span> <span class="n">U</span><span class="p">[]</span> <span class="n">Apply</span><span class="p"><</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">>(</span><span class="n">T</span> <span class="n">arg</span><span class="p">,</span> <span class="n">Func</span><span class="p"><</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">></span> <span class="n">f</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="k">new</span> <span class="n">U</span><span class="p">[]</span> <span class="p">{</span> <span class="nf">f</span><span class="p">(</span><span class="n">arg</span><span class="p">)</span> <span class="p">};</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The element names are not always involved in the inference. In scenarios where tuple arguments match tuple parameters of the same cardinality, the inference works in a purely structural way and element names are ignored.</p>
<p>Surely, when type parameters are inferred from the argument element types, the names of those elements cannot take part in that.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">v</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="s">"hi"</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="s">"there"</span><span class="p">);</span>
<span class="c1">// T is inferred to be 'string'</span>
<span class="c1">// so is the type of r</span>
<span class="kt">var</span> <span class="n">r</span> <span class="p">=</span> <span class="nf">Test</span><span class="p">(</span><span class="n">v</span><span class="p">).</span><span class="n">Result</span><span class="p">;</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">r</span><span class="p">.</span><span class="nf">ToUpper</span><span class="p">());</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="nf">Append</span><span class="p">(</span><span class="n">t</span><span class="p">:</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">),</span> <span class="n">third</span><span class="p">:</span> <span class="m">3</span><span class="p">));</span>
<span class="p">}</span>
<span class="c1">// T is inferred from the first element type of the argument tuple</span>
<span class="k">static</span> <span class="k">async</span> <span class="n">Task</span><span class="p"><</span><span class="n">T</span><span class="p">></span> <span class="n">Test</span><span class="p"><</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">>((</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">)</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// just await something</span>
<span class="k">await</span> <span class="n">Task</span><span class="p">.</span><span class="nf">Yield</span><span class="p">();</span>
<span class="k">return</span> <span class="n">arg</span><span class="p">.</span><span class="n">Item1</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// T, U are inferred from element types of 2-tuple argument</span>
<span class="c1">// and used as element types of 3-tuple result</span>
<span class="c1">// element names are unrelated and unimportant for inference purposes here</span>
<span class="k">static</span> <span class="p">(</span><span class="n">T</span> <span class="n">First</span><span class="p">,</span> <span class="n">U</span> <span class="n">Second</span><span class="p">,</span> <span class="n">V</span> <span class="n">Third</span><span class="p">)</span> <span class="n">Append</span><span class="p"><</span><span class="n">T</span><span class="p">,</span> <span class="n">U</span><span class="p">,</span> <span class="n">V</span><span class="p">>((</span><span class="n">T</span> <span class="n">First</span><span class="p">,</span> <span class="n">U</span> <span class="n">Second</span><span class="p">)</span> <span class="n">t</span><span class="p">,</span> <span class="n">V</span> <span class="n">third</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="p">(</span><span class="n">t</span><span class="p">.</span><span class="n">First</span><span class="p">,</span> <span class="n">t</span><span class="p">.</span><span class="n">Second</span><span class="p">,</span> <span class="n">third</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>Tuple type merging and dropping of element names</strong></p>
<p>When inferring tuple names from multiple sources, a situation may arise where multiple names for the same element would be inferred. In such case these names are “dropped” leaving the corresponding tuple element unnamed.</p>
<p>Indeed, there are only two design choices here - drop conflicting names or make the whole scenario an error. However making it an error would contradict the idea that presence of element names is semantically insignificant.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">x</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="s">"hi"</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="s">"there"</span><span class="p">);</span>
<span class="kt">var</span> <span class="n">y</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alpha</span><span class="p">:</span> <span class="s">"bye"</span><span class="p">,</span> <span class="n">Beta</span><span class="p">:</span> <span class="s">"bye"</span><span class="p">);</span>
<span class="c1">// T is inferred to be</span>
<span class="c1">// (string Alice, string Bob) and also</span>
<span class="c1">// (string Alpha, string Beta)</span>
<span class="c1">//</span>
<span class="c1">// To resolve apparent ambiguity conflicting names are dropped.</span>
<span class="c1">// T is just: (string, string)</span>
<span class="kt">var</span> <span class="n">z</span> <span class="p">=</span> <span class="nf">OneOrAnother</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">DateTime</span><span class="p">.</span><span class="n">Now</span><span class="p">.</span><span class="n">DayOfWeek</span> <span class="p">==</span> <span class="n">DayOfWeek</span><span class="p">.</span><span class="n">Friday</span><span class="p">);</span>
<span class="c1">// this would be an error</span>
<span class="c1">// Console.WriteLine(z.Alice);</span>
<span class="c1">// this is still ok</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">z</span><span class="p">.</span><span class="n">Item1</span><span class="p">);</span>
<span class="kt">var</span> <span class="n">x1</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="s">"bye"</span><span class="p">,</span> <span class="n">Todd</span><span class="p">:</span> <span class="s">"bye"</span><span class="p">);</span>
<span class="c1">// only ambiguous names are dropped</span>
<span class="c1">// z1 has type: (string Alice, string)</span>
<span class="kt">var</span> <span class="n">z1</span> <span class="p">=</span> <span class="n">DateTime</span><span class="p">.</span><span class="n">Now</span><span class="p">.</span><span class="n">DayOfWeek</span> <span class="p">==</span> <span class="n">DayOfWeek</span><span class="p">.</span><span class="n">Friday</span> <span class="p">?</span>
<span class="n">x</span> <span class="p">:</span>
<span class="n">x1</span><span class="p">;</span>
<span class="c1">// this is ok</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">z1</span><span class="p">.</span><span class="n">Alice</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// T is inferrable from both x and y</span>
<span class="k">static</span> <span class="n">T</span> <span class="n">OneOrAnother</span><span class="p"><</span><span class="n">T</span><span class="p">>(</span><span class="n">T</span> <span class="n">x</span><span class="p">,</span> <span class="n">T</span> <span class="n">y</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">flag</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="n">flag</span> <span class="p">?</span> <span class="n">x</span> <span class="p">:</span> <span class="n">y</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>Can element names become “semantically significant” through lambda inference?</strong></p>
<p>There is an interesting scenario which seemingly demonstrates that element names <em>can</em> have effect on overload resolution when combined with lambda inference. The example below is able to steer overload resolution to one of the candidates by using specific tuple element names.<br />
However at closer examination, the element names are actually <em>used directly</em> in this scenario, so of course they make a difference. It is not a case where two tuple types compete for better applicability, it is a case where two reified lambdas compete, and one would have compile errors.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// calls the first Select -</span>
<span class="c1">// the only case where ".Bob" would not be an error</span>
<span class="kt">var</span> <span class="n">r</span> <span class="p">=</span> <span class="nf">Select</span><span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="n">t</span> <span class="p">=></span> <span class="n">t</span><span class="p">.</span><span class="n">Bob</span><span class="p">);</span>
<span class="c1">// ambiguity error: lambda can be applied in either case</span>
<span class="c1">// var r1 = Select(1, 2, t => t.Alice);</span>
<span class="p">}</span>
<span class="k">delegate</span> <span class="n">TResult</span> <span class="n">Selector1</span><span class="p"><</span><span class="n">TArg</span><span class="p">,</span> <span class="n">TResult</span><span class="p">>(</span><span class="n">TArg</span> <span class="n">arg</span><span class="p">);</span>
<span class="k">static</span> <span class="n">T</span> <span class="n">Select</span><span class="p"><</span><span class="n">T</span><span class="p">>(</span><span class="n">T</span> <span class="n">x</span><span class="p">,</span> <span class="n">T</span> <span class="n">y</span><span class="p">,</span> <span class="n">Selector1</span><span class="p"><(</span><span class="n">T</span> <span class="n">Alice</span><span class="p">,</span> <span class="n">T</span> <span class="n">Bob</span><span class="p">),</span> <span class="n">T</span><span class="p">></span> <span class="n">selector</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"first overload"</span><span class="p">);</span>
<span class="k">return</span> <span class="nf">selector</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">delegate</span> <span class="n">TResult</span> <span class="n">Selector2</span><span class="p"><</span><span class="n">TArg</span><span class="p">,</span> <span class="n">TResult</span><span class="p">>(</span><span class="n">TArg</span> <span class="n">arg</span><span class="p">);</span>
<span class="k">static</span> <span class="n">T</span> <span class="n">Select</span><span class="p"><</span><span class="n">T</span><span class="p">>(</span><span class="n">T</span> <span class="n">x</span><span class="p">,</span> <span class="n">T</span> <span class="n">y</span><span class="p">,</span> <span class="n">Selector2</span><span class="p"><(</span><span class="n">T</span> <span class="n">Alice</span><span class="p">,</span> <span class="n">T</span> <span class="n">Todd</span><span class="p">),</span> <span class="n">T</span><span class="p">></span> <span class="n">selector</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"second overload"</span><span class="p">);</span>
<span class="k">return</span> <span class="nf">selector</span><span class="p">((</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>Diagnostics on element name mismatches</strong></p>
<p>Considering how easily element names can be cast aside, the language designers had concerns that compiler would be less than helpful against certain kinds of mistakes. Some name mismatches could be indicative of a confusion or a typo.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// Warning!!</span>
<span class="c1">//</span>
<span class="c1">// "Boook" is ignored. Likely a typo.</span>
<span class="nf">M1</span><span class="p">((</span><span class="n">Boook</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">Chapter</span><span class="p">:</span> <span class="m">2</span><span class="p">));</span>
<span class="c1">// Warnings!!</span>
<span class="c1">//</span>
<span class="c1">// "First" and "Last" are mismatched causing both to be dropped.</span>
<span class="c1">// That is highly suspicious</span>
<span class="kt">var</span> <span class="n">r</span> <span class="p">=</span> <span class="n">DateTime</span><span class="p">.</span><span class="n">Now</span><span class="p">.</span><span class="n">DayOfWeek</span> <span class="p">==</span> <span class="n">DayOfWeek</span><span class="p">.</span><span class="n">Friday</span> <span class="p">?</span>
<span class="p">(</span><span class="n">ID</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">First</span><span class="p">:</span> <span class="s">"F"</span><span class="p">,</span> <span class="n">Last</span><span class="p">:</span> <span class="s">"L"</span><span class="p">)</span> <span class="p">:</span>
<span class="p">(</span><span class="n">ID</span><span class="p">:</span> <span class="m">2</span><span class="p">,</span> <span class="n">Last</span><span class="p">:</span> <span class="s">"L"</span><span class="p">,</span> <span class="n">First</span><span class="p">:</span> <span class="s">"F"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">M1</span><span class="p">((</span><span class="kt">int</span> <span class="n">Book</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Chapter</span><span class="p">)</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// . . .</span>
<span class="p">}</span>
</code></pre></div></div>
<p>While language is pretty clear on the semantics of the above samples, the code is likely to be unintentional.</p>
<p>Determining scenarios that result in warnings is not an easy task. The scenarios must be much more likely a result of an error than not. In addition there should be reasonable and obvious ways to fix the violations. In the initial release the warnings are produced under the following conditions:</p>
<ul>
<li>It is an identity conversion from a tuple literal.</li>
<li>Some names specified in the literal are dropped as a result of conversion.</li>
</ul>
<p>The mistake in such scenarios is fairly clear - the name is explicitly specified and immediately ignored due to mismatch - that is at very least redundant. The most trivial fix is to just fix the name to match destination or to remove it entirely.</p>
<p>There are plans to improve the name mismatch analysis. Some of those plans are captured in this <a href="https://github.com/dotnet/roslyn/issues/14217">WorkItem</a>. More data/statistics on the real-world use of tuples would be useful to improve the analysis as well.</p>
<p><strong>Element names must match when overriding or implementing.</strong></p>
<p>Some language designers felt particularly strong about overriding and implementing scenarios. There was some discussion whether changing element names upon overriding/implementing is a bad enough pattern that it must be a compile error or just a warning.<br />
What tipped the scales towards making this an error is that if error is found to be too strict, it can be relaxed, without being a compatibility issue. Change in the opposite direction would be breaking.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Animal</span> <span class="n">a</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">Dog</span><span class="p">();</span>
<span class="n">a</span><span class="p">.</span><span class="nf">M1</span><span class="p">().</span> <span class="p">???</span> <span class="c1">// AnimalName or DogName ?</span>
<span class="p">}</span>
<span class="k">abstract</span> <span class="k">class</span> <span class="nc">Animal</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">abstract</span> <span class="p">(</span><span class="kt">int</span> <span class="n">ID</span><span class="p">,</span> <span class="kt">string</span> <span class="n">AnimalName</span><span class="p">)</span> <span class="nf">M1</span><span class="p">();</span>
<span class="p">}</span>
<span class="c1">// Changing element names when overriding could be confusing to the caller.</span>
<span class="k">class</span> <span class="nc">Dog</span><span class="p">:</span> <span class="n">Animal</span>
<span class="p">{</span>
<span class="c1">// Error: cannot change tuple element names when overriding.</span>
<span class="k">public</span> <span class="k">override</span> <span class="p">(</span><span class="kt">int</span> <span class="n">ID</span><span class="p">,</span> <span class="kt">string</span> <span class="n">DogName</span><span class="p">)</span> <span class="nf">M1</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">return</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="s">"Spot"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Note that these restrictions get validated and reported <em>after</em> the semantic analysis. The element names are ignored while determining overriding/implementing relationships, but when it is done, it is enforced that element names match.</p>Vladimir Sadovhttp://mustoverride.comC# tuples can have optional element names. Here are some interesting details about tuple element names and how they are treated by the language.C# Tuples. How tuples are related to ValueTuple.2017-01-16T00:00:00+00:002017-01-16T00:00:00+00:00http://mustoverride.com/tuples_valuetuple<p>As a matter of implementation details, C# tuples are implemented on top of ValueTuple types. Here are some details about their relationship.</p>
<h2 id="what-is-actually-emitted-when-tuples-are-used-in-the-code">What is actually emitted when tuples are used in the code.</h2>
<p>Underlying implementation of C# tuples is fairly simple. Tuples of cardinality 2 through 7 are directly mapped to <a href="https://github.com/dotnet/corefx/blob/15d00331e54e6a2d051c9a939fe1deb72b200e26/src/System.ValueTuple/src/System/ValueTuple/ValueTuple.cs"><code class="highlighter-rouge">ValueTuple</code></a> type of corresponding generic arity. I.E <code class="highlighter-rouge">(int, int)</code> is represented by <code class="highlighter-rouge">ValueTuple<int, int></code>.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="n">n</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">1</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n</span><span class="p">.</span><span class="n">Item1</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>is emitted as</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">></span> <span class="n">n</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>(</span><span class="m">1</span><span class="p">,</span> <span class="m">1</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n</span><span class="p">.</span><span class="n">Item1</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>At 8+ elements things get more interesting. Since arities of <code class="highlighter-rouge">ValueTuple</code> types go only up to 8, compiler resorts to nesting. The first 7 elements are stored as-is and the rest of elements is stored as a tuple in the <code class="highlighter-rouge">Rest</code> field of <code class="highlighter-rouge">ValueTuple'8</code>.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">n1</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">7</span><span class="p">,</span><span class="m">8</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n1</span><span class="p">.</span><span class="n">Item8</span><span class="p">);</span>
<span class="kt">var</span> <span class="n">n2</span> <span class="p">=</span> <span class="p">(</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">,</span><span class="m">3</span><span class="p">,</span><span class="m">4</span><span class="p">,</span><span class="m">5</span><span class="p">,</span><span class="m">6</span><span class="p">,</span><span class="m">7</span><span class="p">,</span><span class="m">8</span><span class="p">,</span><span class="m">9</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n2</span><span class="p">.</span><span class="n">Item9</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>is emitted as</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">n1</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">>>(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">,</span> <span class="m">4</span><span class="p">,</span> <span class="m">5</span><span class="p">,</span> <span class="m">6</span><span class="p">,</span> <span class="m">7</span><span class="p">,</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">>(</span><span class="m">8</span><span class="p">));</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n1</span><span class="p">.</span><span class="n">Rest</span><span class="p">.</span><span class="n">Item1</span><span class="p">);</span>
<span class="kt">var</span> <span class="n">n2</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>>(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">,</span> <span class="m">4</span><span class="p">,</span> <span class="m">5</span><span class="p">,</span> <span class="m">6</span><span class="p">,</span> <span class="m">7</span><span class="p">,</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>(</span><span class="m">8</span><span class="p">,</span> <span class="m">9</span><span class="p">));</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">n2</span><span class="p">.</span><span class="n">Rest</span><span class="p">.</span><span class="n">Item2</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The encoding scheme used here is recursive. In a 15-element case, the <code class="highlighter-rouge">Item15</code> will be mapped to <code class="highlighter-rouge">outer.Rest.Rest.Item1</code>. - I.E. every level of nesting can store 7 elements + remaining tail.</p>
<p>Importantly, the tail is always wrapped in a ValueTuple, even if it is just 1 element. The idea is that if the 8th element is itself a tuple, as in <code class="highlighter-rouge">(int,int,int,int,int,int,int,(int,int))</code>, the element would be wrapped and thus it could not be confused with a flat tuple that has N more elements, as in <code class="highlighter-rouge">(int,int,int,int,int,int,int,int,int)</code>.<br />
This clever encoding scheme is not actually new. It is exactly the same approach as has been used by F# tuples for a long time.</p>
<p>Another interesting observation is that this encoding makes it necessary to have <code class="highlighter-rouge">ValueTuple<T></code>, even though by itself 1-element tuples are not expressible in the language.</p>
<h2 id="what-happens-if-valuetuple-is-used-in-c7-sources-directly">What happens if ValueTuple is used in C#7 sources directly?</h2>
<p>The backward compatibility requirements dictate that <code class="highlighter-rouge">ValueTuple</code> structs are allowed in C#7 code, and code that worked in C#6 should continue working in C#7.<br />
In addition to that, considering that tuples are emitted as <code class="highlighter-rouge">ValueTuple</code>, the underlying functionality will unavoidably leak through boxing, interop, dynamic, reflection and other scenarios, so why not just make tuple types be “compatible” with the functionality of the underlying types - including fields, properties, methods, implemented interfaces?</p>
<p>There are two ways how this kind of “compatible” could be formalized in C#:</p>
<ul>
<li>
<p>exactly the same type.<br />
Basically it means that the same type has two syntaxes and wherever syntactically possible, one type reference can be replaced with another with no changes to the meaning of the program. <br />
Example: <code class="highlighter-rouge">System.Nullable<System.Int32></code> and <code class="highlighter-rouge">int?</code> - both refer to exactly the same type<br />
Anything that can be done with <code class="highlighter-rouge">System.Nullable<System.Int32></code> can be done with <code class="highlighter-rouge">int?</code>.</p>
</li>
<li>
<p>identity convertible.<br />
Here language would track distinct types with different static capabilities, but runtime representation is indistinguishable so a variable of one type can be reinterpreted as a variable of another type.<br />
Example: <code class="highlighter-rouge">List<dynamic></code> and <code class="highlighter-rouge">List<object></code><br />
<code class="highlighter-rouge">myList[0].Blah()</code> would work with the first, but would not compile with the second. However you can make an alias of one type to a variable of another.</p>
</li>
</ul>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// 'lo' is a list of objects</span>
<span class="c1">// this is the only "real" variable we have here</span>
<span class="kt">var</span> <span class="n">lo</span> <span class="p">=</span> <span class="k">new</span> <span class="n">List</span><span class="p"><</span><span class="kt">object</span><span class="p">>()</span> <span class="p">{</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">};</span>
<span class="c1">// 'ld' is an alias to 'lo', typed as a list of dynamic</span>
<span class="c1">// can do this since these types are identity convertible</span>
<span class="c1">//</span>
<span class="c1">// could also pass 'lo' as a 'ref List<dynamic>' parameter</span>
<span class="c1">// but ref locals make example more compact</span>
<span class="k">ref</span> <span class="n">List</span><span class="p"><</span><span class="kt">dynamic</span><span class="p">></span> <span class="n">ld</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">lo</span><span class="p">;</span>
<span class="c1">// this compiles with ld</span>
<span class="c1">// GetTypeCode _can_ be called on dynamic</span>
<span class="n">ld</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="nf">GetTypeCode</span><span class="p">();</span>
<span class="c1">// this would not compile</span>
<span class="c1">// GetTypeCode _cannot_ be called on object</span>
<span class="n">error</span> <span class="p">-></span> <span class="n">lo</span><span class="p">[</span><span class="m">0</span><span class="p">].</span><span class="nf">GetTypeCode</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>So, what happens with tuples?</p>
<p>You can not do <code class="highlighter-rouge">t.Alice</code> when <code class="highlighter-rouge">t</code> is typed as <code class="highlighter-rouge">ValueTuple<int, int></code>, but can when it is typed as <code class="highlighter-rouge">(int Alice, int Bob)</code>. Tuple types with element names are clearly separate types. The matters of tuple tuples with element names is worth a separate post, but in short - yes, tuples with element names are <strong>identity convertible</strong> to corresponding <code class="highlighter-rouge">ValueTuple<></code> types.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// the only "real" variable here</span>
<span class="kt">var</span> <span class="n">ii</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>(</span><span class="m">1</span><span class="p">,</span><span class="m">2</span><span class="p">);</span>
<span class="c1">// make an alias typed as '(int Alice, int Bob)'</span>
<span class="c1">// can do this since these types are identity convertible</span>
<span class="k">ref</span> <span class="p">(</span><span class="kt">int</span> <span class="n">Alice</span><span class="p">,</span> <span class="kt">int</span> <span class="n">Bob</span><span class="p">)</span> <span class="n">ab</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">ii</span><span class="p">;</span>
<span class="c1">// '(int Alice, int Bob)' has an element 'Bob'</span>
<span class="c1">// it is the same variable as 'ii.Item2'</span>
<span class="n">ab</span><span class="p">.</span><span class="n">Bob</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="c1">// prints 42</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">ii</span><span class="p">.</span><span class="n">Item2</span><span class="p">);</span>
<span class="c1">// prints 42</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">ab</span><span class="p">.</span><span class="n">Item2</span><span class="p">);</span>
<span class="c1">// prints 42</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">ab</span><span class="p">.</span><span class="n">Bob</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>On the other hand the semantical differences between <code class="highlighter-rouge">(int, int)</code> and <code class="highlighter-rouge">ValueTuple<int, int></code> would be so subtle that it was decided to just make them the <strong>same types</strong>. It does mean that <code class="highlighter-rouge">ValueTuple<int, int></code> is treated a bit specially by the language. In addition to all the properties common to similar generic types, <code class="highlighter-rouge">ValueTuple<int, int></code> would have all the additional functionality of <code class="highlighter-rouge">(int, int)</code>.</p>
<p>This difference is hard to notice (and that is the point).<br />
The easiest way is through observing the presence of <code class="highlighter-rouge">ItemN</code> elements beyond the first 7:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// this type matches the pattern of 8-ple (int, int, int, int, int, int, int, int)</span>
<span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">>></span> <span class="n">vt</span> <span class="p">=</span>
<span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">>></span>
<span class="p">(</span><span class="m">1</span><span class="p">,</span> <span class="m">2</span><span class="p">,</span> <span class="m">3</span><span class="p">,</span> <span class="m">4</span><span class="p">,</span> <span class="m">5</span><span class="p">,</span> <span class="m">6</span><span class="p">,</span> <span class="m">7</span><span class="p">,</span> <span class="k">new</span> <span class="n">ValueTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">>(</span><span class="m">8</span><span class="p">));</span>
<span class="c1">// surely it has 'Item8' element</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">vt</span><span class="p">.</span><span class="n">Item8</span><span class="p">);</span>
<span class="c1">// that is actually emitted as</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">vt</span><span class="p">.</span><span class="n">Rest</span><span class="p">.</span><span class="n">Item1</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>From the implementation prospective, when compiler sees <code class="highlighter-rouge">ValueTuple<></code> type whose shape matches an underlying layout of a tuple type, it “upgrades” the type reference to mean the actual tuple type. The transformation applies equally to type references in source as well as in metadata. As a result conforming <code class="highlighter-rouge">ValueTuple<></code> types behave exactly as corresponding tuple types.</p>
<p>Overall there are just two distinct groups of tuple types - with element names and without. And the tuple type relationship looks like this:
<img src="http://mustoverride.com/images/Conv.jpg" alt="Tuple Type relationship" /></p>Vladimir Sadovhttp://mustoverride.comAs a matter of implementation details, C# tuples are implemented on top of ValueTuple types. Here are some details about their relationship.C# Tuples. Why mutable structs?2017-01-07T00:00:00+00:002017-01-07T00:00:00+00:00http://mustoverride.com/tuples_structs<p>C# defines tuples as mutable value types. Considering the general guidance against mutable structs, it may look as a peculiar design choice.</p>
<p>Indeed why? And in particular, why existing family of <code class="highlighter-rouge">System.Tuple<></code> classes did not fit?</p>
<h2 id="why-value-types">Why value types.</h2>
<p>As with many other features, design of C# tuples started with investigating existing pain points that the new feature was supposed to fix. A special interest was paid to the existing tuple-like types in existing code bases. By tuple-like, here I mean an abstract datatype used just to bundle up several values without giving them any particular meaning.</p>
<p>Turns out Roslyn itself had three!! unrelated custom tuple-like types. There were <code class="highlighter-rouge">ValueTuple<T1, T2></code>, <code class="highlighter-rouge">ValueTuple<T1, T2, T3></code> and so on. There was <code class="highlighter-rouge">Pair</code>. And at some point, I believe, there was <code class="highlighter-rouge">StructTuple</code>, which was later merged with <code class="highlighter-rouge">ValueTuple</code>. In addition to that there was some use of <code class="highlighter-rouge">KeyValuePair</code> for purposes that have nothing to do with either keys or values - just to combine two unrelated pieces of data and to pass around. Other projects, including .Net FX itself, had similar finds. (for example <a href="https://github.com/dotnet/corefx/blob/fb0a1d90d874eba4d1b00227d31009496f67002d/src/System.Linq.Parallel/src/System/Linq/Parallel/Utils/Pair.cs"><code class="highlighter-rouge">Pair</code></a>)</p>
<p>The common theme for all these was trafficking multiple pieces of data as a single unit and in particular returning multiple values from methods. One existing solution for returning multiple values is through <code class="highlighter-rouge">out</code> parameters, but it is often inconvenient and does not work for <code class="highlighter-rouge">async</code> methods at all, so an aggregate type is needed.<br />
On the other hand programmers are not compelled to create specialized types just to move 2+ items around, so they create generalized helper types like <code class="highlighter-rouge">Pair</code> that otherwise have no special meaning - just to hold two things. These are scenarios where tuples would step in.</p>
<p>It was also observed that these types are typically structs. Considering that a tuple is just a combination of values with no identity on its own, the value semantics of structs appeared to be convenient.<br />
Ideally it would be possible to just push more than one value to the stack upon returning from a method, but since that is not possible, pushing a single struct containing those values is the next best thing. Using classes here would only add the costs of allocation and indirection.</p>
<p>At the end it appeared that structs are a sufficient and also cheap solution to combining multiple piece of data into a single unit.<br />
The only consideration in the favor of classes was the cost of copying in case if tuples are large. Turns out the large tuples, while possible, are exceedingly rare.</p>
<h2 id="why-mutable">Why Mutable</h2>
<p>Since tuples are structs there is no much point in making them immutable. The readonliness of a struct is a property of the whole variable. If a tuple variable is assignable, it can be changed to contain any value regardless of the readonliness of individual fields.</p>
<p>Example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">ImmutableTuple</span><span class="p"><</span><span class="n">T1</span><span class="p">,</span> <span class="n">T2</span><span class="p">></span>
<span class="p">{</span>
<span class="c1">// immutable, right?</span>
<span class="k">public</span> <span class="n">T1</span> <span class="n">Item1</span><span class="p">{</span><span class="k">get</span><span class="p">;}</span>
<span class="k">public</span> <span class="n">T2</span> <span class="n">Item2</span><span class="p">{</span><span class="k">get</span><span class="p">;}</span>
<span class="k">public</span> <span class="nf">ImmutableTuple</span><span class="p">(</span><span class="n">T1</span> <span class="n">item1</span><span class="p">,</span> <span class="n">T2</span> <span class="n">item2</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">Item1</span> <span class="p">=</span> <span class="n">item1</span><span class="p">;</span>
<span class="n">Item2</span> <span class="p">=</span> <span class="n">item2</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">override</span> <span class="kt">string</span> <span class="nf">ToString</span><span class="p">()</span> <span class="p">=></span> <span class="s">$"(</span><span class="p">{</span><span class="n">Item1</span><span class="p">}</span><span class="s">, </span><span class="p">{</span><span class="n">Item2</span><span class="p">}</span><span class="s">)"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Test</span><span class="p">(</span><span class="n">ImmutableTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">></span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// change arg arbitrary to (42, 42)</span>
<span class="n">arg</span> <span class="p">=</span> <span class="k">new</span> <span class="n">ImmutableTuple</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">>(</span><span class="m">42</span><span class="p">,</span> <span class="m">42</span><span class="p">);</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Interestingly, in the example above compiler is not even trying to create a new instance and assign. It directly initializes <code class="highlighter-rouge">arg</code> with new values since it knows there is no difference:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">void</span> <span class="nf">Test</span> <span class="p">(</span>
<span class="n">valuetype</span> <span class="n">Program</span><span class="p">/</span><span class="n">ImmutableTuple</span><span class="err">`</span><span class="m">2</span><span class="p"><</span><span class="n">int32</span><span class="p">,</span> <span class="n">int32</span><span class="p">></span> <span class="n">arg</span>
<span class="p">)</span> <span class="n">cil</span> <span class="n">managed</span>
<span class="p">{</span>
<span class="c1">// Method begins at RVA 0x206b</span>
<span class="c1">// Code size 23 (0x17)</span>
<span class="p">.</span><span class="n">maxstack</span> <span class="m">8</span>
<span class="c1">// load the address of "arg"</span>
<span class="n">IL_0000</span><span class="p">:</span> <span class="n">ldarga</span><span class="p">.</span><span class="n">s</span> <span class="n">arg</span>
<span class="c1">// load values</span>
<span class="n">IL_0002</span><span class="p">:</span> <span class="n">ldc</span><span class="p">.</span><span class="n">i4</span><span class="p">.</span><span class="n">s</span> <span class="m">42</span>
<span class="n">IL_0004</span><span class="p">:</span> <span class="n">ldc</span><span class="p">.</span><span class="n">i4</span><span class="p">.</span><span class="n">s</span> <span class="m">42</span>
<span class="c1">// call the constructor "in-place" on the arg</span>
<span class="n">IL_0006</span><span class="p">:</span> <span class="n">call</span> <span class="n">instance</span> <span class="k">void</span> <span class="n">valuetype</span> <span class="n">Program</span><span class="p">/</span><span class="n">ImmutableTuple</span><span class="err">`</span><span class="m">2</span><span class="p"><</span><span class="n">int32</span><span class="p">,</span> <span class="n">int32</span><span class="p">>::.</span><span class="nf">ctor</span><span class="p">(!</span><span class="m">0</span><span class="p">,</span> <span class="p">!</span><span class="m">1</span><span class="p">)</span>
<span class="n">IL_000b</span><span class="p">:</span> <span class="n">ldarg</span><span class="p">.</span><span class="m">0</span>
<span class="n">IL_000c</span><span class="p">:</span> <span class="n">box</span> <span class="n">valuetype</span> <span class="n">Program</span><span class="p">/</span><span class="n">ImmutableTuple</span><span class="err">`</span><span class="m">2</span><span class="p"><</span><span class="n">int32</span><span class="p">,</span> <span class="n">int32</span><span class="p">></span>
<span class="n">IL_0011</span><span class="p">:</span> <span class="n">call</span> <span class="k">void</span> <span class="p">[</span><span class="n">mscorlib</span><span class="p">]</span><span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">::</span><span class="nf">WriteLine</span><span class="p">(</span><span class="kt">object</span><span class="p">)</span>
<span class="n">IL_0016</span><span class="p">:</span> <span class="n">ret</span>
<span class="p">}</span> <span class="c1">// end of method Program::Test</span>
</code></pre></div></div>
<p>Surely, the cheapest instance is the one that was not created at all.</p>
<p>Anyhow, immutability of tuples would not prevent elements from being changeable. It would only serve to annoy users when they want to change and have to do it in a roundabout way instead of directly assigning.</p>
<p>Also, once tuples are mutable, there is no point in using properties - that would just prevent passing individual elements by reference, resulting in unnecessary diminished functionality compared to using two variables directly.
Indeed - if you had two mutable variables, you could pass either by reference, why prevent that once you have them in a mutable tuple?</p>
<p>The end result - <strong><em>C# tuples are extremely lightweight constructs - they are structs and their elements are mutable and directly addressable.</em></strong></p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="nn">System</span><span class="p">;</span>
<span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// tuple1 is a struct</span>
<span class="kt">var</span> <span class="n">tuple1</span> <span class="p">=</span> <span class="p">(</span><span class="n">Alice</span><span class="p">:</span> <span class="m">1</span><span class="p">,</span> <span class="n">Bob</span><span class="p">:</span> <span class="m">2</span><span class="p">);</span>
<span class="c1">// tuple2 is a copy</span>
<span class="kt">var</span> <span class="n">tuple2</span> <span class="p">=</span> <span class="n">tuple1</span><span class="p">;</span>
<span class="c1">// elements can be assigned</span>
<span class="n">tuple1</span><span class="p">.</span><span class="n">Alice</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="c1">// elements can be passed by reference</span>
<span class="nf">Inc</span><span class="p">(</span><span class="k">ref</span> <span class="n">tuple1</span><span class="p">.</span><span class="n">Bob</span><span class="p">);</span>
<span class="c1">// tuple2 is indeed a copy. (prints "False")</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">tuple1</span><span class="p">.</span><span class="nf">Equals</span><span class="p">(</span><span class="n">tuple2</span><span class="p">));</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">Inc</span><span class="p">(</span><span class="k">ref</span> <span class="kt">int</span> <span class="n">x</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">x</span><span class="p">++;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="note-aside-so-is-it-all-that-useless-to-make-the-content-of-a-struct-readonly">Note aside: So, is it all that useless to make the content of a struct readonly?</h2>
<p>Not at all !!!</p>
<p>While design of a struct, on its own, cannot prevent its values from being assigned as a whole, it can prevent piece-wise assignment. That is very important if a given struct has internal invariants guaranteed by the construction.</p>
<p>Consider the following struct:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1">// text span starting at "Start" and ending at "End"</span>
<span class="k">struct</span> <span class="nc">TextSpan</span>
<span class="p">{</span>
<span class="k">private</span> <span class="k">readonly</span> <span class="kt">int</span> <span class="n">start</span><span class="p">;</span>
<span class="k">private</span> <span class="k">readonly</span> <span class="kt">int</span> <span class="n">end</span><span class="p">;</span>
<span class="k">public</span> <span class="nf">TextSpan</span><span class="p">(</span><span class="kt">int</span> <span class="n">start</span><span class="p">,</span> <span class="kt">int</span> <span class="n">end</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">start</span> <span class="p"><</span> <span class="m">0</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">ArgumentException</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">end</span> <span class="p"><</span> <span class="m">0</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">ArgumentException</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">end</span> <span class="p"><</span> <span class="n">start</span><span class="p">)</span> <span class="k">throw</span> <span class="k">new</span> <span class="nf">ArgumentException</span><span class="p">();</span>
<span class="k">this</span><span class="p">.</span><span class="n">start</span> <span class="p">=</span> <span class="n">start</span><span class="p">;</span>
<span class="k">this</span><span class="p">.</span><span class="n">end</span> <span class="p">=</span> <span class="n">end</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// none of the following can be negative!!</span>
<span class="k">public</span> <span class="kt">int</span> <span class="n">Start</span> <span class="p">=></span> <span class="n">start</span><span class="p">;</span>
<span class="k">public</span> <span class="kt">int</span> <span class="n">End</span> <span class="p">=></span> <span class="n">end</span><span class="p">;</span>
<span class="k">public</span> <span class="kt">int</span> <span class="n">Length</span> <span class="p">=></span> <span class="n">end</span> <span class="p">-</span> <span class="n">start</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Note that <code class="highlighter-rouge">TextSpan</code> values have special meaning. There are also guarantees that they have nonnegative <code class="highlighter-rouge">Start</code> and <code class="highlighter-rouge">End</code> and nonnegative <code class="highlighter-rouge">Length</code>. Even if a struct can be overwritten as a whole by another value, that new value would still have the same guarantees.<br />
In a normal program (I.E. no racing assignments, use of reflection or unsafe code) the invariants of <code class="highlighter-rouge">TextSpan</code> would hold regardless of how it is used. Thanks to it being immutable!!!</p>
<p>What makes tuples different here is that tuples do not guarantee any invariants - they are just containers, and any values are allowed, so nothing extra would be achieved by being immutable.</p>Vladimir Sadovhttp://mustoverride.comC# defines tuples as mutable value types. Considering the general guidance against mutable structs, it may look as a peculiar design choice.Why ref locals allow only a single binding?2016-11-30T00:00:00+00:002016-11-30T00:00:00+00:00http://mustoverride.com/ref-locals_single-assignment<p>Current restriction on ref locals to be single-assignable is a straightforward and simple way to guard against several potential problems. There are ways to relax the restriction in the future, if that is found to be beneficial enough.</p>
<p>First of all - ref locals are a new kind of ref variables. I have touched some general details common to all ref variables in the earlier posts. <a href="/refs-not-ptrs/">They are not pointers</a>. They are implemented on top of <a href="/managed-refs-CLR/">managed references</a>. In many ways ref locals are similar to ref parameters. Both belong to the kind of variables that do not get their own storage and instead are bound to existing storage.</p>
<p>Ref locals are lexically scoped, just like other locals, but the life time of the storage that they are bound to may not match the scopes of the references. That is where things get “interesting”.</p>
<p>It could be observed that unrestricted <em>“any ref local can be bound or re-bound to any variable at any time”</em> may lead to the following problems:</p>
<p><strong>1. If you want to return a ref local, compiler must be able to validate that all possible bindings at that point are “safe to return by ref”</strong></p>
<p>Here is an example of a ref local that is not safe to return due to ref assignments and nontrivial control flow.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ref</span> <span class="kt">int</span> <span class="nf">RotateRefs</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r0</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">arr</span><span class="p">[</span><span class="m">0</span><span class="p">];</span> <span class="c1">// safe to return</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r1</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">arr</span><span class="p">[</span><span class="m">1</span><span class="p">];</span> <span class="c1">// safe to return</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r2</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">arr</span><span class="p">[</span><span class="m">2</span><span class="p">];</span> <span class="c1">// safe to return</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r3</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">arr</span><span class="p">[</span><span class="m">3</span><span class="p">];</span> <span class="c1">// safe to return</span>
<span class="kt">var</span> <span class="n">local</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r4</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">local</span><span class="p">;</span> <span class="c1">// NOT safe to return !!!</span>
<span class="k">while</span><span class="p">(</span><span class="nf">Condition</span><span class="p">())</span>
<span class="p">{</span>
<span class="c1">// shift-rotate the refs</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">temp</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r0</span><span class="p">;</span>
<span class="c1">// (hypothetical syntax for ref re-assignment)</span>
<span class="n">r0</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r1</span><span class="p">;</span>
<span class="n">r1</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r2</span><span class="p">;</span>
<span class="n">r2</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r3</span><span class="p">;</span>
<span class="n">r3</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r4</span><span class="p">;</span>
<span class="n">r4</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">temp</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// "r0" is not safe to return here!!!</span>
<span class="c1">// imagine an error message that tries to explain why.</span>
<span class="k">return</span> <span class="k">ref</span> <span class="n">r0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In a most general case, enforcing safe-to-return rule for ref locals would require an analysis similar to the definite assignment analysis. Compiler would need to traverse the control flow graph while propagating what is known about the variables and repeating the analysis until no more knowledge could be gained.</p>
<p>That seems rather complicated, but there is more.</p>
<p><strong>2. ref local should not be allowed to be used outside of the life times of the possible referents.</strong></p>
<p>Violating this would lead to variables that are bound to something, that from the point of the language “does not exist”.</p>
<p>Consider the following example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Program</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">readonly</span> <span class="kt">int</span><span class="p">[]</span> <span class="n">arr</span> <span class="p">=</span> <span class="p">{</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span> <span class="p">};</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// bind r0, r1 to something initially</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r0</span> <span class="p">=</span> <span class="k">ref</span> <span class="p">(</span><span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="m">1</span><span class="p">])[</span><span class="m">0</span><span class="p">];</span>
<span class="k">ref</span> <span class="kt">var</span> <span class="n">r1</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p"><</span> <span class="m">2</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
<span class="p">{</span>
<span class="c1">// NOTE: possibly initializing "variable" using a value</span>
<span class="c1">// of its own binding from 2 iterations behind.</span>
<span class="kt">var</span> <span class="n">variable</span> <span class="p">=</span> <span class="n">r0</span> <span class="p">+</span> <span class="m">1</span><span class="p">;</span>
<span class="c1">// keeping the previous binding of "r1" in "r0"</span>
<span class="n">r1</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">r0</span><span class="p">;</span>
<span class="c1">// binding "r1" to "variable". (hypothetical syntax for ref re-assignment)</span>
<span class="c1">// NOTE: "variable" is about to go out of scope,</span>
<span class="c1">// but "r0" and "r1" would still be around, bound to what???</span>
<span class="n">r0</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">variable</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// NOTE: both "r0" and "r1" are bound to different</span>
<span class="c1">// bindings of "variable", which do not exist at this point.</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"Different bindings of 'variable' are equal?: </span><span class="p">{</span><span class="n">r0</span> <span class="p">==</span> <span class="n">r1</span><span class="p">}</span><span class="s">"</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"r0 : </span><span class="p">{</span><span class="n">r0</span><span class="p">}</span><span class="s">"</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"r1 : </span><span class="p">{</span><span class="n">r1</span><span class="p">}</span><span class="s">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>As with locals being returned by reference, exposing a local from an inner scope to the code in the outer scopes by the means of a reference is a problem and should be:</p>
<p>1) allowed with undefined behavior.<br />
2) allowed with exposed locals captured into closures.<br />
3) disallowed.</p>
<p>For the same reasons as <a href="/ref-returns-and-locals/">with ref returns</a>, just disallowing seems more appropriate for C#.</p>
<p>Sadly, <em>“locals should not be exposed to the outer scopes”</em> is a simple principle that is not so simple to enforce. Precise enforcement would require a transitive tracking of all possible bindings of ref locals and validating that scoping rules are not violated at every use point of the references.</p>
<p>For example the following code could be allowed:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ref</span> <span class="kt">int</span> <span class="n">outer</span><span class="p">;</span>
<span class="k">try</span>
<span class="p">{</span>
<span class="k">for</span><span class="p">(;;)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">inner</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
<span class="c1">// binding outer to the inner!!!</span>
<span class="n">outer</span> <span class="p">=</span> <span class="k">ref</span> <span class="n">inner</span><span class="p">;</span>
<span class="c1">// assigning inner trough outer.</span>
<span class="c1">// that is ok since we could do it directly here too.</span>
<span class="n">outer</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">catch</span>
<span class="p">{</span>
<span class="n">outer</span> <span class="p">=</span> <span class="k">ref</span> <span class="p">(</span><span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="m">1</span><span class="p">])[</span><span class="m">0</span><span class="p">];</span>
<span class="p">}</span>
<span class="c1">// this is ok also, as long as it is proven that outer</span>
<span class="c1">// is not referencing something that is not in scope</span>
<span class="n">outer</span> <span class="p">=</span> <span class="m">333</span><span class="p">;</span>
</code></pre></div></div>
<p>The analysis that is required to validate the example above would be something similar to the analysis for the safe-to return validation for ref locals, but the state that is incrementally updated and propagated through control flow paths would be more complex. It would need to reflect the entire collections of all possible scopes from which ref locals could reference variables. The state will be affected by ref-assignments and will have to propagate the state from ref-parameters to the ref returns in a case of method calls.</p>
<p>Note that this analysis would also be sufficient to prove if/when ref locals are safe-to-return. If at particular point, the set of possible scopes for a ref local contains only the “out-of-method” scope, then ref-returning is safe.</p>
<p>Also note that the analysis is very complex, can be computationally expensive and would often yield diagnostics that is hard to act upon. - <br />
<em>“ERROR: can’t use a ref local here because it is possibly referencing a variable that is out of scope”</em>, - scratch your head and try figure where and how the inconvenient binding was picked up.</p>
<p>Language designers are naturally concerned when a feature requires such analysis. In such situations it is desirable to find a way to constrain the feature in order to reduce the number of scenarios supported, preferably at the cost of uncommon cases. After all, maximizing the number of programs that compile correctly, by itself is not a goal.</p>
<p>In a case of ref locals, it was found that forcing the initialization of ref locals during initialization and not allowing re-assignment solves the problems described above very nicely:</p>
<ul>
<li>the issue with exposing locals by reference to outer scopes is trivially prevented, since it is not possible to initialize something with a variable from an inner scope.</li>
<li>safe-to-return property can be simply copied from the initial referent, since we require that there is one and there won’t be another.</li>
</ul>
<p>It is possible that single-assignment requirement will be found too constraining and the language may need to be relaxed a bit in the future. Generally it is ok to start accepting code that used to be an error in previous versions, but the opposite changes are extremely rare.</p>
<p><em>Some possible future directions for the feature are:</em></p>
<p><em>Addition1:</em> allow re-assigning of ref locals, but only with something that is safe-to-return, but keep inferring the safe-to-return property from the initial assignment.<br />
<em>Addition2:</em> relax the requirement that ref locals must be initialized at declaration. Treat such locals as safe-to-return, as long as they are definitely assigned.</p>
<p>Note that additions above would allow more code to be legal, primarily when dealing with unscoped/heap variables, while not yet require complicated flow analysis.</p>
<p>– <strong>Pedantic notes:</strong></p>
<p>Before the ref locals were introduced, it was, in some situations, possible to use <code class="highlighter-rouge">System.TypedReference</code> in combination with <code class="highlighter-rouge">__makeref</code>, <code class="highlighter-rouge">__refvalue</code> keywords as a crude substitute. <code class="highlighter-rouge">TypedReference</code> clearly contains a managed reference and acts as a proxy, so why all the fuss with C# scoping and why that does not apply to TypedReference?</p>
<p>Well, <code class="highlighter-rouge">__makeref</code> and <code class="highlighter-rouge">__refvalue</code> are basically just special keywords that directly map to <code class="highlighter-rouge">makerefany</code> and <code class="highlighter-rouge">refanyval</code> IL opcodes in order to provide basic support for <code class="highlighter-rouge">__arglist</code> feature.</p>
<p>The functionality is optional in both C# and CLR and is not recommended for general purpose use. In a way, it is a platform/CLR feature (like reflection) and is not very well integrated into the language.</p>
<p>Compiler knows about CLR restrictions of TypedReference - it cannot be boxed, cannot be a field or an array element, cannot be returned from a method, etc… Compiler will try preventing unverifiable code and fatal GC holes, but not much beyond that.</p>
<p>Indeed, the following example shows that <code class="highlighter-rouge">__makeref</code> completely disrespects scoping rules of C# and does not care if references may outlive the life times of the referenced locals, which, while not fatal, may result in undocumented/unpredictable behavior.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Program</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">readonly</span> <span class="kt">int</span><span class="p">[]</span> <span class="n">arr</span> <span class="p">=</span> <span class="p">{</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span><span class="p">,</span> <span class="m">0</span> <span class="p">};</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// bind r0, r1 to something initially</span>
<span class="n">TypedReference</span> <span class="n">r0</span> <span class="p">=</span> <span class="nf">__makeref</span><span class="p">((</span><span class="k">new</span> <span class="kt">int</span><span class="p">[</span><span class="m">1</span><span class="p">])[</span><span class="m">0</span><span class="p">]);</span>
<span class="n">TypedReference</span> <span class="n">r1</span> <span class="p">=</span> <span class="n">r0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="p"><</span> <span class="m">2</span><span class="p">;</span> <span class="n">i</span><span class="p">++)</span>
<span class="p">{</span>
<span class="c1">// NOTE: possibly initializing "variable" using a value</span>
<span class="c1">// of its own binding from 2 iterations behind.</span>
<span class="kt">var</span> <span class="n">variable</span> <span class="p">=</span> <span class="nf">__refvalue</span><span class="p">(</span><span class="n">r0</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="p">+</span> <span class="m">1</span><span class="p">;</span>
<span class="c1">// keeping the previous binding of "r1" in "r0"</span>
<span class="n">r1</span> <span class="p">=</span> <span class="n">r0</span><span class="p">;</span>
<span class="c1">// binding "r1" to "variable".</span>
<span class="c1">// NOTE: "variable" is about to go out of scope,</span>
<span class="c1">// but "r0" and "r1" would still be around, bound to what???</span>
<span class="n">r0</span> <span class="p">=</span> <span class="nf">__makeref</span><span class="p">(</span><span class="n">variable</span><span class="p">);</span>
<span class="c1">// DUMMY NOOP CODE,</span>
<span class="c1">// UNCOMMENT TO CHANGE THE PROGRAM'S OUTPUT!!!!</span>
<span class="c1">//Func<int> dummy = () => variable; // cause "variable" be captured</span>
<span class="p">}</span>
<span class="c1">// NOTE: both "r0" and "r1" are bound to different</span>
<span class="c1">// bindings of "variable", which do not exist at this point.</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"Different bindings of 'variable' are equal?: </span><span class="p">{</span><span class="nf">__refvalue</span><span class="p">(</span><span class="n">r0</span><span class="p">,</span> <span class="kt">int</span><span class="p">)</span> <span class="p">==</span> <span class="nf">__refvalue</span><span class="p">(</span><span class="n">r1</span><span class="p">,</span> <span class="kt">int</span><span class="p">)}</span><span class="s">"</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"r0 : </span><span class="p">{</span><span class="nf">__refvalue</span><span class="p">(</span><span class="n">r0</span><span class="p">,</span> <span class="kt">int</span><span class="p">)}</span><span class="s">"</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"r1 : </span><span class="p">{</span><span class="nf">__refvalue</span><span class="p">(</span><span class="n">r1</span><span class="p">,</span> <span class="kt">int</span><span class="p">)}</span><span class="s">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The output is:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Different bindings of 'variable' are equal?: True
r0 : 2
r1 : 2
</code></pre></div></div>
<p>And when the noop <code class="highlighter-rouge">Func</code> is uncommented, the output is:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Different bindings of 'variable' are equal?: False
r0 : 2
r1 : 1
</code></pre></div></div>Vladimir Sadovhttp://mustoverride.comCurrent restriction on ref locals to be single-assignable is a straightforward and simple way to guard against several potential problems. There are ways to relax the restriction in the future, if that is found to be beneficial enough.Definite Assignment Analysis of locals. The real purpose.2016-11-24T00:00:00+00:002016-11-24T00:00:00+00:00http://mustoverride.com/definite-assignment<p>Definite Assignment Analysis prevents bugs, but there are deeper reasons to have it as a required feature in the language.</p>
<p>Indeed - Why have such strict rules instead of just assuming that unassigned locals contain default/zero values? After all, that seems to work ok for fields. It is also known that C# compiler decorates all methods with IL directive <code class="highlighter-rouge">localsinit</code>, which guarantees that all locals are zeroed out when method is entered. So what is the problem?</p>
<p><code class="highlighter-rouge">localsinit</code> is, unfortunately, not enough to implement the semantics of C# locals. It would work if the life time of all local bindings (i.e. <a href="https://en.wikipedia.org/wiki/Variable_(computer_science)#Scope_and_extent">extents</a>), was the whole method, but that is not the case in C#. In C# locals can have scopes smaller than the entirety of a method and extents match the lexical scopes. Every time the control flow enters a scope, a new set of bindings for the locals contained by that scope is supposed to be created and the bindings exist as long as they can be referenced. In the most general sense the “new bindings” would imply a newly allocated storage completely unrelated to the bindings possibly created when the same scope was entered previously.</p>
<p>A brute-force solution would be to map local variables of the same scope to fields in a synthesized class and create a new instance of such class when entering a scope. In fact this is what happens to locals that are accessible from lambda expressions. Such locals can be used beyond the life time of the containing method and multiple bindings to the same variables could coexist at the same time, so compiler needs to allocate their storage on the heap and rely on GC for keeping them alive as long as they can be referenced.</p>
<p>Example of multiple bindings to the same local:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">class</span> <span class="nc">Program</span>
<span class="p">{</span>
<span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">iteration</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span>
<span class="kt">var</span> <span class="n">setters</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Action</span><span class="p"><</span><span class="kt">int</span><span class="p">>[</span><span class="m">2</span><span class="p">];</span>
<span class="kt">var</span> <span class="n">getters</span> <span class="p">=</span> <span class="k">new</span> <span class="n">Func</span><span class="p"><</span><span class="kt">int</span><span class="p">>[</span><span class="m">2</span><span class="p">];</span>
<span class="n">reenterScope</span><span class="p">:</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">sameVariable</span> <span class="p">=</span> <span class="m">0</span><span class="p">;</span> <span class="c1">// <-- THE VARIABLE</span>
<span class="n">setters</span><span class="p">[</span><span class="n">iteration</span><span class="p">]</span> <span class="p">=</span> <span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">=></span> <span class="n">sameVariable</span> <span class="p">=</span> <span class="n">i</span><span class="p">;</span>
<span class="n">getters</span><span class="p">[</span><span class="n">iteration</span><span class="p">]</span> <span class="p">=</span> <span class="p">()</span> <span class="p">=></span> <span class="n">sameVariable</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">iteration</span><span class="p">++</span> <span class="p"><</span> <span class="m">1</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">goto</span> <span class="n">reenterScope</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"Original values of different bindings of the sameVariable: "</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">getters</span><span class="p">[</span><span class="m">0</span><span class="p">]());</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">getters</span><span class="p">[</span><span class="m">1</span><span class="p">]());</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">();</span>
<span class="n">setters</span><span class="p">[</span><span class="m">0</span><span class="p">](</span><span class="m">33</span><span class="p">);</span>
<span class="n">setters</span><span class="p">[</span><span class="m">1</span><span class="p">](</span><span class="m">42</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"Assigned values of different bindings of the sameVariable: "</span><span class="p">);</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">getters</span><span class="p">[</span><span class="m">0</span><span class="p">]());</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">getters</span><span class="p">[</span><span class="m">1</span><span class="p">]());</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Original values of different bindings of the sameVariable:
0
0
Assigned values of different bindings of the sameVariable:
33
42
</code></pre></div></div>
<p>The most common case is, however, when locals are just that - locals. They are not accessed from lambdas or anything like that and at any time only one (or none) bindings to such local may exist. In such cases locals can be simply mapped to IL local slots and reused every time the control flow enters the scope.
The only problem is that the slot values would need to be “reset” every time the scope is entered to the default value and there is no help from <code class="highlighter-rouge">localsinit</code> here since that works only once - when the whole method is invoked.</p>
<p>In theory, compiler could inject code that would do the “resetting” of all relevant slots, when a scope is entered, but that would be wasteful. Only some of the locals in a given scope would be read from. Besides, most of them would be written to before reading anyways, so why not just require that a local is written to before being read? That would make the code less buggy, but most of all it will make the “resetting” entirely unnecessary.</p>
<p><strong>Essentially, a rule that requires that locals are definitely assigned before being read serves the same purpose as <code class="highlighter-rouge">localsinit</code>, but does much better job.</strong></p>
<ol>
<li>It works at every nested lexical scope recursively (not just on the method level)</li>
<li>It gives stronger guarantees. You can see only what you have already assigned to the variable. It is impossible to read uninitialized/stale state by accident.</li>
<li>It is minimally redundant. If you do not read a local on some code path you do not need to ensure that it is assigned on that code path</li>
</ol>
<p>Simple example of some variables assigned on one code path and not assigned on the other. As long as we do not read the variable it is ok to have it not assigned.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">void</span> <span class="nf">Main</span><span class="p">(</span><span class="kt">string</span><span class="p">[]</span> <span class="n">args</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">a</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">b</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">args</span><span class="p">.</span><span class="n">Length</span> <span class="p">></span> <span class="m">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">goto</span> <span class="n">path1</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">else</span>
<span class="p">{</span>
<span class="k">goto</span> <span class="n">path2</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">path1</span><span class="p">:</span>
<span class="c1">// assign only "a"</span>
<span class="n">a</span> <span class="p">=</span> <span class="m">123</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">path1continues</span><span class="p">;</span>
<span class="n">path2</span><span class="p">:</span>
<span class="c1">// assign only "b"</span>
<span class="n">b</span> <span class="p">=</span> <span class="m">345</span><span class="p">;</span>
<span class="k">goto</span> <span class="n">path2continues</span><span class="p">;</span>
<span class="n">path1continues</span><span class="p">:</span>
<span class="c1">// on this codepath "b" is not assignd,</span>
<span class="c1">// but that is ok since we do not read it.</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">a</span><span class="p">);</span>
<span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
<span class="n">path2continues</span><span class="p">:</span>
<span class="c1">// on this codepath "a" is not assignd,</span>
<span class="c1">// but that is ok since we do not read it.</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">b</span><span class="p">);</span>
<span class="n">exit</span><span class="p">:</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Interestingly, in VB, for historical reasons, locals not referenced from lambdas do have extents that match the entirety of the method and thus definite assignment analysis is much less strict - it basically exists just to give warnings on some cases that likely to be coding mistakes.</p>
<p>example of a VB local binding maintained through the entirety of the method life time:</p>
<div class="language-vb highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Module</span> <span class="nn">Module1</span>
<span class="k">Sub</span> <span class="nf">Main</span><span class="p">()</span>
<span class="k">Dim</span> <span class="nv">iterations</span> <span class="ow">As</span> <span class="kt">Integer</span> <span class="o">=</span> <span class="mi">3</span>
<span class="k">While</span> <span class="n">iterations</span> <span class="o">></span> <span class="mi">0</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">"variable declared and initialized"</span><span class="p">)</span>
<span class="k">Dim</span> <span class="nv">variable</span> <span class="ow">As</span> <span class="kt">Integer</span> <span class="o">=</span> <span class="mi">42</span>
<span class="n">reentryTheScope</span><span class="p">:</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">variable</span><span class="p">)</span>
<span class="n">variable</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">iterations</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">End</span> <span class="k">While</span>
<span class="k">If</span> <span class="n">iterations</span> <span class="o">></span> <span class="o">-</span><span class="mi">3</span> <span class="k">Then</span>
<span class="c1">' "variable" is out of scope here, but it exists and has a value</span>
<span class="c1">' let's reenter the While loop and check upon the value of the "variable"</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">"reentering scope"</span><span class="p">)</span>
<span class="k">GoTo</span> <span class="n">reentryTheScope</span>
<span class="k">End</span> <span class="k">If</span>
<span class="k">End</span> <span class="k">Sub</span>
<span class="k">End</span> <span class="k">Module</span>
</code></pre></div></div>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>variable declared and initialized
42
variable declared and initialized
42
variable declared and initialized
42
reentering scope
43
reentering scope
44
reentering scope
45
</code></pre></div></div>
<p>The locals captured into lambda closures, however, have scoped extents in VB - surely the lifetimes cannot be bound to the lifetime of the containing method anymore when lambdas are involved. Similarly to C#, fresh bindings for captured locals are created when scope is entered and their lifetimes are bound to the lifetime of the referencing lambdas. So if locals are captured, the example above would start behaving differently. To make the unfortunate inconsistency less observable, VB refuses to compile code like above when locals are captured.</p>
<div class="language-vb highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">Module</span> <span class="nn">Module1</span>
<span class="k">Sub</span> <span class="nf">Main</span><span class="p">()</span>
<span class="k">Dim</span> <span class="nv">iterations</span> <span class="ow">As</span> <span class="kt">Integer</span> <span class="o">=</span> <span class="mi">3</span>
<span class="k">While</span> <span class="n">iterations</span> <span class="o">></span> <span class="mi">0</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">"variable declared and initialized"</span><span class="p">)</span>
<span class="k">Dim</span> <span class="nv">variable</span> <span class="ow">As</span> <span class="kt">Integer</span> <span class="o">=</span> <span class="mi">42</span>
<span class="n">reentryTheScope</span><span class="p">:</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="n">variable</span><span class="p">)</span>
<span class="n">variable</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="n">iterations</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="c1">' cause "variable" to be captured into a closure</span>
<span class="k">Dim</span> <span class="nv">lambda</span> <span class="ow">As</span> <span class="n">Func</span><span class="p">(</span><span class="k">Of</span> <span class="kt">Integer</span><span class="p">)</span> <span class="o">=</span> <span class="k">Function</span><span class="err">()</span> <span class="nf">variable</span>
<span class="k">End</span> <span class="k">While</span>
<span class="k">If</span> <span class="n">iterations</span> <span class="o">></span> <span class="o">-</span><span class="mi">3</span> <span class="k">Then</span>
<span class="c1">' "variable" is out of scope here, but it exists and has a value</span>
<span class="c1">' let's reentering the While loop and check upon the value of the "variable"</span>
<span class="nb">Console</span><span class="p">.</span><span class="n">WriteLine</span><span class="p">(</span><span class="s">"reentering scope"</span><span class="p">)</span>
<span class="k">GoTo</span> <span class="n">reentryTheScope</span>
<span class="k">End</span> <span class="k">If</span>
<span class="k">End</span> <span class="k">Sub</span>
<span class="k">End</span> <span class="k">Module</span>
</code></pre></div></div>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Module1.vb(27, 13) : Error BC36597 : 'Goto reentryTheScope' is not valid because 'reentryTheScope' is inside a scope that defines a variable that is used in a lambda or query expression.
</code></pre></div></div>
<p><strong>Pedantic observations:</strong></p>
<p>Since C# enforces stronger invariant than provided by <code class="highlighter-rouge">localsinit</code>, one would wonder why compiler still puts <code class="highlighter-rouge">localsinit</code> on methods. A simple answer is that IL verification rules require that. The underlying reason for the requirement is that the user’s code is not the only entity that might read the locals. The other one is the Garbage Collector.</p>
<p>The issue with GC is that it scans the IL locals of currently active methods in order to record the roots of reachable object graphs, and GC happens at fairly random times. Definite assignment analysis does not guarantee that locals will be assigned something deterministic before GC happens and things will go terribly bad if locals contain random junk. Therefore there is a rule that requires that verifiable methods have <code class="highlighter-rouge">localsinit</code> as an instruction directing the JIT to add a method preamble that wipes the whole stack frame clean before the method body is formally entered and GC had any chance to scan the locals.</p>
<p>In theory the rule could be required only on methods with locals of reference types (or structs containing references), but that would make a difference only to a fraction of methods while complicating the rule. Instead CLI standard allows JIT implementations to disregard the directive if, through some analysis, it could be inferred that not wiping the frame is a safe thing to do.</p>
<p>I am not sure if JITs use this kink in the rules very often though. With exception of the most trivial cases, the analysis could be too involved to be feasible at JIT time and wiping the stack frame is not overly expensive. Still, since there are some costs associated with locals (wiping the frame is just one of them), C# compiler generally tries to be frugal with usage of local slots, especially when compiling with /o+.</p>Vladimir Sadovhttp://mustoverride.comDefinite Assignment Analysis prevents bugs, but there are deeper reasons to have it as a required feature in the language.Conditional member access operator (idiomatic uses).2016-11-15T00:00:00+00:002016-11-15T00:00:00+00:00http://mustoverride.com/conditional-idiomatic<p>Here are some of the less known uses of Null-conditional operator that could be handy to know.</p>
<p><strong>1. use null-conditional access with void methods</strong></p>
<p>A common misconception about null conditional operator is that it can be used only with members that return something. That is probably because of the special treatment where the return type is promoted to nullable (if it cannot represent null already).<br />
It is, actually, perfectly fine to use null-conditional with a void returning method. The overall expression type is still <code class="highlighter-rouge">void</code>, just the method is called conditionally.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// get a dictionary if we have one</span>
<span class="n">Dictionary</span><span class="p"><</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">></span> <span class="n">dict</span> <span class="p">=</span> <span class="nf">GetDictionaryOrNull</span><span class="p">();</span>
<span class="c1">// add something, if we actually have a dictionary</span>
<span class="c1">// NOTE: Add has void return type</span>
<span class="n">dict</span><span class="p">?.</span><span class="nf">Add</span><span class="p">(</span><span class="m">42</span><span class="p">,</span> <span class="nf">GetValue</span><span class="p">());</span>
</code></pre></div></div>
<p>It is particularly nice that execution of the whole<code class="highlighter-rouge">Add(42, GetValue())</code> is conditional and thus <code class="highlighter-rouge">GetValue()</code> is only evaluated if <code class="highlighter-rouge">dict</code> is not <code class="highlighter-rouge">null</code>.<br />
Without <code class="highlighter-rouge">?.</code> the same code would not look as nice.</p>
<p><strong>2. raising events</strong></p>
<p>Generally C# events need to be null-checked before invoked. In addition to that, to be safe from races, an event needs to be captured into a local. That is quite a bit of code to just raise an event:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">event</span> <span class="n">Action</span> <span class="n">OnSomething</span> <span class="p">=</span> <span class="k">null</span><span class="p">;</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">DoSomething</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// raise OnSomething, if not null</span>
<span class="kt">var</span> <span class="n">onSomething</span> <span class="p">=</span> <span class="n">OnSomething</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span> <span class="n">onSomething</span> <span class="p">!=</span> <span class="k">null</span><span class="p">)</span>
<span class="p">{</span>
<span class="nf">onSomething</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Raising an event, however, is nothing more then invoking <code class="highlighter-rouge">Invoke</code> method on the event. And performing that conditionally on not being null is much clearer with <code class="highlighter-rouge">?.</code> :</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">event</span> <span class="n">Action</span> <span class="n">OnSomething</span> <span class="p">=</span> <span class="k">null</span><span class="p">;</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">Something</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// raise OnSomething, if not null</span>
<span class="n">OnSomething</span><span class="p">?.</span><span class="nf">Invoke</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>3. null-conditional and nul-coalescing operator together</strong></p>
<p>When receiver of a null-conditional operator is <code class="highlighter-rouge">null</code>, the result is also <code class="highlighter-rouge">null</code> of appropriate type. That might be inconvenient in cases where a “default” result, other than <code class="highlighter-rouge">null</code> is supposed to be returned. That can be easily and elegantly fixed by combining <code class="highlighter-rouge">?.</code> and <code class="highlighter-rouge">??</code>.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// if obj is not null, give me its hashcode or 42 otherwise</span>
<span class="kt">int</span> <span class="n">hashcode</span> <span class="p">=</span> <span class="n">obj</span><span class="p">?.</span><span class="nf">GetHashCode</span><span class="p">()</span> <span class="p">??</span> <span class="m">42</span><span class="p">;</span>
</code></pre></div></div>
<p>It may seem that the code has two null checks here, which would be redundant, considering that only one input variable could be null:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// null check "obj", wrap result in "int?"</span>
<span class="kt">int</span><span class="p">?</span> <span class="n">temp</span> <span class="p">=</span> <span class="n">obj</span><span class="p">?.</span><span class="nf">GetHashCode</span><span class="p">();</span>
<span class="c1">// null check the "temp", unwrap "int?"</span>
<span class="kt">int</span> <span class="n">hashcode</span> <span class="p">=</span> <span class="n">temp</span> <span class="p">??</span> <span class="m">42</span><span class="p">;</span>
</code></pre></div></div>
<p>Compiler is actually smart enough to understand the meaning of <code class="highlighter-rouge">?.</code> + <code class="highlighter-rouge">??</code> combination, and emits more optimal code. It knows that the only way for the <code class="highlighter-rouge">obj?.GetHashCode()</code> to be <code class="highlighter-rouge">null</code> is when <code class="highlighter-rouge">obj</code> is <code class="highlighter-rouge">null</code> and in such case the whole expression returns <code class="highlighter-rouge">42</code>. When <code class="highlighter-rouge">obj</code> is not <code class="highlighter-rouge">null</code>, the result of <code class="highlighter-rouge">GetHashCode()</code> is returned. In fact, there is no need to involve intermediate wrapping/unwrapping of <code class="highlighter-rouge">int?</code> at all.</p>
<p>The actual code, that is emitted, is an equivalent of:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">object</span> <span class="n">stackTemp</span> <span class="p">=</span> <span class="n">obj</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">hashcode</span> <span class="p">=</span> <span class="n">stackTemp</span> <span class="p">!=</span> <span class="k">null</span> <span class="p">?</span> <span class="n">stackTemp</span><span class="p">.</span><span class="nf">GetHashCode</span><span class="p">()</span> <span class="p">:</span> <span class="m">0</span><span class="p">;</span>
</code></pre></div></div>
<p><strong>4. use null-conditional in conditions</strong></p>
<p>Null-conditional operator has type of <code class="highlighter-rouge">bool?</code>, when used with underlying expression of <code class="highlighter-rouge">bool</code> type. Such expression cannot be used directly in conditions. However, there are easy ways to “normalize” the three-state result in to true/false.</p>
<p>When <code class="highlighter-rouge">null</code> should be treated the same as <code class="highlighter-rouge">false</code>, use <code class="highlighter-rouge">== true</code> or <code class="highlighter-rouge">?? false</code>.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="c1">// assigned externally</span>
<span class="k">public</span> <span class="k">static</span> <span class="n">HashSet</span><span class="p"><</span><span class="kt">int</span><span class="p">></span> <span class="n">hs</span><span class="p">;</span>
<span class="k">public</span> <span class="k">void</span> <span class="nf">Check</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">hs</span><span class="p">?.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">)</span> <span class="p">==</span> <span class="k">true</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">hs</span><span class="p">?.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">)</span> <span class="p">??</span> <span class="k">false</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Both <code class="highlighter-rouge">== true</code> and <code class="highlighter-rouge">?? false</code> result in the same code emitted as the resulting conditions indeed have the same semantics. In either case compiler can infer that the only situation in which the condition will be satisfied is when <code class="highlighter-rouge">hs</code> is not <code class="highlighter-rouge">null</code> and when <code class="highlighter-rouge">hs.Contains(42)</code> returns <code class="highlighter-rouge">true</code>.</p>
<p>I personally like <code class="highlighter-rouge">== true</code> form more, but I have seen <code class="highlighter-rouge">?? false</code> used and I find it just as readable.</p>
<p>Again, the intermediate <code class="highlighter-rouge">bool?</code>, that would be produced by <code class="highlighter-rouge">hs?.Contains(42)</code> alone, and all expenses related to dealing with it, can be bypassed.</p>
<p>The actual codegen for either of the <code class="highlighter-rouge">if</code>s above looks like:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">IL_0000</span><span class="p">:</span> <span class="n">ldsfld</span> <span class="k">class</span> <span class="err">[</span><span class="nc">System</span><span class="p">.</span><span class="n">Core</span><span class="p">]</span><span class="n">System</span><span class="p">.</span><span class="n">Collections</span><span class="p">.</span><span class="n">Generic</span><span class="p">.</span><span class="n">HashSet</span><span class="err">`</span><span class="m">1</span><span class="p"><</span><span class="n">int32</span><span class="p">></span> <span class="n">C</span><span class="p">::</span><span class="n">hs</span>
<span class="n">IL_0005</span><span class="p">:</span> <span class="n">dup</span>
<span class="n">IL_0006</span><span class="p">:</span> <span class="n">brtrue</span><span class="p">.</span><span class="n">s</span> <span class="n">IL_000c</span>
<span class="n">IL_0008</span><span class="p">:</span> <span class="n">pop</span>
<span class="n">IL_0009</span><span class="p">:</span> <span class="n">ldc</span><span class="p">.</span><span class="n">i4</span><span class="p">.</span><span class="m">0</span>
<span class="n">IL_000a</span><span class="p">:</span> <span class="n">br</span><span class="p">.</span><span class="n">s</span> <span class="n">IL_0013</span>
<span class="n">IL_000c</span><span class="p">:</span> <span class="n">ldc</span><span class="p">.</span><span class="n">i4</span><span class="p">.</span><span class="n">s</span> <span class="m">42</span>
<span class="n">IL_000e</span><span class="p">:</span> <span class="n">call</span> <span class="n">instance</span> <span class="kt">bool</span> <span class="k">class</span> <span class="err">[</span><span class="nc">System</span><span class="p">.</span><span class="n">Core</span><span class="p">]</span><span class="n">System</span><span class="p">.</span><span class="n">Collections</span><span class="p">.</span><span class="n">Generic</span><span class="p">.</span><span class="n">HashSet</span><span class="err">`</span><span class="m">1</span><span class="p"><</span><span class="n">int32</span><span class="p">>::</span><span class="nf">Contains</span><span class="p">(!</span><span class="m">0</span><span class="p">)</span>
<span class="n">IL_0013</span><span class="p">:</span> <span class="n">brfalse</span><span class="p">.</span><span class="n">s</span> <span class="n">IL_001f</span>
<span class="n">IL_0015</span><span class="p">:</span> <span class="n">ldstr</span> <span class="s">"contains"</span>
<span class="n">IL_001a</span><span class="p">:</span> <span class="n">call</span> <span class="k">void</span> <span class="p">[</span><span class="n">mscorlib</span><span class="p">]</span><span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">::</span><span class="nf">WriteLine</span><span class="p">(</span><span class="kt">string</span><span class="p">)</span>
</code></pre></div></div>
<p>Which is an equivalent for an optimized:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">HashSet</span><span class="p"><</span><span class="kt">int</span><span class="p">></span> <span class="n">stackTemp</span> <span class="p">=</span> <span class="n">C</span><span class="p">.</span><span class="n">hs</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">stackTemp</span> <span class="p">!=</span> <span class="k">null</span> <span class="p">&&</span> <span class="n">stackTemp</span><span class="p">.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">))</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Conversely <code class="highlighter-rouge">!= false</code> and <code class="highlighter-rouge">?? true</code> could be used when <code class="highlighter-rouge">null</code> is to be treated the same as <code class="highlighter-rouge">true</code>.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">hs</span><span class="p">?.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">)</span> <span class="p">!=</span> <span class="k">false</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains or null"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">hs</span><span class="p">?.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">)</span> <span class="p">??</span> <span class="k">true</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">System</span><span class="p">.</span><span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains or null"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In either case, the condition is emitted as an optimized equivalent of:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">HashSet</span><span class="p"><</span><span class="kt">int</span><span class="p">></span> <span class="n">stackTemp</span> <span class="p">=</span> <span class="n">C</span><span class="p">.</span><span class="n">hs</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">stackTemp</span> <span class="p">==</span> <span class="k">null</span> <span class="p">||</span> <span class="n">stackTemp</span><span class="p">.</span><span class="nf">Contains</span><span class="p">(</span><span class="m">42</span><span class="p">))</span>
<span class="p">{</span>
<span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">"contains or null"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p><strong>5. composing null-conditional and lifted operators</strong></p>
<p>Null-conditional operator mixes well with lifted operators.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="c1">// assigned externally</span>
<span class="k">public</span> <span class="k">static</span> <span class="kt">string</span> <span class="n">s</span><span class="p">;</span>
<span class="k">public</span> <span class="kt">bool</span> <span class="nf">IsLongEnough</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">return</span> <span class="n">s</span><span class="p">?.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="m">1</span> <span class="p">></span> <span class="m">10</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Again, compiler knows about the short circuiting nature of <code class="highlighter-rouge">?.</code> and that the only way a <code class="highlighter-rouge">null</code> can get into the calculation is through <code class="highlighter-rouge">s</code> being <code class="highlighter-rouge">null</code> and once that happens we immediately know the result without the need of propagating that <code class="highlighter-rouge">null</code> through the whole chain of lifted calculations.</p>
<p>The actual code emitted here is equivalent of:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">public</span> <span class="k">class</span> <span class="nc">C</span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">static</span> <span class="kt">string</span> <span class="n">s</span><span class="p">;</span>
<span class="k">public</span> <span class="kt">bool</span> <span class="nf">IsLongEnough</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">string</span> <span class="n">stackTemp</span> <span class="p">=</span> <span class="n">C</span><span class="p">.</span><span class="n">s</span><span class="p">;</span>
<span class="k">return</span> <span class="n">stackTemp</span> <span class="p">!=</span> <span class="k">null</span> <span class="p">&&</span> <span class="c1">// <- check for null once</span>
<span class="n">stackTemp</span><span class="p">.</span><span class="n">Length</span> <span class="p">*</span> <span class="m">2</span> <span class="p">+</span> <span class="m">1</span> <span class="p">></span> <span class="m">10</span><span class="p">;</span> <span class="c1">// <- not null-propagating </span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>As you may notice the intermediate nullables are again completely elided by the compiler and the math was simplified to regular (not the null-propagating) form.</p>Vladimir Sadovhttp://mustoverride.comHere are some of the less known uses of Null-conditional operator that could be handy to know.Safe to return rules for ref returns.2016-11-04T00:00:00+00:002016-11-04T00:00:00+00:00http://mustoverride.com/safe-to-return<p>For the reasons explained in the <a href="/ref-returns-and-locals/">earlier post</a>, C# disallows returning local variables by reference. While the principle of “<em>Cannot return local variables by reference</em>” seems very simple, there are many ways for a user to violate the principle directly or indirectly and enforcing it is an interesting challenge.</p>
<p>So, what is exactly safe to return and what is not?</p>
<p>Clearly attempting to return a local by reference should trigger an error. But what about a field of a local? If that local happens to be a struct we would be in trouble, since we would still be returning a reference to the local data. On the other hand it would be ok if the local is a class. Therefore there is a need to generalize the rule to include the fields of struct locals as well, recursively. - Field of a field of a field of a field of … of a local is unsafe to return as long as all types in the chain are structs.</p>
<p>Another interesting question is whether a ref return or a ref parameter themselves are safe to return.</p>
<p>Consider the following example:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">ref</span> <span class="kt">int</span> <span class="nf">Callee</span><span class="p">(</span><span class="k">ref</span> <span class="kt">int</span> <span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="k">ref</span> <span class="n">arg</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">ref</span> <span class="kt">int</span> <span class="nf">Caller</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">s</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="c1">// DANGER!! returning a reference to the local data</span>
<span class="k">return</span> <span class="k">ref</span> <span class="nf">Callee</span><span class="p">(</span><span class="k">ref</span> <span class="n">s</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Here Caller passes a local variable <code class="highlighter-rouge">s</code> by reference to Callee. Then Callee returns it back by reference. What comes from Caller is essentially <code class="highlighter-rouge">ref s</code>. While it would be ok for the caller to use that, returning that result by reference would be an equivalent of returning <code class="highlighter-rouge">s</code> and thus should be prevented.<br />
In a general case, compiler has no knowledge of what is going on inside Callee. Conservatively, compiler must assume that any byref parameter or its field may be returned back by reference, so as long as any of the ref arguments are not safe to return, the result of the call is not safe to return either.<br />
Note that in some cases, knowing the types of the ref parameters and the return type, it could be proven that the return can not possibly be referencing data from one of the parameters. However, it was decided to be conservative here for the sake of simplicity and consistency. (considering structs, interfaces and generics, the additional rules could get really complicated).</p>
<p>Here are the actual “safe to return” rules as enforced by the language:</p>
<hr />
<ol>
<li>
<p><strong>refs to variables on the heap are safe to return</strong></p>
</li>
<li>
<p><strong>ref parameters are safe to return</strong></p>
</li>
<li>
<p><strong>out parameters are safe to return (but must be definitely assigned, as is already the case today)</strong></p>
</li>
<li>
<p><strong>instance struct fields are safe to return as long as the receiver is safe to return</strong></p>
</li>
<li>
<p><strong>“this” is not safe to return from struct members</strong></p>
</li>
<li>
<p><strong>a ref, returned from another method is safe to return if all refs/outs passed to that method as formal parameters were safe to return.</strong><br />
Specifically it is irrelevant if receiver is safe to return, regardless whether receiver is a struct, class or typed as a generic type parameter.</p>
</li>
</ol>
<hr />
<p>The last two rules might look a bit curious. - What’s up with “this”?<br />
The special treatment for “this” was added to handle the following scenario:</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">interface</span> <span class="nc">IIndexable</span><span class="p"><</span><span class="n">T</span><span class="p">></span>
<span class="p">{</span>
<span class="k">public</span> <span class="k">ref</span> <span class="n">T</span> <span class="k">this</span><span class="p">[</span><span class="kt">int</span> <span class="n">i</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">ref</span> <span class="kt">int</span> <span class="n">First</span><span class="p"><</span><span class="n">T</span><span class="p">>(</span><span class="n">T</span> <span class="n">arg</span><span class="p">)</span> <span class="k">where</span> <span class="n">T</span><span class="p">:</span> <span class="n">IIndexable</span><span class="p"><</span><span class="kt">int</span><span class="p">></span>
<span class="p">{</span>
<span class="c1">// is this safe to return by reference?</span>
<span class="k">return</span> <span class="k">ref</span> <span class="n">arg</span><span class="p">[</span><span class="m">0</span><span class="p">];</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The problem is that “this” is passed by reference to struct members and by value to class members. If we consider “this” in struct members the same as other parameters for the purpose of rule #6, we would have a problem here since we do not know whether T is a struct or a class. Treating type T conservatively as “can be a struct” would diminish the usefulness of ref returns when used with generics, so another approach was chosen. - “this” is completely ignored at the call site for the purpose of “safe to return” rule as we may not even know whether we are dealing with a struct or a class. To make that safe in cases when we do get a struct, the rule #5 was added. Surely, it is known inside a member whether the container is a struct or a class and the safety can be enforced there.</p>
<p><strong>Verifiability.</strong><br />
There is a little issue with ref returns concerning verifiability. Generally ECMA 335 specifies ref returns as not verifiable. Some JITs are less strict and allow ref returning of heap variables (to accommodate some patterns used by managed c++). That relaxed behavior is still stricter than “safe to return” rules and some examples involving ref returns would only work in scenarios that do not involve formal verification.</p>
<p>It is, however, believed that a system in agreement with “safe to return” rules is actually typesafe and there are plans to add corresponding relaxation to verification rules in the current JITs and tools like PEVerify</p>
<p>It is conceivable that ECMA 335 will be modified or get an implementation specific amendment for such relaxation at some point as well .</p>Vladimir Sadovhttp://mustoverride.comFor the reasons explained in the earlier post, C# disallows returning local variables by reference. While the principle of “Cannot return local variables by reference” seems very simple, there are many ways for a user to violate the principle directly or indirectly and enforcing it is an interesting challenge.Local variables cannot be returned by reference.2016-10-29T00:00:00+00:002016-10-29T00:00:00+00:00http://mustoverride.com/ref-returns-and-locals<p>Ability to return by reference introduces an interesting scenario.- What happens when a local variable is returned by reference? Is the variable still alive when its containing method has completed? What happens with the returned reference when callee is invoked again?</p>
<p>These are the questions that every language that allows byref returns needs to answer one way or another. C# design team had to deal with these questions too.</p>
<p>Several options were considered:</p>
<p>– <strong>Allow returning locals by reference and leave the behavior unspecified.</strong><br />
That is how C++ handles locals returned by reference. Although most C++ compilers would give a warning.</p>
<p>It is not a viable option for C#. The underlying mechanism for ref returns is <a href="/managed-refs-CLR/">managed pointers</a> and those are subject to GC tracking. Regular locals are typically implemented as slots on the stack and subsequent calls will reuse those slots while their local variables may have different types.<br />
It is extremely dangerous to have a managed reference of type T pointing to unspecified data that has nothing to do with T. If GC attempts to track through such reference and follow what would be the fields of the T instance, but in reality bits and pieces of some other type, it would easily result in heap corruptions.</p>
<p>Example of a type safety problem if actual stack slots are returned by ref.</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1">// returns a ref to an Exception local</span>
<span class="k">ref</span> <span class="kt">int</span> <span class="nf">RefEx</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">Exception</span> <span class="n">local</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">Exception</span><span class="p">(</span><span class="s">"hi"</span><span class="p">);</span>
<span class="k">return</span> <span class="k">ref</span> <span class="n">local</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// returns a ref to an int local</span>
<span class="k">ref</span> <span class="kt">int</span> <span class="nf">RefInt</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">local</span> <span class="p">=</span> <span class="m">42</span><span class="p">;</span>
<span class="k">return</span> <span class="n">local</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">void</span> <span class="nf">TakesTwoRefs</span><span class="p">(</span><span class="k">ref</span> <span class="n">Exception</span> <span class="n">s</span><span class="p">,</span> <span class="k">ref</span> <span class="kt">int</span> <span class="n">i</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">GC</span><span class="p">.</span><span class="nf">Collect</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">void</span> <span class="nf">WritesIntIntoEx</span><span class="p">()</span>
<span class="p">{</span>
<span class="c1">// RefEx will run in the same stack space as RefInt</span>
<span class="c1">// so it is likely that results of RefInt and RefEx</span>
<span class="c1">// would point to the same or overlapping memory location</span>
<span class="c1">// That is already bad by itself.</span>
<span class="c1">// What is worse is that RefInt writes "42" into that location</span>
<span class="c1">// If GC happens during the call, it may see something typed</span>
<span class="c1">// as "Exception" at completely bogus location.</span>
<span class="nf">TakesTwoRefs</span><span class="p">(</span><span class="k">ref</span> <span class="nf">RefInt</span><span class="p">(),</span> <span class="k">ref</span> <span class="nf">RefEx</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>
<p>– <strong>Extend the life time of the local by allocating it on the heap.</strong> <br />
This is how Go handles this situation.</p>
<p>It would not be something entirely new for C#. The approach would be somewhat similar to capturing locals into closures. However, it was decided that in the context of ref returns this is not a good solution.</p>
<p><em>Firstly</em>, the extent (lifetime) of a local variable in C# matches its scope. Since the caller is running outside of the scope of the callee, then, from its point of view, the locals of the callee <em>do not exist</em>. Note that lambdas that cause locals to be captured into closures do not leave the scope, while returning from the method certainly does leave the scope. It would be strange that caller can get an alias to a local that does not exist and perhaps even multiple aliases to multiple incarnation of such local, if caller makes a ref returning call more than once.<br />
That was not the major point, though. I am sure with some effort such behavior could be rationalized and accepted, if necessary.</p>
<p><em>Secondly</em>, and more importantly, the whole idea of introducing ref returns was motivated by performance-sensitive scenarios where it would allow to avoid redundant copying. Enabling the feature via automatic capturing of locals into display classes would defeat the purpose.</p>
<p>– <strong>Disallow returning local variables by reference.</strong><br />
This is the solution that was chosen for C#. - To guarantee that a reference does not outlive the referenced variable C# does not allow returning references to local variables by reference.<br />
Interestingly, this is the same approach used by Rust, although for slightly different reasons. (Rust is a RAII language and actively destroys locals when exiting scopes)</p>Vladimir Sadovhttp://mustoverride.comAbility to return by reference introduces an interesting scenario.- What happens when a local variable is returned by reference? Is the variable still alive when its containing method has completed? What happens with the returned reference when callee is invoked again?