HashSet Initialisation Speed in C#

Which one is faster?

var hs = new HashSet<int>(_data);
var hs = new HashSet<int>();
foreach(int i in _data) {
    hs.Add(i);
}
var hs = new HashSet<int>(_data.Length);
foreach (int i in _data) {
	hs.Add(i);
}

My first thought was that option 1 is definitely faster - I’m passing entire dataset into HashSet so .NET should be efficient enough to figure that out. Testing it:

MethodLengthMeanErrorStdDevMinMaxMedianGen0Gen1Gen2Allocated
WithConstructor100000052.07 ms44.886 ms2.460 ms49.47 ms54.37 ms52.36 ms454.5455454.5455454.545517.74 MB
WithAddNoLength100000053.45 ms5.679 ms0.311 ms53.10 ms53.70 ms53.54 ms900.0000900.0000900.000041.12 MB
WithAddWithLength100000037.92 ms15.129 ms0.829 ms37.01 ms38.64 ms38.10 ms500.0000500.0000500.000017.74 MB

which means option #3 is faster! Passing data length into constructor makes sure we’ll have no memory reallocation, and then adding element one by one fills it up quicker.

Benchmark Code

#LINQPad optimize+

void Main()
{
	Util.AutoScrollResults = true;
	BenchmarkRunner.Run<Enumeration>();
}

[ShortRunJob]
[MinColumn, MaxColumn, MeanColumn, MedianColumn]
[MemoryDiagnoser]
[MarkdownExporter]
public class Enumeration
{
	[Params(1000000)]
	public int Length;
    
    private int[] _data;
    private static Random random = new Random();

	[GlobalSetup]
	public void Setup()
	{
        _data = Enumerable.Range(0, Length).Select(i => random.Next()).ToArray();
	}


	[Benchmark]
	public void WithConstructor()
	{
        var hs = new HashSet<int>(_data);
	}

    [Benchmark]
    public void WithAddNoLength() {
        var hs = new HashSet<int>();
        foreach(int i in _data) {
            hs.Add(i);
        }
    }

    [Benchmark]
    public void WithAddWithLength() {
        var hs = new HashSet<int>(_data.Length);
        foreach (int i in _data) {
            hs.Add(i);
        }
    }


}


To contact me, send an email anytime or leave a comment below.