HashSet in .NET 9

What is a HashSet<T>?

In .NET, a HashSet<T> is a collection that implements an unordered set of unique elements. Introduced in .NET Framework 3.5 as part of the generic collections library, HashSet<T> benefits from internal framework optimizations for performance. Conceptually, it uses a hash-based structure to efficiently store elements and ensure uniqueness, similar to how a dictionary uses hashes for keys.

In everyday programming, it’s commonly used when we need a list of unique elements. Its key benefits are:

  • Performance: Thanks to the use of a hash table, common operations like Contains have an average time complexity of O(1) (unlike searching through strings, which has a complexity of O(n) because the list needs to scan all existing elements).
  • Uniqueness: Perfect for data collections where duplicates must be avoided without manual checks. Duplicate entries are automatically discarded.

Practical Examples

Creating a HashSet<T>

Here are different ways to instantiate a HashSet<T>: passing an existing list to the constructor or adding elements incrementally.

C#
var stringList = new List<string> { "Alice", "Bob", "Giorgio" };
var hashSet1 = new HashSet<string>(stringList);

var intList = new List<int> { 1, 2, 3, 4, 5 };
var hashSet2 = new HashSet<int>(intList);

var hashSet3 = new HashSet<int>();
hashSet3.Add(1);
hashSet3.Add(2);
hashSet3.Add(3);
hashSet3.Add(4);

Merging two HashSet<T> lists

C#
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 5, 6, 7, 8, 9 });

hashSet1.UnionWith(hashSet2);

foreach (var item in hashSet1)
{
    Console.WriteLine(item);
}

/*
Output:

1
2
3
4
5
6
7
8
9

*/

Intersection of two HashSets

Finding common elements

C#
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 3, 4, 5, 6, 7 });

hashSet1.IntersectWith(hashSet2);

foreach (var item in hashSet1)
{
    Console.WriteLine(item);
}

/*
Output:

3
4
5

*/

Subset (ExceptWith)

Returning a HashSet<T>by removing elements present in another HashSet<T>:

C#
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 3, 4, 5, 6, 7 });

hashSet1.ExceptWith(hashSet2);

foreach (var item in hashSet1)
{
    Console.WriteLine(item);
}

/*
Output:

1
2

*/

Removing duplicates

When a HashSet<T> is initialized with a list containing duplicate strings, duplicates are automatically removed. Note that “Bob” and “bob” are considered distinct.

C#
var list = new List<string> { "Alice", "Bob", "Alice", "Alice", "Giorgio", "bob" };
var hashSet = new HashSet<string>(list);

foreach (var item in hashSet)
{
    Console.WriteLine(item);
}

/*
Output:

    Alice
    Bob
    Giorgio
    bob
    
*/

Conclusion

HashSet<T> is a powerful, high-performance, and versatile tool for managing collections of unique data in .NET. Its features and support for set operations simplify complex scenarios while improving performance.

Share this article
Shareable URL
Prev Post

Discovering Span<T>

Next Post

SOLID Programming Principles

Leave a Reply

Your email address will not be published. Required fields are marked *

Read next

Discovering Span<T>

With .NET Core 2.1 and C# 7.2, a new type of struct was introduced: Span<T> and ReadOnlySpan<T>.…

LINQ Extension Method

At the 2024 edition of Overnet’s WPC conference, I attended an insightful talk about LINQ extension…