What is a HashSet<T>?
In .NET, a HashSet<T>
is a collection that implements an unordered set of unique elements. Introduced in .NET Framework 3.5 as part of the generic collections library, HashSet<T> benefits from internal framework optimizations for performance. Conceptually, it uses a hash-based structure to efficiently store elements and ensure uniqueness, similar to how a dictionary uses hashes for keys.
In everyday programming, it’s commonly used when we need a list of unique elements. Its key benefits are:
- Performance: Thanks to the use of a hash table, common operations like
Contains
have an average time complexity of O(1) (unlike searching through strings, which has a complexity of O(n) because the list needs to scan all existing elements). - Uniqueness: Perfect for data collections where duplicates must be avoided without manual checks. Duplicate entries are automatically discarded.
Practical Examples
Creating a HashSet<T>
Here are different ways to instantiate a HashSet<T>
: passing an existing list to the constructor or adding elements incrementally.
var stringList = new List<string> { "Alice", "Bob", "Giorgio" };
var hashSet1 = new HashSet<string>(stringList);
var intList = new List<int> { 1, 2, 3, 4, 5 };
var hashSet2 = new HashSet<int>(intList);
var hashSet3 = new HashSet<int>();
hashSet3.Add(1);
hashSet3.Add(2);
hashSet3.Add(3);
hashSet3.Add(4);
Merging two HashSet<T> lists
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 5, 6, 7, 8, 9 });
hashSet1.UnionWith(hashSet2);
foreach (var item in hashSet1)
{
Console.WriteLine(item);
}
/*
Output:
1
2
3
4
5
6
7
8
9
*/
Intersection of two HashSets
Finding common elements
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 3, 4, 5, 6, 7 });
hashSet1.IntersectWith(hashSet2);
foreach (var item in hashSet1)
{
Console.WriteLine(item);
}
/*
Output:
3
4
5
*/
Subset (ExceptWith)
Returning a HashSet<T>
by removing elements present in another HashSet<T>
:
var hashSet1 = new HashSet<int>(new List<int> { 1, 2, 3, 4, 5 });
var hashSet2 = new HashSet<int>(new List<int> { 3, 4, 5, 6, 7 });
hashSet1.ExceptWith(hashSet2);
foreach (var item in hashSet1)
{
Console.WriteLine(item);
}
/*
Output:
1
2
*/
Removing duplicates
When a HashSet<T> is initialized with a list containing duplicate strings, duplicates are automatically removed. Note that “Bob” and “bob” are considered distinct.
var list = new List<string> { "Alice", "Bob", "Alice", "Alice", "Giorgio", "bob" };
var hashSet = new HashSet<string>(list);
foreach (var item in hashSet)
{
Console.WriteLine(item);
}
/*
Output:
Alice
Bob
Giorgio
bob
*/
Conclusion
HashSet<T> is a powerful, high-performance, and versatile tool for managing collections of unique data in .NET. Its features and support for set operations simplify complex scenarios while improving performance.