-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Expected behavior is that if I set forceUnique to false, then I will end up with a resultant collection containing count strings.
Currently, when the target framework is > .NET Framework 2.0, strings are generated and added into a HashSet. This automatically creates a collection of unique items, whether the caller wanted unique or not.
It is easy to test/reproduce with this line of code:
List<string> randomStrings = RandomString.GetStrings(Types.NUMBERS, count: 2000, maxLength: 3, randomLength: false, forceUnique: false);
See:
RandomString4Net/RandomString4Net/RandomString.cs
Lines 188 to 205 in f01e288
| HashSet<string> randomStrings = new HashSet<string>(); | |
| #endif | |
| int inputStringLength = inputString.Length; | |
| int outputStringLength; | |
| for (int i = 0; i < count; i++) | |
| { | |
| outputStringLength = randomLength ? randomInstance.Next(1, maxLength) : maxLength; | |
| StringBuilder currentRandomString = new StringBuilder(); | |
| for (int j = 0; j < outputStringLength; j++) | |
| currentRandomString.Append(inputString[randomInstance.Next(inputStringLength)]); | |
| if (forceUnique && randomStrings.Contains(currentRandomString.ToString())) | |
| continue; | |
| randomStrings.Add(currentRandomString.ToString()); |
Proposed solutions:
Instead of using a HashSet, just stick with List:
If the target is .NET Framework 2.0, then use a combination of BinarySearch and Insert to maintain the list in sorted order and ensure only unique items are inserted
For other framework targets, use Enumerable.Distinct() when returning the collection if forceUnique is true.
A micro benchmark would be helpful to judge the performance of the latter option. It might be better to have both the List and the HashSet, checking the HashSet before inserting into the List, but of course there is a memory tradeoff there.