Skip to content

Conversation

@ginkelsoft-development
Copy link
Owner

This feature addresses the issue of overly broad search matches from very short prefixes by introducing a configurable minimum length requirement for prefix-based searches.

Changes:

  • Added 'min_prefix_length' configuration option (default: 3)
  • Updated Tokens::prefixes() to accept minLength parameter
  • Modified HasEncryptedSearchIndex to enforce minimum length during:
    • Token generation (indexing)
    • Query execution (searching)
  • Added comprehensive test coverage (10 new feature tests, 6 unit tests)

Behavior:

  • With min_prefix_length=3 (default):
    • Searching for "Wi" (2 chars) returns no results
    • Searching for "Wil" (3+ chars) works normally
  • Prevents performance issues from single-character searches
  • Reduces false positives from very short search terms
  • Exact search is unaffected by this setting

Benefits:

  • Eliminates unwanted matches (e.g., "W" matching "William", "Wendy", "Walter")
  • Improves search precision
  • Maintains backwards compatibility (set to 1 for old behavior)
  • Configurable per environment via ENCRYPTED_SEARCH_MIN_PREFIX

Test updates:

  • Updated existing tests to use min_prefix_length=1 for compatibility
  • Added MinimumPrefixLengthTest with 10 comprehensive scenarios
  • Added 6 unit tests for Tokens class minimum length behavior
  • All 76 tests passing (136 assertions)

This feature addresses the issue of overly broad search matches from
very short prefixes by introducing a configurable minimum length
requirement for prefix-based searches.

Changes:
- Added 'min_prefix_length' configuration option (default: 3)
- Updated Tokens::prefixes() to accept minLength parameter
- Modified HasEncryptedSearchIndex to enforce minimum length during:
  - Token generation (indexing)
  - Query execution (searching)
- Added comprehensive test coverage (10 new feature tests, 6 unit tests)

Behavior:
- With min_prefix_length=3 (default):
  - Searching for "Wi" (2 chars) returns no results
  - Searching for "Wil" (3+ chars) works normally
- Prevents performance issues from single-character searches
- Reduces false positives from very short search terms
- Exact search is unaffected by this setting

Benefits:
- Eliminates unwanted matches (e.g., "W" matching "William", "Wendy", "Walter")
- Improves search precision
- Maintains backwards compatibility (set to 1 for old behavior)
- Configurable per environment via ENCRYPTED_SEARCH_MIN_PREFIX

Test updates:
- Updated existing tests to use min_prefix_length=1 for compatibility
- Added MinimumPrefixLengthTest with 10 comprehensive scenarios
- Added 6 unit tests for Tokens class minimum length behavior
- All 76 tests passing (136 assertions)
@ginkelsoft-development ginkelsoft-development merged commit 09e8e92 into develop Oct 13, 2025
4 of 8 checks passed
ginkelsoft-development added a commit that referenced this pull request Oct 14, 2025
…efix-length

add configurable minimum prefix length for search queries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants