Skip to content

Commit 1b95951

Browse files
authored
feat: add beta support for Mongo Atlas Search (Resolves #96) (#104)
1 parent c125189 commit 1b95951

File tree

5 files changed

+1212
-6
lines changed

5 files changed

+1212
-6
lines changed

README.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -582,6 +582,174 @@ The Elasticsearch Query Builder has a couple of family of methods that can be ov
582582
In Mongo and Postgres there is a near 1-1 translation between an AST node and a query. In Elasticsearch, due to [Nested Queries](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html) the mapping is not 1-to-1,
583583
due to visiting a nested field. If you need to override behaviour pertaining to a nested field, the `Get____QueryBuilder()` functions are probably where the override should happen, otherwise `Visit____()` might be simpler.
584584

585+
#### MongoDB Atlas Search (Beta)
586+
587+
The following example shows how to generate a MongoDB Atlas Search query with this library.
588+
589+
**Note**: MongoDB Atlas Search support is currently in beta. Some operators & types are not yet implemented.
590+
591+
```go
592+
package example
593+
594+
import (
595+
"context"
596+
"github.com/elasticpath/epcc-search-ast-helper"
597+
"github.com/elasticpath/epcc-search-ast-helper/mongo"
598+
"go.mongodb.org/mongo-driver/bson"
599+
"go.mongodb.org/mongo-driver/mongo"
600+
)
601+
602+
func Example(ast *epsearchast.AstNode, collection *mongo.Collection, tenantBoundaryId string) (*mongo.Cursor, error) {
603+
// Not Shown: Validation
604+
605+
// Create Atlas Search query builder
606+
// Configure multi-analyzers for fields that support LIKE/ILIKE
607+
var qb epsearchast.SemanticReducer[bson.D] = astmongo.DefaultAtlasSearchQueryBuilder{
608+
FieldToMultiAnalyzers: map[string]*astmongo.StringMultiAnalyzers{
609+
"name": {
610+
WildcardCaseInsensitive: "caseInsensitiveAnalyzer",
611+
WildcardCaseSensitive: "caseSensitiveAnalyzer",
612+
},
613+
"email": {
614+
WildcardCaseInsensitive: "caseInsensitiveAnalyzer",
615+
WildcardCaseSensitive: "caseSensitiveAnalyzer",
616+
},
617+
},
618+
}
619+
620+
// Create AST Query Object
621+
astQuery, err := epsearchast.SemanticReduceAst(ast, qb)
622+
623+
if err != nil {
624+
return nil, err
625+
}
626+
627+
// Build the Atlas Search query with compound must clause
628+
// - astQuery contains the user's search filter (from AST)
629+
// - equals clauses ensure results are scoped to the tenant boundary
630+
searchQuery := bson.D{
631+
{"compound",
632+
bson.D{
633+
{"must", bson.A{
634+
astQuery,
635+
bson.D{
636+
{"equals", bson.D{
637+
{"path", "tenant_boundary_id"},
638+
{"value", tenantBoundaryId},
639+
}},
640+
},
641+
}},
642+
},
643+
},
644+
}
645+
646+
// Execute the search using aggregation pipeline
647+
pipeline := mongo.Pipeline{
648+
{{Key: "$search", Value: searchQuery}},
649+
}
650+
651+
return collection.Aggregate(context.TODO(), pipeline)
652+
}
653+
```
654+
655+
##### Supported Operators
656+
657+
The following operators are currently supported:
658+
- `text` - Full-text search with analyzers
659+
- `eq` - Exact case-sensitive equality matching (string fields only)
660+
- `in` - Multiple value exact matching (string fields only)
661+
- `like` - Case-sensitive wildcard matching
662+
- `ilike` - Case-insensitive wildcard matching
663+
- `gt` - Greater than (lexicographic comparison for strings)
664+
- `ge` - Greater than or equal (lexicographic comparison for strings)
665+
- `lt` - Less than (lexicographic comparison for strings)
666+
- `le` - Less than or equal (lexicographic comparison for strings)
667+
668+
##### Field Configuration
669+
670+
###### Multi-Analyzer Configuration for LIKE/ILIKE
671+
672+
To support `like` and `ilike` operators with proper case sensitivity handling, you need to:
673+
674+
1. **Define custom analyzers in your search index** with appropriate tokenization and case handling
675+
2. **Configure multi-analyzers on your string fields** to index the same field with different analyzers
676+
3. **Map fields to analyzer names** in the query builder using `FieldToMultiAnalyzers`
677+
678+
**Example Search Index Definition:**
679+
680+
```json
681+
{
682+
"analyzers": [
683+
{
684+
"name": "caseInsensitiveAnalyzer",
685+
"tokenizer": {
686+
"type": "keyword"
687+
},
688+
"tokenFilters": [
689+
{
690+
"type": "lowercase"
691+
}
692+
]
693+
}
694+
],
695+
"mappings": {
696+
"dynamic": false,
697+
"fields": {
698+
"name": [
699+
{
700+
"type": "string",
701+
"analyzer": "lucene.standard",
702+
"multi": {
703+
"caseInsensitiveAnalyzer": {
704+
"type": "string",
705+
"analyzer": "caseInsensitiveAnalyzer"
706+
},
707+
"caseSensitiveAnalyzer": {
708+
"type": "string",
709+
"analyzer": "lucene.keyword"
710+
}
711+
}
712+
},
713+
{
714+
"type": "token"
715+
}
716+
]
717+
}
718+
}
719+
}
720+
```
721+
722+
**Query Builder Configuration:**
723+
724+
The `FieldToMultiAnalyzers` map specifies which multi-analyzer to use for each field:
725+
726+
```go
727+
FieldToMultiAnalyzers: map[string]*StringMultiAnalyzers{
728+
"name": {
729+
WildcardCaseInsensitive: "caseInsensitiveAnalyzer", // Used for ILIKE
730+
WildcardCaseSensitive: "caseSensitiveAnalyzer", // Used for LIKE
731+
},
732+
}
733+
```
734+
735+
**Behavior:**
736+
If a field is **not** in `FieldToMultiAnalyzers`, if you specify a non empty analyzer, then a "multi" attribute is generated with the name (e.g., `{"path": {"value": "fieldName", "multi": "analyzerName"}}`)
737+
738+
This allows you to mix fields with and without multi-analyzer support in the same index.
739+
740+
##### Limitations
741+
742+
1. The following operators are not yet implemented: `contains`, `contains_any`, `contains_all`, `is_null`
743+
2. The following field types are not currently supported: UUID fields, Date fields, Numeric fields (numbers are compared as strings)
744+
3. Range operators (`gt`, `ge`, `lt`, `le`) perform lexicographic comparison on string fields only
745+
4. Atlas Search requires proper [search index configuration](https://www.mongodb.com/docs/atlas/atlas-search/create-index/) with appropriate field types:
746+
- String fields used with `like`/`ilike` should be indexed with multi-analyzers as shown above
747+
- String fields used with `eq`/`in` should be indexed with `token` type
748+
- String fields used with range operators (`gt`/`ge`/`lt`/`le`) work with `token` type for lexicographic comparison
749+
- Text fields should be indexed with `string` type and an appropriate analyzer
750+
5. Unlike regular MongoDB queries, Atlas Search queries use the aggregation pipeline with the `$search` stage
751+
6. Additional filters (like tenant boundaries) should be included within the `$search` stage using compound must clauses for optimal performance (as shown in the example above). Alternatively, they can be added as separate `$match` stages after the `$search` stage, though this is less efficient as it filters results after the search rather than during indexing
752+
585753
### FAQ
586754

587755
#### Design

docker-compose.yml

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
volumes:
2+
mongo-community:
3+
mongo-community-search:
4+
15
services:
26
postgres:
37
image: postgres:15.0-bullseye
@@ -17,6 +21,8 @@ services:
1721
ports:
1822
- '127.0.0.1:20002:27017'
1923
environment:
24+
MONGODB_INITDB_ROOT_USERNAME: admin
25+
MONGODB_INITDB_ROOT_PASSWORD: admin
2026
MONGO_INITDB_ROOT_USERNAME: admin
2127
MONGO_INITDB_ROOT_PASSWORD: admin
2228
healthcheck:
@@ -38,3 +44,155 @@ services:
3844
timeout: 5s
3945
retries: 5
4046

47+
48+
mongo-community:
49+
container_name: search-ast-helper-mongo-community
50+
hostname: mongo-community
51+
extra_hosts:
52+
# We override the hostname for mongo to point to localhost.
53+
# Because in the init mode mongo is only available to localhost.
54+
# But we need the replicaset to be configured with a name other containers can see.
55+
# https://github.com/docker-library/mongo/issues/339#issuecomment-2253159258
56+
- "mongo-community:127.0.0.1"
57+
#image: mongodb/mongodb-community-server:8.2.1-ubi8
58+
image: mongo:8.2.1
59+
command:
60+
- "--bind_ip_all"
61+
- "--replSet"
62+
- "rs0"
63+
- "--setParameter"
64+
- "mongotHost=mongo-community-search:27027"
65+
- "--setParameter"
66+
- "searchIndexManagementHostAndPort=mongo-community-search:27027"
67+
- "--setParameter"
68+
- "skipAuthenticationToSearchIndexManagementServer=false"
69+
- "--setParameter"
70+
- "useGrpcForSearch=true"
71+
- "--keyFile"
72+
- "/keyfile"
73+
- "--auth"
74+
ports:
75+
- '127.0.0.1:20004:27017'
76+
volumes:
77+
- mongo-community:/data/db:delegated
78+
configs:
79+
- source: mongo-rs-config.js
80+
target: /docker-entrypoint-initdb.d/mongo-rs-config.js
81+
- source: mongo-user-config.js
82+
target: /docker-entrypoint-initdb.d/mongo-user-config.js
83+
- source: keyfile
84+
target: /keyfile
85+
mode: 0400
86+
uid: "999"
87+
gid: "999"
88+
healthcheck:
89+
test: >
90+
mongosh --quiet "localhost/test" --eval 'quit(db.runCommand({ ping: 1 }).ok ? 0 : 2)'
91+
interval: 10s
92+
timeout: 5s
93+
retries: 10
94+
start_period: 40s
95+
96+
mongo-community-search:
97+
container_name: search-ast-helper-mongo-community-search
98+
hostname: mongo-community-search
99+
image: mongodb/mongodb-community-search:0.55.0
100+
ports:
101+
- '127.0.0.1:20005:27027'
102+
volumes:
103+
- mongo-community-search:/data/mongot:delegated
104+
configs:
105+
- source: config.default.yml
106+
target: /mongot-community/config.default.yml
107+
- source: passwordFile
108+
target: /etc/mongot/secrets/passwordFile
109+
mode: 0400
110+
uid: "999"
111+
gid: "999"
112+
stop_grace_period: 1s
113+
depends_on:
114+
mongo-community:
115+
condition: service_healthy
116+
117+
118+
119+
configs:
120+
mongo-rs-config.js:
121+
# language=javascript
122+
content: |
123+
print("\n\n\nStarting Replica Set Configuration\n\n\n");
124+
rs.initiate({
125+
_id: "rs0",
126+
members: [
127+
{ _id: 0, host: `mongo-community:27017` }
128+
]
129+
});
130+
131+
while (!rs.status().members.some(m => m.stateStr === "PRIMARY")) {
132+
print("Waiting for primary...");
133+
sleep(1000); // sleep 1 second
134+
}
135+
136+
print("\n\n\nReplica Set Configuration Completed\n\n\n");
137+
138+
mongo-user-config.js:
139+
# language=javascript
140+
content: |
141+
print("\n\n\nCreating Users\n\n\n");
142+
var db = db.getSiblingDB("admin");
143+
db.createUser(
144+
{
145+
user: "admin",
146+
pwd: "admin",
147+
mechanisms: ["SCRAM-SHA-256"],
148+
roles: [
149+
{
150+
role: "root",
151+
db: "admin"
152+
}
153+
]
154+
},
155+
{
156+
w: "majority",
157+
wtimeout: 5000
158+
}
159+
);
160+
161+
db.createUser(
162+
{
163+
user: "search",
164+
pwd: "search",
165+
mechanisms: ["SCRAM-SHA-256"],
166+
roles: [ "searchCoordinator" ]
167+
}
168+
);
169+
170+
print("\n\n\nUsers created\n\n\n");
171+
keyfile:
172+
content: |
173+
helloworld
174+
passwordFile:
175+
content: "search"
176+
config.default.yml:
177+
# language=yaml
178+
content: |
179+
syncSource:
180+
replicaSet:
181+
hostAndPort: "mongo-community:27017"
182+
username: "search"
183+
passwordFile: "/etc/mongot/secrets/passwordFile"
184+
tls: false
185+
storage:
186+
dataPath: "/data/mongot"
187+
server:
188+
grpc:
189+
address: "0.0.0.0:27027"
190+
tls:
191+
mode: "disabled"
192+
metrics:
193+
enabled: true
194+
address: "0.0.0.0:9946"
195+
healthCheck:
196+
address: "0.0.0.0:8080"
197+
logging:
198+
verbosity: INFO

0 commit comments

Comments
 (0)