Skip to content

Commit bee3c06

Browse files
committed
Enable lustre driver and add the relevant documentation
1 parent 1a1cd15 commit bee3c06

File tree

9 files changed

+339
-9
lines changed

9 files changed

+339
-9
lines changed

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ else
3838
VERSION ?= ${VERSION}
3939
endif
4040

41-
RELEASE = v1.32.0
41+
RELEASE = v1.32.1
4242

4343
GOOS ?= linux
4444
ARCH ?= amd64

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ cloud-provider specific code out of the Kubernetes codebase.
4040
| v1.31.0 | v1.31 | - |
4141
| v1.31.1 | v1.31 | - |
4242
| v1.32.0 | v1.32 | - |
43+
| v1.32.1 | v1.32 | - |
4344

4445

4546
Note:

container-storage-interface.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,10 @@ Check if PVC is now in bound state:
128128
$ kubectl describe pvc/oci-bv-claim
129129
```
130130

131+
## PVCs with Lustre File System
132+
133+
Provisioning PVCs on the file storage with lustre service can be found [here](docs/pvcs-with-lustre.md)
134+
131135
# Troubleshoot
132136

133137
## FsGroup policy not propagated from pod security context

docs/pvcs-with-lustre-using-csi.md

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
# Provisioning PVCs on the File Storage with Lustre Service
2+
3+
The Oracle Cloud Infrastructure File Storage with Lustre service is a fully managed storage service designed to meet the demands of AI/ML training and inference, and high performance computing needs. You use the Lustre CSI plugin to connect clusters to file systems in the File Storage with Lustre service.
4+
5+
You can use the File Storage with Lustre service to provision persistent volume claims (PVCs) by manually creating a file system in the File Storage with Lustre service, then defining and creating a persistent volume (PV) backed by the new file system, and finally defining a new PVC. When you create the PVC, Kubernetes binds the PVC to the PV backed by the File Storage with Lustre service.
6+
7+
The Lustre CSI driver is the overall software that enables Lustre file systems to be used with Kubernetes via the Container Storage Interface (CSI). The Lustre CSI plugin is a specific component within the driver, responsible for interacting with the Kubernetes API server and managing the lifecycle of Lustre volumes.
8+
9+
Note the following:
10+
11+
- The Lustre CSI driver is supported on Kubernetes version 1.29 or later.
12+
- The Lustre CSI driver is supported on Oracle Linux 8 x86 and on Ubuntu x86 22.04.
13+
- To use a Lustre file system with a Kubernetes cluster, the Lustre client package must be installed on worker nodes that have to mount the file system. For more information about Lustre clients, see [Mounting and Accessing a Lustre File System](/iaas/Content/lustre/file-system-connect.htm).
14+
15+
## Provisioning a PVC on an Existing File System
16+
17+
To create a PVC on an existing file system in the File Storage with Lustre service (using Oracle-managed encryption keys to encrypt data at rest):
18+
19+
1. Create a file system in the File Storage with Lustre service, selecting the Encrypt using Oracle-managed keys encryption option. See [Creating a Lustre File System](/iaas/Content/lustre/file-system-create.htm).
20+
21+
2. Create security rules in either a network security group (recommended) or a security list for both the Lustre file system, and for the cluster's worker nodes subnet. The security rules to create depend on the relative network locations of the Lustre file system and the worker nodes which act as the client, according to the following scenarios:
22+
23+
These scenarios, the security rules to create, and where to create them, are fully described in the File Storage with Lustre service documentation (see [Required VCN Security Rules](/iaas/Content/lustre/security-rules.htm)).
24+
25+
3. Create a PV backed by the file system in the File Storage with Lustre service as follows:
26+
27+
a. Create a manifest file to define a PV and in the `csi:` section, set:
28+
29+
- `driver` to `lustre.csi.oraclecloud.com`
30+
- `volumeHandle` to `<MGSAddress>@<LNetName>:/<MountName>`
31+
where:
32+
- `<MGSAddress>` is the Management service address for the file system in the File Storage with Lustre service
33+
- `<LNetName>` is the LNet network name for the file system in the File Storage with Lustre service
34+
- `<MountName>` is the mount name used while creating the file system in the File Storage with Lustre service
35+
36+
For example: `10.0.2.6@tcp:/testlustrefs`
37+
38+
- `fsType` to `lustre`
39+
- (optional, but recommended) `volumeAttributes.setupLnet` to `"true"` if you want the Lustre CSI driver to perform lnet (Lustre Network) setup before mounting the filesystem
40+
- (optional) `volumeAttributes.lustreSubnetCidr` to the CIDR block of the subnet where the worker node's secondary VNIC is located, to ensure the worker node has network connectivity to the Lustre file system. For example, `10.0.2.0/24`.
41+
42+
**Note:** Do not specify `volumeAttributes.lustreSubnetCidr` if you are using the worker node's default interface (the primary VNIC) to connect to the Lustre file system.
43+
44+
- (optional) `volumeAttributes.lustrePostMountParameters` to set Lustre parameters. For example:
45+
```yaml
46+
volumeAttributes:
47+
lustrePostMountParameters: '[{"*.*.*MDT*.lru_size": 11200},{"at_history" : 600}]'
48+
```
49+
50+
For example, the following manifest file (named `lustre-pv-example.yaml`) defines a PV called `lustre-pv-example` backed by a Lustre file system:
51+
52+
```yaml
53+
apiVersion: v1
54+
kind: PersistentVolume
55+
metadata:
56+
name: lustre-pv-example
57+
spec:
58+
capacity:
59+
storage: 31Ti
60+
volumeMode: Filesystem
61+
accessModes:
62+
- ReadWriteMany
63+
persistentVolumeReclaimPolicy: Retain
64+
csi:
65+
driver: lustre.csi.oraclecloud.com
66+
volumeHandle: "10.0.2.6@tcp:/testlustrefs"
67+
fsType: lustre
68+
volumeAttributes:
69+
setupLnet: "true"
70+
```
71+
72+
b. Create the PV from the manifest file by entering:
73+
```bash
74+
kubectl apply -f <filename>
75+
```
76+
77+
For example:
78+
```bash
79+
kubectl apply -f lustre-pv-example.yaml
80+
```
81+
82+
c. Verify that the PV has been created successfully by entering:
83+
```bash
84+
kubectl get pv <pv-name>
85+
```
86+
87+
For example:
88+
```bash
89+
kubectl get pv lustre-pv-example
90+
```
91+
92+
Example output:
93+
```
94+
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
95+
lustre-pv-example 31Ti RWX Retain Bound 56m
96+
```
97+
98+
4. Create a PVC that is provisioned by the PV you have created, as follows:
99+
100+
a. Create a manifest file to define the PVC and set:
101+
102+
- `storageClassName` to `""`
103+
104+
**Note:** You must specify an empty value for `storageClassName`, even though storage class is not applicable in the case of static provisioning of persistent storage. If you do not specify an empty value for `storageClassName`, the default storage class (`oci-bv`) is used, which causes an error.
105+
106+
- `volumeName` to the name of the PV you created (for example, `lustre-pv-example`)
107+
108+
For example, the following manifest file (named `lustre-pvc-example.yaml`) defines a PVC named `lustre-pvc-example` that will bind to a PV named `lustre-pv-example`:
109+
110+
```yaml
111+
apiVersion: v1
112+
kind: PersistentVolumeClaim
113+
metadata:
114+
name: lustre-pvc-example
115+
spec:
116+
accessModes:
117+
- ReadWriteMany
118+
storageClassName: ""
119+
volumeName: lustre-pv-example
120+
resources:
121+
requests:
122+
storage: 31Ti
123+
```
124+
125+
**Note:** The `requests: storage:` element must be present in the PVC's manifest file, and its value must match the value specified for the `capacity: storage:` element in the PV's manifest file. Apart from that, the value of the `requests: storage:` element is ignored.
126+
127+
b. Create the PVC from the manifest file by entering:
128+
```bash
129+
kubectl apply -f <filename>
130+
```
131+
132+
For example:
133+
```bash
134+
kubectl apply -f lustre-pvc-example.yaml
135+
```
136+
137+
c. Verify that the PVC has been created and bound to the PV successfully by entering:
138+
```bash
139+
kubectl get pvc <pvc-name>
140+
```
141+
142+
For example:
143+
```bash
144+
kubectl get pvc lustre-pvc-example
145+
```
146+
147+
Example output:
148+
```
149+
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
150+
lustre-pvc-example Bound lustre-pv-example 31Ti RWX 57m
151+
```
152+
153+
The PVC is bound to the PV backed by the File Storage with Lustre service file system. Data is encrypted at rest, using encryption keys managed by Oracle.
154+
155+
5. Use the new PVC when creating other objects, such as deployments. For example:
156+
157+
a. Create a manifest named `lustre-app-example-deployment.yaml` to define a deployment named `lustre-app-example-deployment` that uses the `lustre-pvc-example` PVC, as follows:
158+
159+
```yaml
160+
apiVersion: apps/v1
161+
kind: Deployment
162+
metadata:
163+
name: lustre-app-example-deployment
164+
spec:
165+
selector:
166+
matchLabels:
167+
app: lustre-app-example
168+
replicas: 2
169+
template:
170+
metadata:
171+
labels:
172+
app: lustre-app-example
173+
spec:
174+
containers:
175+
- args:
176+
- -c
177+
- while true; do echo $(date -u) >> /lustre/data/out.txt; sleep 60; done
178+
command:
179+
- /bin/sh
180+
image: busybox:latest
181+
imagePullPolicy: Always
182+
name: lustre-app-example
183+
volumeMounts:
184+
- mountPath: /lustre/data
185+
name: lustre-volume
186+
restartPolicy: Always
187+
volumes:
188+
- name: lustre-volume
189+
persistentVolumeClaim:
190+
claimName: lustre-pvc-example
191+
```
192+
193+
b. Create the deployment from the manifest file by entering:
194+
```bash
195+
kubectl apply -f lustre-app-example-deployment.yaml
196+
```
197+
198+
c. Verify that the deployment pods have been created successfully and are running by entering:
199+
```bash
200+
kubectl get pods
201+
```
202+
203+
Example output:
204+
```
205+
NAME READY STATUS RESTARTS AGE
206+
lustre-app-example-deployment-7767fdff86-nd75n 1/1 Running 0 8h
207+
lustre-app-example-deployment-7767fdff86-wmxlh 1/1 Running 0 8h
208+
```
209+
210+
## Provisioning a PVC on an Existing File System with Mount Options
211+
212+
You can optimize the performance and control access to an existing Lustre file system by specifying mount options for the PV. Specifying mount options enables you to fine-tune how pods interact with the file system.
213+
214+
To include mount options:
215+
216+
1. Start by following the instructions in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system).
217+
218+
2. In the PV manifest described in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system), add the `spec.mountOptions` field, which enables you to specify how the PV should be mounted by pods.
219+
220+
For example, in the `lustre-pv-example.yaml` manifest file shown in [Provisioning a PVC on an Existing File System](#provisioning-a-pvc-on-an-existing-file-system), you can include the `mountOptions` field as follows:
221+
222+
```yaml
223+
apiVersion: v1
224+
kind: PersistentVolume
225+
metadata:
226+
name: lustre-pv-example
227+
spec:
228+
capacity:
229+
storage: 31Ti
230+
volumeMode: Filesystem
231+
accessModes:
232+
- ReadWriteMany
233+
persistentVolumeReclaimPolicy: Retain
234+
mountOptions:
235+
- ro
236+
csi:
237+
driver: lustre.csi.oraclecloud.com
238+
volumeHandle: "10.0.2.6@tcp:/testlustrefs"
239+
fsType: lustre
240+
volumeAttributes:
241+
setupLnet: "true"
242+
```
243+
244+
In this example, the `mountOptions` field is set to `ro`, indicating that pods are to have read-only access to the file system. For more information about PV mount options, see [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) in the Kubernetes documentation.
245+
246+
## Encrypting Data At Rest on an Existing File System
247+
248+
The File Storage with Lustre service always encrypts data at rest, using Oracle-managed encryption keys by default. However, you have the option to specify at-rest encryption using your own master encryption keys that you manage yourself in the Vault service.
249+
250+
For more information about creating File Storage with Lustre file systems that use Oracle-managed encryption keys or your own master encryption keys that you manage yourself, see [Updating File System Encryption](/iaas/Content/lustre/file-system-encryption.htm).

manifests/container-storage-interface/csi/templates/oci-csi-controller-driver.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ spec:
9696
- --fss-csi-endpoint=unix://var/run/shared-tmpfs/csi-fss.sock
9797
command:
9898
- /usr/local/bin/oci-csi-controller-driver
99-
image: ghcr.io/oracle/cloud-provider-oci:v1.32.0
99+
image: ghcr.io/oracle/cloud-provider-oci:v1.32.1
100100
imagePullPolicy: IfNotPresent
101101
env:
102102
- name: BLOCK_VOLUME_DRIVER_NAME

manifests/container-storage-interface/csi/templates/oci-csi-node-driver.yaml

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,15 @@ metadata:
1313
spec:
1414
fsGroupPolicy: File
1515
---
16+
apiVersion: storage.k8s.io/v1
17+
kind: CSIDriver
18+
metadata:
19+
name: {{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com
20+
spec:
21+
attachRequired: false
22+
podInfoOnMount: false
23+
fsGroupPolicy: File
24+
---
1625
kind: ConfigMap
1726
apiVersion: v1
1827
metadata:
@@ -107,6 +116,7 @@ spec:
107116
- --nodeid=$(KUBE_NODE_NAME)
108117
- --loglevel=debug
109118
- --fss-endpoint=unix:///fss/csi.sock
119+
- --lustre-endpoint=unix:///lustre/csi.sock
110120
command:
111121
- /usr/local/bin/oci-csi-node-driver
112122
env:
@@ -117,11 +127,15 @@ spec:
117127
fieldPath: spec.nodeName
118128
- name: PATH
119129
value: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/host/usr/bin:/host/sbin
130+
- name: LUSTRE_DRIVER_ENABLED
131+
value: "true"
120132
- name: BLOCK_VOLUME_DRIVER_NAME
121133
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}blockvolume.csi.oraclecloud.com"
122134
- name: FSS_VOLUME_DRIVER_NAME
123135
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}fss.csi.oraclecloud.com"
124-
image: ghcr.io/oracle/cloud-provider-oci:v1.32.0
136+
- name: LUSTRE_VOLUME_DRIVER_NAME
137+
value: "{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com"
138+
image: ghcr.io/uneet7/cloud-provider-oci:v1.32.1
125139
securityContext:
126140
privileged: true
127141
volumeMounts:
@@ -152,6 +166,8 @@ spec:
152166
- mountPath: /sbin/mount
153167
name: fss-driver-mounts
154168
subPath: mount
169+
- mountPath: /lustre
170+
name: lustre-plugin-dir
155171
- name: csi-node-registrar
156172
args:
157173
- --csi-address=/csi/csi.sock
@@ -190,6 +206,25 @@ spec:
190206
name: fss-plugin-dir
191207
- mountPath: /registration
192208
name: registration-dir
209+
- name: csi-node-registrar-lustre
210+
args:
211+
- --csi-address=/lustre/csi.sock
212+
- --kubelet-registration-path=/var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com/csi.sock
213+
image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.12.0
214+
securityContext:
215+
privileged: true
216+
lifecycle:
217+
preStop:
218+
exec:
219+
command:
220+
- /bin/sh
221+
- -c
222+
- rm -rf /registration/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com /registration/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com-reg.sock
223+
volumeMounts:
224+
- mountPath: /lustre
225+
name: lustre-plugin-dir
226+
- mountPath: /registration
227+
name: registration-dir
193228
dnsPolicy: ClusterFirst
194229
hostNetwork: true
195230
restartPolicy: Always
@@ -212,6 +247,10 @@ spec:
212247
path: /var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}fss.csi.oraclecloud.com
213248
type: DirectoryOrCreate
214249
name: fss-plugin-dir
250+
- hostPath:
251+
path: /var/lib/kubelet/plugins/{{ if .Values.customHandle }}{{ .Values.customHandle }}.{{ end }}lustre.csi.oraclecloud.com
252+
type: DirectoryOrCreate
253+
name: lustre-plugin-dir
215254
- hostPath:
216255
path: /var/lib/kubelet
217256
type: Directory

manifests/container-storage-interface/oci-csi-controller-driver.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ spec:
9696
- --fss-csi-endpoint=unix://var/run/shared-tmpfs/csi-fss.sock
9797
command:
9898
- /usr/local/bin/oci-csi-controller-driver
99-
image: ghcr.io/oracle/cloud-provider-oci:v1.32.0
99+
image: ghcr.io/oracle/cloud-provider-oci:v1.32.1
100100
imagePullPolicy: IfNotPresent
101101
volumeMounts:
102102
- name: config

0 commit comments

Comments
 (0)