Skip to content

Commit af99f51

Browse files
Merge pull request #3892 from raspberrypi/lurch-patch-1
Small typos
2 parents a395576 + 87f36f1 commit af99f51

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

documentation/asciidoc/accessories/ai-camera/details.adoc

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,15 @@ image::images/imx500-comparison.svg[Traditional versus IMX500 AI camera systems]
99

1010
The left side demonstrates the architecture of a traditional AI camera system. In such a system, the camera delivers images to the Raspberry Pi. The Raspberry Pi processes the images and then performs AI inference. Traditional systems may use external AI accelerators (as shown) or rely exclusively on the CPU.
1111

12-
The right side demonstrates the architecture of a system that uses IMX500. The camera module contains a small Image Signal Processor (ISP) which turns the raw camera image data into an **input tensor**. The camera module sends this tensor directly into the AI accelerator within the camera, which produces an **output tensor** that contains the inferencing results. The AI accelerator sends this tensor to the Raspberry Pi. There is no need for an external accelerator, nor for the Raspberry Pi to run neural network software on the CPU.
12+
The right side demonstrates the architecture of a system that uses IMX500. The camera module contains a small Image Signal Processor (ISP) which turns the raw camera image data into an **input tensor**. The camera module sends this tensor directly into the AI accelerator within the camera, which produces **output tensors** that contain the inferencing results. The AI accelerator sends these tensors to the Raspberry Pi. There is no need for an external accelerator, nor for the Raspberry Pi to run neural network software on the CPU.
1313

1414
To fully understand this system, familiarise yourself with the following concepts:
1515

1616
Input Tensor:: The part of the sensor image passed to the AI engine for inferencing. Produced by a small on-board ISP which also crops and scales the camera image to the dimensions expected by the neural network that has been loaded. The input tensor is not normally made available to applications, though it is possible to access it for debugging purposes.
1717

1818
Region of Interest (ROI):: Specifies exactly which part of the sensor image is cropped out before being rescaled to the size demanded by the neural network. Can be queried and set by an application. The units used are always pixels in the full resolution sensor output. The default ROI setting uses the full image received from the sensor, cropping no data.
1919

20-
Output Tensor:: The results of inferencing performed by the neural network. The precise number and shape of the outputs depend on the neural network. Application code must understand how to handle the tensor.
20+
Output Tensors:: The results of inferencing performed by the neural network. The precise number and shape of the outputs depend on the neural network. Application code must understand how to handle the tensors.
2121

2222
=== System architecture
2323

@@ -43,13 +43,13 @@ Once `libcamera` dequeues the image and inference data buffers from the kernel,
4343
| Description
4444

4545
| `CnnOutputTensor`
46-
| Floating point array storing the output tensor.
46+
| Floating point array storing the output tensors.
4747

4848
| `CnnInputTensor`
4949
| Floating point array storing the input tensor.
5050

5151
| `CnnOutputTensorInfo`
52-
| Network specific parameters describing the output tensors structure:
52+
| Network specific parameters describing the output tensors' structure:
5353

5454
[source,c]
5555
----
@@ -67,7 +67,7 @@ struct CnnOutputTensorInfo {
6767
----
6868

6969
| `CnnInputTensorInfo`
70-
| Network specific parameters describing the input tensors structure:
70+
| Network specific parameters describing the input tensor's structure:
7171

7272
[source,c]
7373
----
@@ -204,7 +204,7 @@ def draw_detections(request, detections, stream="main"):
204204
cv2.rectangle(m.array, (b.x, b.y), (b.x + b.width, b.y + b.height), (255, 0, 0, 0))
205205
206206
def parse_detections(request, stream='main'):
207-
"""Parse the output tensor into a number of detected objects, scaled to the ISP out."""
207+
"""Parse the output tensor into a number of detected objects, scaled to the ISP output."""
208208
outputs = imx500.get_outputs(request.get_metadata())
209209
boxes, scores, classes = outputs[0][0], outputs[1][0], outputs[2][0]
210210
detections = [ Detection(box, category, score, metadata)
@@ -245,7 +245,7 @@ There are a number of scaling/cropping/translation operations occurring from the
245245
| Returns the input tensor size based on the neural network model used.
246246

247247
| `IMX500.get_outputs(metadata)`
248-
| Returns the output tensors from the Picamera2 image metadata metadata.
248+
| Returns the output tensors from the Picamera2 image metadata.
249249

250250
| `IMX500.get_output_shapes(metadata)`
251251
| Returns the shape of the output tensors from the Picamera2 image metadata for the neural network model used.

0 commit comments

Comments
 (0)