Skip to content

Commit adf8bd0

Browse files
[ai-form-recognizer] 5.0.0 changes (Azure#26711)
This PR makes several changes related to the impending 5.0.0 release of Document Intelligence (f.k.a. Form Recognizer). - The major version revision signals a breaking change in which key-value pairs and extracted languages will not be returned unless a special `feature` is provided when making the analysis request. - Support for explicit API version was removed. This functionality never worked very well in TypeScript, and the major version revision gives us an opportunity to remove it and avoid its inherent complexity until a better solution is available. This also addresses some issues with the way that training data content sources were represented and named. The new representation is stricter about providing only one of the two possible source type fields: `azureBlobSource` or `azureBlobFileListSource`. --------- Co-authored-by: Will Temple <will@wtemple.net>
1 parent 1177c61 commit adf8bd0

File tree

21 files changed

+201
-311
lines changed

21 files changed

+201
-311
lines changed

sdk/formrecognizer/ai-form-recognizer/CHANGELOG.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,39 @@
11
# Release History
22

3-
## 4.1.0 (Unreleased)
3+
## 5.0.0 (2023-08-08)
44

55
### Features Added
66

7-
- `AnalyzeDocumentOptions.features` allows three new features compared to the last beta version:
7+
- Updated the SDK to use the latest Generally Available (GA) version of the Form Recognizer REST API: `2023-07-31`.
8+
- `AnalyzeDocumentOptions.features` accepts three new features compared to the last beta version:
89
- `barcodes`: enables the detection of barcodes in the document.
910
- `keyValuePairs`: enable the detection of general key value pairs (form fields) in the document.
1011
- `languages`: enables the detection of the text content language.
11-
- `beginBuildDocumentModel` has a new overload that accepts a `DocumentModelContentSource` in place of a raw `containerUrl`. This allows training document models using the new Azure Blob file list source (that is already supported by document classifiers). The `DocumentModelContentSource` is an object that contains a `containerUrl` property, and if a `fileList` property is also provided it is interpreted as an Azure Blob file list source. Otherwise it is interpreted as an Azure Blob content source with an optional `prefix` property.
12+
- `beginBuildDocumentModel` has a new overload that accepts a `DocumentModelSource` in place of a raw `containerUrl`. This allows training document models using the new Blob File List source (that is already supported by document classifiers). Like with classifiers, the source inputs are specified as an object containing an `azureFileListSource` property or an `azureBlobSource` property containing the respective details of each source type.
1213

1314
### Breaking Changes
1415

15-
- `DocumentAnalysisClient` and `DocumentModelAdministrationClient` now target service API version `2023-07-31` by default. Version `2023-02-28-preview` is not supported.
16+
From the last stable release (4.0.0):
17+
18+
- Support for passing alternative API versions has been removed from the client. In practice, the client only supported
19+
using a single API version, but types and options for specifying an API version were provided. In version 5.0.0, these options and their associated types were removed:
20+
- The `apiVersion` option that was previously accepted by the `DocumentAnalysisClient` and `DocumentModelAdministrationClient` constructors was removed. This option previously only had one valid value in version 4.0.0, and supporting multiple API versions in a single package weakens the type constraints, so we have chosen to only support the latest Generally Available version of the service in this SDK package. Support for multiple API versions may be reintroduced in a future version.
21+
- The `FormRecognizerApiVersion` type and enum were removed as they no longer serve any purpose.
22+
- The type of `apiVersion` properties of result objects was changed from `FormRecognizerApiVersion` to `string`. This type is more accurate, as these fields reflect the API version used to create the model or start the analysis operation, and not necessarily an API version that the client instance is aware of.
23+
- The `FormRecognizerCommonClientOptions` interface, which both `DocumentAnalysisClientOptions` and `DocumentModelAdministrationClientOptions` inherited from was removed, as it only carried the `apiVersion` option that no longer exists.
24+
- The `languages` and `keyValuePairs` properties of `AnalyzeResult` that were previously returned when using the `prebuilt-document` model are no longer returned unless the corresponding `features` are specified when making the analysis request.
25+
26+
From the last beta release (4.1.0-beta.1):
27+
1628
- `AnalyzeDocumentOptions.features` changed the following feature names:
1729
- `ocr.highResolution` renamed to `ocrHighResolution`.
1830
- `ocr.formula` renamed to `formulas`.
1931
- `ocr.font` renamed to `styleFont`.
20-
- The following fields have been removed
32+
- The following fields have been removed:
2133
- `AnalyzeDocumentOptions.queryFields`
2234
- `DocumentPage.kind` and `DocumentPage.images` (`DocumentPageKind` and `DocumentImage` types have been removed too.)
2335
- `DocumentKeyValuePair.commonName`
24-
- Changed how content sources are provided when creating document classifiers. The type of content source (`azureBlobContentSource` or `azureBlobFileListSource`) is no longer required in the content source input, and the type is now inferred automatically. If a `fileList` property is provided, it is interpreted as a file list source, and otherwise it is interpreted as a blob content source with optional `prefix`.
36+
- The type of the `docTypes` parameter of `beginBuildDocumentClassifier` was refined slightly. The type will no longer accept _both_ `azureBlobSource` and `azureFileListSource`
2537

2638
## 4.1.0-beta.1 (2023-04-11)
2739

sdk/formrecognizer/ai-form-recognizer/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
"sdk-type": "client",
44
"author": "Microsoft Corporation",
55
"description": "An isomorphic client library for the Azure Form Recognizer service.",
6-
"version": "4.1.0",
6+
"version": "5.0.0",
77
"keywords": [
88
"node",
99
"azure",

sdk/formrecognizer/ai-form-recognizer/review/ai-form-recognizer.api.md

Lines changed: 38 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ export interface AnalyzedDocument {
4949

5050
// @public
5151
export interface AnalyzeDocumentOptions<Result = AnalyzeResult<AnalyzedDocument>> extends OperationOptions, PollerOptions<DocumentAnalysisPollOperationState<Result>> {
52-
features?: string[];
52+
features?: FormRecognizerFeature[];
5353
locale?: string;
5454
pages?: string;
5555
}
@@ -67,7 +67,7 @@ export interface AnalyzeResult<Document = AnalyzedDocument> extends AnalyzeResul
6767

6868
// @public
6969
export interface AnalyzeResultCommon {
70-
apiVersion: FormRecognizerApiVersion;
70+
apiVersion: string;
7171
content: string;
7272
modelId: string;
7373
}
@@ -76,17 +76,29 @@ export interface AnalyzeResultCommon {
7676
export type AnalyzeResultOperationStatus = "notStarted" | "running" | "failed" | "succeeded";
7777

7878
// @public
79-
export interface AzureBlobContentSource {
80-
containerUrl: string;
81-
prefix?: string;
79+
export interface AzureBlobFileListSource {
80+
azureBlobFileListSource: AzureBlobFileListSourceDetails;
81+
azureBlobSource?: undefined;
8282
}
8383

8484
// @public
85-
export interface AzureBlobFileListContentSource {
85+
export interface AzureBlobFileListSourceDetails {
8686
containerUrl: string;
8787
fileList: string;
8888
}
8989

90+
// @public
91+
export interface AzureBlobSource {
92+
azureBlobFileListSource?: undefined;
93+
azureBlobSource: AzureBlobSourceDetails;
94+
}
95+
96+
// @public
97+
export interface AzureBlobSourceDetails {
98+
containerUrl: string;
99+
prefix?: string;
100+
}
101+
90102
export { AzureKeyCredential }
91103

92104
// @public
@@ -113,8 +125,8 @@ export interface BoundingRegion extends HasBoundingPolygon {
113125

114126
// @public
115127
export interface ClassifierDocumentTypeDetails {
116-
azureBlobFileListSource?: AzureBlobFileListContentSource;
117-
azureBlobSource?: AzureBlobContentSource;
128+
azureBlobFileListSource?: AzureBlobFileListSourceDetails;
129+
azureBlobSource?: AzureBlobSourceDetails;
118130
}
119131

120132
// @public
@@ -181,7 +193,7 @@ export class DocumentAnalysisClient {
181193
}
182194

183195
// @public
184-
export interface DocumentAnalysisClientOptions extends FormRecognizerCommonClientOptions {
196+
export interface DocumentAnalysisClientOptions extends CommonClientOptions {
185197
stringIndexType?: StringIndexType;
186198
}
187199

@@ -237,9 +249,6 @@ export interface DocumentClassifierBuildOperationDetails extends OperationDetail
237249
result?: DocumentClassifierDetails;
238250
}
239251

240-
// @public
241-
export type DocumentClassifierContentSource = AzureBlobContentSource | AzureBlobFileListContentSource;
242-
243252
// @public
244253
export interface DocumentClassifierDetails {
245254
apiVersion: string;
@@ -252,13 +261,21 @@ export interface DocumentClassifierDetails {
252261
expiresOn?: Date;
253262
}
254263

264+
// @public
265+
export interface DocumentClassifierDocumentTypeSources {
266+
[docType: string]: DocumentClassifierSource;
267+
}
268+
255269
// @public
256270
export interface DocumentClassifierOperationState extends PollOperationState<DocumentClassifierDetails>, ModelAdministrationOperationStateCommon {
257271
}
258272

259273
// @public
260274
export type DocumentClassifierPoller = PollerLike<DocumentClassifierOperationState, DocumentClassifierDetails>;
261275

276+
// @public
277+
export type DocumentClassifierSource = AzureBlobSource | AzureBlobFileListSource;
278+
262279
// @public
263280
export interface DocumentCountryRegionField extends DocumentFieldCommon {
264281
kind: "countryRegion";
@@ -354,7 +371,7 @@ export interface DocumentLine extends HasBoundingPolygon {
354371

355372
// @public
356373
export interface DocumentModel<Result> {
357-
apiVersion?: FormRecognizerApiVersion;
374+
apiVersion?: string;
358375
modelId: string;
359376
transformResult: (input: AnalyzeResult) => Result;
360377
}
@@ -364,11 +381,9 @@ export class DocumentModelAdministrationClient {
364381
constructor(endpoint: string, credential: TokenCredential, options?: DocumentModelAdministrationClientOptions);
365382
constructor(endpoint: string, credential: KeyCredential, options?: DocumentModelAdministrationClientOptions);
366383
constructor(endpoint: string, credential: KeyCredential | TokenCredential, options?: DocumentModelAdministrationClientOptions);
367-
beginBuildDocumentClassifier(classifierId: string, docTypes: {
368-
[docType: string]: DocumentClassifierContentSource;
369-
}, options?: BeginBuildDocumentClassifierOptions): Promise<DocumentClassifierPoller>;
384+
beginBuildDocumentClassifier(classifierId: string, docTypeSources: DocumentClassifierDocumentTypeSources, options?: BeginBuildDocumentClassifierOptions): Promise<DocumentClassifierPoller>;
370385
beginBuildDocumentModel(modelId: string, containerUrl: string, buildMode: DocumentModelBuildMode, options?: BeginBuildDocumentModelOptions): Promise<DocumentModelPoller>;
371-
beginBuildDocumentModel(modelId: string, contentSource: DocumentModelContentSource, buildMode: DocumentModelBuildMode, options?: BeginBuildDocumentModelOptions): Promise<DocumentModelPoller>;
386+
beginBuildDocumentModel(modelId: string, contentSource: DocumentModelSource, buildMode: DocumentModelBuildMode, options?: BeginBuildDocumentModelOptions): Promise<DocumentModelPoller>;
372387
beginComposeDocumentModel(modelId: string, componentModelIds: Iterable<string>, options?: BeginComposeDocumentModelOptions): Promise<DocumentModelPoller>;
373388
beginCopyModelTo(sourceModelId: string, authorization: CopyAuthorization, options?: BeginCopyModelOptions): Promise<DocumentModelPoller>;
374389
deleteDocumentClassifier(classifierId: string, options?: OperationOptions): Promise<void>;
@@ -384,7 +399,7 @@ export class DocumentModelAdministrationClient {
384399
}
385400

386401
// @public
387-
export interface DocumentModelAdministrationClientOptions extends FormRecognizerCommonClientOptions {
402+
export interface DocumentModelAdministrationClientOptions extends CommonClientOptions {
388403
}
389404

390405
// @public
@@ -408,9 +423,6 @@ export interface DocumentModelComposeOperationDetails extends OperationDetails {
408423
result?: DocumentModelDetails;
409424
}
410425

411-
// @public
412-
export type DocumentModelContentSource = AzureBlobContentSource | AzureBlobFileListContentSource;
413-
414426
// @public
415427
export interface DocumentModelCopyToOperationDetails extends OperationDetails {
416428
kind: "documentModelCopyTo";
@@ -439,6 +451,9 @@ export interface DocumentModelOperationState extends PollOperationState<Document
439451
// @public
440452
export type DocumentModelPoller = PollerLike<DocumentModelOperationState, DocumentModelDetails>;
441453

454+
// @public
455+
export type DocumentModelSource = AzureBlobSource | AzureBlobFileListSource;
456+
442457
// @public
443458
export interface DocumentModelSummary {
444459
apiVersion?: string;
@@ -607,25 +622,10 @@ export type FontStyle = string;
607622
// @public
608623
export type FontWeight = string;
609624

610-
// @public
611-
export type FormRecognizerApiVersion = (typeof FormRecognizerApiVersion)[keyof typeof FormRecognizerApiVersion];
612-
613-
// @public
614-
export const FormRecognizerApiVersion: {
615-
readonly Latest: "2023-07-31";
616-
readonly Stable: "2023-07-31";
617-
readonly "2022-08-31": "2022-08-31";
618-
};
619-
620-
// @public
621-
export interface FormRecognizerCommonClientOptions extends CommonClientOptions {
622-
apiVersion?: FormRecognizerApiVersion;
623-
}
624-
625625
// @public
626626
export type FormRecognizerFeature = (typeof FormRecognizerFeature)[keyof typeof FormRecognizerFeature] | (string & {});
627627

628-
// @public (undocumented)
628+
// @public
629629
export const FormRecognizerFeature: {
630630
readonly Fonts: "styleFont";
631631
readonly OcrHighResolution: "ocrHighResolution";
@@ -813,7 +813,7 @@ export interface OperationDetails {
813813
};
814814
}
815815

816-
// @public (undocumented)
816+
// @public
817817
export type OperationDetailsUnion = OperationDetails | DocumentModelBuildOperationDetails | DocumentModelComposeOperationDetails | DocumentModelCopyToOperationDetails | DocumentClassifierBuildOperationDetails;
818818

819819
// @public

sdk/formrecognizer/ai-form-recognizer/samples-dev/buildClassifier.ts

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,10 +42,14 @@ async function main() {
4242
type1: {
4343
// `azureBlobSource` isn't the only way to provide training data to the service. For more information, see
4444
// the documentation of the `ClassifierDocumentTypeDetails` type.
45-
containerUrl: trainingDataSasUrl1,
45+
azureBlobSource: {
46+
containerUrl: trainingDataSasUrl1,
47+
},
4648
},
4749
type2: {
48-
containerUrl: trainingDataSasUrl2,
50+
azureBlobSource: {
51+
containerUrl: trainingDataSasUrl2,
52+
},
4953
},
5054
},
5155
{

sdk/formrecognizer/ai-form-recognizer/samples-dev/buildModel.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,11 @@ async function main() {
3636

3737
const poller = await client.beginBuildDocumentModel(
3838
modelId,
39-
trainingDataSasUrl,
39+
{
40+
azureBlobSource: {
41+
containerUrl: trainingDataSasUrl,
42+
},
43+
},
4044
DocumentModelBuildMode.Template
4145
);
4246
const model = await poller.pollUntilDone();

sdk/formrecognizer/ai-form-recognizer/src/constants.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,6 @@ export const DEFAULT_COGNITIVE_SCOPE = "https://cognitiveservices.azure.com/.def
1010
/**
1111
* @internal
1212
*/
13-
export const SDK_VERSION = "4.1.0";
13+
export const SDK_VERSION = "5.0.0";
14+
15+
export const FORM_RECOGNIZER_API_VERSION = "2023-07-31";

sdk/formrecognizer/ai-form-recognizer/src/documentAnalysisClient.ts

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
import { KeyCredential, TokenCredential } from "@azure/core-auth";
55
import { createTracingClient } from "@azure/core-tracing";
66
import { TracingClient } from "@azure/core-tracing";
7-
import { SDK_VERSION } from "./constants";
7+
import { FORM_RECOGNIZER_API_VERSION, SDK_VERSION } from "./constants";
88
import {
99
AnalyzeDocumentRequest,
1010
AnalyzeResultOperation,
@@ -23,11 +23,7 @@ import {
2323
} from "./lro/analysis";
2424
import { OperationContext, lro } from "./lro/util/poller";
2525
import { AnalyzeDocumentOptions } from "./options/AnalyzeDocumentOptions";
26-
import {
27-
DEFAULT_GENERATED_CLIENT_OPTIONS,
28-
DocumentAnalysisClientOptions,
29-
FormRecognizerApiVersion,
30-
} from "./options/FormRecognizerClientOptions";
26+
import { DocumentAnalysisClientOptions } from "./options/FormRecognizerClientOptions";
3127
import { DocumentModel } from "./documentModel";
3228
import { makeServiceClient, Mappers, SERIALIZER } from "./util";
3329
import { AbortSignalLike } from "@azure/abort-controller";
@@ -66,7 +62,6 @@ import { ClassifyDocumentOptions } from "./options/ClassifyDocumentOptions";
6662
export class DocumentAnalysisClient {
6763
private _restClient: GeneratedClient;
6864
private _tracing: TracingClient;
69-
private _apiVersion: FormRecognizerApiVersion;
7065

7166
/**
7267
* Create a `DocumentAnalysisClient` instance from a resource endpoint and a an Azure Identity `TokenCredential`.
@@ -137,8 +132,6 @@ export class DocumentAnalysisClient {
137132
packageVersion: SDK_VERSION,
138133
namespace: "Microsoft.CognitiveServices",
139134
});
140-
141-
this._apiVersion = options.apiVersion ?? DEFAULT_GENERATED_CLIENT_OPTIONS.apiVersion;
142135
}
143136

144137
// #region Analysis
@@ -418,12 +411,13 @@ export class DocumentAnalysisClient {
418411
? { modelId: model, apiVersion: undefined, transformResult: (v: AnalyzeResult) => v }
419412
: model;
420413

421-
if (requestApiVersion && requestApiVersion !== this._apiVersion) {
414+
if (requestApiVersion && requestApiVersion !== FORM_RECOGNIZER_API_VERSION) {
422415
throw new Error(
423416
[
424-
`API Version mismatch: the provided model wants version: ${requestApiVersion}, but the client is using ${this._apiVersion}.`,
417+
`API Version mismatch: the provided model wants version: ${requestApiVersion},`,
418+
`but the client is using ${FORM_RECOGNIZER_API_VERSION}.`,
425419
"The API version of the model must match the client's API version.",
426-
].join("\n")
420+
].join(" ")
427421
);
428422
}
429423

sdk/formrecognizer/ai-form-recognizer/src/documentModel.ts

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
import { DocumentFieldSchema, DocumentModelDetails } from "./generated";
55
import { AnalyzedDocument, AnalyzeResult } from "./lro/analysis";
66
import { DocumentField } from "./models/fields";
7-
import { FormRecognizerApiVersion } from "./options";
87
import { isAcronymic, uncapitalize } from "./util";
98

109
/**
@@ -21,7 +20,7 @@ export interface DocumentModel<Result> {
2120
/**
2221
* The API version of the model.
2322
*/
24-
apiVersion?: FormRecognizerApiVersion;
23+
apiVersion?: string;
2524
/**
2625
* An associated transformation that is used to conver the base (weak) Result type to the strong version.
2726
*/
@@ -94,7 +93,7 @@ export function createModelFromSchema(
9493
): DocumentModel<AnalyzeResult<unknown>> {
9594
return {
9695
modelId: schema.modelId,
97-
apiVersion: schema.apiVersion as FormRecognizerApiVersion,
96+
apiVersion: schema.apiVersion,
9897
transformResult(baseResult: AnalyzeResult): AnalyzeResult<unknown> {
9998
const hasDocuments = Object.entries(schema.docTypes ?? {}).length > 0;
10099

0 commit comments

Comments
 (0)