Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions sdk/textanalytics/azure-ai-textanalytics/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,21 @@
- `offset` is the offset of the text from the start of the document

**New features**
- Added the support for Personally Identifiable Information(PII) entity recognition feature.
To use this feature, you need to make sure you are using the
service's v3.1-preview.1 API.
- Added support for Personally Identifiable Information(PII) entity recognition feature.
To use this feature, you need to make sure you are using the service's v3.1-preview.1 API.

New synchronous API introduced:
- PiiEntityCollection recognizePiiEntities(String document)
- PiiEntityCollection recognizePiiEntities(String document, String language)
- RecognizePiiEntitiesResultCollection recognizePiiEntitiesBatch(Iterable<String> documents, String language, TextAnalyticsRequestOptions options)
- Response<RecognizePiiEntitiesResultCollection> recognizePiiEntitiesBatchWithResponse(Iterable<TextDocumentInput> documents, TextAnalyticsRequestOptions options, Context context)

New asynchronous API introduced:
- Mono<PiiEntityCollection> recognizePiiEntities(String document)
- Mono<PiiEntityCollection> recognizePiiEntities(String document, String language)
- Mono<RecognizePiiEntitiesResultCollection> recognizePiiEntitiesBatch(Iterable<String> documents, String language, TextAnalyticsRequestOptions options)
- Mono<Response<RecognizePiiEntitiesResultCollection>> recognizePiiEntitiesBatchWithResponse(Iterable<TextDocumentInput> documents, TextAnalyticsRequestOptions options)

## 5.0.0 (2020-07-27)
- Re-release of version `1.0.1` with updated version `5.0.0`.

Expand Down
16 changes: 8 additions & 8 deletions sdk/textanalytics/azure-ai-textanalytics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ TextAnalyticsAsyncClient textAnalyticsClient = new TextAnalyticsClientBuilder()

### Analyze sentiment
Run a Text Analytics predictive model to identify the positive, negative, neutral or mixed sentiment contained in the
passed-in document or batch of documents.
provided document or batch of documents.

<!-- embedme ./src/samples/java/com/azure/ai/textanalytics/ReadmeSamples.java#L104-L108 -->
```java
Expand All @@ -238,7 +238,7 @@ For samples on using the production recommended option `DetectLanguageBatch` see
Please refer to the service documentation for a conceptual discussion of [language detection][language_detection].

### Extract key phrases
Run a model to identify a collection of significant phrases found in the passed-in document or batch of documents.
Run a model to identify a collection of significant phrases found in the provided document or batch of documents.

<!-- embedme ./src/samples/java/com/azure/ai/textanalytics/ReadmeSamples.java#L149-L151 -->
```java
Expand All @@ -250,7 +250,7 @@ For samples on using the production recommended option `ExtractKeyPhrasesBatch`
Please refer to the service documentation for a conceptual discussion of [key phrase extraction][key_phrase_extraction].

### Recognize entities
Run a predictive model to identify a collection of named entities in the passed-in document or batch of documents and
Run a predictive model to identify a collection of named entities in the provided document or batch of documents and
categorize those entities into categories such as person, location, or organization. For more information on available
categories, see [Text Analytics Named Entity Categories][named_entities_categories].

Expand All @@ -265,14 +265,14 @@ For samples on using the production recommended option `RecognizeEntitiesBatch`
Please refer to the service documentation for a conceptual discussion of [named entity recognition][named_entity_recognition].

### Recognize Personally Identifiable Information entities
Run a predictive model to identify a collection of Personally Identifiable Information(PII) entities in the passed-in
Run a predictive model to identify a collection of Personally Identifiable Information(PII) entities in the provided
document. It recognizes and categorizes PII entities in its input text, such as
Social Security Numbers, bank account information, credit card numbers, and more. This endpoint is only available for
v3.1-preview.1 and up.
Social Security Numbers, bank account information, credit card numbers, and more. This endpoint is only supported for
API versions v3.1-preview.1 and above.

<!-- embedme ./src/samples/java/com/azure/ai/textanalytics/ReadmeSamples.java#L158-L161 -->
```java
String document = "My SSN is 555-55-5555";
String document = "My SSN is 859-98-0987";
textAnalyticsClient.recognizePiiEntities(document).forEach(entity -> System.out.printf(
"Recognized Personally Identifiable Information entity: %s, entity category: %s, entity subcategory: %s, offset: %s, length: %s, confidence score: %f.%n",
entity.getText(), entity.getCategory(), entity.getSubcategory(), entity.getOffset(), entity.getLength(), entity.getConfidenceScore()));
Expand All @@ -281,7 +281,7 @@ For samples on using the production recommended option `RecognizePiiEntitiesBatc
Please refer to the service documentation for [supported PII entity types][pii_entity_recognition].

### Recognize linked entities
Run a predictive model to identify a collection of entities found in the passed-in document or batch of documents,
Run a predictive model to identify a collection of entities found in the provided document or batch of documents,
and include information linking the entities to their corresponding entries in a well-known knowledge base.

<!-- embedme ./src/samples/java/com/azure/ai/textanalytics/ReadmeSamples.java#L135-L142 -->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ class RecognizePiiEntityAsyncClient {
Mono<PiiEntityCollection> recognizePiiEntities(String document, String language) {
try {
Objects.requireNonNull(document, "'document' cannot be null.");
final TextDocumentInput textDocumentInput = new TextDocumentInput("0", document).setLanguage(language);
return recognizePiiEntitiesBatch(Collections.singletonList(textDocumentInput), null)
return recognizePiiEntitiesBatch(
Collections.singletonList(new TextDocumentInput("0", document).setLanguage(language)), null)
.map(resultCollectionResponse -> {
PiiEntityCollection entityCollection = null;
// for each loop will have only one entry inside
Expand Down Expand Up @@ -143,8 +143,7 @@ private Response<RecognizePiiEntitiesResultCollection> toRecognizePiiEntitiesRes
// Pii entities list
final List<PiiEntity> piiEntities = documentEntities.getEntities().stream().map(entity ->
new PiiEntity(entity.getText(), EntityCategory.fromString(entity.getCategory()),
entity.getSubcategory(), entity.getOffset(), entity.getLength(),
entity.getConfidenceScore()))
entity.getSubcategory(), entity.getConfidenceScore(), entity.getOffset(), entity.getLength()))
.collect(Collectors.toList());
// Warnings
final List<TextAnalyticsWarning> warnings = documentEntities.getWarnings().stream()
Expand Down Expand Up @@ -193,10 +192,10 @@ private Mono<Response<RecognizePiiEntitiesResultCollection>> getRecognizePiiEnti
options == null ? null : options.isIncludeStatistics(),
null,
context.addData(AZ_TRACING_NAMESPACE_KEY, COGNITIVE_TRACING_NAMESPACE_VALUE))
.doOnSubscribe(ignoredValue -> logger.info("A batch of documents - {}", documents.toString()))
.doOnSuccess(response ->
logger.info("Recognized Personally Identifiable Information entities for a batch of documents- {}",
response.getValue()))
.doOnSubscribe(ignoredValue -> logger.info(
"Start recognizing Personally Identifiable Information entities for a batch of documents."))
.doOnSuccess(response -> logger.info(
"Successfully recognized Personally Identifiable Information entities for a batch of documents."))
.doOnError(error ->
logger.warning("Failed to recognize Personally Identifiable Information entities - {}", error))
.map(this::toRecognizePiiEntitiesResultCollectionResponse)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ public Mono<Response<DetectLanguageResultCollection>> detectLanguageBatchWithRes
*
* For a list of supported entity types, check: <a href="https://aka.ms/taner">this</a>.
* For a list of enabled languages, check: <a href="https://aka.ms/talangs">this</a>.
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -376,7 +376,7 @@ public Mono<Response<RecognizeEntitiesResultCollection>> recognizeEntitiesBatchW
*
* For a list of supported entity types, check: <a href="https://aka.ms/tanerpii">this</a>.
* For a list of enabled languages, check: <a href="https://aka.ms/talangs">this</a>. This method will use the
* default language that sets up in {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is
* default language that is set using {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is
* specified, service will use 'en' as the language.
*
* <p><strong>Code sample</strong></p>
Expand Down Expand Up @@ -500,7 +500,7 @@ public Mono<Response<RecognizePiiEntitiesResultCollection>> recognizePiiEntities
* Returns a list of recognized entities with links to a well-known knowledge base for the provided document. See
* <a href="https://aka.ms/talangs">this</a> for supported languages in Text Analytics API.
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -620,7 +620,7 @@ public Mono<RecognizeLinkedEntitiesResultCollection> recognizeLinkedEntitiesBatc
/**
* Returns a list of strings denoting the key phrases in the document.
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -739,7 +739,7 @@ public Mono<Response<ExtractKeyPhrasesResultCollection>> extractKeyPhrasesBatchW
* Returns a sentiment prediction, as well as confidence scores for each sentiment label (Positive, Negative, and
* Neutral) for the document and each sentence within it.
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ public Response<DetectLanguageResultCollection> detectLanguageBatchWithResponse(
*
* For a list of supported entity types, check: <a href="https://aka.ms/taner">this</a>
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -297,7 +297,7 @@ public Response<RecognizeEntitiesResultCollection> recognizeEntitiesBatchWithRes
*
* For a list of supported entity types, check: <a href="https://aka.ms/tanerpii">this</a>
* For a list of enabled languages, check: <a href="https://aka.ms/talangs">this</a>. This method will use the
* default language that sets up in {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is
* default language that is set using {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is
* specified, service will use 'en' as the language.
*
* <p><strong>Code Sample</strong></p>
Expand Down Expand Up @@ -409,7 +409,7 @@ public Response<RecognizePiiEntitiesResultCollection> recognizePiiEntitiesBatchW
* Returns a list of recognized entities with links to a well-known knowledge base for the provided document.
* See <a href="https://aka.ms/talangs">this</a> for supported languages in Text Analytics API.
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -525,7 +525,7 @@ public RecognizeLinkedEntitiesResultCollection recognizeLinkedEntitiesBatch(
/**
* Returns a list of strings denoting the key phrases in the document.
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down Expand Up @@ -637,7 +637,7 @@ public Response<ExtractKeyPhrasesResultCollection> extractKeyPhrasesBatchWithRes
* Returns a sentiment prediction, as well as confidence scores for each sentiment label
* (Positive, Negative, and Neutral) for the document and each sentence within i
*
* This method will use the default language that sets up in
* This method will use the default language that is set using
* {@link TextAnalyticsClientBuilder#defaultLanguage(String)}. If none is specified, service will use 'en' as
* the language.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ public AnalyzeSentimentResult(String id, TextDocumentStatistics textDocumentStat
* Get the document sentiment.
*
* @return The document sentiment.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public DocumentSentiment getDocumentSentiment() {
throwExceptionIfError();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ public DetectLanguageResult(String id, TextDocumentStatistics textDocumentStatis
* Get the detected primary language.
*
* @return The detected language.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public DetectedLanguage getPrimaryLanguage() {
throwExceptionIfError();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ public ExtractKeyPhraseResult(String id, TextDocumentStatistics textDocumentStat
* Get a {@link KeyPhrasesCollection} contains a list of key phrases and warnings.
*
* @return A {@link KeyPhrasesCollection} contains a list of key phrases and warnings.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public KeyPhrasesCollection getKeyPhrases() {
throwExceptionIfError();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ public final class PiiEntity {
*/
private final String subcategory;

/*
* Confidence score between 0 and 1 of the extracted entity.
*/
private final double confidenceScore;

/*
* Start position for the entity text.
*/
Expand All @@ -35,23 +40,18 @@ public final class PiiEntity {
*/
private final int length;

/*
* Confidence score between 0 and 1 of the extracted entity.
*/
private final double confidenceScore;

/**
* Creates a {@link PiiEntity} model that describes entity.
*
* @param text The entity text as appears in the request.
* @param category The entity category, such as Person/Location/Org/SSN etc.
* @param subcategory The entity subcategory, such as Medical/Stock exchange/Sports etc.
* @param confidenceScore A confidence score between 0 and 1 of the recognized entity.
* @param offset The start position for the entity text
* @param length The length for the entity text
* @param confidenceScore A confidence score between 0 and 1 of the extracted entity.
*/
public PiiEntity(String text, EntityCategory category, String subcategory, int offset, int length,
double confidenceScore) {
public PiiEntity(String text, EntityCategory category, String subcategory, double confidenceScore, int offset,
int length) {
this.text = text;
this.category = category;
this.subcategory = subcategory;
Expand All @@ -63,7 +63,7 @@ public PiiEntity(String text, EntityCategory category, String subcategory, int o
/**
* Get the text property: PII entity text as appears in the request.
*
* @return The text value.
* @return The {@code text} value.
*/
public String getText() {
return this.text;
Expand All @@ -72,25 +72,34 @@ public String getText() {
/**
* Get the category property: Categorized entity category, such as Person/Location/Org/SSN etc.
*
* @return The category value.
* @return The {@code category} value.
*/
public EntityCategory getCategory() {
return this.category;
}

/**
* Get the subcategory property: Categorized entity sub category, such as Medical/Stock exchange/Sports etc.
* Get the subcategory property: Categorized entity subcategory, such as Medical/Stock exchange/Sports etc.
*
* @return The subcategory value.
* @return The {@code subcategory} value.
*/
public String getSubcategory() {
return this.subcategory;
}

/**
* Get the score property: Confidence score between 0 and 1 of the recognized entity.
*
* @return The {@code confidenceScore} value.
*/
public double getConfidenceScore() {
return this.confidenceScore;
}

/**
* Get the offset property: the start position for the entity text.
*
* @return The offset value.
* @return The {@code offset} value.
*/
public int getOffset() {
return this.offset;
Expand All @@ -99,18 +108,9 @@ public int getOffset() {
/**
* Get the length property: the length for the entity text.
*
* @return The length value.
* @return The {@code length} value.
*/
public int getLength() {
return this.length;
}

/**
* Get the score property: Confidence score between 0 and 1 of the extracted entity.
*
* @return The score value.
*/
public double getConfidenceScore() {
return this.confidenceScore;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ public RecognizeEntitiesResult(String id, TextDocumentStatistics textDocumentSta
* Get an {@link IterableStream} of {@link CategorizedEntity}.
*
* @return An {@link IterableStream} of {@link CategorizedEntity}.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public CategorizedEntityCollection getEntities() {
throwExceptionIfError();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ public RecognizeLinkedEntitiesResult(String id, TextDocumentStatistics textDocum
* Get an {@link IterableStream} of {@link LinkedEntity}.
*
* @return An {@link IterableStream} of {@link LinkedEntity}.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public LinkedEntityCollection getEntities() {
throwExceptionIfError();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ public RecognizePiiEntitiesResult(String id, TextDocumentStatistics textDocument
* Get an {@link IterableStream} of {@link PiiEntity}.
*
* @return An {@link IterableStream} of {@link PiiEntity}.
*
* @throws TextAnalyticsException if result has {@code isError} equals to true and when a non-error property
* was accessed.
*/
public PiiEntityCollection getEntities() {
throwExceptionIfError();
Expand Down
Loading