Testing ASR / NLU evaluation with command line

How’s going your Alexa development? I’m Hugtech. Living in the Netherlands, Freelancer. I often appear at AWS and Alexa Community.

This post’s theme is “Testing ASR / NLU evaluation with command line“. Going with completion some points that would be a bit hard to understand on the Official document.

Let’s take a look!

ASR/NLU Evaluation Tool

asr-nlu-evaluation-console

API Reference

https://developer.amazon.com/en-US/docs/alexa/smapi/nlu-evaluation-tool-api.html#smapi

Steps of ASR Evaluation

On the official document, There is an step by step instruction. This post is also following it.

1. Create Catalog and Associating it to Skill.

Catalog is sharedable datasource in multiple skills. The catalog is associated to the developer.amazon.com account. Catalog is basically hosted on AWS S3. You can also manage the catalog on your S3 bucket. These catalog operations are cupsulerized on Alexa Developer Console. So generally developer does not need to care about Catalog operations.

As the first test, The below is an result by calling Get the list of catalogs, v0 API after execute ASR evaluation on the developer console. You will see the catalog named “AMAZON.AudioRecording” with “AMAZON.AudioRecording”. Furthermore, in the “associatedSkillIds” attributes, You will also see the skillId which is associated the catalog.

-> % curl --location --request GET 'https://api.amazonalexa.com/v0/catalogs?vendorId=xxxxxx' \
--header 'Authorization: Bearer ....' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1040  100  1040    0     0   4279      0 --:--:-- --:--:-- --:--:--  4279
{
  "_links": {
    "self": {
      "href": "/v0/catalogs"
    }
  },
  "catalogs": [
    {
      "associatedSkillIds": [
        ""amzn1.ask.skill.xxxxxxxx"
      ],
      "createdDate": "2021-11-12T13:58:34.189Z",
      "id": "amzn1.ask-catalog.cat.28b2d85a-f34b-419e-8616-17cb30bd6d65",
      "lastUpdatedDate": "2021-11-12T13:58:34.346Z",
      "title": "ALEXA_ASR_EVALUATION",
      "type": "AMAZON.AudioRecording",
      "usage": "AlexaTest.Catalog.AudioRecording"
    }
  ],
  "isTruncated": false
}

If you’d like to create your own catalog, Use Create a catalog API. Let’s call it. Result is the below.

curl --location --request POST 'https://api.amazonalexa.com/v0/catalogs' \
--header 'Authorization: Bearer xxxxx' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "title": "hugtech-test-catalog-001",
    "vendorId": "xxxxxx",
    "usage": "AlexaTest.Catalog.AudioRecording",
    "type": "AMAZON.AudioRecording"
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   449  100   286  100   163    758    432 --:--:-- --:--:-- --:--:--  1190
{
  "associatedSkillIds": [],
  "createdDate": "2021-11-28T21:32:33.649Z",
  "id": "amzn1.ask-catalog.cat.xxxxxx",
  "lastUpdatedDate": "2021-11-28T21:32:33.649Z",
  "title": "hugtech-test-catalog-001",
  "type": "AMAZON.AudioRecording",
  "usage": "AlexaTest.Catalog.AudioRecording"
}

Just created the Catalog. Making relation the ID to your skill. Use Associate a catalog with a skill. Let’s look at the result after the above. To confirm the result, Use Get the list of catalogs, v0 API that was introduced before.

The catalog “hugtech-test-catalog-001” was being related to the skill on the associatedSkillIds attributes.

-> % curl --location --request PUT 'https://api.amazonalexa.com/v0/skills/amzn1.ask.skill.842aa6a7-xxxx/catalogs/amzn1.ask-catalog.cat.b8690b93-xxxx' \
--header 'Authorization: Bearer xxxxxK' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

-> % curl --location --request GET 'https://api.amazonalexa.com/v0/catalogs?vendorId=xxxxxxxx ' \
--header 'Authorization: Bearer xxxxK' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1668  100  1668    0     0   6270      0 --:--:-- --:--:-- --:--:--  6247
{
  "_links": {
    "self": {
      "href": "/v0/catalogs"
    }
  },
  "catalogs": [
    {
      "associatedSkillIds": [
        "amzn1.ask.skill.842aa6a7-xxxx"
      ],
      "createdDate": "2021-11-28T21:32:33.649Z",
      "id": "amzn1.ask-catalog.cat.b8690b93-xxxx",
      "lastUpdatedDate": "2021-11-28T21:40:21.100Z",
      "title": "hugtech-test-catalog-001",
      "type": "AMAZON.AudioRecording",
      "usage": "AlexaTest.Catalog.AudioRecording"
    },
    {
      "associatedSkillIds": [
        "amzn1.ask.skill.xxxxx9"
      ],
      "createdDate": "2021-11-12T13:58:34.189Z",
      "id": "amzn1.ask-catalog.cat.xxxxxxx",
      "lastUpdatedDate": "2021-11-12T13:58:34.346Z",
      "title": "ALEXA_ASR_EVALUATION",
      "type": "AMAZON.AudioRecording",
      "usage": "AlexaTest.Catalog.AudioRecording"
    }
  ],
  "isTruncated": false
}

2. Uploading audio data to the catalog

Now that the catalog has been created, we need to upload the audio data to be used for ASR Evaluation. The upload procedure is the same as the Multipart upload procedure in AWS S3, but it is a little confusing due to the connection with the SMAPI API, so I will explain it in a little more detail.

2.1 Open Upload Request

Sending the request which uploading sound data to AVS. After accepted the request, Will return the ID of the request and a PresignedUrl which we should placed the data on. (Presigned URL is the URL that has timeout. )

To do this, Use Create an upload, v0 API. Let’s call it.

-> % curl --location --request POST 'https://api.amazonalexa.com/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxx/uploads' \
--header 'Authorization: Bearer xxxxx83t' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "numberOfUploadParts": 1
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1055  100  1023  100    32   3398    106 --:--:-- --:--:-- --:--:--  3504
{
  "catalogId": "amzn1.ask-catalog.cat.72757baa-xxxxxx",
  "createdDate": "2021-11-29T21:29:59.253Z",
  "id": "amzn1.ask-catalog.upl.44e82eb0-xxxxxxxx",
  "ingestionSteps": [
    {
      "errors": [],
      "logUrl": "",
      "name": "INGESTION",
      "status": "PENDING"
    },
    {
      "errors": [],
      "logUrl": "",
      "name": "UPLOAD",
      "status": "PENDING"
    },
    {
      "errors": [],
      "logUrl": "",
      "name": "SCHEMA_VALIDATION",
      "status": "PENDING"
    }
  ],
  "lastUpdatedDate": "2021-11-29T21:29:59.253Z",
  "presignedUploadParts": [
    {
      "partNumber": 1,
      "url": "https://ask-catalog-prod-na-tmp-upload.s3.amazonaws.com/contentAuthorities/AlexaTest/catalogScopes/AlexaTest.Catalog.AudioRecording/catalogs/amzn1.ask-catalog.cat.72757baa-2306-463a-95b9-af1f6447354e/uploads/b0ab46b5-2dea-4094-98b8-62dbd3733f90?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20211129T212959Z&X-Amz-SignedHeaders=host&X-Amz-Expires=3600&X-Amz-Credential=AKIAYKW4E4252YH7ICGJ%2F20211129%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=c15ee06b7e92922e78f44bf2a6fbe73487fbc2ce58ba8ec38aa5ee635e0ce189"
    }
  ],
  "status": "PENDING"
}

The numberOfUploadParts property specified in the Body should be specified 1 until the data size is less than 5GB. Looking to the response, We will see the URL to which we should upload the data.

2.2 Upload the file

Put the file to the URL acquired on Step 2.1.

These files must be Zip that multiple audio files are archived. Must not include the extension for each file. “.mp3” is automatically appended.

Let’s look at the result. Should remember the Etag value because the value will use the next step.

Check the status of upload by Get the list of uploads, v0 .

-> % curl --location --request GET 'https://api.amazonalexa.com/v0/catalogs/amzn1.ask-catalog.cat.72757baa-2306-463a-95b9-af1f6447354e/uploads' \
--header 'Authorization: Bearer xxxxx3t' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8070  100  8070    0     0  20906      0 --:--:-- --:--:-- --:--:-- 20906
{
  "_links": {
    "self": {
      "href": "/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxxxx/uploads"
    }
  },
  "isTruncated": false,
  "uploads": [
  :
    {
      "catalogId": "amzn1.ask-catalog.cat.72757baa-xxxx",
      "createdDate": "2021-11-29T21:29:59.253Z",
      "id": "amzn1.ask-catalog.upl.44e82eb0-xxxxx",
      "lastUpdatedDate": "2021-11-29T21:29:59.253Z",
      "status": "PENDING"
    },
    :
  ]
}

The status is still indicating “PENDING”.

2.3 アップロードリクエストをクローズする

After upload, We must inform the completion of the Upload to AVS.

Complete an upload, v0 APIを使います。Callしてみます。結果を 先ほどと同様に Get the list of uploads, v0 APIで確認してみます。

-> % curl --location --request POST 'https://api.amazonalexa.com/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxxxx/uploads/amzn1.ask-catalog.upl.44e82eb0-xxxxxxx' \
--header 'Authorization: Bearer xxxxxx' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "partETags": [
        {
            "eTag": "47xxxxxxx",
            "partNumber": 1
        }
    ]
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   132    0     0  100   132      0    287 --:--:-- --:--:-- --:--:--   287


-> % curl --location --request GET 'https://api.amazonalexa.com/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxxx/uploads' \                                                            
--header 'Authorization: Bearer xxxxxxx' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8072  100  8072    0     0  22360      0 --:--:-- --:--:-- --:--:-- 22360
{
  "_links": {
    "self": {
      "href": "/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxx/uploads"
    }
  },
  "isTruncated": false,
  "uploads": [
    :
    {
      "catalogId": "amzn1.ask-catalog.cat.72757baa-xxxx",
      "createdDate": "2021-11-29T21:29:59.253Z",
      "id": "amzn1.ask-catalog.upl.44e82eb0-xxxx",
      "lastUpdatedDate": "2021-11-29T22:03:24.883Z",
      "status": "SUCCEEDED"
    },
     :
  ]
}

ステータスが “SUCCEEDED” に変わっています。

ファイルは正しくアップロードできているでしょうか? Get information about a specified upload, v0 APIで確認します。

-> % curl --location --request GET 'https://api.amazonalexa.com/v0/catalogs/amzn1.ask-catalog.cat.72757baa-xxxxxx/uploads/amzn1.ask-catalog.upl.44e82eb0-xxxxxx' \
--header 'Authorization: Bearer xxxxx' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1615  100  1615    0     0   2515      0 --:--:-- --:--:-- --:--:--  2515
{
  "catalogId": "amzn1.ask-catalog.cat.72757baa-xxxxxx",
  "createdDate": "2021-11-29T21:29:59.253Z",
  "file": {
    "presignedDownloadUrl": "https://ask-catalog-prod-na-content-upload.s3.amazonaws.com/xxxxxx",
    "status": "AVAILABLE"
  },
  "id": "amzn1.ask-catalog.upl.44e82eb0-xxxxxxxx",
  "ingestionSteps": [
    {
      "errors": [],
      "logUrl": "https://ask-catalog-prod-na-error-logs.s3.amazonaws.com/xxxxxx",
      "name": "INGESTION",
      "status": "SUCCEEDED"
    },
    {
      "errors": [],
      "logUrl": "",
      "name": "UPLOAD",
      "status": "SUCCEEDED"
    },
    {
      "errors": [],
      "logUrl": "",
      "name": "SCHEMA_VALIDATION",
      "status": "SUCCEEDED"
    }
  ],
  "lastUpdatedDate": "2021-11-29T22:03:24.883Z",
  "status": "SUCCEEDED"
}

アップロードにはどうやら、INGESTION, UPLOAD, SCHEMA_VALIDATION の3つのステップがあったようですねw とにかくアップロードはいずれのステップも SUCCEEDED となっており、成功はしているようです。

レスポンスの file.presignedDownloadUrl にアップロードしたファイルが格納されています。ダウンロードしてみます。

-> % curl https://ask-catalog-prod-na-content-upload.s3.amazonaws.com/contentAuthorities/AlexaTest/catalogScopes/AlexaTest.Catalog.AudioRecording/catalogs/amzn1.ask-catalog.cat.72757baa-xxxxxxxxxx/uploads/amzn1.ask-catalog.upl.44exxxxxxx --output audio_dl.zip
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14102  100 14102    0     0  22708      0 --:--:-- --:--:-- --:--:-- 22672

-> % unzip audio_dl.zip                                  
Archive:  audio_dl.zip
  inflating: konbanha                
  inflating: ohayougozaimasu         
  inflating: konnichiha              
[23時16分56秒] [~/tinms/tradfit/alexa-utterance-test-tool] [development *]
-> % 

正しくファイルがアップロードされていることが確認できました。

AnnotationSetを作る

ファイルがアップロードできたところで、次はこれらのファイルをつかってAnnotationSetを作りましょう。ASR、NLU用のテスト仕様書のようなものです。

まずは、ASRのAnnotationSetを作ります。

Create an Annotation Set API を使います。テスト仕様書の雛形を作成するAPIです。Callしてみます。スキルは言語単位で作りますが、ASRのAnnotationSetには現状ロケールで分けるという概念はないようです。 アノテーションセットの名前にロケールをつけるてアノテーションの言語を識別しやすいようにしておくのがよい方法です。

-> % curl --location --request POST 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-5591-476f-8f58-ae94f43d41b9/asrAnnotationSets' \
--header 'Authorization: Bearer Atza|IwEBID_8f98r359AvawBuPnYsHuU87ZrHSbfe0M-d3uJSeszwR9Eam_klmr12PSwiFBDcMMVTMoAPyXNB-cRKQD5-ImNOQku4ljSE5Rz9duNUs3E8Y-dscv2ROaQJXQkMAioTmXzsXkv6ZXuR3zAtT-e8okrEjxxMoq-JaT5H7smPRC2fhCwW0udIdecaz_qy0-IZIaobi4oOb8CcC43CYuWyp3KSCQRvmPGFpJNhwLaj_LqxWZ1x3j1Z_JWyn6iOc9bP9KIC7vL7L5z00BulHE47yGe82rdp6IfrGIATptJyTWdKwG1w-1czEVWBSyRWWUlcQDL79Fkuc_C6b0aKk8V1gjhWgk6s18PkSaY0x52FSp3fIJlNwChgvRzQRXYswuwhA07B1Km_CJE0daBvgR3FX4n_fim2-dzwEtB_ziR8Z_x2g10k-tpP4UBaM907uTJBuPx-Gv3PSg9BK0WycwmVIUyMZI2jLbzhnU0LkAitWs0J7ktnr5SyNqTckwg7UuZrJ1cyirjmo1wJkmNTxoZ5K-4kcgHXklxW_pVO3G4tVzzf4ON9IN7I8fns2T3kOPDwHPAzpYuxktGh8kjMuSfpKKM_Ki1DdMJNstoPNxOmbF2GtK5HrQLac_PNZ4ItiBh_R9-c6vgI84fdcv1hiIEo39Cd9aR2lquzlhW95CAtJ8qRFxnR8FNBTHNLHx6fS_rfn7XUjGaT74s_OC5d777kT5mn-WRjpzIW_nIl0f6NGp9vQ' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "name": "sample-annotation-set-jp"
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   113  100    74  100    39    141     74 --:--:-- --:--:-- --:--:--   215
{
  "id": "amzn1.ask.asr-annotation-set.a84e6c2e-xxxxxxxx"
}

テストの雛形ができたので、AnnotationSetにテストを追加します。Update Annotation Set Annotations API を使います。Callしてみます。

-> % curl --location --request PUT 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxx/asrAnnotationSets/amzn1.ask.asr-annotation-set.a84e6c2e-xxxxx/annotations' \ 
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer xxxxxxx9vQ' \
--data-raw '{
    "annotations": [{
        "uploadId": "amzn1.ask-catalog.upl.59dbd3db-xxxxxx",
        "filePathInUpload": "konnichiha",
        "evaluationWeight": 1,
        "expectedTranscription": "こんにちは"
    }]
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   238    0     0  100   238      0    217  0:00:01  0:00:01 --:--:--   217

登録したAnnotationSet を確認します。Get Annotation Set Contents API を使います。Accept Header は、 application/json または text/csv のみ受け付けますので、忘れずにどちらかを指定します。

-> % curl --location --request GET 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxxx/asrAnnotationSets/amzn1.ask.asr-annotation-set.a84e6c2e-xxxxxxx/annotations' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer xxxxxxGp9vQ' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   724  100   724    0     0   2445      0 --:--:-- --:--:-- --:--:--  2445
{
  "annotations": [
    {
      "evaluationWeight": 1,
      "expectedTranscription": "こんにちは",
      "filePathInUpload": "konnichiha",
      "uploadId": "amzn1.ask-catalog.upl.xxxxxx",
      "audioAsset": {
        "downloadUrl": "https://audio-transcoded-prod.s3.amazonaws.com/M2AUSLW6GQRMGE/amzn1.ask-catalog.cat.xxxxxxxx/amzn1.ask-catalog.upl.xxxxxx/MP3/konnichiha.mp3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20211202T213445Z&X-Amz-SignedHeaders=host&X-Amz-Expires=10800&X-Amz-Credential=AKIAYWH6WUJBVHBXC6M2%2F20211202%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=a9a6e7bb428793c26b097fdc770b6f34f5226155e487d976a0f5dae977388bb1",
        "expiryTime": "2021-12-03T00:34:45.484Z"
      }
    }
  ]
}

downloadUrl のところにS3のURLが返ってきているのがわかります。拡張子に注目してください。mp3がついていますね。先程の章でアップロードしたのは、拡張子をつけていませんでしたが、ここでは拡張子が付与されています。

"downloadUrl": "https://audio-transcoded-prod.s3.amazonaws.com/M2AUSLW6GQRMGE/amzn1.ask-catalog.cat.xxxxxxxx/amzn1.ask-catalog.upl.xxxxxx/MP3/konnichiha.mp3?X-Amz-Algorithmxxxxxxxx1"

Note: Alexaの開発者コンソールで音声ファイルだけ録音して、

ASR の AnnotationSetができました。次は、NLU の AnnotationSetを作ります。

Create a new annotation set API を使います。手順はASRのときとほぼ同じです。Callします。ASRのときとは異なり、ロケールの指定が必要です。

-> % curl --location --request POST 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxx/nluAnnotationSets' \
--header 'Authorization: Bearer Atza|xxxxxxxxM' \
--header 'Content-Type: text/plain' \
--data-raw '{
    "name": "sample-annotation-set",
    "locale": "ja-JP"
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   107  100    45  100    62     44     61  0:00:01  0:00:01 --:--:--   106
{
  "id": "d52feae5-xxxxx"
}

NLUのテスト雛形ができたので、アノテーションを追加します。Upload or update an annotation set API を使います。Callしてみます。

-> % curl --location --request POST 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxx/nluAnnotationSets/d52feae5-xxxxx/annotations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer xxxxxxx9M' \
--data-raw '{
  "data": [
    {
      "inputs": {
        "utterance": "こんにちは"
      },
      "expected": [
        {
          "intent": {
            "name": "AMAZON.CancelIntent",
            "slots": {}
            }
        }
      ]
    }
  ]
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   248    0     0  100   248      0    450 --:--:-- --:--:-- --:--:--   449

AnnotationSets を追加したので確認しましょう。公式ドキュメントにはAnnotationを確認する方法の記載がありませんが、先程実行した Create Annotation の HTTPメソッドをGETに変更することで動きます。Acceptヘッダーだけ注意してください。

-> % curl --location --request GET 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxxx/nluAnnotationSets/d52feae5-xxxxxxx/annotations' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer Atza|xxxxxxxxxB9M' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   119  100   119    0     0    231      0 --:--:-- --:--:-- --:--:--   231
{
  "data": [
    {
      "inputs": {
        "utterance": "こんにちは"
      },
      "expected": [
        {
          "intent": {
            "name": "AMAZON.CancelIntent",
            "slots": {}
          }
        }
      ]
    }
  ]
}

追加されています。

テストを実行する

テストの用意が整ったので、実行してみます。Post ASR Evaluation API を実行します。結果を取得するためのIDが発行されます。

-> % curl --location --request POST 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxxx/asrEvaluations' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer Atzaxxxxxx' \
--data-raw '{
    "skill": {
        "stage": "development",
        "locale": "ja-JP"
    },
    "annotationSetId": "amzn1.ask.asr-annotation-set.a84e6c2e-xxxxxxx"
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   244  100    70  100   174     71    177 --:--:-- --:--:-- --:--:--   249
{
  "id": "amzn1.ask.asr-evaluation.a8a4a0c0-xxxxxxxx"
}

つづいて、NLU のテストも実行します。 Start an evaluation API を使います。呼び出しかたは ASR のときとほぼ同じですね。

-> % curl --location --request POST 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxxx/nluEvaluations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer Atza|xxxxxx' \
--data-raw '{
  "stage": "development",
  "locale": "ja-JP",
  "source": {
    "annotationId": "d52feae5-a410-4f95-badf-5675fed8ae34"
  }
}' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   172  100    45  100   127     43    123  0:00:01  0:00:01 --:--:--   166
{
  "id": "f204cb5f-xxxxxx"
}

テスト結果を取得する

AnnotationSetのEvaluationは、非同期に実行されます。Evaluationの開始時に取得したIDをキーに、結果を取得します。まず、ASR Annotation Sets の評価結果を取得します。Get ASR Evaluation Results API を使います。音声ファイルは、期待したとおりに、”こんにちは”と STT(Speech To Text) されて、試験にパスしたことがわかります。

-> % curl --location --request GET 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxxxxx/asrEvaluations/amzn1.ask.asr-evaluation.a8a4a0c0-4570-xxxxxxx/results' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer Atza|xxxxxx' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   798  100   798    0     0   1842      0 --:--:-- --:--:-- --:--:--  1842
{
  "results": [
    {
      "annotation": {
        "audioAsset": {
          "downloadUrl": "https://audio-xxxxx-prod.s3.amazonaws.com/M2Axxxxxxx",
          "expiryTime": "2021-12-03T19:21:12.072Z"
        },
        "evaluationWeight": 1,
        "expectedTranscription": "こんにちは",
        "filePathInUpload": "konnichiha",
        "uploadId": "amzn1.ask-catalog.upl.59dbd3db-xxxxxxx"
      },
      "output": {
        "transcription": "こんにちは"
      },
      "status": "PASSED"
    }
  ]
}

つづいて、NLUの評価結果を確認します。Get the results of an evaluation API を使います。評価結果は、期待したIntentに入らなかったことが確認できます。

-> % curl --location --request GET 'https://api.amazonalexa.com/v1/skills/amzn1.ask.skill.842aa6a7-xxxxx/nluEvaluations/f204cb5f-xxxxxxx/results' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer Atza|xxxxxxxx' | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   440  100   440    0     0   1349      0 --:--:-- --:--:-- --:--:--  1345
{
  "_links": {
    "self": {
      "href": "/v1/skills/amzn1.ask.skill.842aa6a7-xxxxx/results"
    }
  },
  "paginationContext": {
    "totalCount": 1
  },
  "testCases": [
    {
      "actual": {
        "intent": {
          "confirmationStatus": "NONE",
          "name": "AMAZON.FallbackIntent",
          "slots": {}
        }
      },
      "expected": [
        {
          "intent": {
            "name": "AMAZON.CancelIntent",
            "slots": {}
          }
        }
      ],
      "inputs": {
        "utterance": "こんにちは"
      },
      "status": "FAILED"
    }
  ],
  "totalFailed": 1
}

まとめ

正直かなりめんどくさい作業ではありますが、アレクサスキルの認識率を継続的に高めていくためには、ASRとNLUを統合的に評価できるしくみがあるとよいことは間違いないです。現在は、ASRとNLUの評価が完全に分かれている状態のため、ユーザーの音声ファイルから期待するTranscriptionとそのTranscriptionから期待するIntentまでをこれらのAPIを組み合わせることによって、シームレスな評価サイクルが作れたりするとよいですね。