How to Build a Voiceprint System in Just Three Steps

Background

Voiceprint Recognition Technologies

1. Voiceprint Retrieval Demo

Figure 1. Voiceprint demo system
Figure 2. Voice query
Figure 3. Voice registration
Figure 4. Voice recording and retrieval

2. Overall Design of Application Structure

Figure 5. Voiceprint retrieval database

3. System Accuracy

Table 1. Accuracy of the results that rank first

Three Steps to Building a Voiceprint System

Step 1: Initialization

import requests
import json
import numpy as np
# sound: binary sound file.
# model_id: ID of the model.
def get_vector(sound, model_id='i-vector'):
url = 'http://47.111.21.183:18089/demo/vdb/v1/retrieve'
d = {'resource': sound,
'model_id': model_id}
r = requests.post(url, data=d)
js = json.loads(r.text)
return np.array(js['emb'])
# Read the user file.
file = 'xxx.wav'
data = f.read()
print(get_vector(data))
f.close()
-- Create a user voiceprint table
CREATE TABLE person_voiceprint_detection_table(
id serial primary key,
name varchar,
voiceprint_feature float4[]
);
-- Create a vector index
CREATE INDEX person_voiceprint_detection_table_idx
ON person_voiceprint_detection_table
USING ann(voiceprint_feature)
WITH(distancemeasure=L2,dim=400,pq_segments=40);

Step 2: Registering User’s Voice

-- Register the user "John" in the current system.
-- Use the HTTP service to convert the voiceprint into a corresponding vector.
INSERT INTO person_voiceprint_detection_table(name, voiceprint_feature)
SELECT 'John', array[-0.017,-0.032,...]::float4[])

Step 3: Retrieving and Authenticating User’s Voice

-- Voiceprint authentication for door locks (1:1)SELECT  id,    -- User ID
name, -- User name
l2_distance(voiceprint_feature, ARRAY[-0.017,-0.032,...]::float4[]) AS distance -- Distance between the vectors
FROM person_voiceprint_detection_table -- User voice table
WHERE distance < threshold -- Generally, the threshold is 550
AND id = 'user_id' -- The user ID to authenticate
-- Voiceprint recognition of a conference speaker (1:N identification)SELECT  id,    -- User ID
name, -- User name
l2_distance(voiceprint_feature, ARRAY[-0.017,-0.032,...]::float4[]) AS distance -- Distance between the vectors
FROM person_voiceprint_detection_table -- User voice table
WHERE distance < threshold -- Generally, the threshold is 550
ORDER BY voiceprint_feature <-> ARRAY[-0.017,-0.032,...]::float4[] -- Use the vectors to sort
LIMIT 1; -- Return the most similar results

References

Original Source:

--

--

--

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

A Multi-Cloud and Multi-Cluster Architecture with Kubernetes

Deploying Drupal 8 using Ansible Playbook: Part 1

Create table using for loop in PHP

Supercharging Your Code Review With Collaboration

Coding: Zero to Bootcamp Grad

Phased Release of Application Services through Kubernetes Ingress Controller

Tech Insights — Two-phase Commit Protocol for Distributed Transactions

Easy Category Theory

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com

More from Medium

Apache Kafka 101

Kafka — A Distributed Streaming Platform

Creating DAG in Apache Airflow

Apache Flink in 5 minutes