Volume( 14) - Issue( 1) 2026 pp 1-8 DOI: 10.62346/ijcn_q1_v14_no1_26_02

Architectural Design and Performance Evaluation of Machine Learning-Based Speaker Recognition Systems

Title

Architectural Design and Performance Evaluation of Machine Learning-Based Speaker Recognition Systems

Abstract

This describes an implemented speaker identification system leveraging a 1D Convolutional Neural Network (CNN). The classifier processes simulated Mel-Frequency Cepstral Coefficient (MFCC) features to distinguish between 4 unique speakers. The system circumvents real audio data acquisition by generating 80 fixed-length feature vectors (length 100), where the distinct acoustic signatures are simulated by assigning a unique mean offset to the feature distribution of each speaker. After reshaping the features for the Conv1D input and splitting the data, the defined CNN architectureβ€”which includes two Conv1D layers and MaxPooling1D blocksβ€”is trained. The model effectively demonstrates the capacity of 1D CNNs for sequence classification in biometric tasks, yielding near-perfect accuracy owing to the highly separable nature of the generated voice features.

Keywords

Speech Signals, Feature Extraction, Classification, Convolutional Neural Network (CNN), Emotion recognition system (ERS), Facial Emotion Recognition (FER).

Copyright Β© 2013-2026 ERES Publications