We present SQ-AST, a transformer-based model for non intrusive speech quality prediction. The model predicts over all speech quality and four perceptual dimensions—noisiness, discontinuity, coloration, and loudness—using only the de graded signal. SQ-AST leverages Audio Spectrogram Trans formers (AST), pretrained on large-scale audio datasets and fine-tuned on diverse speech quality corpora. It operates on short speech clips (4–12 seconds) without requiring a reference signal. Training was conducted on 106 databases comprising 165,791 samples. Independent evaluations confirm strong gen eralization to real-world conditions. The model is currently under consideration for ITU-T standardization, highlighting its potential for benchmarking, quality assessment, and industry adoption. Index Terms: speech quality assessment, deep learning, audio spectrogram transformer, perceptual evaluation, machine learning.