ARTI-6: Towards Six-dimensional Articulatory Speech Encoding

Speech Samples

Speaker ID 237 (Female)

Intermediate Feature Type	Dimension Size	The shaggy coat of the prairie, which they lifted to make him a bed, has vanished forever.	The wind was flapping her big hat and teasing a curl of her chestnut colored hair.	His yellow canvas leggings and khaki trousers were splashed to the knees.
Ground-truth	-

Mel-spectrogram [1]	80

EMA+pitch+loudness [2]	14

*ARTI-6 (Ours)*	6

Speaker ID 1580 (Female)

Intermediate Feature Type	Dimension Size	"I am afraid there are no signs here," said he.	I'll take the armchair in the middle.	Well, then, I must make some suggestions to you.
Ground-truth	-

Mel-spectrogram [1]	80

EMA+pitch+loudness [2]	14

*ARTI-6 (Ours)*	6

Speaker ID 260 (Male)

Intermediate Feature Type	Dimension Size	The danger is approaching.	During his watch I slept.	Hans had spoken truly.
Ground-truth	-

Mel-spectrogram [1]	80

EMA+pitch+loudness [2]	14

*ARTI-6 (Ours)*	6

Speaker ID 2830 (Male)

Intermediate Feature Type	Dimension Size	The most they could claim is that they were sent by others.	It must be watched.	There may be something to that.
Ground-truth	-

Mel-spectrogram [1]	80

EMA+pitch+loudness [2]	14

*ARTI-6 (Ours)*	6

References

[1] Jungil Kong, Jaehyeon Kim, and Jaekyoung Bae, “Hifi-gan: Generative adversarial networks for efficient and high fidelity speech synthesis,” Advances in neural information processing systems, 2020

[2] Cheol J Cho, Peter Wu, Tejas S Prabhune, Dhruv Agarwal, and Gopala K Anumanchipalli, “Coding speech through vocal tract kinematics,” IEEE Journal of Selected Topics in Signal Processing, 2024

ARTI-6: Towards Six-dimensional Articulatory Speech Encoding

Submitted to ICASSP 2026

Paper: Click here

Code: Click here

Authors

Jihwan Lee¹, Sean Foley^1,2, Thanathai Lertpetchpun¹, Kevin Huang¹, Yoonjeong Lee¹, Tiantian Feng¹, Louis Goldstein², Dani Byrd²Shrikanth Narayanan^1,2

¹Signal Analysis and Interpretation Laboratory, University of Southern California, USA
²Department of Linguistics, University of Southern California, USA

Abstract

Speech Samples

Speaker ID 237 (Female)

Speaker ID 1580 (Female)

Speaker ID 260 (Male)

Speaker ID 2830 (Male)

References

ARTI-6: Towards Six-dimensional Articulatory Speech Encoding

Submitted to ICASSP 2026

Paper: Click here

Code: Click here

Authors

Jihwan Lee1, Sean Foley1,2, Thanathai Lertpetchpun1, Kevin Huang1, Yoonjeong Lee1, Tiantian Feng1, Louis Goldstein2, Dani Byrd2Shrikanth Narayanan1,2

1Signal Analysis and Interpretation Laboratory, University of Southern California, USA 2Department of Linguistics, University of Southern California, USA

Abstract

Speech Samples

Speaker ID 237 (Female)

Speaker ID 1580 (Female)

Speaker ID 260 (Male)

Speaker ID 2830 (Male)

References

Jihwan Lee¹, Sean Foley^1,2, Thanathai Lertpetchpun¹, Kevin Huang¹, Yoonjeong Lee¹, Tiantian Feng¹, Louis Goldstein², Dani Byrd²Shrikanth Narayanan^1,2

¹Signal Analysis and Interpretation Laboratory, University of Southern California, USA
²Department of Linguistics, University of Southern California, USA