Introduction: Cardiac auscultation accuracy is poor: 20% to 40%. Audio-only of 500 heart sounds cycles over a short time period significantly improved auscultation scores. Hypothesis: adding visual information to an audio-only format, significantly improves short and long term accuracy. Methods: Pre-test: Twenty-two 1st and 2nd year medical student participants took an audio-only pre-test. Seven students comprising our audio-only training cohort heard audio-only, of 500 heart sound repetitions. 15 students comprising our paired visual with audio cohort heard and simultaneously watched video spectrograms of the heart sounds. Immediately after trainings, both cohorts took audio-only post-tests; the visual with audio cohort also took a visual with audio post-test, a test providing audio with simultaneous video spectrograms. All tests were repeated in six months. Results: All tests given immediately after trainings showed significant improvement with no significant difference between the cohorts. Six months later neither cohorts maintained significant improvement on audio-only post-tests. Six months later the visual with audio cohort maintained significant improvement on the visual with audio post-test. Conclusions: Audio retention of heart sound recognition is not maintained if: trained using audio-only; or, trained using visual with audio. Providing visual with audio in training and testing allows retention of auscultation accuracy. Devices providing visual information during auscultation could prove beneficial.