Hybrid Anomaly Detection for Bullion Coin Authentication Leveraging Acoustic Signature Analysis

Source: arXiv:2604.27803 · Published 2026-04-30 · By Krzysztof Siwek, Tran Hoai Linh, Tomasz Gryczka, Maciej Stodolski

TL;DR

This paper addresses the critical problem of authenticating bullion coins, which is increasingly challenging due to the sophistication of counterfeit coins that are visually nearly indistinguishable from genuine specimens. The authors propose a novel, non-destructive verification framework that leverages the unique acoustic fingerprint of coins. By mechanically striking coins and analyzing their resonance frequency spectra, their method captures physical properties linked to metal composition and geometry. The approach combines an autoencoder-based anomaly detection model trained solely on authentic coins with a classifier built on the autoencoder’s latent space for coin type identification.

The key innovation is a dual-model system that mitigates the data scarcity and imbalance challenge inherent in counterfeit coin detection by exploiting unsupervised reconstruction error for anomaly detection, avoiding the need for counterfeit training samples. The authors employ dynamic thresholding and data augmentation to increase robustness to environmental noise and recording variability. Experimental results demonstrate accurate differentiation between authentic and high-quality tungsten-core counterfeit Australian Kangaroo silver coins and stable classifier performance across varying acoustic conditions. The work also suggests scalability of this acoustic approach for non-destructive testing of safety-critical components beyond coin authentication, such as in aerospace and automotive industries.

Key findings

Four primary resonance frequencies were identified in the genuine Australian Kangaroo silver coin acoustic spectrum: approximately 3505.6 Hz, 3770.0 Hz, 8648.75 Hz, and 15258.12 Hz (Table I).
Inter-specimen variability is low at higher frequency modes (mode 3 frequency SD of 5.62 Hz and amplitude SD of 3.58 dB), but somewhat higher at lower modes (frequency SD ~47–55 Hz with amplitude SD ~15–18 dB), attributed primarily to striking and recording differences (Table II).
The counterfeit coin with a tungsten core exhibited substantial deviations in resonance frequencies and amplitudes compared to the authentic coin, despite identical visual features (Fig. 4).
Autoencoder model training converged with mean squared reconstruction error stabilizing around epoch 25 in a small dataset regime (30 epochs total) using mean squared error loss (Fig. 7).
Dynamic anomaly detection threshold was set to µ+3σ of the peak matching distance metric, resulting in a numeric threshold of 131, ensuring 99.75% confidence range (Equation 6).
The autoencoder accurately reconstructed the original frequency spectra of genuine coins with final peak distance from original spectra of 0.13, well below the counterfeit detection threshold (Fig. 8).
Reconstruction errors for counterfeit coins significantly exceeded the anomaly detection threshold, enabling robust counterfeit identification without training on negative examples (Fig. 9).
Classifier built on the 128-dimensional latent space representation demonstrated sufficient linear separability to distinguish between coin types (Australian Kangaroo vs. Athenian Owl) using a simple 2-layer network with 64 ReLU and 2 linear output neurons (Fig. 6).

Threat model

The adversary is a counterfeiter producing visually indistinguishable high-quality fake bullion coins (e.g., tungsten cores plated with silver). They do not have access to acoustic recording equipment or the ability to directly alter the acoustic verification system but attempt to evade detection by closely mimicking the genuine coin's physical and acoustic properties. The adversary's capabilities do not include tampering with the recorded audio signals or the trained model.

Methodology — deep read

The core methodology involves capturing the unique acoustic signature of a coin generated by mechanically striking it with a rigid object and recording the resulting sound. The threat model assumes no prior access to counterfeit sound data, only authentic coin sound recordings, and the adversary cannot directly manipulate the recording process but may attempt to pass high-quality counterfeit coins visually indistinguishable from originals.

Data originates from authors’ recordings of several specimens per coin type, e.g., Australian Kangaroo and Athenian Owl silver bullion coins, with a limited sample size (~dozens). There are no public datasets or counterfeit coins for training, only a few counterfeit specimens for evaluation. Preprocessing involves extracting a fixed-length segment post-impact, RMS normalization, zero-padding, and data augmentation including amplitude scaling and Gaussian noise addition to mimic diverse acoustic environments.

The system implements a dual neural architecture: an autoencoder and a classifier. The autoencoder compresses the 8820-length frequency spectrum input through three linear layers with ReLU activations and dropout (0.1) to a 128-dimensional latent representation, then symmetrically decodes back reconstructing the input spectrum. Training minimizes mean squared error using Adam optimizer with learning rate 0.0005 for 30 epochs and batch size 4.

For anomaly detection, the model is trained only on authentic coin spectra, thus learns to reconstruct these with low error. At inference, the distance metric between peak frequencies and amplitudes of original and reconstructed spectra is computed with weights favoring frequency (0.8) over amplitude (0.2). A statistically derived threshold (mean plus 3 standard deviations) is applied to this distance measure to classify samples as authentic or counterfeit.

The classifier uses the autoencoder’s 128-dimensional latent outputs as features, trained with a small two-layer fully connected network (64 ReLU neurons + 2 linear output) on cross entropy loss to distinguish coin types. Principal component analysis was used for visualization showing feature space separability.

Evaluation focuses on reconstruction error convergence, distance metric deviations for genuine versus counterfeit coins, and classification accuracy on identifying coin types. The study lacks large-scale cross-validation or adversarial testing but demonstrates generalization across noise-augmented samples and recording devices. Code and data are not publicly released, limiting reproducibility.

Technical innovations

A hybrid dual-model architecture combining an unsupervised autoencoder for anomaly detection trained exclusively on authentic coin acoustic signatures with a classifier on the autoencoder’s latent features for coin type recognition.
Dynamic anomaly threshold selection based on statistical analysis of reconstruction distances to adapt to variability in acoustic recording conditions.
Data augmentation tailored for acoustic signals combining amplitude scaling and Gaussian noise addition to compensate for limited and homogeneous acoustic training data.
A weighted peak-based distance metric emphasizing frequency differences over amplitude for robust anomaly scoring between original and reconstructed frequency spectra.

Datasets

Proprietary bullion coin acoustic recordings — approx. 10-12 samples per coin type — created by authors
Various counterfeit coin samples (very limited, not publicly released)

Baselines vs proposed

No direct baselines reported due to unique dataset; performance compared internally between reconstruction error distances: genuine coins achieve mean reconstruction peak distance of 0.13 (<131 threshold) vs counterfeit coins well above threshold.

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2604.27803.

Fig 1

Fig 1: One-ounce Australian Kangaroo Silver Coin

Fig 2

Fig 2: Mechanical impact spectrogram

Fig 3

Fig 3: Spectrum of frequency spectra for three different specimens of the

Fig 4

Fig 4: Frequency spectrum for genuine and counterfeit coins

Fig 5

Fig 5: Example of a sound spectrogram before impact (left) and after it has

Fig 6

Fig 6: presents a visualization of the sample distribution

Fig 7

Fig 7: shows the average loss over successive epochs

Fig 8

Fig 8: shows a comparison of the original recording of

Limitations

Small and imbalanced dataset with very limited counterfeit samples, limiting statistical robustness.
No adversarial evaluation against adaptive counterfeit strategies or environmental acoustic manipulations.
Lack of large-scale cross-validation or external public datasets restricts generalizability assessment.
No public release of code or datasets inhibits external reproducibility and independent verification.
Model robustness validated primarily on one coin family (Australian Kangaroo silver) with limited coin types considered.

Open questions / follow-ons

How well does the acoustic anomaly detection generalize to a broader range of counterfeit techniques and materials beyond tungsten cores?
Can the methodology be extended to authenticate coins from multiple mints and precious metals with diverse acoustic profiles?
What are the limits of robustness under adversarial acoustic perturbations or recording environment variability?
How can synthetic data generation or physics-based simulation be leveraged to augment counterfeit data diversity for improved model training?

Why it matters for bot defense

For bot-defense practitioners, this paper illustrates a compelling example of leveraging multi-modal physical signals—in this case, acoustic fingerprints—and unsupervised machine learning techniques for anomaly detection when negative class data is scarce or unavailable. The approach highlights how autoencoder-based reconstruction errors combined with dynamic thresholding can identify subtle deviations indicative of forgery or attack, a principle that can be translated to detecting automated bot interactions or fraud in complex input spaces.

Moreover, the paper demonstrates the importance of carefully engineered preprocessing, augmentation, and robust feature extraction from noisy real-world signals, which are critical in bot defense to generalize across diverse user environments. Understanding the tradeoffs between supervised classification and one-class anomaly detection informs CAPTCHA design choices where collecting labeled attack data is infeasible. The methodology’s focus on physical signal integrity verification due to hardware/material properties could inspire novel multi-factor bot-detection mechanisms beyond traditional behavioral cues.

Cite

bibtex

@article{arxiv2604_27803,
  title={ Hybrid Anomaly Detection for Bullion Coin Authentication Leveraging Acoustic Signature Analysis },
  author={ Krzysztof Siwek and Tran Hoai Linh and Tomasz Gryczka and Maciej Stodolski },
  journal={arXiv preprint arXiv:2604.27803},
  year={ 2026 },
  url={https://arxiv.org/abs/2604.27803}
}

Hybrid Anomaly Detection for Bullion Coin Authentication Leveraging Acoustic Signature Analysis ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​