AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP

1857 signal processing and analysis Preprints

Related keywords
signal processing and analysis Causative attacks CRB Land mobile radio equipment feedback control music identification joint communications and sensing (JC&S) consumer electronics RADAR Bioinspired Sensors health estimation image/video analysis sensor array inertial sensor temporal hierarchy refinement beamforming microphone array vertical farming ecg Multimodal Instantaneous Frequency audio fingerprinting communication systems Dataset cramér-rao lower bound + show more keywords
Orthogonal time frequency space network training strategy energy measurement corrupted training sets 42B35 radio environment mapping non-stationary 92C55 Hotelling's t-squared statistic and Sound Quality affective gaming tracking Multipath environment modeling robust aerospace JSAC low-complexity Hilbert Spectrum radio frequency (RF) analyzers synthetic aperture radar source localization Capacitive sensing emotion language model video analytics upsampling method maximum likelihood estimator video games Threat Model Vulnerabilities MVDR integrated sensing and communication fields, waves and electromagnetics modular multilevel converter autoencoder computing and processing affective computing Boundary Element Methods dualdomain learning 42A50 valence node-red feature extraction cybersecurity Internet of Things unsupervised online monitoring MEMS sensors Dc capacitor Bayesian Information Criterion Attack Defense Tree Image video intelligence threats/ threat model bayesian methods Transfer learning Fourier Transform. MSC2020: 28A05 wideband music information retrieval broadcast monitoring detection classification 42A10 Data-poisoning robotics and control systems Directional Microphones pyramid features components, circuits, devices and systems reliability geoscience otfs asr Federated Learning Tremolo-Vibrato spaces Land mobile radio cellular systems iot sensors arousal unsupervised learning Acoustic Measurements Federated Neural Networks statistical modeling communication, networking and broadcast technologies Low Noise sensors Audio Systems Distributed Processing deep learning waveform design Trigonometric polynomial doa estimation eda vulnerabilities sparse-view ct photonics and electrooptics temporal-frequency invariance sea ice classification selective encryption general topics for engineers swin transformer Change Detection sensors near-field optical microcombs Cepstral Analysis difference coarray unmanned aerial vehicles (uavs) music Spherical sector harmonics audio model time series power, energy and industry applications Acoustic Particle Velocity Flow Sensing video analysis Selective encryption Tampering Threats Hardy-type spaces Microwave photonics chirp Attacks/Attack Defense Tree Hearing Devices integrated optics dynamic facial expression recognition adversarial machine learning Hydroponics tampering multi-user sensing physiological
FOLLOW
  • Email alerts
  • RSS feed
Please note: These are preprints and have not been peer reviewed. Data may be preliminary.
Unsupervised Multitemporal Multiclass Change Detection

Rogério G Negri

and 4 more

January 26, 2024
We address the crucial task of identifying changes in land cover using remotely sensed imagery. While most change detection methods focus on two images, we introduce an unsupervised approach that considers long image series (more than two), supporting a more nuanced differentiation between changed and unchanged areas. The proposed technique transforms input data to a new representation, capturing the target's spectral response changes over time. Areas with minimal response variation are identified as non-changing and distinguished from regions that have undergone modifications. The method further categorizes, utilizing statistical procedures, regions undergoing spatiotemporal modifications into seasonal or permanent changes. Experimental validation using simulated and real-world remote sensing image series demonstrates the effectiveness of the proposed approach.
A Low-Complexity Detector for OTFS-based Sensing

Tommaso Bacchielli

and 3 more

January 26, 2024
Orthogonal time frequency space (OTFS) modulation holds great promise for enabling integrated sensing and communication (ISAC) systems in future mobile networks. However, computing the channel matrix in OTFS presents significant challenges due to its extremely high dimension, especially in sensing applications. This study introduces a novel approach to drastically reduce the computational complexity of the OTFS-based ISAC system. Firstly, we define four low-dimensional matrices, which are utilized in computing the channel matrix through simple algebraic manipulations. Secondly, we establish an analytic criterion independent of system parameters to identify the most informative elements in each of these matrices by leveraging the properties of the Dirichlet kernel. To gauge the impact of our approach, we derive the computational complexity of the distilled channel matrix in terms of the number of elementary operations required. Numerical results demonstrate that our proposed technique significantly reduces receiver complexity by up to three orders of magnitude without compromising sensing performance.
AMuCS: Affective Multimodal Counter-Strike video game dataset
Marios Fanourakis

Marios Fanourakis

and 1 more

January 26, 2024
Video games are a versatile and multi-faceted stimulus which can elicit complex player experiences. As a consequence, several datasets have been curated or created for studying human cognition, behaviours, and physiological responses where video games are the primary stimulus. Many of these datasets have a low number of participants or do not have a rich set of modalities and are always recorded in a laboratory setting. To address these issues, we have recorded 256 participants at LAN events while they played the first person shooter, Counter-Strike: Global Offensive. Our dataset consists of several complementary modalities: physiological signals (ECG, EDA, Respiration), behavioural signals (facial expressions, eyetracking, depth images, seat pressure), computer interaction (keyboard and mouse events, game actions), and stimulus information (gameplay video, game logs). We show that the number of participants in our dataset and the variety of modalities recorded is advantageous for training machine learning models.
Framework for Mobile-Based Monitoring System of Smart Hydroponics Lettuce Farming
Ruben Balon

Ruben Balon

and 3 more

January 26, 2024
Hydroponics farming is popular all over the world because it sustains many people who suffer from hunger and who don't have a lot of space or land that can be planted. The focus of this study is to provide material and design for innovative smart hydroponics farming that involves growing a lettuce plant using IoT devices, sensors, and Node-Red. Conducting this study is critical to the research because different components need to be identified first, as well as features for mobile devices connected to the IoT devices. The aim of this study is to design an IoT-based system that constantly monitors the water level, temperature, and humidity of the hydroponic lettuce crop. To fulfill the aim of the study, the researchers provide material and design for how it works, methodology for the hardware of the system, and a design thinking process to address complex problems and come up with unique solutions that emphasize innovation. As a result, the study can collect data from the different sensors. The readings of the sensors can be accessed through the Node-red Dashboard, viewable on mobile devices. Additionally, the researchers suggest looking into solar energy as an alternative power source for IoT devices to save energy.
Phase Difference During Propagation of a Wave at a Given Instance on its Path
Chandru Iyer

Chandru Iyer

and 1 more

January 26, 2024
A document by Chandru Iyer . Click on the document to view its contents.
PTH-Net: Dynamic Facial Expression Recognition without Face Detection and Alignment
Min Li

Min Li

and 4 more

January 26, 2024
Pyramid Temporal Hierarchy Network (PTH-Net) is a new paradigm for dynamic facial expression recognition, applied directly to raw videos without face detection and alignment (FDA). The traditional paradigm initially employs FDA to extract facial regions from raw videos before recognition. The advantage of this paradigm lies in minimizing the impact of complex backgrounds. However, it inadvertently neglects valuable information, such as body movements. Additionally, being bound to FDA sacrifices flexibility. In contrast, PTH-Net distinguishes background and target at the feature level, preserves more critical information, and is an end-to-end network that is more flexible. Specifically, PTH-Net utilizes a pre-trained backbone to extract multiple generic features of video understanding at various temporal frequencies, forming pyramid features. Subsequently, through temporal hierarchy refinement—achieved via differential sharing and downsampling—PTH-Net refines key information under the supervision of multiple receptive fields with the temporal-frequency invariance of expressions. In addition, to solve the problem of containing numerous irrelevant frames in videos, PTH-Net incorporates a Temporal Hierarchy Refinement layer to aggregate information at different temporal granularities, enhancing its ability to distinguish target and non-target expressions. Notably, PTH-Net achieves more comprehensive and in-depth understanding by merging knowledge from both forward and reverse video sequences. PTH-Net excels across six challenging benchmarks with lower computational costs in comparison to preceding methods. 
A Novel OTFS-Chirp Waveform for Low-Complexity Multi-User Joint Sensing and Communica...
Salah Eddine Zegrar

Salah Eddine Zegrar

and 2 more

January 26, 2024
Joint sensing and communication (JSAC) has become increasingly popular in recent years due to spectrum scarcity, hardware limitations, power constraints, and the emergence of applications that require both communication and sensing capabilities. As such, the coexistence of both functionalities remains a challenge for current systems. Therefore, this paper proposes a novel orthogonal time frequency space (OTFS)chirp waveform. The proposed waveform exploits the sparsity of linear chirps in delay-Doppler domain to multiplex OTFS and chirp in orthogonal manner. This enables simultaneous high data rate communication and low-complexity accurate sensing simultaneously. Furthermore, we propose a multiuser JSAC scheme where multiple orthogonal chirps (OCs) are assigned to different users within the same OTFS-Chirp waveform. On top of that, a novel chirp-based channel estimation technique is proposed for OTFS systems. The proposed method leverages the sparsity of chirp in both delay-Doppler and Fresnel domains. This approach alleviates the pilot guard overhead and reduces the peak-to-average power ratio (PAPR) of the transmitted signal. The effectiveness of the proposed JSAC waveform is verified by the conducted numerical results that agree with the developed analysis and validate that the proposed waveform design can achieve accurate low-complexity radar parameter estimation while preserving high data rates.
A Survey on Detection, Classification, and Tracking of Aerial Threats using Radar and...
Wahab Ali Gulzar Khawaja

Wahab Ali Gulzar Khawaja

and 5 more

January 26, 2024
The use of unmanned aerial vehicles (UAVs) for a variety of commercial, civilian, and defense applications has increased many folds in recent years. While UAVs are expected to transform future air operations, there are instances where they can be used for malicious purposes. In this context, the detection, classification, and tracking (DCT) of UAVs (DCT-U) for safety and surveillance of national air space is a challenging task when compared to DCT of manned aerial vehicles. In this survey, we discuss the threats and challenges from malicious UAVs and we subsequently study three radio frequency (RF)-based systems for DCT-U. These RF-based systems include radars, communication systems, and RF analyzers. Radar systems are further divided into conventional and modern radar systems, while communication systems can be used for joint communications and sensing (JC&S) in active mode and act as a source of illumination to passive radars for DCT-U. The limitations of the three RF-based systems are also provided. The survey briefly discusses non-RF systems for DCT-U and their limitations. Future directions based on the lessons learned are provided at the end of the survey.
Virtualized Radio Access Networks: Energy Models, Challenges and Opportunities
Sofia Martins

Sofia Martins

and 2 more

January 26, 2024
Radio Access Networks (RANs) are responsible for the majority of the energy consumption of cellular networks, and their energy consumption has been growing at an unsustainable rate. Virtualized RANs (vRANs) are an alternative to monolithic RANs, which can adapt to ever-evolving traffic patterns in an agile way. Additionally, virtualization allows multiple radio sites to share the same computing infrastructure, which can lower energy consumption. However, the transition to vRANs is a slow process and there are many open questions about how to best manage vRANs to meet the requirements of cellular operators. In particular, it is unclear whether vRANs can reduce energy consumption compared to monolithic RANs and under which conditions. Energy models are crucial to address this question and identify opportunities for reducing energy consumption in vRANs. In this paper, we present an overview of the energy-modelling approaches that can be applied to RANs and we review models of monolithic and virtualized RANs. Additionally, we characterize the energy consumption of RANs at the network, node, component, and functional levels. Lastly, we examine the techniques that can be used to improve the performance of vRANs and highlight the challenges, unanswered questions, and opportunities in this field, concerning energy consumption.
Fourier Series Characteristic Function Method of Analysis of Sinusoids Plus Noise in...
Cameron Pike

Cameron Pike

January 26, 2024
Rice’s characteristic function method of analysis for combinations of signals and noise in a memoryless nonlinear system is extended in this paper to greatly facilitate numerical methods of solutions. This improvement is made by describing the nonlinearity as a Fourier series instead of a Fourier transform, as in the original method, thereby producing expressions for calculating the desired output correlation functions as summations rather than double or triple improper integrals that must be estimated. Even with modern computing tools, it has been found that estimating the values of the former improper integrals can be problematic for SNR conditions of particular interest that motivate the use of the characteristic function method, producing unreliable results and prompting researchers to look elsewhere for a solution. The modified method presented in this paper makes this powerful analytical tool more accessible to researchers and engineers, particularly as new semiconductor devices are under development requiring noise analysis. The extended method retains the useful properties of the original, including applicability to one or more signals (sine waves) and to any noise distribution since its characteristic function always exists, particularly Gaussian or random-telegraph shot noise. An alternative fundamental formula for the characteristic function method is presented expressed in terms of the Fourier coefficients and a discrete parameterization of the general characteristic function. New equations are presented and some numerical results for an example nonlinearity with a sinusoid plus Gaussian noise. The new procedure is concisely summarized for application to other problems.
Robust Multi-signal Estimation Framework with Applications to Inertial Sensor Stochas...

Chrysostomos Minaretzis

and 5 more

January 26, 2024
The stochastic calibration of low-cost and consumer grade inertial sensors has recently become very important due to their wide-spread utilization in a multitude of mass-market applications like smartphone and drone navigation. The reason behind this is because if accurate stochastic modeling about the inertial sensor noise is obtained, then the estimation quality of the navigation solution may improve significantly. Generally, the mainstream methods for stochastic calibration consider only a single signal, collected under static conditions, to infer that knowledge. However, it has been observed that even though the stochastic model structure that characterizes each (static) calibration signal remains the same, its parameter values vary from one replicate to another. Even though techniques have been recently proposed to address this in a statistically efficient way, a very important factor has been neglected, namely the influence of outliers on the estimation process. In this paper, a robust multi-signal framework for the stochastic modeling of inertial sensor errors is proposed, which contains two layers of robustness: one that reduces the influence of outliers in each observed signal (data corruption) and one that safeguards the estimation process from the collection of calibration signal replicates with notably different stochastic behaviour compared to the majority (sample contamination). Furthermore, two estimators are defined from this framework, with each encompassing either one or both layers of robustness, and their efficiency in different data contamination scenarios is assessed in a simulation setting. Finally, real data collected from a consumer-grade MEMS-based device are used within a navigation simulator to evaluate the relationship between the quality of the stochastic models obtained by the two robust estimators in different data collection scenarios and the navigation solution stability.
Bio-inspired Flow-Sensing Capacitive MEMS Microphone
Johar Pourghader

Johar Pourghader

and 6 more

January 22, 2024
Inspired by the auditory systems of small animals such as spiders, the tachinid fly, Ormia ochracea, and mosquitoes, a novel low-noise, flow-sensing capacitive MEMS microphone capable of sensing acoustic particle velocity is introduced. Unlike virtually all conventional microphones that have a diaphragm for sensing sound pressure, this design consists of a thin, porous, movable structure that is intended to be driven by viscous forces due to the sound-induced flow. This viscous force then rotates the movable structure around a middle central hinge and creates a change in capacitance due to a relative motion between neighboring beams on the movable and fixed electrodes which creates an electronic signal through a charge amplifier circuit. The whole structure is made of one layer of silicon using a Silicon-on-Insulator (SOI) wafer using photolithography technology consisting of moving and fixed electrodes with a thickness of 5µm. The movable part has dimensions of 0.7mm×1.2mm and is placed above a cavity inside the bulk silicon that facilitates the flow of sound particles. Because this microphone responds to flow (a vector) rather than pressure (a scalar), this design has an output that depends on the sound propagation direction and can cancel out sound from unwanted directions. Experimental results show that it has very promising capabilities for achieving the next generation of MEMS microphones that respond to acoustic flow rather than pressure and can be integrated with other sound sensors for consumer electronics and hearing aids.
Multimodal Video Intelligence Framework
Mayur Akewar

Mayur Akewar

March 12, 2024
Analyzing videos presents a unique challenge due to their rich content compared to images. Furthermore, processing lengthy videos efficiently necessitates segmenting them into scenes. Focusing on individual scene analysis offers an efficient alternative to analyzing entire videos. To facilitate this approach, images are extracted from these scenes, and each image is subjected to processing and analysis. In this study, we introduce a novel framework for video analysis employing state-of-the-art open-source models, including Contrastive Language–Image Pre-training (CLIP) for image classification, Bidirectional and Auto-Regressive Transformer (BART) language model for text classification, WHat-If Scenario Planning for Event Recognition (WHISPER) for Speech Recognition, and Yet Another Multiscale Convolutional Neural Network (YAMNET) audio model for audio classification. The proposed methodology starts with a video, and then, audio features are harnessed to extract scenes. A multimodal approach is applied to these scenes, capturing audio features, and text features through Automatic Speech Recognition (ASR), and image classification using CLIP. This fusion of multimodal features enhances the video intelligence process, making it a highly effective approach. The application of this approach extends to a variety of Video Intelligence tasks, from surveillance applications to comprehensive video analytics. By capitalizing on opensource foundation models and leveraging audio and text features, our framework offers a versatile solution to the intricate task of video analysis, catering to a multitude of real-world applications. We published the code at https URL [https://github.com/akewarmayur/VideoIntelligence] for future consideration and study.
Practical Online Condition Monitoring of DC-Link Capacitors in Modular Multilevel Con...

Mohsen Asoodar

and 3 more

January 22, 2024
This article presents a novel scheme for condition monitoring of dc-link capacitors in modular multilevel converters (MMCs). The proposed solution uses estimated capacitance values of the dc-link capacitors for indicating their state-of-health (SoH). Moreover, a comparative approach is proposed where the estimated capacitance of all submodule capacitors are used to separate parameter drifts caused by aging from parameter drifts caused by other factors such as temperature change. It is shown in simulation and experimental results that a short-term equal drift in all capacitance estimates can be a result of factors other than aging. However, a drift in the capacitance of one capacitor compared to the average capacitance of all submodules may be attributed to aging of that specific unit. Using the proposed comparative technique, there is no need for additional temperature sensors to incorporate the effect of temperature variations on the online estimations. Simulation and experimental results prove an overall estimation error of less than 1% when applying the proposed comparative technique.
Pixel Tampering Detection in Encrypted Surveillance Videos on Resource-Constrained De...
Ifeoluwapo Aribilola

Ifeoluwapo Aribilola

and 2 more

January 22, 2024
Encryption (naïve/selective) is recommended to secure the recorded visual content; however, intruders can still manipulate encrypted data. Visually, tampering attacks on encrypted video pixels in selectively encrypted videos are difficult to identify. Thus, this paper presents a tampering detection system that performs vulnerability analysis for Regions-of-Interest (ROI) in encrypted videos. To detect the tampering attacks, we explored the pixels' intensities and proposed a new TampDetect algorithm. The TampDetect applies the Hue, Saturation, and Value (HSV) colour model to detect the encrypted areas in the video. These encrypted pixels are then segmented and selected from the non-encrypted pixels using a minimum and maximum HSV value and a global threshold. The mean intensity of the encrypted pixels is thus calculated and stored. The integrity of the video frame is then validated by comparing the stored mean intensity with the newly calculated mean intensity to validate tampering/attack. The experiments were conducted on an Intel NUC, and its low computational cost demonstrates the lightweight nature of the proposed TampDetect algorithm for detecting tampering in ROI encrypted videos. The developed dataset for experiments, i.e., original, ROI encrypted, and tampered videos (encrypted/decrypted) is made available on kaggle-repository for future researchers.
Extrapolating ICESat-2 Derived Sea Ice Development to Sentinel-1 SAR
Karl Kortum

Karl Kortum

and 2 more

January 10, 2024
Sea ice property retrieval from synthetic aperture radar (SAR) at high resolution has one central problem: there are no comprehensive ground truth data available. One candidate instrument to offer such ground truth is the ICESat-2 altimeter, which has excellent vertical resolution. Due to the ice motion however, both instruments have to be acquiring within a short time difference for the measurements to still be correlated. Because the ICESat-2 altimeter uses a laser at optical wavelengths, cloud free conditions are also needed. As a result, valuable coincident measurements are sparse. Deep learning methods, that have shown promising results for ice classification in the past, particularly suffer from sparse training data. In this paper we propose an architecture agnostic transfer learning and autoencoding approach to overcome the problem of data sparsity. A major component of this approach is using local incidence angle dependencies as a proxy for ice classes. Thus, we can do a majority of the deep learning model training using self-supervised training on data without labels. Using this technique, altimetry derived ice development can be meaningfully extrapolated to Sentinel-1 SAR acquisitions.
Feedback control in microwave photonic transversal filters based on Kerr microcombs
David J. Moss

David J. Moss

January 10, 2024
Feedback control plays a crucial role in improving system accuracy and stability for a variety of scientific and engineering applications. Here, we theoretically and experimentally investigate the implementation of feedback control in microwave photonic (MWP) transversal filter systems based on optical microcomb sources, which offer advantages in achieving highly reconfigurable processing functions without requiring changes to hardware. We propose four different feedback control methods including (1) one-stage spectral power reshaping, (2) one-stage impulse response reshaping, (3) twostage spectral power reshaping, and (4) two-stage synergic spectral power reshaping and impulse response reshaping. We experimentally implement these feedback control methods and compare their performance. The results show that the feedback control can significantly improve not only the accuracy of comb line shaping as well as temporal signal processing and spectral filtering, but also the system's long-term stability. Finally, we discuss the current limitations and future prospects for optimizing feedback control in microcomb-based MWP transversal filter systems implemented by both discrete components and integrated chips. Our results provide a comprehensive guide for the implementation of feedback control in microcomb-based MWP filter systems in order to improve their performance for practical applications.
Towards an Unsupervised Federated Learning Approach for Speaker Diarization on IoT-st...
AMIT KUMAR BHUYAN

Amit Kumar Bhuyan

and 2 more

January 10, 2024
This paper presents a computationally efficient and distributed speaker diarization framework suitable for a network of IoT-style audio devices. This work proposes a Federated Learning based speaker identification model which can identify the participants in a conversation without the requirement of a large audio database for training. An unsupervised online update mechanism is proposed for the Federated Learning model which depends on cosine similarity of speaker embeddings. Moreover, the proposed diarization system solves the problem of speaker change detection via. unsupervised segmentation techniques using Hotelling's tsquared Statistic and Bayesian Information Criterion. In this new approach, speaker change detection is biased around detected quasi-silences, which reduces the severity of the tradeoff between the missed detection and false detection rates. Additionally, the computational overhead due to frame-byframe identification of speakers is reduced via. unsupervised clustering of speech segments. The results show the effectiveness of the proposed training method in the presence of non-IID speech data. It also shows a considerable improvement in the reduction of false and missed detection at the segmentation stage while reducing the computational overhead. Improved accuracy and reduced computational cost of the proposed mechanism makes it appropriate for real-time speaker diarization, especially for a distributed IoT network.
Computation Overhead Optimization Strategy and Implementation for Dual-Domain Sparse-...
Zihan Deng

Zihan Deng

and 4 more

January 10, 2024
Sparse-view computed tomography (CT) significantly reduces radiation doses to the human body, whereas its analytical reconstruction exhibits severe streak artifacts. Recently, deep learning methods have shown exciting effects in CT reconstruction. The Dual-Domain (DuDo) deep learning method is one of the representative methods, and it can process the information in both the sinogram and image domains. However, the existing DuDo methods do not pay enough attention to the allocation of training costs and strategies for the two domains. In this paper, we propose a Computation-Overhead Optimization (COO) DuDo training strategy for sparse-view CT reconstruction, i.e., COO-DuDo. The training ratio of different domains is controlled by calculating their computation overhead, loss, and gradient variation of the loss. To make our COO-DuDo strategy enable sparse-view CT reconstruction better, we adopt a DuDo-Network (COO-DDNet) structure based on two coding-decoding-type subnetworks. As specific contributions, we design a Multilevel Cross-domain Connection (MCC) method to connect the decoding layers of the same scale in the two subnetworks and adopt the two-channel method for upsampling, which enables more fine-grained control of model updates and suppresses checkerboard artifacts. The evaluation results validate the effectiveness of our training strategy and methods. The peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) of the reconstruction results increased by 38.8% and 0.37%, respectively, and the model convergence time decreased by 11.8%. Our research provides a broader perspective for dual-domain image restoration tasks from the perspective of computational overhead.
Federated Learning Under Attack: Exposing Vulnerabilities through Data Poisoning Atta...
Ehsan Nowroozi

Ehsan Nowroozi

and 3 more

January 10, 2024
Federated Learning (FL) is a machine learning (ML) approach that enables multiple decentralized devices or edge servers to collaboratively train a shared model without exchanging raw data. During the training and sharing of model updates between clients and servers, data and models are susceptible to different data-poisoning attacks. In this study, our motivation is to explore the severity of data poisoning attacks in the computer network domain because they are easy to implement but difficult to detect. We considered two types of data-poisoning attacks, label flipping (LF) and feature poisoning (FP), and applied them with a novel approach. In LF, we randomly flipped the labels of benign data and trained the model on the manipulated data. For FP, we randomly manipulated the highly contributing features determined using the Random Forest algorithm. The datasets used in this experiment were CIC [1] and UNSW [2] related to computer networks. We generated adversarial samples using the two attacks mentioned above, which were applied to a small percentage of datasets. Subsequently, we trained and tested the accuracy of the model on adversarial datasets. We recorded the results for both benign and manipulated datasets and observed significant differences between the accuracy of the models on different datasets. From the experimental results, it is evident that the LF attack failed, whereas the FP attack showed effective results, which proved its significance in fooling a server. With a 1% LF attack on the CIC, the accuracy was approximately 0.0428 and the ASR was 0.9564; hence, the attack is easily detectable, while with a 1% FP attack, the accuracy and ASR were both approximately 0.9600, hence, FP attacks are difficult to detect. We repeated the experiment with different poisoning percentages.
Localization of Wideband and Non-Stationary Sources Using Non-Uniform Planar Arrays i...
Shekhar Kumar Yadav

Shekhar Kumar Yadav

and 1 more

January 10, 2024
In this work, we tackle the estimation of elevation and azimuth direction of arrival (DOA) of uncorrelated wideband (WB) and statistically non-stationary (NS) sources located in the far-field of a sensor array in both free-field and multipath propagation environments. For the case of free-field propagation, we derive the closed-form expression of the Cramer-Rao Bound (CRB) that is applicable to both the overdetermined and underdetermined cases of DOA estimation. We also establish the condition for the existence of the CRB along with a discussion on the dependence of CRB on the signal-to-noise ratio and the upper bound on the total number of WB and NS sources that can be resolved theoretically. Further, we develop a coarray-based DOA estimation algorithm that uses the enhanced degrees of freedom (DOF) of the difference coarray of a non-uniform planar array to localize more uncorrelated WB and NS sources than the number of physical sensors. In the multipath scenario, the correlation between the direct signal and the reflected signal is tackled by proposing a new focusing method based on the decomposition of the array steering function using 2D Fourier basis functions. The proposed focusing approach can be applied to any realworld arbitrary planar array and does not require the initial estimate of the DOAs of the WB and NS sources. After focusing, frequency smoothing is performed to decorrelate the direct and reflected signals. After decorrelation, we propose two coarraybased techniques to identify the time-frequency bins that are dominated by the direct signals which are then used to estimate the DOA of the sources emitting the direct signals. Numerical examples verify the effectiveness of the proposed methods.
Protecting Copyright Ownership via Identification of Remastered Music in Radio Broadc...
Pasindu Marasinghe

Pasindu Marasinghe

and 3 more

January 08, 2024
In radio broadcasting, the crucial task of monitoring becomes evident for protecting musical work copyrights and ensuring the fair distribution of royalties. Manual monitoring, due to its time-consuming and unreliable nature, necessitates an automated approach. The challenges in automated monitoring arise mainly from the practice of broadcast stations remastering songs before airing. These alterations introduce complexities that complicate the identification process for existing music identification techniques. This paper tackles this challenge by exploring the feasibility of employing computer vision techniques on STFT spectrograms from ongoing audio streams. The objective is to identify similar spectrogram representations by comparing them with previously registered key features extracted from STFT spectrograms generated for original song tracks. This aims to unveil the identity of content being broadcast on radio and, consequently, safeguard the rightful ownership of copyrighted songs. The proposed approach achieved an accuracy of over 97% for tempo alterations up to 20% and over 95% accuracy for pitch alterations up to 20%.
Band-limited functions, and Time-Frequency Analysis of Non-Stationary Signals: Applic...
Moisés Soto-Bajo
Javier

Moisés Soto-Bajo

and 2 more

January 08, 2024
Many methods for signal analysis are based in the framework of time-frequency analysis, and some of them are supposed to be able to deal with non-stationary signals, as the Hilbert-Huang transform, which is based on the Hilbert Spectrum Analysis (HSA). HSA is a classical methodology which allows to assign an amplitude-phase representation, and hence an instantaneous energy and frequency, to a signal, which are supposed to be meaningful when applied to some specific signals. However, the resulting methods not always produce consistent or satisfactory frequential analyses. In this work, we propose a new method based on the HSA for time-frequency analysis, which is easy to implement, computationally cheap, and highly customizable to adapt the results to some specific requirements on the frequential behaviour of the signal. To do that, some previous theoretical analysis are performed, which are interesting by themselves. We investigate the time-frequency localization problem of the Fourier transform, defining the support of measurable functions, and characterizing the spaces of functions with support contained into a specific set. This leads us to consider the Hardy-type spaces, composed by square-summable functions such that the support of its Fourier transform is contained into a specific set. On the other hand, we approximate some time-frequency components of a signal, linked to specific time windows and frequency ranges, with generalized trigonometric polynomials, with frequencies into these specific ranges. This leads us to introduce the Tremolo-Vibrato spaces, intimately linked to the Hardy-type spaces, and to study the Hilbert spectrums of generalized trigonometric polynomials, which can be performed directly by using explicit general expressions. Finally, we propose a simple method for quantifying the time-varying relative importance of each component in the frequential decomposition. This method is applied to the analysis of medical signals, as EEG and ECG.
Near-Field Source Localization and Beamforming in the Spherical Sector Harmonics Doma...
Shekhar Kumar Yadav

Shekhar Kumar Yadav

and 2 more

January 08, 2024
Three-dimensional arrays have the ability to localize sources anywhere in the spatial domain without any ambiguity. Among these arrays, the spherical microphone array (SMA) has gained widespread usage in acoustic source localization and beamforming. However, SMAs are bulky and in many applications with space and power constraints, it is undesirable to use an SMA. To deal with this issue, arrays with microphones placed only in a sector of a sphere have been developed along with various techniques for localizing far-field sources in the spherical sector harmonics (S2H) domain. This work addresses near-field acoustic localization and beamforming using a spherical sector microphone array. We present the representation of spherical waves from a point source in the S2H domain using the orthonormal S2H basis functions. Then, using the representation, we develop an array model for when a spherical sector array is placed in a wavefield created by multiple near-field sources in the S2H domain. Using the developed array model, two algorithms are proposed for the joint estimation of the range, elevation and azimuth locations of near-field sources, namely NF-S2H-MUSIC and NF-S2H-MVDR. Further, a near-field beamforming algorithm capable of radial and angular filtering in the S2H domain is also presented. Finally, we present the Cramer-Rao Bound (CRB) in the S2H domain for near-field sources. The performances of the proposed algorithms are assessed using extensive localization and beamforming simulations.
← Previous 1 2 3 4 5 6 7 8 9 … 77 78 Next →
Back to search
Authorea
  • Home
  • About
  • Product
  • Preprints
  • Pricing
  • Blog
  • Twitter
  • Help
  • Terms of Use
  • Privacy Policy