AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP

3184 computing and processing Preprints

Related keywords
computing and processing cross attention Low-Resource Languages imbalanced recurrent network architectures concept drift dermoscopy Personal Companion Electrical Impedance Spectroscopy text generation bert bioengineering Crowd authoring neural networks deep learning applications education Jointly typical decoding Facial feature extraction next generation reservoir computing Question Answering Systems quantization process Cancer Detection apache hadoop wavelet transform llm + show more keywords
Up/Down semantic mutual information computational intelligence transformers embedding vectors parallel processing climate generative artificial intelligence visual memory network security quran Synonymous typical set Synonymous mapping crowd simulation Cross Domain Face Recognition mixed reality streaming algorithms language models image restoration Semantically jointly typical set Fair Scheduler machine learning Electroneurogram signal (ENG) contrastive learning data mining linearity motor commands decoding Retrieval Augmented Generation non-rigid interaction unstructured grid data center Semantic relative entropy Turbocharging force and tactile sensing internet server human-robot interaction optimization Multi-Job Workloads generative ai for educational metaverse bidirectional neural prosthesis tai for educational metaverse visuotactile sensors text classification large language model Semantic rate distortion function Semantic entropy domain decomposition Thermal to Visual Face Recognition speech synthesis dependability text-to-speech Semantic distortion anomaly detection Semantic source channel coding Semantically typical set regression openacc, energy demand forecasting User control intention inference explainable artificial intelligence (xai) neuromorphic computing appearance uncertainties Turkic languages Cole Relaxation Frequency components, circuits, devices and systems geoscience signal processing and analysis Job Dependency Synonymous length floating-point mental health Maximum likelihood group decoding human-machine interaction location shallow water flows Gaze estimation mpi face recognition chatbots blind super resolution Interactive simulation ai for education machine learning (ml) deep learning depression Non-Small Cell Lung Carcinoma communication, networking and broadcast technologies Controllable diffusion language models neural network image classification Semantic channel capacity soft errors generative ai general topics for engineers Jointly typical encoding Interactive tool facial expression recognition trustworthy ai streaming data reservoir computing lstm power, energy and industry applications explainable human-computer interaction computer vision regularization network intrusion detection system (nids) data-driven methods phq-9 Dynamic Job Grouping
FOLLOW
  • Email alerts
  • RSS feed
Please note: These are preprints and have not been peer reviewed. Data may be preliminary.
Wavelet Transform Analysis of Bio-impedance Spectroscopy for Accurate Cancer Detectio...

Onur Fidaner

and 1 more

February 20, 2024
Objective: In this paper we propose a novel use of the wavelet transform in analyzing spectral impedance to discriminate cancerous and non-cancerous lung tissue. Methods: Cancerous and Non Cancerous ex-vivo tissue samples were obtained during a human clinical trial of patients undergoing a lobectomy. Spectral Impedance measurements of the tissues samples were performed using a custom impedance bridge and measurement system. Three methods are described using the wavelet transform of the spectral impedances and comparing results to pathological assessments of the lesion. Results: In a cohort of n=46 patients, for which both tumor and away-from-tumor samples have been collected, we report sensitivity of 87% and specificity of 84%. Conclusion: Three techniques have been proposed as an algorithm basis to improve the sensitivity and specificity of discriminating between cancerous and non-cancerous tissue. Significance: Lung Cancer is the cause of 20% of all cancer related death globally and is the most diagnosed cancer worldwide. Utilizing the proposed methods of analyzing tissue samples and reporting on cancer state at the surgical suite offers clinicians an additional tool to better diagnose and subsequently treat the deadly disease.
Turbocharging Hadoop Fair Scheduler using Dynamic Job Grouping in Multi-Job Workloads
Hikmat Dhamee

Hikmat Dhamee

and 1 more

February 14, 2024
This study addresses the efficiency challenges in the Hadoop fair scheduler when handling interdependent multi-job workloads in a single-tenant Hadoop environment. Hadoop has become the leading choice for large-scale data processing by leveraging MapReduce parallelism. However, conventional fair scheduling struggles with interdependent multi-job workloads, where the dependency is in the form of one job taking the output of another job as input, leading to underutilized cluster resources and extended job waiting times owing to job dependencies. To remedy this, this research introduces a dynamic job grouping approach that optimizes fair scheduler performance and reduces the overall workload completion time. Simulating a healthcare workload with 35 interdependent jobs, the approach demonstrates a 35% to 40% reduction in average job completion time, showcasing adaptability in diverse scenarios. Recommendations include integrating dynamic job grouping into a fair scheduler, optimal parallelism factor consideration, and the exploration of dynamic job submission mechanisms in the Hadoop ecosystem with diverse datasets. As organizations increasingly adopt Hadoop for multi-job workloads, the proposed approach provides valuable enhancement for dynamic and adaptive job submission, improving overall efficiency.
Visual Memory Transfer for Imbalanced Medical Image Classification

Ke Yao

and 1 more

February 12, 2024
In deep learning of medical image data, skin lesion classification remains a challenging problem due to imbalanced distribution of training data. We propose a visual memory transfer (VMT) method by means of transferring visual knowledge from majority classes to minority classes. As a result, our method enriches feature of minority classes with pre-calculated memory features. In addition, VMT defines a refined feature map to perform fine-grained classification. Our classification results outperform SOTA methods on the largest public available dermoscopic image dataset on averaged F-score and Top-1 classification accuracy.
Hybrid-Segmentor: A Hybrid Approach to Automated Damage Detection on Civil Infrastruc...
June Moh Goo

June Moh Goo

and 3 more

February 12, 2024
Detecting and segmenting cracks in infrastructure, such as roads and buildings, is crucial for safety and cost-effective maintenance. In spite of the potential of deep learning, there are challenges in achieving precise results and handling diverse crack types. With the proposed dataset and model, we aim to enhance crack detection and infrastructure maintenance. This study proposes a novel approach termed Hybrid-Segmentor, which uses a convolutional neural network path that is well-suited for extracting fine-grained local features and a transformer path to extract global features that benefit from understanding the overall structure. This hybrid method makes the model more generalizable to various shapes, surfaces, and sizes of cracks. To achieve a balanced computational cost, the study incorporates efficient self-attention in the transformer path and introduces a comparatively simpler decoder compared to the complexity of the two encoder paths. This combination strategically optimizes the extraction of global and local features while maintaining computational efficiency. The model was trained using a combined binary cross entropy and Dice loss function on a large refined dataset of 12,000 crack images generated from 13 publicly available datasets. Our studies demonstrate that the model efficiently utilizes convolutional layers and transformers to extract local and global features. Hybrid-Segmentor outperforms existing benchmark models across 5 quantitative metrics (accuracy 0.971, precision 0.804, recall 0.744, F1-score 0.770, and IoU score 0.630), achieving state-of-the-art status. Finally, through careful qualitative analysis, we show that the model is capable of addressing discontinuities, detecting small non-crack regions, handling low-quality images, and detecting crack contours more accurately than existing models.
A Review of the Factors Impacting the Optimal Placement of Data Centers
Viraj Nain

Viraj Nain

February 12, 2024
Our current digital landscape relies heavily on existing data centers to store and process information for online applications accessed by millions of users. Any failure of these operations significantly affects productivity as well as significant operations of an organization. Additionally, data centers are high energy consumers. Climate change and the scarcity of energy resources due to political or local constraints emphasize the need for us to deeply analyze the strategic placement of these data centers to understand factors contributing to risks and opportunities in data center placement.
IllusionX: An LLM-powered mixed reality personal companion
Ramez Yousri

Ramez Yousri

and 5 more

February 12, 2024
Mixed Reality (MR) and Artificial Intelligence (AI) are increasingly becoming integral parts of our daily lives. Their applications range in fields from healthcare to education to entertainment. MR has opened a new frontier for such fields as well as new methods of enhancing user engagement. In this paper, We propose a new system one that combines the power of Large Language Models (LLMs) and mixed reality (MR) to provide a personalized companion for educational purposes. We present an overview of its structure and components as well tests to measure its performance. We found that our system is better in generating coherent information, however it's rather limited by the documents provided to it. This interdisciplinary approach aims to provide a better user experience and enhance user engagement. The user can interact with the system through a custom-design smart watch, smart glasses and a mobile app.
OneTip Is Enough: A Non-Rigid Input Device for Single-Fingertip Human-Computer Intera...
Mingxuan Li
Yen Hang Zhou

Mingxuan Li

and 4 more

February 27, 2024
With the development of novel display technologies such as 3-D graphical applications, virtual and augmented reality, there are new demands for more efficient and natural multi-degree-of-freedom (DOF) interaction devices. However, traditional rigid touchscreens provide only the 2-D position of the touch point as input information, which cannot meet the needs of high DOF interactions, and many of the existing interaction technologies suffer from the problems of low resolution and lack of information. At the same time, non-rigid interaction interfaces provide deformability for input and have been shown to extend the richness of input vocabulary. This article proposes a new non-rigid input device, OneTip, for single-fingertip human-computer interaction with 6-DOF. In terms of design and manufacture, OneTip employs the bio-inspired design of Skin-On interfaces to mimic the sensitivity of human skin and provide 6-DOF interaction capabilities. In terms of sensing, OneTip uses the visuotactile sensing technique based on the marker displacement method to achieve high-resolution and multi-modal measurements. We propose a novel fingertip pose estimation method based on incipient slip detection, a non-learning algorithm that does not require registration and priori information. Experiments show that OneTip had good 6-D pose estimation accuracy, with RMSEs of translation and rotation not exceeding 0.1mm and 2.6°, respectively, within the linear interval. Extensive experiments were also conducted to explore the application of OneTip in typical virtual manipulation tasks and the possibility of combining it with other interaction devices. This work is intended to serve as a reference for other researchers exploring innovative interaction techniques.
A Survey on Thermal to Visual Cross Domain Face Recognition
Yaswanth Gavini

Yaswanth Gavini

and 3 more

February 12, 2024
Thermal to visual cross-domain face recognition is the process of recognising faces captured in the thermal domain and matching them with the visual domain images. Its significance lies in enabling accurate face recognition across different domains, allowing for enhanced security and surveillance capabilities in various environments, particularly during night-time or adverse weather conditions where thermal imaging excels. By bridging the gap between thermal and visual domains, this cross-domain face recognition improves the overall performance and applicability of face recognition systems in real-world scenarios. This survey explores the existing literature on thermal-to-visual cross-domain face recognition, categorising it into two distinct approaches: the first classification is based on the learning model, and the other is on generalisation approaches. This survey aims to provide a nuanced understanding of the current landscape, methodologies, and challenges in this evolving field by systematically reviewing and categorising the extensive body of research in thermal-to-visual cross-domain face recognition.
Don't Just Explain, Enhance! Using Explainable Artificial Intelligence (XAI) to Autom...
Pieter Barnard

Pieter Barnard

and 2 more

February 12, 2024
Explainable artificial intelligence (XAI) methodologies can demystify the behavior of machine learning (ML) "black-box" models based on the individual impact each feature has on the model's output. In the cybersecurity domain, explanations of this type have been studied from the perspective of a humanin-the-loop, where they serve an essential role in building trust from stakeholders, as well as aiding practitioners in tasks such as feature selection and model debugging. However, another important but largely overlooked use case of explanations emerges when they are passed as inputs into other ML models. In this sense, the rich information encompassed by the explanations can be harnessed at a fine-grained level and used to subsequently enhance the performance of the system. In this work, we outline a general methodology whereby explanations of a front-end network intrusion detection system (NIDS) are leveraged alongside additional ML to automatically improve the system's overall performance. We demonstrate the robustness of our methodology by evaluating its performance across multiple intrusion datasets and perform an in-depth analysis to assess its generalizability under various conditions, such as unseen environments, varying types of front-end NIDS and XAI methodologies, etc. The overall results indicate the efficacy of our methodology to produce significant improvement gains (i.e., up to +36% gains achieved across the considered metrics and datasets) that exceed those achieved by other state-of-the-art intrusion models from the literature.
SLYKLatent, a Learning Framework for efficient Facial Features Estimation
Samuel Adebayo

Samuel Adebayo

and 2 more

February 12, 2024
In this research, we present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets due to aleatoric uncertainties, covariant shifts, and test domain generalization. SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network and an inverse explained variance weighted training loss function. Our evaluation on benchmark datasets achieves an 8.7% improvement on Gaze360, rivals top MPI-IFaceGaze results, and leads on a subset of ETH-XGaze by 13%, surpassing existing methods by significant margins. Additionally, adaptability tests on RAF-DB and Affectnet show 86.4% and 60.9% accuracies, respectively. Ablation studies confirm the effectiveness of SLYKLatent's novel components. This approach has strong potential in human-robot interaction.
Trustworthy AI for Educational Metaverses

Faseela Abdullakutty

and 2 more

February 12, 2024
The metaverse-a 3D virtual universe is expected to significantly impact the education sector by making learning more accessible, personalized, and fun. The advancements in AI, blockchain, extended reality, big data, and cloud computing are the key enablers for the development of educational metaverses. The recent disruptions in AI, particularly, generative AI (GenAI) have transformed educational practices by generating human-like text, automating conversations, providing personalized learning experiences, and supporting students with disabilities. AI advancements together with immersive technologies hold immense potential to transform conventional education and learning by providing an interactive and immersive platform for seamless learning experiences. As GenAI advances, it is expected to generate more accurate and high-quality content, with future applications in the educational metaverse enhancing trustworthiness. This article contributes to background research on AI in education, a detailed study on the educational metaverse, a critical discussion on proactive measures to achieve trustworthy AI, and open research issues in the TAI in the educational metaverse context.
Revisiting Streaming Anomaly Detection: Benchmark and Evaluation
Yang Cao

Yang Cao

and 3 more

February 11, 2024
Anomaly detection in streaming data is a crucial task for many real-world applications, such as network security, fraud detection, and system monitoring. However, streaming data often exhibit concept drift, which means that the data distribution changes over time. This poses a significant challenge for anomaly detection algorithms, as they need to adapt to the evolving data to maintain high detection accuracy. Existing streaming anomaly detection algorithms lack a unified evaluation framework that can assess their performance and robustness under different types of concept drift and anomaly. In this paper, we conduct a systematic technical review of the state-of-the-art methods for anomaly detection in streaming data. We propose a new data generator, called SCAR (Streaming data generator with Customizable Anomalies and concept dRifts), that can synthesize streaming data based on synthetic and real-world datasets from different domains. Furthermore, we adapt four static anomaly detection models to the streaming setting using a generic reconstruction strategy as baselines, and then compare them systematically with 9 existing streaming anomaly detection algorithms on 76 synthesized datasets that exhibit various types of anomalies and concepts. The challenges and future research directions for anomaly detection in streaming data are also presented. All the codes and datasets are publicly available at https://github.com/yixiaoma666/ SCAR.
On BiasWrappers: New Regularization Techniques for Machine Learning Regression
Karthik Singaravadivelan

Karthik Singaravadivelan

February 12, 2024
Regressive models in machine learning require regularization to balance the bias-variance tradeoff and attain realistic predictions in the real world. Two new regularization techniques, referenced as BiasWrappers, will be discussed in this paper: BiasWrapperC1 and BiasWrapperC2. BiasWrapperC1 uses a form of penalization to prevent models from consistently overshooting or undershooting. BiasWrapperC2 uses a modified layer of regression stacking to identify correlations of a regression model’s error. The techniques’ logics will be discussed through pseudocode in the context of machine learning regression. The regularization techniques are applied to machine learning models and compared with other regularization techniques through a series of carefully chosen datasets, and these metrics are used to hypothesize about the implications of these new techniques. All implementations are referenced with pseudocode in the paper, with external testing wrappers programmed in Python. An experimental study was conducted with standard regression datasets and showed the regularizations’ value propositions in multi-output data and outlier-based data.
A Mathematical Theory of Semantic Communication
niukai

Kai Niu

and 1 more

February 12, 2024
The year 1948 witnessed the historic moment of the birth of classic information theory (CIT). Guided by CIT, modern communication techniques have approached the theoretic limitations, such as, entropy function H(U), channel capacity C = max p(x) I(X; Y) and rate-distortion function R(D) = min p(x|x):Ed(x,x)≤D I(X; X). Semantic communication paves a new direction for future communication techniques whereas the guided theory is missed. In this paper, we try to establish a systematic framework of semantic information theory (SIT). We investigate the behavior of semantic communication and find that synonym is the basic feature so we define the synonymous mapping between semantic information and syntactic information. Stemming from this core concept, synonymous mapping, we introduce the measures of semantic information, such as semantic entropy H s (Ũ), up/down semantic mutual information I s (X; Ỹ) (I s (X; Ỹ)), semantic capacity C s = max p(x) I s (X; Ỹ), and semantic rate-distortion function R s (D) = min p(x|x):Eds(x, x)≤D I s (X; X). Furthermore, we prove three coding theorems of SIT by using random coding and (jointly) typical decoding/encoding, that is, the semantic source coding theorem, semantic channel coding theorem, and semantic rate-distortion coding theorem. We find that the limits of SIT are extended by using synonymous mapping, that is, H s (Ũ) ≤ H(U), C s ≥ C and R s (D) ≤ R(D). All these works composite the basis of semantic information theory. In addition, we discuss the semantic information measures in the continuous case. Especially, for band-limited Gaussian channel, we obtain a new channel capacity formula, C s = B log S 4 1 + P N0B with the synonymous length S. In summary, the theoretic framework of SIT proposed in this paper is a natural extension of CIT and may reveal great performance potential for future communication.
Quantized Embedding Vectors for Controllable Diffusion Language Models
Cheng Kang

Cheng Kang

and 3 more

February 06, 2024
Improving the controllability, portability, and inference speed of diffusion language models (DLMs) is a key challenge in natural language generation. While recent research has shown significant success in complex text generation with language models, the memory and computational power are still very demanding and fall short of expectations, which naturally results in low portability and instability for the models. To mitigate these issues, numerous well-established methods were proposed for neural network quantization. To further enhance their portability of independent deployment as well as improve their stability evaluated by language perplexity, we propose a novel approach called the Quantized Embedding Controllable Diffusion Language Model (QE-CDLM). QE-CDLM builds upon the recent successful controllable DLMs by remodeling the task-specific embedding space via quantization. This leads to a gradient-based controller for the generation tasks, and more stable intermediate latent variables are obtained, which naturally brings in an accelerated convergence as well as better controllability. Additionally, the adaption fine-tuning method is employed to reduce tunable weights. Experimental results on five challenging fine-grained control tasks demonstrate that QE-CDLM compares favorably to existing methods in terms of quality and feasibility, achieving better perplexity and lightweight fine-tuning.
Two Heads Better than One: Dual Degradation Representation for Blind Super-Resolution

Hsuan Yuan

and 7 more

February 06, 2024
Previous methods have demonstrated remarkable performance in single image super-resolution (SISR) tasks with known and fixed degradation (e.g., bicubic downsampling). However, when the actual degradation deviates from these assumptions, these methods may experience significant declines in performance. In this paper, we propose a Dual Branch Degradation Extractor Network to address the blind SR problem. While some BlindSR methods assume noisefree degradation and others do not explicitly consider the presence of noise in the degradation model, our approach predicts two unsupervised degradation embeddings that represent blurry and noisy information, respectively. The SR network can then be adapted to blur embedding and noise embedding in distinct ways. Furthermore, we treat the degradation extractor as a regularizer to capitalize on the differences between SR and HR images. Extensive experiments on several benchmarks demonstrate that our method achieves SOTA performance in the blind SR problem.
A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS L...
Ahmet yusuf alan
Enis Karaarslan

Ahmet yusuf alan

and 2 more

February 13, 2024
Challenges exist in learning and understanding religions, such as the complexity and depth of religious doctrines and teachings. Chatbots as question-answering systems can help in solving these challenges. LLM chatbots use NLP techniques to establish connections between topics and accurately respond to complex questions. These capabilities make it perfect for enlightenment on religion as a question-answering chatbot. However, LLMs also tend to generate false information, known as hallucination. Also, the chatbots' responses can include content that insults personal religious beliefs, interfaith conflicts, and controversial or sensitive topics. It must avoid such cases without promoting hate speech or offending certain groups of people or their beliefs. This study uses a vector database-based Retrieval Augmented Generation (RAG) approach to enhance the accuracy and transparency of LLMs. Our question-answering system is called “MufassirQAS''. We created a database consisting of several open-access books that include Turkish context. These books contain Turkish translations and interpretations of Islam. This database is utilized to answer religion-related questions and ensure our answers are trustworthy. The relevant part of the dataset, which LLM also uses, is presented along with the answer. We have put careful effort into creating system prompts that give instructions to prevent harmful, offensive, or disrespectful responses to respect people's values and provide reliable results. The system answers and shares additional information, such as the page number from the respective book and the articles referenced for obtaining the information. MufassirQAS and ChatGPT are also tested with sensitive questions. We got better performance with our system. Study and enhancements are still in progress. Results and future works are given.  
A Cross Attention Approach to Diagnostic Explainability Using Clinical Practice Guide...
Sumit Dalal

Sumit Dalal

and 5 more

February 06, 2024
The lack of explainability using relevant clinical knowledge hinders the adoption of Artificial Intelligence-powered analysis of unstructured clinical dialogue. A wealth of relevant, untapped Mental Health (MH) data is available in online communities, providing the opportunity to address the explainability problem with substantial potential impact as a screening tool for both online and offline applications. We develop a method to enhance attention in popular transformer models and generate clinician-understandable explanations for classification by incorporating external clinical knowledge. Inspired by how clinicians rely on their expertise when interacting with patients, we leverage relevant clinical knowledge to model patient inputs, providing meaningful explanations for classification. This will save manual review time and engender trust. We develop such a system in the context of MH using clinical practice guidelines (CPG) for diagnosing depression, a mental health disorder of global concern. We propose an application-specific language model called ProcesS knowledge-infused cross ATtention (PSAT), which incorporates CPGs when computing attention. Through rigorous evaluation on three expert-curated datasets related to depression, we demonstrate application-relevant explainability of PSAT. PSAT also surpasses the performance of nine baseline models and can provide explanations where other baselines fall short. We transform a CPG resource focused on depression, such as the Patient Health Questionnaire (e.g. PHQ-9) and related questions, into a machine-readable ontology using SNOMED-CT. With this resource, PSAT enhances the ability of models like GPT-3.5 to generate application-relevant explanations.
TatarTTS: An Open-Source Text-to-Speech Synthesis Dataset for the Tatar Language
Daniil

Daniil Orel

and 4 more

February 06, 2024
This paper introduces an open-source dataset for speech synthesis in the Tatar language. The dataset comprises approximately 70 hours of transcribed audio recordings, featuring two professional speakers (one male and one female). Notably, it is the first large-scale dataset of its kind that is publicly available, aimed at promoting Tatar text-to-speech (TTS) applications in both academic and industrial contexts. The paper describes the procedures for developing the dataset, discusses the challenges faced, and outlines important future directions. To demonstrate the reliability of the dataset, baseline end-to-end TTS models were built and evaluated using the subjective mean opinion score (MOS) measure. The dataset, training recipe, and pre-trained TTS models are publicly available.
Parallel Computation of Shallow Water Flows Using Hybrid MPI/OpenACC
Syngman Rhee

Syngman Rhee

February 06, 2024
A parallel shallow water flow model is introduced in this paper. The explicit-time finite volume approach is adopted to solve the 2D shallow water equations on an unstructured triangular mesh. The proposed scheme is second-order accurate in temporal and spatial terms using the two-stage Runge-Kutta and the monotone upwind scheme for conservation law (MUSCL) methods, respectively. Based on Message Passing Interface (MPI) and OpenACC, a multi-GPU model is presented with the METIS library to produce the domain decomposition. A CUDA-aware MPI library through GPUDirect for peer-to-peer (P2P) transfer between two GPUs and overlapping computation and MPI communication are used to speed up MPI memory exchange and the performance of the code. A 2D circular dam break test with wet and dry downstream beds and grid resolutions of about 2 million cells is considered to verify the accuracy of the code, and good results were achieved compared to the numerical simulations of published studies. Compared with the multi-CPU version of the 6-core CPU, maximum speedups of 56.18 and 331.51 were obtained using the single-GPU and multi-GPU versions, respectively. Results indicate that acceleration performance improves as the mesh resolution increases.
Neuromorphic Event-based Processing of Transradial Intraneural Recording for Online G...
Farah Baracat

Farah Baracat

and 4 more

February 06, 2024
A document by Farah Baracat. Click on the document to view its contents.
Benchmarking Reservoir Computing for Residential Energy Demand Forecasting
Karoline Brucke

Karoline Brucke

and 6 more

February 12, 2024
In the energy sector, accurate demand forecasts are vital but often limited by the available computational power. Reservoir computing (RC) or echo-state networks excel in chaotic time series prediction, with lower computational requirements compared to other recurrent network based methods like LSTMs. Next-generation reservoir computing (NG-RC) is a newer, more efficient variant of classical RC originating from nonlinear vector autoregression and therefore missing the randomness of classical RC. In our study, we evaluate RC and NG-RC for day-ahead energy demand predictions on four data sets and compare it to LSTMs and a naive persistence approach. We find that NG-RC outperforms all other methods when considering the root mean squared error on all data sets but struggles with very small or zero demands. Additionally, it offers a very computationally effective hyperparameter optimization and excels in replicating the inherent volatility and the erratic behavior of energy demands.
On the Dependability of Bidirectional Encoder Representations from Transformers (BERT...

Zhen Gao

and 7 more

January 29, 2024
Transformers are widely used in natural language processing and computer vision, and Bidirectional Encoder Representations from Transformers (BERT) is one of the most popular pre-trained transformer models for many applications. This paper studies the dependability and impact of soft errors on BERT implemented with different floating-point formats using two case studies: sentence emotion classification and question & answering. Simulation by error injection is conducted to assess the impact of errors on different parts of the BERT model and different bits of the parameters. The analysis of the results leads to the following findings: 1) in both single and half precision, there is a Critical Bit (CB) on which errors significantly affect the performance of the model; 2) in single precision, errors on the CB may cause overflow in many cases, which leads to a fixed result regardless of the input; 3) in half precision, the errors do not cause overflow but they may still introduce a large accuracy loss. In general, the impact of errors is significantly larger in singleprecision than half-precision parameters. Error propagation analysis is also considered to further study the effects of errors on different types of parameters and reveal the mitigation effects of the activation function and the intrinsic redundancy of BERT.
P2C: A Paths-to-Crowds Framework to Parameterize Behaviors
Marilena Lemonari

Marilena Lemonari

and 6 more

January 29, 2024
Simulating believable crowds heavily relies upon the perceived realism and diversity of the agents' behaviors, whilst generating novel crowd simulations highly depends on being able to easily manipulate intuitive behavior parameters. We present P2C (Paths-to-Crowds), a method that parameterizes reference crowd data and enables the simulation of similar behaviors in different environments. This approach enables explainable and fine-grained control of simulations since artists can modify at run-time a small set of intuitive parameters in local regions. We incorporate the integration of four fundamental behaviors: goal-seeking, grouping, interaction with areas of interest, and connectivity-into an existing Reinforcement Learning (RL)-based crowd simulation system, which facilitates the creation of customizable agent behaviors and interactions. To learn a parameter model, we synthesize numerous simulations by sampling the parameter space; we then use these data to learn a model that outputs the underlying parameters. To achieve this, we encode the simulation in a set of 2D maps that encode different measurements such as velocities, occupancy, interpersonal distances, path deviations, etc. The trained model can then be used to infer parameters in localized regions of given crowd data (both real and simulated). This approach enables replication of behavior, transfer to new environments, real-time local control, editing of parameters, and explainability of behaviors (with respect to the fundamental behaviors). We evaluate our model's predictive power on real data and compare it against existing baselines. This along with the accompanying user study reveals P2C's potential in terms of achieving behavior realism and diversity.
← Previous 1 2 3 4 5 6 7 8 9 … 132 133 Next →
Back to search
Authorea
  • Home
  • About
  • Product
  • Preprints
  • Pricing
  • Blog
  • Twitter
  • Help
  • Terms of Use
  • Privacy Policy