loading...

Publications
Publications
2023
Automatic subtitle generation for Bengali multimedia using deep learning
Ehsanur Rahman Rhythm, Shafakat Sowroar Arnob, Rajvir Ahmed Shuvo, Annajiat Alim Rasel and Sifat E Jahan
BSc. Thesis 2023
Abstract:

For audio or video material to be more inclusive and accessible, automatic subtitle generation is essential. Nevertheless, implementing this technology into Bengali presents significant challenges due to scarce resources and linguistic difficulty. In this study, a new deep learning based system for creating Subtitles for Bengali multimedia automatically is introduced. The suggested approach makes use of the Wav2vec2 and the Common Voice Bengali Dataset, a large collection of Bengali audio recordings. This study uses the Common Voice Dataset Bengali to train and tune the Wav2vec2 model in order to accurately convert Bengali audio into text. Current automatic speech recognition approaches are combined with Bengali language-specific factors in the created system to give accurate and reliable transcription works. The transcribed text is synced with the matching audio parts throughout the subtitle production process. The produced subtitles are enhanced using post-processing approaches, similar to capitalization and punctuation restoration, to ensure readability and consistency. The findings of this study might greatly improve Bengali language media’s usability and availability across a range of sectors. The created subtitles may enhance the watching experience for Bengali multimedia by easing greater understanding, and expanding availability. The study demonstrates the potential of using deep learning and ASR methods to get over the difficulties of automated subtitle production in the Bengali language, advancing multimedia availability and inclusion.

2023
ReSkipNet: Skip Connected Convolutional Autoencoder for Original Document Denoising
Mohammad Muhibur Rahman*, Anushua Ahmed*, Mohammad Rakibul Hasan Mahin, Fahmid Bin Kibria, Waheed Moonwar, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Data pre-processing, data analysis, and Optical Character Recognition need a huge amount of clean data, and document images are usually a good source for this. However, document images frequently exhibit blurring and various other forms of noise, which can pose challenges in their manipulation and analysis. To denoise and deblur such document images, autoencoders have been used for a long time. For this task, we propose a novel Convolutional Autoencoder Network which is composed of multiple skip-connected residual blocks and other layers for supporting the encoder and decoder parts. This model not only uses less computational power to denoise existing document image datasets but also performs well. While prior research primarily concentrates on optimizing evaluation metrics, our approach additionally prioritizes larger resolution input sizes. This characteristic of using larger image sizes enhances its practicality and usability as real-world documents are typically characterized by a higher word density. Moreover, in order to further advance the development of our model, we produced an original dataset and proceeded to train our model on this dataset, resulting in satisfactory outcomes.

2023
Siamese-Transformer Network for Offline Handwritten Signature Verification using Few-shot
Prattoy Majumder, AFM Mohimenul Joaa, Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Handwritten signature verification is a crucial task with applications spanning authentication, financial transactions, and legal documents. In scenarios where only a single reference signature is available, the challenge of accurate verification becomes pronounced due to variations in writing styles, distortions, and limited labeled data. In this paper, we propose a novel Siamese-Transformer network tailored for handwritten signature verification using few-shot learning. By synergizing Siamese neural networks and Transformer architectures, our model excels in capturing contextual relationships and discerning genuine from forged signatures. A triplet loss function facilitates discriminative feature learning. Convolution layers extract local features from an image, while the transformer component utilizes these local features to capture global dependencies within signatures. Experimental results on benchmark datasets showcase the model’s superior performance in few-shot verification scenarios, marking it as a promising advancement in signature verification and few-shot learning techniques.

2023
Sentiment Analysis of Amazon Reviews Using Machine Learning Classifier
Razin Sumyta Monsoor, Tania Sultana Tamanna, Salequzzaman Khan, Shehrin Hoque, Mahdi Islam, Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Reviews can significantly impact a company’s reputation in the market, potentially influencing its overall business outcomes, either positively or negatively. This is especially crucial for companies that operate primarily through e-commerce platforms. Hence, it is vital for companies to pay close attention to customer reviews. Sentiment Analysis, often referred to as "opinion mining," is a significant procedure in Natural Language Processing (NLP) which serves the purpose of ascertaining the emotional tone of a provided text and categorizing it into positive, negative, or neutral perspectives. In this paper, sentiment analysis methodology is presented for classifying Amazon reviews which utilizes a large dataset of reviews and employs Multinomial Naïve Bayesian (MNB), Support Vector Machine (SVM), Maximum Entropy (ME), and Logistic Regression as the primary classifiers by the authors. With the aid of machine learning, we employed a supervised learning approach to an extensive Amazon dataset in order to categorize it based on sentiment polarity, achieving a high level of accuracy for the results. Here, we utilized the Kaggle dataset that includes a substantial volume of reviews and associated metadata which comprises customer reviews and ratings on Amazon products.

2023
Contrail Analysis through Advanced Neural Network Architectures: Image Segmentation and Classification
Ashraful Alam Nirob, Shahriar Ahmed, Tahmidul Karim Takee, Adnan Rahman Eshan, Shadik Ul Haque, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICCIT 2023
Abstract:

The aviation industry’s immense expansion is having an impact on global warming and has resulted in some significant environmental issues. When an airplane passes directly over them, tiny crisscross patterns, often known as contrails, may be visible. They are to blame for this effect. Contrails are really just airborne particles that have been compressed with water. They are uncommon since ice can only form under particular climatic circumstances, such as extremely cold, hot, humid, and saturating air. Even worse, because of the cooler climate at night, it is more dangerous because it has more time to live. They gather heat from the sun and store it, then release it into the atmosphere. Some experts have also warned the public that the radiation these contrails produce may be more damaging to the atmosphere than previously predicted. For this reason, scientists are looking for methods to reduce these contrails by comprehending their behaviors and patterns. Now, the proposed study segments and classifies images of contrails acquired from satellite data. In this study, complex neural network architectures, including U-Net, DeepLab, Attention Mechanism, and ResNet50 with CNN, are used to segment and binary classify those photos. These architectural frameworks will aid this research in effectively classifying and segmenting those contrails from the satellite images so that further research can comprehend and observe their patterns and behaviors.

2023
Skin Lesion Detection and Classification Using Machine Learning: A Comprehensive Approach for Accurate Diagnosis and Treatment
H M Layes Delower, Tasin Mohammad, Shifath Jahan Prity, Maliha Binta Islam, Mohammod Tahseen Mansoor, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Cutaneous abnormalities, commonly known as skin lesions, form a broad spectrum of skin irregularities that necessitates proper identification and immediate treatment. A significant development in the utilization of machine learning approaches for analyzing medical imagery has been observed recently - particularly its effectiveness in the automatic detection and categorization of skin lesions. This academic study discusses an extensive technique for recognizing and categorizing skin lesions using machine learning protocols. The key cornerstone is the HAM10000 dataset, which comprises 10,000 images portraying discolored skin conditions varying in types and patient demographics. Our analysis examines the efficiency of Decision Trees, Support Vector Machines (SVMs), Random Forests, and K-Nearest Neighbors (KNN) in detecting and classifying such tensions on the dermis. Stringent evaluation procedures involving Accuracy, Precision, Recall rate, along with F1-score have been employed to measure these algorithms’ efficacy alongside their possible influence within clinical settings. Overall model performance was strong, with Support Vector Machines (SVMs) acquiring the highest accuracy of 94.8%, while Decision Tree gave an accuracy of 94.3%, Random Forest coming to a close third with an accuracy of 94.1%, and finally K-Nearest Neighbors (KNN), which gave us an accuracy of 93.7%. The presented methodology contributes meaningfully to progress in dermatology by generating precise diagnostic instruments that are beneficial for both healthcare professionals as well as patients suffering from these anomalies. This inquiry underscores how machine learning could elevate health outcomes by improving early recognition processes and enabling personalized therapeutics directed at treating skin lesions effectively.

2023
Vision Meets Language: Multimodal Transformers Elevating Predictive Power in Visual Question Answering
Sajidul Islam Khandaker, Tahmina Talukdar, Prima Sarker, Md Humaion Kabir Mehedi, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Visual Question Answering (VQA) is a field where computer vision and natural language processing intersect to develop systems capable of comprehending visual information and answering natural language questions. In visual question answering , algorithms interpret real-world images in response to questions expressed in human language. Our paper presents an extensive experimental study on Visual Question Answering (VQA) using a diverse set of multimodal transformers. The VQA task requires systems to comprehend both visual content and natural language questions. To address this challenge, we explore the performance of various pre-trained transformer architectures for encoding questions, including BERT, RoBERTa, and ALBERT, as well as image transformers, such as ViT, DeiT, and BEiT, for encoding images. Multimodal transformers’ smooth fusion of visual and text data promotes cross-modal understanding and strengthens reasoning skills. On benchmark datasets like the Visual Question Answering (VQA) v2.0 dataset, we rigorously test and fine-tune these models to assess their effectiveness and compare their performance to more conventional VQA methods. The results show that multimodal transformers significantly outperform traditional techniques in terms of performance. Additionally, the models’ attention maps give users insights into how they make decisions, improving interpretability and comprehension. Because of their adaptability, the tested transformer topologies have the potential to be used in a wide range of VQA applications, such as robotics, healthcare, and assistive technology. This study demonstrates the effectiveness and promise of multimodal transformers as a method for improving the effectiveness of visual question-answering systems.

2023
Genre Classification: A Machine Learning Based Comparative Study of Classical Bengali Literature
Asadullah Al Galib, Maisha Mostofa Prima, Satabdi Rani Debi, MD Muntasir Mahadi, Nayema Ahmed, Ehsanur Rahman Rhythm, Adib Muhammad Amit and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Bengali literature, specifically classical Bengali literature has been a source of inspiration, a spark for paradigm-shifting revolutions, and the sole sustaining source of cultural thirst for hundreds of millions of people over many generations. Unfortunately, very few attempts have been made to analyze this never-ending collection of literary works from the luminary figures of Bengali literature. The availability of high-quality research-ready datasets comprising all the authenticated literary works has been a key obstacle in conducting NLP research, utilizing the most recent advancements in deep learning and large language models. Identifying the genre of a given text snippet is a key step in analyzing a vast collection of works comprising different styles, themes, and motivations from classical authors. From classifying previously unexplored archival documents to identifying and suggesting similar literary works for modern recommender engines, genre classification opens the door for many downstream and specialized use cases. In this project, we initiate an ambitious goal of compiling a comprehensive dataset of literary works from classical authors and eventually extending the collection to contemporary writers as well. We explore both classical methods such as Naive Bayes as well as LSTM and recent transformer-based models to classify genre from short text snippets. We concluded that fine-tuning pre-trained BERT models produced much higher accuracy than both classical and LSTM models.

2023
Analyzing Public Sentiment on Social Media during FIFA World Cup 2022 using Deep Learning and Explainable AI
Shafakat Sowroar Arnob, M. A. Ahad Shikder, Tashfiq Alam Ovey, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICCIT 2023
Abstract:

Analysis of public sentiment is extremely useful for comprehending the responses of the general public during important events, and the FIFA World Cup 2022 was no exception. Within the scope of this study, we used deep learning models such as roBERTa, distilBERT, and XLNet to conduct an analysis of the views that were stated on Twitter during the first day of the tournament. These models were fine-tuned using a comprehensive dataset consisting of 30,000 tweets, which had been preprocessed. The performance of these models was assessed using measures such as accuracy, F1-score, precision, recall, etc. In addition, we used an Explainable AI known as Local Interpretable Model-Agnostic Explanations (LIME) so that we could better understand how model decisions were made in sentiment classification. Our research has shown that roBERTa is an excellent model for classifying sentiment, and it has also shown the significance of interpretability achieved using LIME. Our research enhances the understanding of sentiment analysis during major sports events and suggests future directions for research in this domain.

2023
Text-based Q&A: Automated Question Generation and Answering for Enhanced Data Processing
Ehsanur Rahman Rhythm, Abdul Halim Hosain, Nusrat Zaman Raya, Kazi Al Refat Pranta, Tonusree Talukder Trina, Md Sabbir Hossain, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
CATS 2023
Abstract:

In the age of information overload, where the world is getting increasingly digital, traditional methods of learning are becoming tedious and extensively outdated. Imagine a system with just a few clicks that can quickly generate perplexing questions and enlightening solutions from a given text. Likewise, this paper represents a groundbreaking system that uses state-of-the-art of natural language processing techniques to analyze subject-specific chapters to create questions and corresponding solutions of varying lengths. The system's versatility as an ideal tool for a wide range of users, including students, researchers, and educators, is a result of its capability to handle a wide range of domains. By offering questions of insight along with appropriate answers, this aforementioned structure demonstrated its extraordinary accuracy and competency in our studies on an array of informational datasets. Altering the learning process and promoting knowledge discovery, this program is flexible in delivering brief or comprehensible solutions to inquiries, having the potential to completely change how individuals interact with written material, whether they are reading for short reference or conducting any depth research. Our suggested framework establishes a dynamic platform for immediate information that enables people to learn substantially more and comprehend any topic in depth. With this leading-edge method, bid goodbye to exhausting manual question generation and get ready to embrace a new era of seamless and fruitful learning with this cutting-edge system.

2023
A Comparative Analysis of Customer Service Chatbots: Efficiency, Usability and Application
Kefaiat Lamia Ehsani, Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
CATS 2023
Abstract:

A computer programme that imitates and processes human interaction, either through the use of voice or text communication, is known as a chatbot. Its purpose is to be of assistance in the process of finding a solution to a problem. The transformation brought on by advances in technology has had an effect on every industry. The chatbot provides assistance with a wide variety of tasks, including Reservations, Customer Service, and a great number of other services. The fast development of technologies relating to artificial intelligence and natural language processing has resulted in an increase in the use of chatbots in a variety of fields, most notably in the field of customer service. Customers could receive advice that is prompt, accurate, and personalised through the use of chatbots, which has the potential to completely transform customer service. Because it can automate customer service and reduce the amount of work that needs to be done by humans, it has gained a lot of popularity in the business world. Which can help businesses improve the experience they provide for their customers. The purpose of this research is to undertake a comparative review of customer service chatbots, with a particular emphasis on their efficiency, usability, and application across a variety of business sectors. The research will uncover best practises, difficulties, and potential for improvement by analysing a variety of chatbot solutions.

2023
Unveiling Twitter Sentiments: Analyzing Emotions and Opinions through Sentiment Analysis on Twitter Dataset
Jannatul Ferdoshi, Samirah Dilshad Salsabil, Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
CATS 2023
Abstract:

Social media plays a vital role in our daily lives. To understand and interpret emotions and opinions expressed on social media platforms, analyzing sentiment is very important. Our study is based on Twitter sentiment analysis. Our aim is to classify tweets automatically as positive, negative, or neutral based on their content using natural language processing and machine learning algorithms. The dataset we used for our analysis is extracted from the website called mendeley data and also we have added some tweets manually which covers various topics. To remove noise, including URLs, hashtags, punctuations, and user mentions, and to retain essential textual content and emojis, we pre-processed the dataset. Additionally, for our research, we used VADER (Valence Aware Dictionary and sentiment Reasoner) and Transformers-RoBERTa to analyze the sentiment of various tweets. We evaluate the performance of these two models using evaluation metrics such as accuracy, precision, recall and F1-score, and also confusion metrics on the testing set. We also discuss the studys limitations and conclude that machine learning-based sentiment analysis models are a reliable tool for the sentiment analysis of the twitter dataset.

2023
Comparative Analysis of Traditional and Contextual Embedding for Bangla Sarcasm Detection in Natural Language Processing
Kaji Mehedi Hasan Fahim, Mithila Moontaha, Mashrur Rahman, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
COMNETSAT 2023
Abstract:

Sarcasm, a sort of sentiment characterized by a disparity between the apparent and intended meanings of the text, is a key component of sentiment analysis, opinion extraction, and social media analytics. However, sarcasm detection in Bangla has not received sufficient research attention yet. Moreover, there hasn’t been a significant amount of study done comparing traditional and contextual word embeddings for the Bengali language. This study aims to address this gap by comparing traditional embedding by using the Bidirectional Gated Recurrent Unit - BiGRU model and contextual embedding by using Bidirectional Encoder Representations from Transformers - BERT for sarcasm detection in Bangla. The collection of the dataset of Bangla text was from social media platforms, containing labelled instances - whether it provides sarcasm or non-sarcasm. Pre-trained word embeddings i.e. GloVe and FastText are used as traditional embedding for this study. By using metrics like precision, recall and F1-score, the performances for both models have been obtained. When the two traditional word embedding approaches are compared, GloVe embedding with Bi-GRU has outperformed FastText embedding with a macro-averaged F1 score of 0.9395. On the other hand, contextual word embedding using BERT has outperformed both the traditional approaches having a better macro-averaged F1 score of 0.9572 and greater class-wise performance as compared with traditional embedding for both non-sarcastic (96%) and sarcastic (96%) text detection. In our findings, contextual word embedding i.e. BERT has performed better as compared with the two traditional word embeddings for this specific Bangla sarcasm detection binary classification task.

2023
Bengali Misogyny Identification with Deep Learning and LIME
Shafakat Sowroar Arnob, M. A. Ahad Shikder, Tashfiq Alam Ovey, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
COMNETSAT 2023
Abstract:

The increase of misogyny across social media platforms highlights the urgent need to create efficient tools for recognizing and responding to gender-based online abuse. This study explores the complex problem of identifying instances of sexism in the Bengali language, a field that has a limited amount of research conducted due to a lack of financial resources and academic interest. We study the performance of BERT-based architectures, in recognizing misogynistic language by using the capabilities of deep learning models. Our research hypothesis is that enhancing the mBERT model with linguistic and cultural variety by employing multilingual training such as merging Bengali, Hindi, and English data for training will improve the ability to detect misogyny in Bengali, potentially transcending language barriers. We offer two extensive experiments that assess the performance of the models and give insight into the strengths and limits of those models. In addition, we employ LIME to uncover the decision-making processes of the models, enhancing their interpretability. Our results contribute to the development of improved methods for identifying online sexism, offering insights for the creation of safer digital environments. This study lays the groundwork for future research on language-specific nuances and cross-lingual trends in the field of gender-based abuse detection.

2023
Detecting Derogatory Comments on Women using Transformer-Based Models
Sara Jerin Prithila, Fariha Hasan Tonima, Tahsina Tajrim Oishi, Md. Nazrul Islam, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
COMNETSAT 2023
Abstract:

Natural Language Processing (NLP) is a piqued interest field nowadays, as it helps AI to understand and interpret human languages. In order to facilitate the advancement in this field, in this paper, we propose research on the detection of derogatory comments against women with the help of transformer-based models. Here, our main focus is to detect misogynistic comments, as the women of our country mainly get harassed by such texts. This paper aims to make a comparative study on how efficient transformer models are in detecting gender-biased slandering in languages such as English and Bengali. To carry out this research procedure, the datasets we used were in English and Bengali languages which were further trained across the following transformer models: BanglaBERT, XLM-RoBERTa, m-BERT, and DistilBERT. To give further richness to the paper, the Bengali and English datasets used were created by combining multiple different datasets in these languages. The datasets were extracted from various papers related to this or a similar field of research to help reduce biases and improve language understanding capability. Upon, training our datasets across the mentioned models, for the Bengali dataset, Bangla-BERT-Base performed the best with an F1 score of 94% and for the English dataset, m-BERT scored the best with an F1 score of 86.1%. To add on, since the paper mostly focuses on the Bengali language, it will furthermore, encourage others to increase research on low-resourced languages.

2023
Large Scale Web Crawling and Distributed Search Engines: Techniques, Challenges, Current Trends, and Future Prospects
Asadullah Al Galib, Md Humaion Kabir Mehedi, Ehsanur Rahman Rhythm and Annajiat Alim Rasel
ICOCI 2023
Abstract:

The heart of any substantial search engine is a crawler. A crawler is a program that collects web pages by following links from one web page to the next. Due to our complete dependence on search engines for finding information and insights into every aspect of human endeavors, from finding cat videos to the deep mysteries of the universe, we tend to overlook the enormous complexities of today’s search engines powered by the web crawlers to index and aggregate everything found on the internet. The sheer scale and technological innovation that enabled the vast body of knowledge on the internet to be indexed and easily accessible upon queries is constantly evolving. In this paper, we look at the current state of the massive apparatus of crawling the internet, specifically focusing on deep web crawling, given the explosion of information behind an interface that cannot be extracted from raw text. We also explore distributed search engines and the way forward for finding information in the age of large language models like ChatGPT or Bard. Our primary goal is to explore the junction of large-scale web crawling and search engines in an integrative approach to identify the emerging challenges and scopes in massive data where recent advancements in AI upend traditional means of information retrieval. Finally, we present the design of a new asynchronous crawler that can extract information from any domain into a structured format.

2023
Advancements in Optical Character Recognition for Bangla Scripts
MD Tanjim Mostafa, Ehsanur Rahman Rhythm, Md Humaion Kabir Mehedi and Annajiat Alim Rasel
ASYU 2023
Abstract:

Optical Character Recognition (OCR) systems are very powerful tools that are used to convert handwritten texts or digital data on an image to machine readable texts. The importance of Optical Character Recognition for handwritten documents cannot be overstated due to its widespread use in human transactions. OCR technology allows for the conversion of various types of documents or images into machine understandable data that can be analyzed, edited, and searched. In earlier years, manually crafted feature extraction techniques were used on comparatively small datasets which were not good enough for practical use. With the advent of deep learning, it was possible to perform OCR tasks more efficiently and accurately than ever before. In this paper, several OCR techniques have been reviewed. We mostly reviewed works on Bangla scripts and also gave an overview of the contemporary works and recent progresses in OCR technology (e.g. TrOCR, transformer w/ CNN). It was found that for Bangla handwritten texts, CNN models like DenseNet121, ResNet50, MobileNet etc are the commonly adopted techniques because of their state of the art performance in object recognition tasks. Using an RNN layer like LSTM or GRU alongside the base CNN-based architecture, the accuracy can be further improved. TrOCR is a fairly new technique in this field that shows promise. Experimental results show that in synthetic IAM handwriting dataset it showed a Character Error Rate (CER) of 2.89. The goal of this paper is to provide a summary of the research conducted on character recognition of handwritten documents in Bangla Scripts and suggest future research directions.

2023
Sentiment Analysis of Restaurant Reviews from Bangladeshi Food Delivery Apps
Ehsanur Rahman Rhythm, Rajvir Ahmed Shuvo, Md Sabbir Hossain, Md. Farhadul Islam and Annajiat Alim Rasel
ESCI 2023
Abstract:

In this study, we conducted sentiment analysis on restaurant reviews from Bangladeshi food delivery apps using natural language processing techniques. Food delivery apps have become increasingly popular in Bangladesh, and understanding the sentiment of customer reviews can provide valuable insights for restaurant owners and food delivery app companies. In this research, we have created a dataset named "Bangladeshi Restaurant Reviews" by gathering customer reviews of restaurants available on Foodpanda and Hungrynaki, which are two popular food delivery apps in Bangladesh. We used Robustly Optimized BERT Pretraining Approach (RoBERTa), AFINN, and DistilBERT, a distilled version of Bidirectional Encoder Representations from Transformers (BERT) to perform the sentiment analysis. Overall, this research paper highlights the importance of sentiment analysis in the food delivery industry and demonstrates the effectiveness of different models in performing this task. It also provides insights for businesses looking to use sentiment analysis to improve their services and products. The accuracy of the models evaluated, RoBERTa, AFINN, and DistilBERT, were 74\%, 73\%, and 77\% respectively.

2023
Distributed Computing for Big Data Analytics: Challenges and Opportunities
Ehsanur Rahman Rhythm, Rajvir Ahmed Shuvo, Md Humaion Kabir Mehedi, Md Sabbir Hossain and Annajiat Alim Rasel
ResearchGate 2023
Abstract:

This paper explores the application of distributed computing systems for the processing and analysis of large data sets, also referred to as big data. The paper outlines the various challenges that can arise when working with big data, including issues with data storage, data processing, and data management. The report also explores the opportunities distributed computing systems present for overcoming these challenges and enabling efficient and effective big data analytics. Overall, the paper provides a detailed overview of distributed computing for big data analytics and offers insights into the potential benefits and drawbacks of using these systems for big data analysis.

2022
Bengali Speech Recognition: An Overview
Mashuk Arefin Pranjol, Farhin Rahman, Ehsanur Rahman Rhythm, Rajvir Ahmed Shuvo, Tanjib Ahmed, Bushra Yesmeen Anika, Md. Abdullah Al Masum Anas, Jahidul Hasan, Saiadul Arfain, Shadab Iqbal, Md Humaion Kabir Mehedi, Annajiat Alim Rasel
IICAIET 2022
Abstract:

This study outlines the notable efforts of creating of automatic speech recognition (ASR) system in Bengali. It describes data from the Bengali languages existing voice corpus and the major reports that have contributed to the recent research scenario. It provides an overview of dataset or corpus that has been created for bengali ASR, challenge faced to create bengali ASR as well as techniques used to build Bengali ASR system. ASR techniques for the Bengali language have made significant progress in recent years. Our article contains studies from 2016 through 2020. We examined the results of these investigations, as well as the strategies used to accomplish this goal, for Automated voice recognition. We have examined these publications to obtain a feel of the present state of Bengali ASR. We have observed a dearth of sufficient datasets among these researchers, which is important for any automated system. Due to the languages abundance of consonant clusters, the Machine Learning (ML) system has difficulty interpreting Bengali words. As a result of these modifications, the system now confronts a new set of difficulties in terms of effectiveness and efficiency. Additionally, numerous words have nearly identical pronunciations. These are only some of the issues that the papers we examined face. This research makes use of a variety of techniques, including linear prediction coding, Mel Frequency Cepstral Coefficient, Hidden Markov Model, Neural Network, and Fuzzy logic. Bengali ASR will require further investigation shortly. While recent research is encouraging, ASR of other languages, such as English, is far from perfect and efficient.



* denotes that both authors contributed equally to the research.