Parallel Simulation of Coarse-Grained Reconfigurable Array using Structural Simulation Toolkit

Jan 2024 - April 2024

EECS 570 Semester Project

Coarse-grained Reconfigurable Arrays (CGRAs) have developed over the years to achieve greater compute efficiency while ensuring flexibility to handle multiple workloads. Simulators are effective tool for both hardware architects and compiler designers in developing these domain-specific accelerators without running actual register-transfer level (RTL) simulations. However, current simulators are often tied to a fixed hardware architecture and face scalability issues as hardware complexity increases. For CGRAs with high reconfigurability, these simulators are often unsatisfactory. In this work, we implement our proposed digital CGRA architecture using the Structural Simulation Toolkit (SST). Our SST simulation of a PE-array achieves 5x speedup as compared to running RTL simulation for different simulation time using a single core. Parallel execution of our SST simulation on multiple cores reaches speedup of up to 42x. The cycle-accurate implementation of our SST simulation is highly scalable as it is able to handle different configuration of the PE-array without compromising the speedup gain. Thus, this work serves as a step towards building simulators and even generators for larger CGRA architecture for better design space exploration and compiler design.

Memory Access Pattern Recognition Using Search Algorithms

Sept 2023 - Dec 2023

EECS 592 Semester Project

Memory-intensive applications experience system performance issues mainly due to memory access latency. One of the ways to mitigate this issue is to prefetch data ahead of time to reduce the processor’s wait time. Unfortunately, this is not possible if one cannot foresee what data is required in the near future. We solve this problem by implementing search algorithms to find the memory access pattern being followed by programs to help inform memory prefetching. Particularly, we implement hill climbing, simulated annealing, and the evolutionary genetic algorithm to search for the access pattern in a program’s memory address trace. We finally evaluate these search algorithms on real-world applications to assess their efficiency and accuracy.

Github Repo Link

ACRE: Explainable Random Forest Process in Memory Accelerator

Sept 2022 - Dec 2022

EECS 583 Semester Project

Explainable Artificial Intelligence (XAI) has been developed to provide explanations for results obtained from AI models. There is currently a need to improve the performance of XAI models. A proposed solution is to parallelize parts of the XAI algorithm, specifically explainable Random Forest, and accelerate it using Process-in-Memory. The large bus traffic can be reduced by placing a node comparator in the memory for simple comparisons.

BeXAI: Benchmark Suite for Explainable AI

June 2021 - Nov 2021

Gathered and curated different explainable AI models that range across both input data domains (images, tabular and data, text) and types of models (Decision Tree, Random Forest, NN). Developed metrics to quantitatively measure the performance of explainable AI models. Tested the performance of SHAP and LIME explanation methods in terms of how faithful the interpretations are to the predictions made by different target models across image, text, and tabular datasets. Compiled and organized interpretable ML algorithms with metrics to evaluate them.

Presentation Link

Github Repo Link

Graph Application Acceleration through Optimal Vertex Placement

June 2021 - Sept 2021

Aimed at preprocessing techniques to improve the locality. Identified major bottleneck of graph application that operates on power-law distribution graphs. Tested different graph reordering techniques to get better performance in terms of execution speed. Measured the overall speedup obtained when using preprocessed graphs of selected graph applications on the baseline platform and OMEGA graph application accelerator

Presentation Link

Github Repo Link

Ethiopian Sign Language to Speech Conversion

Sept 2019 - Dec2020

Aimed to bridge the communication barrier between hearing and speech-impaired people and the rest of the community, specifically for Ethiopian sign language users in my country, Ethiopia. Designed and built a pair of gloves equipped with flex sensors, gyroscopes, accelerometers, Bluetooth modules, and Arduino microcontroller to capture gestures. Successfully gathered and processed gesture datasets in the form of electrical signals and trained ML algorithms, which are used for sign classification: Random Forest and SVM Classifiers. Finally, exported the gesture classification models and used them in a mobile application built using Kotlin for real-time translation to speech.

Github Repo Link