Parallel Simulation of Coarse-Grained Reconfigurable Array using Structural Simulation Toolkit
Jan 2024 - April 2024
EECS 570 Semester Project
Coarse-grained Reconfigurable Arrays (CGRAs) have developed over the years to achieve greater compute
efficiency while ensuring flexibility to handle multiple workloads. Simulators are effective tool for
both hardware architects and compiler designers in developing these domain-specific accelerators without
running actual register-transfer level (RTL) simulations. However, current simulators are often tied to a
fixed hardware architecture and face scalability issues as hardware complexity increases. For CGRAs with high
reconfigurability, these simulators are often unsatisfactory. In this work, we implement our proposed digital
CGRA architecture using the Structural Simulation Toolkit (SST). Our SST simulation of a PE-array achieves
5x speedup as compared to running RTL simulation for different simulation time using a single core.
Parallel execution of our SST simulation on multiple cores reaches speedup of up to 42x.
The cycle-accurate implementation of our SST simulation is highly scalable as it is able to handle different
configuration of the PE-array without compromising the speedup gain. Thus, this work serves as a step towards
building simulators and even generators for larger CGRA architecture for better design space exploration and compiler design.
Memory Access Pattern Recognition
Using Search Algorithms
Sept 2023 - Dec 2023
EECS 592 Semester Project
Memory-intensive applications experience system
performance issues mainly due to memory access latency. One of
the ways to mitigate this issue is to prefetch data ahead of time
to reduce the processor’s wait time. Unfortunately, this is not
possible if one cannot foresee what data is required in the near
future. We solve this problem by implementing search algorithms
to find the memory access pattern being followed by programs
to help inform memory prefetching. Particularly, we implement
hill climbing, simulated annealing, and the evolutionary genetic
algorithm to search for the access pattern in a program’s memory
address trace. We finally evaluate these search algorithms on
real-world applications to assess their efficiency and accuracy.
Github Repo Link
ACRE: Explainable Random Forest Process in Memory Accelerator
Sept 2022 - Dec 2022
EECS 583 Semester Project
Explainable Artificial Intelligence (XAI) has been developed to provide explanations
for results obtained from AI models. There is currently a need to improve the performance of XAI models.
A proposed solution is to parallelize parts of the XAI algorithm, specifically explainable Random Forest,
and accelerate it using Process-in-Memory. The large bus traffic can be reduced by placing a node comparator
in the memory for simple comparisons.
BeXAI: Benchmark Suite for Explainable AI
June 2021 - Nov 2021
Gathered and curated different explainable AI models that range across both input data
domains (images, tabular and data, text) and types of models (Decision Tree, Random Forest, NN).
Developed metrics to quantitatively measure the performance of explainable AI models.
Tested the performance of SHAP and LIME explanation methods in terms of how faithful the
interpretations are to the predictions made by different target models across image,
text, and tabular datasets. Compiled and organized interpretable ML algorithms with metrics to
evaluate them.
Presentation Link
Github Repo Link
Graph Application Acceleration through Optimal Vertex Placement
June 2021 - Sept 2021
Aimed at preprocessing techniques to improve the locality. Identified major bottleneck of
graph application that operates on power-law distribution graphs. Tested different graph
reordering techniques to get better performance in terms of execution speed. Measured the
overall speedup obtained when using preprocessed graphs of selected graph applications on the
baseline platform and OMEGA graph application accelerator
Presentation Link
Github Repo Link
Ethiopian Sign Language to Speech Conversion
Sept 2019 - Dec2020
Aimed to bridge the communication barrier between hearing and speech-impaired
people and the rest of the community, specifically for Ethiopian sign language
users in my country, Ethiopia. Designed and built a pair of gloves equipped with
flex sensors, gyroscopes, accelerometers, Bluetooth modules, and Arduino microcontroller
to capture gestures. Successfully gathered and processed gesture datasets in the form of
electrical signals and trained ML algorithms, which are used for sign classification:
Random Forest and SVM Classifiers. Finally, exported the gesture classification models and
used them in a mobile application built using Kotlin for real-time translation to speech.
Github Repo Link