osre24

Reproducing and addressing Data Leakage issue : Duplicates in dataset

Hello! In this blog post, I will explore a common issue in machine learning called data leakage, using an example from the paper: Benedetti, P., Perri, D., Simonetti, M., Gervasi, O.

Last updated on Aug 24, 2024 SummerofReproducibility24

Final blog: Automatic reproducibility of COMPSs experiments through the integration of RO-Crate in Chameleon

The project aims to develop a service that facilitates the automated replication of COMPSs experiments within the Chameleon infrastructure

Archit Dabral, Raül Sirvent

Last updated on Aug 24, 2024 SoR

Final Blogpost: HDEval's LLM Benchmarking for HDL Design

Introduction Hello everyone! I’m Ashwin Bardhwaj, an undergraduate student studying at UC Berkeley. As part of Micro Architecture Santa Cruz (MASC) my proposal under the mentorship of Jose Renau and Sakshi Garg looks to create a suite of benchmark programs for HDEval.

Ashwin Bardhwaj

Last updated on Aug 24, 2024

Deriving Realistic Performance Benchmarks for Python Interpreters

Hi, I am Mrigank. I am one of the Summer of Reproducibility fellows for 2024, and I will be working on deriving realistic performance benchmarks for Python interpreters with Ben Greenman from the University of Utah.

Last updated on Aug 19, 2024

Deriving Realistic Performance Benchmarks for Python Interpreters

Final Blog: FEP-Bench: Benchmarking for Enhanced Feature Engineering and Preprocessing in Machine Learning

Background Hello, I’m Lihaowen (Jayce) Zhu, a 2024 SoR contributor for the FEP-bench project, under the mentorship of Yuyang (Roy) Huang. Before we started, let’s recap the goal of our project and our progress until mid term.

Lihaowen (Jayce) Zhu

Last updated on Aug 19, 2024

Final Blog: FSA - Benchmarking Fail-Slow Algorithms

Introduction Hello! I hope you’re enjoying the summer as much as I am. I’m excited to join the SOR community as a 2024 contributor. My name is Xikang Song, and I’m thrilled to collaborate with mentors Ruidan Li and Kexin Pei on the FSA-Benchmark project.

Kexin Pei, Ruidan Li, Xikang Song

Last updated on Aug 18, 2024

Data Leakage in Applied ML

Hello everyone! I have been working on reproducing the results from Characterization of Term and Preterm Deliveries using Electrohysterograms Signatures. This paper aims to predict preterm birth using Support Vector Machine with RBF kernel.

Last updated on Aug 19, 2024 SoR

Midterm Report : Halfway through medicinal data visulaization using PolyPhy/Polyglot

Introduction Hello! My name is Ayush Sharma, a machine learning engineer and researcher based out of Chandigarh, a beautiful city in Northern India known for its modern architecture and green spaces.

Last updated on Aug 14, 2024 GSoC'24, natural language processing

Midterm Report : Halfway through medicinal data visulaization using PolyPhy/Polyglot

Midterm Check-In: Progress on the AutoAppendix Project

Hi all, I’m happy to share a quick update on the AutoAppendix project as we’re about halfway through. We’ve made some steady progress on evaluating artifacts from SC24 papers, and we’re starting to think about how we can use what we’ve learned to improve the artifact evaluation process in the future.

Klaus Kraßnitzer

Last updated on Sep 9, 2024 SoR

[MidTerm] ScaleRep: Reproducing and benchmarking scalability bugs hiding in cloud systems

Hey there, scalability enthusiasts and fellow researchers! I’m excited to share my progress on the ScaleRep project for SoR 2024 under the mentorship of Bogdan "Bo" Stoica and Yang Wang. Here’s a glimpse into how we’re tackling scalability bugs in large-scale distributed systems.

Zahra Nabila Maharani

Last updated on Aug 6, 2024 SummerofReproducibility24