Skip to content

Commit 792dfe6

Browse files
committed
uadded more section:
1 parent 86af695 commit 792dfe6

File tree

2 files changed

+192
-41
lines changed

2 files changed

+192
-41
lines changed

index.html

Lines changed: 68 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -106,13 +106,16 @@ <h2 id="contributions">Key Contributions</h2>
106106

107107
<section>
108108
<h2 id="project-goal">Problem Statement</h2>
109-
<p>We're tackling the challenge of refactoring code to make it more organized and efficient. Imagine you have several pieces of code that do similar things. Our goal is to create a unified library that captures the common patterns, significantly reducing the total code size while still supporting all original functionalities. This not only makes the code base smaller but also helps uncover new ways to use the code.</p>
109+
<p>We study the problem of refactoring code for better organization and efficiency. Given multiple codebases with similar functionalities, our goal is to create a unified library that captures common patterns.
110+
This process should significantly reduce the total amount of code while ensuring all original functionality remains intact.
111+
112+
</p>
113+
114+
<p>We evaluate refactorings based on two key principles:
110115

111-
<h3>How We Do It</h3>
112-
<p>Our approach focuses on finding refactorings that are both correct and simple.</p>
113116
<ul>
114-
<li><strong>Correctness</strong> is straightforward: Does the refactored code pass all the original tests?</li>
115-
<li><strong>Simplicity</strong> is more nuanced. We don't just count characters; we define simplicity using <strong>Minimum Description Length (MDL)</strong>. This means we're looking for code that is not only short but also natural, elegant, and extensible—like finding the most concise yet understandable way to express an idea, rather than just the shortest, potentially unreadable, version (think "Perl Golf" where the shortest code is often incomprehensible!).</li>
117+
<li><strong><span class="highlight highlight-green">Correctness</span> is straightforward</strong>: Does the refactored code pass all the original tests?</li>
118+
<li><strong><span class="highlight highlight-blue">Simplicity</span> is more nuanced</strong>: We don't just count characters; we define simplicity using <strong class="highlight highlight-orange">Minimum Description Length (MDL)</strong>. This means we're looking for code that is not only short but also natural, elegant, and extensible—like finding the most concise yet understandable way to express an idea, rather than just the shortest, potentially unreadable, version (think "Perl Golf" where the shortest code is often incomprehensible!).</li>
116119
</ul>
117120

118121

@@ -157,6 +160,66 @@ <h3>How It Works:</h3>
157160
</ul>
158161
</section>
159162

163+
<section>
164+
<h2>The MINICODE Benchmark</h2>
165+
<p>
166+
MINICODE evaluates a <strong class="highlight highlight-blue">code agent's</strong> capability to identify abstractions across multiple implementations and design reusable <strong class="highlight highlight-orange">libraries</strong>. Agents are presented with a collection of code sources and are tasked with refactoring them into a unified library. Key desiderata for these collections are that they must be <strong class="highlight highlight-blue">compressible</strong>, containing a latent shared abstraction, and <strong class="highlight highlight-blue">verifiable</strong>, allowing functional correctness to be measured. Agents interact with the benchmark via the terminal, managing multi-package Python repositories.
167+
</p>
168+
169+
<h3>CodeContests Domain</h3>
170+
<p>
171+
Sourced from the CodeContests dataset, this domain uses competitive programming problems which naturally contain shared concepts and test cases. Each collection provides multiple solutions, and the agent's task is to create a central <code>library.py</code> file that is imported by each refactored solution.
172+
</p>
173+
174+
<h3>Repositories Domain</h3>
175+
<p>
176+
This domain features synthesized projects with controlled complexity and overlap. Using a generative process, we create collections of repositories tailored to specific use cases. Agents must extract reusable functions from across these repositories and rewrite the original source code to use a new, shared <code>common</code> subpackage.
177+
</p>
178+
179+
<figure class="table-figure">
180+
<table class="table-styled">
181+
<thead>
182+
<tr>
183+
<th><strong>Domain</strong></th>
184+
<th><strong>Sources</strong></th>
185+
<th><strong>Collections</strong></th>
186+
<th><strong>Avg LoC</strong></th>
187+
<th><strong>Avg Tests</strong></th>
188+
<th><strong>Gen by</strong></th>
189+
</tr>
190+
</thead>
191+
<tbody>
192+
<tr>
193+
<td>Code Contests</td>
194+
<td>300</td>
195+
<td>30</td>
196+
<td>87</td>
197+
<td>10</td>
198+
<td>Humans</td>
199+
</tr>
200+
<tr>
201+
<td>Small Repositories</td>
202+
<td>262</td>
203+
<td>22</td>
204+
<td>209</td>
205+
<td>12</td>
206+
<td>o4-mini</td>
207+
</tr>
208+
<tr>
209+
<td>Large Repositories</td>
210+
<td>20</td>
211+
<td>10</td>
212+
<td>6,433</td>
213+
<td>101</td>
214+
<td>Claude-Sonnet 3.7</td>
215+
</tr>
216+
</tbody>
217+
</table>
218+
<figcaption style="text-align: center;">Table 1: MINICODE Statistics</figcaption>
219+
</figure>
220+
221+
</section>
222+
160223
<!-- <section>
161224
<h2 id="video">Video</h2>
162225
<iframe height="528" src="#" title="Supplemental video" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

style.css

Lines changed: 124 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,15 @@ h1 {
4242
}
4343

4444
h2 {
45-
font-size: 1.5em;
45+
font-size: 2em;
4646
text-align: center;
47-
margin: 2em 0 0.75em 0;
47+
margin: 2.5em 0 1em 0;
48+
padding-bottom: 0.5em;
49+
border-bottom: 1px solid var(--nord4);
4850
}
4951

52+
/* Problem Statement uses normal h2 styling */
53+
5054
/* Links */
5155
a {
5256
color: var(--nord10);
@@ -201,8 +205,10 @@ nav > ul > li {
201205
position: relative;
202206
background-color: var(--nord5);
203207
border: 1px solid var(--nord4);
204-
border-radius: 4px;
205-
padding: 0.5rem 0.75rem;
208+
border-radius: 8px;
209+
padding: 1rem 1.25rem;
210+
border-left: 5px solid var(--nord10);
211+
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.05);
206212
}
207213

208214
#citation pre {
@@ -217,8 +223,8 @@ nav > ul > li {
217223

218224
#citation code {
219225
font-family: 'Noto Mono', monospace;
220-
font-size: 0.7em;
221-
line-height: 1.0;
226+
font-size: 0.85em;
227+
line-height: 1.4;
222228
color: var(--nord1);
223229
background: none;
224230
padding: 0;
@@ -278,7 +284,7 @@ footer {
278284
}
279285

280286
#citation code {
281-
font-size: 0.8em;
287+
font-size: 0.85em;
282288
}
283289

284290
.author-block {
@@ -355,6 +361,27 @@ footer {
355361
color: var(--nord1);
356362
}
357363

364+
.highlight {
365+
font-weight: 600;
366+
padding: 0.1em 0.3em;
367+
border-radius: 4px;
368+
}
369+
370+
.highlight-blue {
371+
color: var(--nord9);
372+
background-color: rgba(129, 161, 200, 0.15);
373+
}
374+
375+
.highlight-green {
376+
color: #5A7C5A;
377+
background-color: rgba(163, 190, 140, 0.25);
378+
}
379+
380+
.highlight-orange {
381+
color: var(--nord12);
382+
background-color: rgba(208, 135, 112, 0.15);
383+
}
384+
358385
.contribution-box strong {
359386
font-weight: 700;
360387
padding: 0.1em 0.3em;
@@ -371,50 +398,101 @@ footer {
371398
background-color: rgba(208, 135, 112, 0.15);
372399
}
373400

374-
/* Project Goal section */
375-
#project-goal {
376-
margin-top: 2em;
401+
/* Math equation styling */
402+
.math-display {
403+
text-align: center;
404+
margin: 1.5em 0;
405+
padding: 1.2em;
406+
border-radius: 8px;
407+
overflow-x: auto;
408+
border: 3px solid transparent;
409+
background: linear-gradient(var(--nord5), var(--nord5)) padding-box,
410+
linear-gradient(135deg, var(--nord15), var(--nord9)) border-box;
377411
}
378412

379-
#project-goal h3 {
380-
font-size: 1.3em;
381-
margin: 1.5em 0 0.8em 0;
382-
color: var(--nord0);
383-
font-weight: 600;
413+
.math-display .MathJax {
414+
font-size: 1.2em;
384415
}
385416

386-
#project-goal ul {
387-
margin: 1em 0;
388-
padding-left: 2em;
417+
/* Enhanced table styling with Nord theme */
418+
figure.table-figure {
419+
display: flex;
420+
flex-direction: column;
421+
margin: 2em 0;
389422
}
390423

391-
#project-goal li {
392-
margin: 0.5em 0;
393-
line-height: 1.6;
424+
figure.table-figure .table-styled {
425+
order: 2;
426+
margin: 0;
394427
}
395428

396-
#project-goal strong {
397-
color: var(--nord10);
398-
font-weight: 600;
429+
figure.table-figure figcaption {
430+
order: 1;
431+
margin-bottom: 0.8em;
399432
}
400433

401-
/* Math equation styling */
402-
.math-display {
403-
text-align: center;
404-
margin: 1.5em 0;
405-
padding: 1em;
434+
.table-styled {
435+
width: 100%;
436+
border-collapse: collapse;
437+
font-family: 'Noto Serif', Georgia, serif;
438+
border-top: 2px solid var(--nord1);
439+
border-bottom: 2px solid var(--nord1);
440+
}
441+
442+
.table-styled th,
443+
.table-styled td {
444+
padding: 0.8em 1em;
445+
text-align: left;
446+
border-bottom: 1px solid var(--nord4);
447+
}
448+
449+
.table-styled th {
450+
background-color: transparent;
451+
color: var(--nord0);
452+
font-weight: 700;
453+
font-size: 1em;
454+
border-bottom: 2px solid var(--nord3);
455+
}
456+
457+
.table-styled tbody tr {
458+
background: transparent;
459+
}
460+
461+
.table-styled tbody tr:nth-child(even) {
462+
background-color: transparent;
463+
}
464+
465+
.table-styled tbody tr:hover {
406466
background-color: var(--nord5);
407-
border-radius: 6px;
408-
border: 1px solid var(--nord4);
409-
overflow-x: auto;
410467
}
411468

412-
.math-display .MathJax {
413-
font-size: 1.1em;
469+
.table-styled td {
470+
color: var(--nord1);
471+
font-size: 0.95em;
472+
line-height: 1.5;
473+
}
474+
475+
.table-styled td strong {
476+
color: var(--nord0);
477+
font-weight: 700;
478+
}
479+
480+
/* Enhanced table caption */
481+
figcaption {
482+
text-align: center;
483+
margin: 0.5em 0 0 0;
484+
font-size: 0.9em;
485+
color: var(--nord3);
486+
font-style: italic;
414487
}
415488

416489
/* Responsive adjustments for project goal section */
417490
@media screen and (max-width: 768px) {
491+
#project-goal {
492+
padding: 1.2em 1em;
493+
margin-top: 1.5em;
494+
}
495+
418496
#project-goal h3 {
419497
font-size: 1.2em;
420498
}
@@ -424,7 +502,17 @@ footer {
424502
}
425503

426504
.math-display {
427-
padding: 0.75em;
428-
margin: 1em 0;
505+
padding: 0.5em;
506+
margin: 1.2em 0;
507+
}
508+
509+
.table-styled th,
510+
.table-styled td {
511+
padding: 0.6em 0.8em;
429512
}
513+
}
514+
515+
#project-goal ul li {
516+
margin-bottom: 2em !important;
517+
padding-bottom: 0.5em;
430518
}

0 commit comments

Comments
 (0)