Implementation and extension of the approximate K‑NN graph construction challenge from SIGMOD 2023’s Programming Contest, developed across three phases with full test coverage:
- Core Vamana algorithm implementation (Subramanya et al., 2019)
- Filtered extensions for label-constrained search (Gollapudi et al., 2023)
- Performance optimizations via parallelization and memory-efficient indexing
IMPORTANT: All commands should be executed from the directory containing the Makefile.
-
make: Compiles the executables in the./bin/directory. -
make test: Combines all tests into a single executable located at./bin/test. -
make groundtruth: Builds thecalculate_groundtruthexecutable in./bin/and generates groundtruth fork = 100. -
make graphs: Buildsfiltered_main_graphandstitched_main_graphin./bin/and generates graphs and maps with parameters:L = 100R = 96a = 1.2t = 55(filtered)R_stitched = 98(stitched)
-
make valgrind: Runs./bin/testusingvalgrindfor memory analysis. -
make clean: Deletes the following directories:./bin/(executables)./build/(object files)
Note: This does not delete
graphs,maps, andgroundtruth (gt)files located in./datasets/smallscale/.
Calculates the groundtruth for a given k and saves the output as gt_k=x.bin (e.g., gt_k=100.bin) in ./datasets/smallscale/.
Usage:
./bin/calculate_groundtruth <k>Example:
./bin/calculate_groundtruth 100Generates the graph and map using the FilteredVamana algorithm without query processing.
Usage:
./bin/filtered_main_graph <L> <R> <a> <t> <base_file_path>Example:
./bin/filtered_main_graph 110 96 1.2 55 ./datasets/smallscale/dummy-data.binGenerates the graph and map using the StitchedVamana algorithm without query processing.
Usage:
./bin/stitched_main_graph <L> <R> <a> <t> <base_file_path>Example:
./bin/stitched_main_graph 110 96 1.2 98 ./datasets/smallscale/dummy-data.binExecutes the FilteredVamana algorithm. It can be run in three ways:
-
Command Line Inputs:
./bin/filtered_main <k> <L> <R> <a> <t> <base_file_path> <queries_file_path> <groundtruth_file_path>
Example:
./bin/filtered_main 100 110 96 1.2 55 ./datasets/smallscale/dummy-data.bin ./datasets/smallscale/dummy-queries.bin ./datasets/smallscale/gt_k=100.bin
-
Using Pre-generated Graph:
./bin/filtered_main <k> <graph_filename> <map_filename> <queries_file_path> <groundtruth_file_path>
Example:
./bin/filtered_main 100 filtered_graph_L=110_a=1.200000_R=96_taph=55 filtered_map_L=110_a=1.200000_R=96_taph=55 ./datasets/smallscale/dummy-queries.bin ./datasets/smallscale/gt_k=100.bin
-
Using Configuration File:
./bin/filtered_main <input_file>
Example config file:
k = 100 L = 110 R = 96 a = 1.2 t = 55 base_file = ./datasets/smallscale/dummy-data.bin queries_file = ./datasets/smallscale/dummy-queries.bin groundtruth_file = ./datasets/smallscale/gt_k=100.bin
Example execution:
./bin/filtered_main ./filtered_config.txt
Executes the StitchedVamana algorithm with similar execution modes as filtered_main.
A bash script testing.sh is provided to automate testing and print results to results.txt.
Usage:
chmod +x ./testing.sh
./testing.shThis project is shared for educational purposes. No license is granted for commercial or non-educational use without explicit permission.