-
Notifications
You must be signed in to change notification settings - Fork 0
merge: [feat/#51-add-license] -> [dev] #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sekulas
added a commit
that referenced
this pull request
Jan 28, 2025
* merge: [feat/#7-wal-ops] -> [dev] (#7) * feat: Files names as consts * feat: wal-ops introduced * test: new tests + file guard * style: naming for update_header changed * refactor: header deserialization + flushing header * refactor: flushing header improvements * refactor: changed error handling * merge: [feat/#10-db-ops] -> [dev] * feat: db load method * style: error naming changed * feat: wal create * refactor: file_ops moved to database * feat: Create Collection command implementation * feat: wal_to_txt util * feat: CREATE collection wal logging * feat: commands don't change wal * feat: consistency_fix template to wal added * refactor: PathBuf -> Path changes * refactor: WALEntry struct and methods * refactor: separated commands and command builder * feat: WAL::load returns WALType * feat: redo uncommited command * refactor: file structure changed * feat: validating target path of the db * chore: fixing warnings * style: lsn -> current_max_lsn * refactor: single responsibility for 'wal' and 'builder' * fix: wal_to_json multiple commands deserialization * style: typo * fix: database creation * feat: create & drop collection additional checks * feat: LISTCOLLECTION added * feat: Command Query responsibilities separation * chore: removed unused 'Database' struct code * refactor: faster wal entry commit * feat: TRUNCATEWAL * merge: [feat/#14-unnecessary-ops-not-in-wal] -> [dev] * merge: [feat/#15-missing-collections-checking] -> [dev] * refactor: database module restructurized * refactor: CreateCol & DropCol Commands moved to separate file * feat: DbConfig used for specyfing if collection exists before writing to WAL * feat: security checks * refactor: removing warnings * test: refactored tests to work * test: utilizing tempfile for parallel wal test running * merge: [feat/#13-crud-for-collections] -> [dev] * feat: Collection struct for collection setup * feat: Collection struct schema * feat: vector inserting for storage * feat: search for vector storage * feat: delete vector for storage * feat: update vector for storage * feat: additional error types * refactor: moved insert command to separate file * feat: INSERT command building * test: improved one test * feat: insert modes * refactor: removed vec id in storage * refactor: vecs being handled as slices * style: type alias for dimensions and offset * test: added test schemas for collection * feat: checking if vec has incorrect amount of dimensions * fix: not serializable correctly Option in header * refactor: changed hardcoded '0' to named vars * refactor: restricting access to private for storage structs * feat: StorageHeader definition on header corruption * refactor: changed modules composition * refactor: impl Hash for checksum calculation * style: comments * test: changed tests in not delivered module * style: comments changed * merge: [feat/#19-b+tree-indexing] -> [dev] (#21) * refactor: collection creation uses components' functions * fix: missing checksum update on header flushing * feat: index template * feat: root creation * feat: node reading * test: tree creation test provided * refactor: removed unnecessary parent offsets * feat: pager * style: renamed pager to BTreeFile * feat: insert for node * fix: header hashing * fix: file loading path incorrect * feat: insertion to tree * refactor: branching factor moved to tree * fix: insertion to nodes * test: test for full root insertion * fix: highest_subtree_index getting * test: 3rd lvl root creation * fix: 3rd lvl root to new child connection * refactor: recursive insert to tree * style: private function moved to the bottom * test: search tests * feat: search * refactor: to-root insertion generalized * style: code moved * refactor: insert Result->Option * style: code moved * test: update * feat: update * fix: update * test: additional tests for insert * fix: clearing map after each command * fix: updating child offsets in update * test: bulk insertion tests added * feat: bulk insert * feat: Index trait as interface * test: refactor for existing tests to use index trait * feat: index enums * test: tests for search_all * feat: search_all * test: tests for updating * fix: updating next_leaf_pointer correctly * test: load tree tests * test: name of create tree test changed * feat: reading nodes with specified capacity * feat: syncing file after node writing * feat: writing nodes with specified capacity * merge: [feat/#22-storage-index-integration] -> [dev] (#23) * test: create & load tests * fix: index loading * feat: load collection * test: fixed index load/create tests * refactor: removed payload offset for storage * feat: validate_checksum for record * test: collection crud tests * feat: checksum validation in storage search * feat: crud for collections * feat: storage in separate module * test: batch insert for storage tests * feat: batch insert for storage * fix: batch_insert records have same lsn now * test: batch insert for collection * feat: batch insert for collection * merge: [feat/#20-rollback-cud-operations] -> [dev] (#24) * feat: lsn passed to the collection and index * fix: index midification_lsn was incorrectly set up * test: storage tests for command/query interface and lsn passing * feat: storage uses command/query interface and lsn * test: storage tests are now testing public methods * refactor: storage crud methods are now private * feat: collection uses storage command/queries * test: improved tests readability by using PartialEq trait on Record * test: fix in bulk_insert test for storage * refactor: solved warning related to UnexpectedError and LSN * test: rollback for storage tests * feat: storage rollback * fix: BPTreeHeader size measured using bincode * test: index rollback tests * feat: index rollback * test: rollback for collection tests * feat: rollbacks for collection * merge: [feat/#25-cli-collection-integration] -> [dev] (#26) * refactor: impl Command execute and rollback methods are takes now mutable self * test: integration tests for db/col creation and insertion * refactor: queries as mutable, and fields are now private for CQ structs * feat: insertion command to_string implementation * test: integration tests for search * feat: search integrated with CLI * test: is_wal_consistent checking in integration testing * test: search tests for non-existent db and col * test: improved tests by comparing to error struct directly * test: drop collection tests * feat: status 1 returned on error * test: improved tests by asserting status 1 * feat: parsing_ops improvement and additional parsing * refactor: insert command now gets parsed data * test: UPDATE command integration tests * feat: UPDATE command * test: delete tests introduced * feat: DELETE command * test: different dimension vector during update/insert * feat: handling different vector dimension during update/insert * feat: parsing_ops module allows to parse multiple vectors and payloads * test: arrange act assert added for parsing_ops tests * test: BULKINSERT integration tests * feat: BULKINSERT integrated with CLI * refactor: main.rs redo_last_command simplified * chore: todo question added * merge: [feat/#27-reindexation] -> [dev] (#29) * feat: storage custom creation * feat: bptree custom creation * feat: index and storage now stores their filenames * test: get_creation_settings for storage and index provided * test: reindex tests provided * feat: reindex for collection * feat: SEARCHALL functionality provided * feat: integrated REINDEX with CLI * chore: removed warning reasons * merge: [feat/#28-load-tests] -> [dev] (#31) * refactor: embedding generation based on specified 384 dim model * test: load tests for bulk_insert, search and reindex * chore: todo added * merge: [feat/#30-log-truncating] -> [dev] (#32) * feat: get_file_name_from_path util * test: truncate test provided and old ones refactored * feat: TRUNCATEWAL command provided * test: integration tests for TRUNCATEWAL provided * fix: TRUNCATEWAL command execute * merge: [feat/#33-readonly-db-state] -> [dev] (#34) * refactor: removed redundant db/col existance checks from builder * feat: LISTCOLLECTIONS query * chore: got rid of some warnings * feat: setting col as readonly * style: dto display changes * feat: deserialize header for storage error throwing * chore: removed unused define header functionality * feat: deserialize header for index error throwing * deserialize header for wal error throwing * merge: [feat/#35-wal-dbconfig-checks] -> [dev] (#36) * feat: checksum validation for wal entries * refactor: popped up const * feat: checksum validation for dbconfig * merge: [feat/#37-rollback-tests] -> [dev] (#38) * feat: wal uncommiting * style: command_query_builder mod name changed to cq * refactor: validator & executor introduced * refactor: commands are now handling commands and validation * chore: removed not used directory * feat: rollback for reindex * refactor: cqexecutor * tests: rollback tests provided * merge: [feat/#39-hnsw] -> [dev] (#41) * chore: TODO comment * fix: selecting keys for search in bptree * chore: .gitignore updated * feat: hnsw building in cq_builder * feat: file selecting during embeddings processing * feat: first hnsw implementation * refactor: removed unnecessary result postprocessing * chore: removed unnecessary code * refactor: graph is being always saved * chore: removed unnecessary code * feat: hnsw results are being now postprocessed * feat: parsing distance and vectors with regex * chore: removed unused code * refactor: search_simillar multi vector querying * feat: graphs saved per distance type * feat: distance is now selectable * refactor: removed unnecessary sorting * tests: provided for hnsw search and graph converter * chore: removed unused fields from file data struct * tests: fixed hnsw_search test * chore: removed unused module of graph_links * feat: hnsw graph versioning * refactor: entry_points -> entry_point * feat: information if collection does not exist * feat: informing user about current step in search simillar * feat: hnsw index build progress showed--amend * feat: time measuring during similarity search * feat: time measuring during bulk insert * feat: time measuring during vector parsing * fix: rollbacking in tree * refactor: parsing_ops now utilizes parallelism * tests: storage rollback test fix * tests: bulk_insert updated * refactor: much faster parsing data from file * refactor: better display for embeddings * tests: updated tests for display changes * style: conventional error messages * tests: create vector index test added * refactor: moved hnsw index consts to the types file * refactor: hnsw open splitted to load and create * feat: parse distance func * feat: extracted create vector index command from search simillar * refactor: dto formating changes * chore: split hnsw open to load and create missing change added * refactor: hnsw clean coded and parallelised * refactor: graph_layers_builder clean coded * refactor: graph_links, graph_layers clean coded * chore: removed unused max threads var * feat: default hnsw settings changed * refactor: simplified scorer * refactor: separated internal ids from external ids * feat: (#40) handling failed rollback introduced * refactor: removed unnecessary passing as mut * feat: cli described * refactor: removed memchr crate * refactor: ef_search is now a param * style: changed file extensions * feat: b+tree branching factor is now 330 * fix: typo in search_all command name * feat: search_all now shows time * refactor: branching_factor now set to 338 * feat: create & delete commands have to has collection name provided * fix: index tree after branching factor change * tests: more tests for links storage * refactor: added vector preprocessing * refactor: hnsw params set to 32, 64, 128 * refactor: commands are not now passed as mut * tests: removed unnecessary test * style: results formact changed * feat: order for search_all can be now specified * perf: removed printing layer diagnostics * feat: embeddings generation separator choosable * chore: todo comments removed * feat: better handling for commands without rollback implementation * ci: action update * ci: updated fastembed * style: error messages changed * chore: warnings resolved * style: unnecessary comment removed * merge: [feat/#42-readonly-improvements] -> [dev] (#43) * refactor: changed the way target is being set as readonly * style: changed information display for handling failed rollback * merge: [feat/#44-first-release-prep] -> [dev] (#45) * feat: first draft of README.md * feat: bibliography introduced * feat: images described * style: eng title changed * feat: examples added * feat: references to bib added * feat: table of contents * feat: separation of results * merge: [chore/#46-removal-of-unused-files] -> [dev] #47 * merge: [feat/#51-add-license] -> [dev] (#52) * LICENSE * style: apache2 mentioning changed * feat: notice * chore: changed copyright date
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.