-
Notifications
You must be signed in to change notification settings - Fork 3
Feature: Improve robustness of MCAP Parser #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Initially only change float64 and uint64 Other data types may work be we need to verify that this doesn't break any object/string parsing
This catches an edge case where an MCAP file may have both a scalar and an array and we need to concatenate both into an array. This basically converts scalars into arrays prior to concatenation
This catches and handles edge cases in array expansion and gives every effort to output valid data
Constants can be seen by using `mcap list schemas <files>`
|
@cnicho35, this is ready for review. This adds support the M2 output from the comparison test. |
|
Hi Andrew, I did a quick test on the HPC and it doesn't appear to be working for the m1-m2-comparison mcap files. import sys
from pathlib import Path
# --- Add your local toolkit folder to the Python path ---
# For example, if modaq_toolkit is in /projects/m2_ws/src/m2_core/
toolkit_path = Path(r"/projects/wpfacilities/m1_m2_comparison/MODAQ_toolkit/src/") # <-- change as needed
sys.path.append(str(toolkit_path))
from modaq_toolkit import MCAPParser
from pathlib import Path
# Skip specific topics during parsing
parser = MCAPParser(
mcap_path=Path("/projects/wpfacilities/m1_m2_comparison/MODAQ_toolkit/tests/Bag_2025_07_23_17_10_38_184.mcap"),
topics_to_skip=["/diagnostics", "/rosout", "/parameter_events"]
)
# Process the file - skipped topics won't appear in output
dataframes = parser.get_dataframes()
print("Parsed DataFrames with skipped topics:")
for df in dataframes:
print(df)This returns an empty df with no error message. Also, can you please add the topics_to_skip input to the process_mcap_files function? |
|
Also, can you please remove this print function: |
|
@cnicho35 I think I fixed all of the above bugs. On Kestrel with the latest version of this code this yields: I'm going to create to run this on the SURF-WEC dataset to verify it works there also. This "should" work for all the m2_comparison files, but you may have one off edge cases. I'm going to bump this to 0.4, and update the PR notes as this includes some larger changes than just a patch. LMK if you see any other areas of improvement. |
This PR improves the modaq_toolkit MCAP parser's robustness when handling edge cases in ROS2 message processing, adds flexible topic filtering, and enhances type conversion from ROS to numpy data types.
New Features
1. Topic Filtering with
topics_to_skipThe
MCAPParsernow accepts an optionaltopics_to_skipparameter that allows you to exclude specific topics from processing:2. Automatic ROS Constants Filtering
The parser now automatically skips ROS message constant definitions (which are not actual message fields). These include common logging level constants like:
DEBUG=10INFO=20WARN=30ERROR=40FATAL=50These constants appear in ROS message schemas but aren't data fields. Previously, the parser would try to extract these as fields and fail. Now they're automatically detected and skipped during schema parsing.
3. ROS to Numpy Type Conversion
The
MessageProcessornow includes a type mapping system (ros_type_to_numpy_type_map) that converts ROS primitive types to their corresponding numpy dtypes:When processing messages, fields with recognized ROS types are automatically converted to numpy arrays with the correct dtype:
mcap list schema <path_to_mcap>provides schema output in cli for reference