Skip to content

Conversation

@Samia35-2973
Copy link

This PR adds a basic integration between the Gemini model and the DeepForest object detection tool, aiming to test whether the Gemini model can effectively call and utilize the DeepForest functionalities. I have added the workflow diagram in the README.md

Setup Instructions

1. Create and activate a Conda environment

conda create -n deepforest_agent python=3.12.11
conda activate deepforest_agent

2. Install dependencies

pip install -r requirements.txt
pip install -e .

3. Configure your Google API Key

Create a .env file in the root directory and add:

GOOGLE_API_KEY="your_api_key_here"

You can get a key from Google AI Studio.

🚀 Usage

Run the following on the terminal to start the Gradio interface:

python -m deepforest_agent.main

Then open the displayed link (e.g., http://127.0.0.1:7860) in your browser to interact with the DeepForest Agent.

@Samia35-2973
Copy link
Author

I made some changes in this Pull Request:

  • Removed return_plot parameter as we always want to see the plotted image whenever DeepForest is run
  • Removed predict_image method as DeepForest mainly uses the predict_tile method
  • Removed the logic to choose between predict_image and predict_tile as predict_tile method is used most of the time
  • Removed the extension parameter and its occurrences as it was mainly used to choose between predict_image and predict_tile methods
  • Earlier the original image was not appending to the messages list to let the Gemini model perform visual analysis on the image. This was because Gradio wasn't receiving the original image when I was passing the image_path separately. Instead only the annotated image after running DeepForest tool was appended. Now, I adjusted the logic to append the original image to the user message in the messages list. And after the tool is called by the Gemini model, only detection data and summary are added for analysis.
  • I also modified the system prompts into a new one to get better response which now seems to be working on my side.
  • I changed the _add_detection_context_to_messages to return complete detection data if previous detection data is available
  • Updated the workflow diagram in README file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant