diff --git a/Final/README.md b/Final/README.md
new file mode 100644
index 0000000000..d2cc61725c
--- /dev/null
+++ b/Final/README.md
@@ -0,0 +1,274 @@
+# Final
+
+**Sean Hardesty Lewis (solo)**
+
+
+
+I built a pair of "compasses" that point towards each other no matter where they are in a room. While standard GPS tools reduce social connection to Cartesian coordinates on a map, the Friend Compass uses relative RF signal strength (RSSI) and sensor fusion to provide a continuous, egocentric bearing toward a partner. It explores how relative directional cues foster a sense of presence, turning navigation into a "warm" game of hot-and-cold rather than a cold turn-by-turn instruction.
+
+## Inspiration
+
+Please note that this project is heavily inspired by [komugiko2000's](https://www.instagram.com/komugiko_2000/) animation for [ZUTOMAYO – Hunch Gray (Music Video)](https://www.youtube.com/watch?v=ugpywe34_30). You can click on the thumbnail below to watch the YouTube video and understand the inspiration. I highly recommend it!
+
+
+
+## Background
+
+I actually watched the above video a few years ago and was fascinated by the concept of a compass wristwatch that points to a friend. However, I didn't have the knowledge, resources, or ability at the time for how to even go about building it- and this was my best guess. Here is my sketch from 2022/2023:
+
+
+
+...and here is a sketch I made for this project (November 2025):
+
+
+
+
+## Overview
+
+### Part A: The Scope Pivot (Hardware)
+
+I started this project with an extremely ambitious "hardware-first" mindset. The original plan was to build a pair of physical wristwatches driven by Femtoduinos, using tiny brushless motors to spin a physical needle. I spent the first week diving into the mechanics of absolute positioning and quickly realized I was spiraling into scope creep. Absolute position motors are surprisingly expensive, and standard DC motors have no idea "where" they are pointing without complex optical encoders. I tried stepper motors, but they are often capped at 180 degrees or require bulky drivers. I considered continuous rotation servos with slip rings, but wondered about mechanical drift- as this was a problem i was seeing even with the stepper motors. Without a closed-loop control system, the physical needle would eventually lose its orientation relative to "North," making it useless.
+
+I realized that while a physical needle is cool, I was solving a mechanical engineering problem instead of the interaction design problem I actually cared about. I pivoted hard. I swapped the idea of Femtoduino back to prototype Raspberry Pi 5 and the physical motor idea back to just a digital display. This allowed me to focus on the actual hard problem: the invisible signal processing. While less "mechanical" than a moving needle, the digital display allowed for rapid iteration on the smoothing algorithms without worrying about the physical inertia or latency of a motor. It shifted the project from "How do I spin this gear?" to "How do I know where my friend is?"
+
+Here is my original pitch:
+
+
+
+
+### Part B: The Signal Problem (RSSI vs. Reality)
+
+Here is some of the different routes that were tested but ultimately not gone down.
+From left to right:
+- Pair of servo motors. While stepped variants exist they are often not continuous, and most continuous versions do not have a bearing.
+- GPS Breakout. This didn't work at all indoors and when it connected to satellites outside, location was too brittle.
+- Rotary encoder. This was just to test rubber band between a servo motor and the encoder. Interesting concept but bad execution.
+
+
+
+The core technical challenge was calculating direction using Wi-Fi signals. My initial instinct was to use a GPS module on the Pi. I thought, "Satellites know where everything is, right?" I was wrong. The GPS modules were incredibly brittle, failing completely indoors (where most social interaction happens) and requiring a clear view of the sky to get even a 10-meter accuracy radius. For a "room-scale" interaction, 10 meters of error is useless, you could be in the kitchen while the compass says you’re outside.
+
+So, I looked at what was already in the air: Wi-Fi. I shifted to using RSSI (Received Signal Strength Indicator) over a shared Wi-Fi network to estimate proximity and direction. I hypothesized that I could triangulate position based on signal strength (I did my undergraduate degree in Mathematics and Bellman's Lost in a Forest navigation problem was by far my favorite- but that is more geometric than this). The "Moment of Truth" came when I set up the two Pis on opposite ends of my apartment. I wrote a script to make the "Seeker" Pi scan for the "Target" Pi's unique beacon packets. I watched the console logs: RSSI: -45dBm, RSSI: -60dBm. It was working! The numbers moved as I moved. But they were pretty messy.
+
+Here is a diagram of the pivoted architecture, moving from raw hardware to a sensor-fusion software approach:
+
+
+
+### Part C: The Mess
+
+However, seeing numbers on a console screen and getting an arrow to point at a person turned out to be two completely different realities. The gap between "I have data" and "I have a compass" was massive and filled with physics problems I hadn't anticipated. When I first hooked the RSSI data directly to the visual arrow, the result was a catastrophe. The arrow didn't point at my friend; it seized violently, vibrating back and forth across 180 degrees, or spinning in circles even when both devices were sitting perfectly still on a table.
+
+I discovered that indoor environments are essentially "Halls of Mirrors" for radio waves. This is called Multipath Interference: signals bouncing off walls, floors, and metal desks, creating noise that makes a static device look like it's teleporting. Even a hand covering the antenna can drop the signal strength significantly, throwing off the calculation. The signal from my friend's device wasn't just coming in a straight line; it was bouncing off nearly everything around it. Sometimes a reflected signal was stronger than the direct line-of-sight signal because the direct path was blocked by my own body (maybe since humans are mostly water, and water blocks 2.4GHz waves effectively?). The compass would confidently point at a metal cabinet because that's where the strongest reflection was coming from. I first tried to solve this with machine learning regression, feeding it raw data to "learn" the room, but the environment was too dynamic. Moving a chair or opening a door changed the RF landscape completely.
+
+The breakthrough came when I realized I couldn't rely on RF alone for direction. I needed to know what the device itself was doing. If the signal strength dropped, did my friend run away, or did I just turn my body? To answer this, I integrated an IMU (Inertial Measurement Unit) but I discovered it didn't have a magnetometer (compass chip), only a Gyroscope and Accelerometer. This meant the device had absolutely no idea where "North" was. It was blind to the world, only knowing how fast it was spinning. However, this was useful for the compass to know if we were "turning our body" and essentially could ensure that rotating the device keeps the needle still pointing in the same consistent direction with the IMU readings.
+
+Since we could solve rotating in place, the biggest thing left was accuracy of the needle. Multipath inference was the boss of all bosses- to which I was stumped. Thankfully, real life comes to save us sometimes. Reggie Fils-Aimé visited our campus the next day and reminded me that the Wii existed, and specifically Wii Sports Resort and the Wii Balance Board. I remembered something that every person who owned a Wii went through- and wondered if it could be applied to my own setting. Custom calibration the very first time you set it up. So I implemented a "Calibration" startup sequence. When we start using our compass, the user spins around their friend in a slow 360-degree circle. The system maps the RSSI strength and how it changes in that dynamic environment. It identifies the angle where the signal was strongest (the "Peak") and defines that virtual angle as "Friend." From that moment on, the system relies entirely on a calibrated environment that is custom to the person, and not some generic model from a completely different env. This smoothed the experience drastically. While RSSI and multipath inference were still issues, they were noticably less apparent with the needle stopped jumping to reflections because it was anchored to that initial calibration scan. It wasn't perfect, since there was no magnetometer to correct it, and sometimes Gyro drift would eventually creep in, but it transformed the device from a broken random number generator into something that felt stable, heavy, and intentional.
+
+Here is what the device looked like with needle pointing in direction of a friend:
+
+
+
+### Part D: The Interaction
+
+Observations
+
+Despite the brittleness of the underlying signal, the interaction feels magical when it works. In a confined, open environment (like a large living room or lab without heavy metal interference), the compass can usually reliably point toward the partner device, with some drift/interference (it is impossible to get rid of all interference with Wi-Fi).
+
+In an open room without too many obstacles, when you turn the device and the arrow spins to lock onto your friend, it feels genuinely alive. It feels like a magnetic pull. Users immediately understood the "game" of it without explanation. However, the system falls apart in predictable, physics-based ways. Even with the smoothing, Multipath Interference is a persistent enemy. In the Maker Lab, which is full of metal desks and equipment, the signal would sometimes bounce so hard that the "Peak" during calibration was actually a reflection off a wall, leading the user confidently in the wrong direction. I also found that the human body is a giant blocker; simply holding the device close to your chest (or enclosing the wifi sensors with your hands) could drop the signal strength by 10dBm, making the compass think your friend just sprinted 20 feet away.
+
+Users
+
+Thinking like a user, I realized that "accuracy" matters a bit less than "responsiveness." If the needle jumps around wildly (which raw RSSI data does), the user thinks the device is broken. If the needle moves smoothly but is slightly wrong, the user thinks they are reading it wrong or that it's "calibrating" (or like some users pointed out, were convinced that it was accurate and justified reasons for the needle's direction). I leaned into this. I implemented a heavy filter that trusts the Gyroscope for short-term rotation and ignores sudden, impossible jumps in RSSI signal. This makes the needle feel heavy and intentional, rather than jittery and digital. Users described this as feeling more like a "magnetic" compass, which was exactly the socio-affective vibe I wanted.
+
+The "Pirates of the Caribbean" effect is real. Users noted that because the device is "alive" (constantly updating), their brains do a lot of the work. If the needle points generally in the right direction, the user feels a connection. The smoothing algorithms helped here immensely; by damping the rotation, the compass feels like it has weight and inertia, making it feel less like a glitchy computer and more like a physical object.
+
+Here are some top-down photos of the compass:
+
+
+
+
+
+
+Here is a video of the compass when it is navigating towards its friend:
+
+https://github.com/user-attachments/assets/ea61648a-68ff-4d69-92d7-91f7e831c4e8
+
+Here is a video of the compass when it is near its friend:
+
+https://github.com/user-attachments/assets/d746280c-711a-43e1-8f6f-bfefe4ae0745
+
+### Part E: Lessons
+
+I realized that the interaction worked best when I treated it as "Calm Technology." It shouldn't scream at you. It should just be.
+
+| Question | Answer |
+| :--- | :--- |
+| **What can you use X for?** | The Friend Compass is used for maintaining a peripheral sense of connection. It allows you to find a friend in a crowd or a large venue without staring at a map app or texting "wya". |
+| **What is a good environment for X?** | Open indoor spaces, line-of-sight scenarios (like a warehouse party or gym), or wood-framed houses where RF passes through walls easily. |
+| **What is a bad environment for X?** | Metal-heavy labs (Faraday cage effect), dense concrete buildings, or incredibly crowded Wi-Fi environments (like a convention center) where packet loss is high. |
+| **When will X break?** | It breaks when Multipath Interference overwhelms the filter (e.g., standing next to a large metal fridge) or when the devices hand-off to different Wi-Fi access points. |
+| **When it breaks how will X break?** | The arrow will drift aimlessly (due to Gyro drift) or lock onto a "reflection" of the signal (like a wall) rather than the true source. |
+| **What are other properties/behaviors of X?** | The "weight" of the needle is programmable. I can make it twitchy and reactive (good for tracking fast movement) or heavy and slow (good for general direction). |
+| **How does X feel?** | It feels like a "living" artifact. The needle has a "magnetic" pull towards your friend. It feels playful, slightly mysterious, and warm. |
+
+#### What worked and what didn’t
+The Wins: The pivot was the biggest win. Moving away from hardware mechanics to software signal processing saved the project. The "vibe" was also a huge success. Even when the compass was technically inaccurate (pointing 15 degrees off), users didn't care. They corrected their path naturally, like playing "Hot and Cold." The visual feedback of the needle rotating smoothly (thanks to the Gyro) made the device feel high-quality, masking the noisy data underneath.
+
+The Misses: RSSI is barely usable for precision. It is incredibly sensitive to the environment. The "GPS" idea was a total failure indoors. Also, the lack of a magnetometer means the device suffers from Gyro Drift over time. If you use it for 10 minutes straight without re-calibrating, "North" will slowly drift to the left or right, and the arrow will lose its accuracy.
+
+#### Lessons for making it more autonomous
+The biggest lesson was that filtering creates reality. The raw data says the friend is teleporting around the room. The filter says the friend is standing still. The user believes the filter. For a device to feel "smart," it doesn't need perfect sensors; it needs a model of the world that rejects impossible data.
+
+I also learned that "calibration" can be a feature, not a bug. Asking the user to calibrate the device creates a ritual that builds trust in the machine. It makes the user an active participant in the sensing loop.
+
+#### What I’d do next
+The first thing I’d change is the radio. LoRaWAN is the answer. Wi-Fi is designed for Netflix, not ranging. LoRa modules (like the SX1280) have Time-of-Flight (ToF) capabilities that measure how long a signal takes to travel, which is infinitely more accurate for distance than measuring signal loudness (RSSI).
+
+Second, Haptics. I want to integrate vibration motors. As you get closer to your friend, the watch should pulse like a heartbeat (The "Tell-Tale Heart" effect). This allows for eyes-free navigation, you could find your friend with your hands in your pockets.
+
+Third, Magnetometer. I would absolutely ensure the next iteration has a working magnetometer. Relying solely on the Gyro creates a "time limit" on usage before drift makes it unusable. A 9-axis IMU (instead of 6-axis) is a requirement for version 2.
+
+Here is a longer format video of the Friend Compass in action, showing the "seeking" behavior and its spin interaction on finding the friend. Thanks to Sebastian Bidigain for testing it out!
+
+https://github.com/user-attachments/assets/76d5e89e-605a-4976-9143-8b986f3f2893
+
+### Discussion
+
+| Person | Viewpoint |
+| :--- | :--- |
+| **Sebastian Bidigain** | Thinks the concept is fantastic. He revealed he had a similar idea for a "festival wristband" using LoRaWAN and LED rings. He was impressed I managed to get a prototype working with just Wi-Fi RF, noting that existing commercial solutions (like Totem Compass) usually require extremely close range. He validated that moving to LoRa is the "smart way to go" for the future. |
+| **Bil de Leon** | Jokingly called it "the future." He compared it to the compass in *Pirates of the Caribbean* that points to "what you desire most." He was skeptical of the accuracy, wondering if the "placebo effect" made him believe the needle was more accurate than it was, but admitted the smooth rotation made it feel convincing. |
+| **Anonymous** | Found the interaction "playful." She noted that unlike a map, which demands your full attention, this device allows you to look around the room. She liked that it didn't give a distance number (ex. "10 meters"), but rather a feeling of direction, which felt more human. |
+
+## Code Pipeline
+
+```
+[Receiver Pi (Target/Mirror)] [Root Pi (Seeker/Compass)]
+ (Runs beacon.py + mirror.py) (Runs compass.py)
+ │ │
+ │ 1. Emits UDP Beacon (:50050) │
+ │ "I am here!" │
+ └────────────────────────────────────────────────────────────>│
+ │
+ [compass.py]
+ 2. Sniffs packets on wlan1 (Left)
+ & wlan2 (Right) via Scapy.
+ 3. Fuses RSSI Delta + Gyro.
+ 4. Updates local Screen.
+ 5. Broadcasts Result via UDP.
+ │
+ │ │
+ │ 6. Receives Angle Data (:55555) │
+ │ <───────────────────────────────────────────────────────────┘
+ │
+ [mirror.py]
+ 7. Calculates Inverse Angle (Angle + 180°).
+ 8. Renders "Mirror" arrow to screen.
+```
+
+## Components
+
+### 1) `beacon.py`
+* **Run on:** **Receiver Pi** (The Target).
+* **Purpose:** Emits a "heartbeat" RF signal.
+* **How it works:** It broadcasts small UDP packets (`RECEIV_BEACON_V1`) to the local network broadcast address (`255.255.255.255`) on port **50050** every 0.1 seconds.
+* **Why:** This creates the constant radio noise that the Root Pi needs to track. By running this on the Receiver Pi, the Receiver becomes the physical "Friend" that is being tracked.
+
+### 2) `mirror.py`
+* **Run on:** **Receiver Pi** (The Target).
+* **Hardware:** HDMI Monitor (I didn't have a second PiTFT screen).
+* **Purpose:** The "Magic Mirror" display.
+* **Logic:** It listens on port **55555** for status updates from the Root Pi. When it receives an angle (ex. "Friend is at 90°"), it calculates the reciprocal angle (270°) and displays it. This ensures that if the Compass points at the Mirror, the Mirror points back at the Compass.
+
+### 3) `compass.py` (The Brain)
+* **Run on:** **Root Pi** (The Seeker).
+* **Hardware:**
+ * **2x Wi-Fi Interfaces:** `wlan1` (Left Ear) and `wlan2` (Right Ear) in monitor mode.
+ * **IMU:** LSM6DS3 (Gyroscope).
+ * **Display:** ST7789 (SPI 1.3" Screen).
+* **Purpose:** Performs the sensing and sensor fusion.
+* **Logic:**
+ * **Sniffing:** Uses `scapy` to capture the `beacon.py` packets emitted by the Receiver.
+ * **Direction:** Compares signal strength between the Left and Right Wi-Fi antennas (`wlan1` vs `wlan2`) to determine direction.
+ * **Fusion:** Blends this RF data with Gyroscope readings to smooth the motion.
+ * **Broadcast:** Sends the calculated state `{angle, spin}` back to the Receiver so the Mirror stays in sync.
+
+### 4) `calibrate.py` (Active Learning)
+* **Run on:** **Root Pi** (The Seeker).
+* **Purpose:** Trains the navigation model in real-time.
+* **Workflow:**
+ 1. **Bootstrap:** You record 3 baseline points (Center, Left 90, Right 90).
+ 2. **Free Roam:** You walk around. If the needle is correct, you press `y` (reinforcing the model). If wrong, you press `n`.
+ 3. **Save:** Generates a `.pkl` checkpoint file that `compass.py` can load.
+
+### 5) `calibrate_spin.py` (Proximity)
+* **Run on:** **Root Pi** (The Seeker).
+* **Purpose:** Defines the "Arrival" zone.
+* **Workflow:** You stand at the distance where you want the "Friend Found" spin animation to trigger. The script records the signal strength at that distance and saves it to `checkpoints/spin.json`.
+
+### 6) `udp_checker.py` (Tooling)
+* **Run on:** Either Pi (for debugging).
+* **Purpose:** Verifies that the UDP broadcast packets are actually making it across the network, which is critical since the entire system relies on connectionless UDP for low latency.
+
+## Default Ports & Addresses
+
+* **Beacon Signal (Receiver -> Root):** UDP `50050` (Broadcast).
+* **State Sync (Root -> Receiver):** UDP `55555` (Broadcast).
+* **Target MAC:** Hardcoded in `compass.py`. *Note: You must update `RECEIV_MAC` in the code to match the Wi-Fi MAC address of the Receiver Pi.*
+
+## Notes
+
+The `compass.py` script relies on **Stereo RF**. The Root Pi must have **two separate Wi-Fi interfaces** (`wlan1` and `wlan2`) enabled in monitor mode. This is achieved by plugging in two external USB Wi-Fi adapters. This hardware setup allows the code to compare `raw_rssi_left` vs `raw_rssi_right` simultaneously, enabling "Direction Finding" based on the signal differential (Stereo Phase/Amplitude) rather than just a single signal strength.
+
+### Critical Setup Instructions
+
+To make this work, you cannot just run the scripts!! You MUST configure the hardware environment first (super important)
+
+**1. Monitor Mode is Mandatory**
+The `compass.py` script requires raw access to radio packets. You must manually set your USB Wi-Fi adapters (`wlan1` and `wlan2`) to Monitor Mode before running the script.
+
+```bash
+sudo ip link set wlan1 down
+sudo iw wlan1 set type monitor
+sudo ip link set wlan1 up
+
+sudo ip link set wlan2 down
+sudo iw wlan2 set type monitor
+sudo ip link set wlan2 up
+```
+
+**2. Network Synchronization**
+For the sniffer to see the packets, all devices must be on the same frequency. Connect your internal Wi-Fi (`wlan0`) to a specific network, and ensure your monitor interfaces are tuned to that channel.
+
+```bash
+# Connect internal wifi to your hotspot
+sudo nmcli dev wifi connect "YourNetworkName" password "YourPassword"
+# You can set specific wifis for different receivers like so:
+sudo nmcli dev wifi connect "YourNetworkName" password "YourPassword" ifname wlan#
+# You can check what wlan# your receivers are by using `iwconfig`
+```
+
+**3. Preempting the Lab Boot Screen**
+If you are using a shared lab Raspberry Pi that has a default "System Info" screen running on boot, it will block access to the SPI display. You must kill this service before running the compass.
+
+```bash
+sudo systemctl stop screen-boot.service
+# Or if running manually:
+pkill -f "python.*screen_boot_script.py"
+```
+
+## AI Disclaimer
+
+I used AI coding assistance (GitHub Copilot) to generate the `scapy` packet sniffing logic in `compass.py` and the `pygame` rendering boilerplate in `mirror.py`. The sensor fusion math (weighted averaging of RSSI delta and Gyroscope integration) and the distributed "Beacon/Mirror" architecture were implemented piecewise with quite a bit of manual work from me to ensure low latency.
+
+## Poster
+
+
+
+## Credits
+
+Special thanks to **Sebastian Bidigain** for the critical feedback on RF protocols and the wristband concept validation, and **Bil de Leon** for the user experience testing and "Pirates" comparison. Also thanks to the Maker Lab staff for letting me run around the room endlessly to test signal drops. Also, special thanks since this is the final project of the semester to **Albert Han**, **Hauke Sandhaus**, and **Wendy Ju** for making this class really awesome. I loved getting to prototype my random ideas and seeing them come alive, and really appreciate all the hard work that they put in behind the scenes!
diff --git a/Final/beacon.py b/Final/beacon.py
new file mode 100644
index 0000000000..b34d8a9e32
--- /dev/null
+++ b/Final/beacon.py
@@ -0,0 +1,53 @@
+"""
+beacon.py
+
+Device: Receiving RPI
+
+Purpose:
+ - Periodically broadcast small UDP "beacon" packets.
+ - The Raspberry Pi (with two ears in monitor mode) will sniff 802.11 frames
+ from this Raspberry Pi Wi-Fi interface and use RSSI to estimate direction.
+
+No monitor mode needed, just a normal Wi-Fi connection.
+"""
+
+import socket
+import time
+import json
+import uuid
+
+BROADCAST_PORT = 50050
+BEACON_INTERVAL = 0.1 # seconds
+BEACON_MAGIC = "RECEIV_BEACON_V1"
+
+
+def main():
+ sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
+ sock.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
+
+ instance_id = str(uuid.uuid4())[:8]
+ print(f"[RPI Beacon] Starting, instance id = {instance_id}")
+ print(f"[RPI Beacon] Broadcasting on UDP port {BROADCAST_PORT} every {BEACON_INTERVAL}s")
+ print("Press Ctrl+C to stop.")
+
+ seq = 0
+ try:
+ while True:
+ seq += 1
+ payload = {
+ "magic": BEACON_MAGIC,
+ "instance": instance_id,
+ "seq": seq,
+ "time": time.time(),
+ }
+ data = json.dumps(payload).encode("utf-8")
+ sock.sendto(data, ("255.255.255.255", BROADCAST_PORT))
+ time.sleep(BEACON_INTERVAL)
+ except KeyboardInterrupt:
+ print("\n[RPI Beacon] Stopping.")
+ finally:
+ sock.close()
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Final/calibrate.py b/Final/calibrate.py
new file mode 100644
index 0000000000..bfd231a009
--- /dev/null
+++ b/Final/calibrate.py
@@ -0,0 +1,281 @@
+"""
+calibrate.py
+
+Active Learning Calibration for Raspberry Pi Dual-Ear Compass.
+
+HARDWARE:
+ - Pi 5 (Receiver) moving around a static Beacon (Sender).
+ - ST7789 Display showing the "Needle".
+
+WORKFLOW:
+ 1. BOOTSTRAP: We collect 3 baseline points (Center, Left 90, Right 90).
+ 2. FREE ROAM: The compass starts running live using the initial model.
+ - You move around the circle.
+ - When the needle matches reality (points at beacon), you type 'y'.
+ - If the needle is wrong, you type 'n'.
+ - The model retrains instantly on 'y'.
+
+Usage:
+ sudo python3 calibrate.py
+"""
+
+import os
+import time
+import uuid
+import joblib
+import threading
+import math
+import sys
+import select
+from collections import deque
+from sklearn.linear_model import LinearRegression
+
+import board
+import digitalio
+from PIL import Image, ImageDraw, ImageFont
+from adafruit_rgb_display import st7789
+from scapy.all import sniff, Dot11, RadioTap
+
+# ---------- CONFIG ----------
+
+RECEIV_MAC = "2c:cf:67:73:fe:2c" # BEACON MAC
+LEFT_IFACE = "wlan1"
+RIGHT_IFACE = "wlan2"
+CHECKPOINT_DIR = "checkpoints"
+
+# Physics limits for fallback (before ML takes over)
+DELTA_MAX_DB = 20.0
+MAX_ANGLE_DEG = 90.0
+
+# ---------- STATE ----------
+
+lock = threading.Lock()
+rssi_left_buffer = deque(maxlen=10) # Fast smoothing
+rssi_right_buffer = deque(maxlen=10)
+
+X_train = [] # Features: [rssi_l, rssi_r]
+y_train = [] # Labels: [angle]
+
+ml_model = None
+is_running = True
+current_pred_angle = 0.0
+
+# ---------- HARDWARE SETUP ----------
+
+def ensure_root():
+ if os.geteuid() != 0:
+ raise SystemExit("!! MUST RUN AS ROOT (sudo) !!")
+
+# Setup Display
+cs_pin = digitalio.DigitalInOut(board.D5)
+dc_pin = digitalio.DigitalInOut(board.D25)
+spi = board.SPI()
+disp = st7789.ST7789(spi, cs=cs_pin, dc=dc_pin, width=135, height=240, x_offset=53, y_offset=40)
+width = disp.height
+height = disp.width
+image = Image.new("RGB", (width, height))
+draw = ImageDraw.Draw(image)
+bl = digitalio.DigitalInOut(board.D22)
+bl.switch_to_output()
+bl.value = True
+try:
+ font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 18)
+except:
+ font = ImageFont.load_default()
+
+# ---------- UTILS ----------
+
+def normalize_mac(mac):
+ return (mac or "").lower()
+
+def get_current_rssi():
+ """Returns average RSSI from buffers."""
+ with lock:
+ if len(rssi_left_buffer) < 2 or len(rssi_right_buffer) < 2:
+ return None, None
+ avg_l = sum(rssi_left_buffer) / len(rssi_left_buffer)
+ avg_r = sum(rssi_right_buffer) / len(rssi_right_buffer)
+ return avg_l, avg_r
+
+def train_model():
+ """Retrains the global model on X_train/y_train."""
+ global ml_model
+ if len(X_train) < 3:
+ return # Not enough data
+
+ # Linear Regression is robust for simple RSSI difference
+ clf = LinearRegression()
+ clf.fit(X_train, y_train)
+ ml_model = clf
+
+# ---------- SNIFFER ----------
+
+TARGET_MAC = normalize_mac(RECEIV_MAC)
+
+def sniff_thread(iface, side):
+ def packet_handler(pkt):
+ if not pkt.haslayer(Dot11): return
+ if normalize_mac(pkt.addr2) == TARGET_MAC:
+ try:
+ rssi = int(pkt[RadioTap].dBm_AntSignal)
+ with lock:
+ if side == "left": rssi_left_buffer.append(rssi)
+ else: rssi_right_buffer.append(rssi)
+ except: pass
+
+ print(f"[Sniffer] Listening on {iface} ({side})...")
+ sniff(iface=iface, prn=packet_handler, store=False)
+
+# ---------- DISPLAY LOOP (The "Spinning Compass") ----------
+
+def display_thread_func():
+ """Runs the display at 30FPS."""
+ global current_pred_angle
+
+ while is_running:
+ l, r = get_current_rssi()
+
+ # 1. Predict Angle
+ if l is not None and r is not None:
+ if ml_model:
+ # Use the Brain
+ try:
+ pred = ml_model.predict([[l, r]])[0]
+ current_pred_angle = max(-110, min(110, pred))
+ except:
+ current_pred_angle = 0.0
+ else:
+ # Use rough math (Fallback before bootstrap)
+ delta = r - l
+ ratio = max(-1, min(1, delta / DELTA_MAX_DB))
+ current_pred_angle = ratio * MAX_ANGLE_DEG
+
+ # 2. Draw
+ draw.rectangle((0, 0, width, height), outline=0, fill=(0, 0, 0))
+ cx, cy = width // 2, height // 2
+ radius = 50
+
+ # Compass circle
+ draw.ellipse((cx-radius, cy-radius, cx+radius, cy+radius), outline=(100,100,100), width=2)
+
+ # Needle
+ angle_rad = math.radians(current_pred_angle)
+ px = cx + radius * math.sin(angle_rad)
+ py = cy - radius * math.cos(angle_rad)
+ draw.line((cx, cy, px, py), fill=(0, 255, 0), width=4) # Green needle for calibration
+
+ # Stats
+ if ml_model:
+ draw.text((5, 5), "AI MODE", font=font, fill=(0, 255, 0))
+ else:
+ draw.text((5, 5), "MATH MODE", font=font, fill=(255, 0, 255))
+
+ draw.text((5, 25), f"Ang: {current_pred_angle:.0f}", font=font, fill=(255,255,255))
+ if l: draw.text((5, height-20), f"L:{l:.0f} R:{r:.0f}", font=font, fill=(100,100,100))
+
+ disp.image(image, 90)
+ time.sleep(0.05)
+
+# ---------- MAIN INTERACTION ----------
+
+def main():
+ global is_running, ml_model, X_train, y_train
+ ensure_root()
+ if not os.path.exists(CHECKPOINT_DIR): os.makedirs(CHECKPOINT_DIR)
+
+ # 1. Start Ears
+ t1 = threading.Thread(target=sniff_thread, args=(LEFT_IFACE, "left"), daemon=True)
+ t2 = threading.Thread(target=sniff_thread, args=(RIGHT_IFACE, "right"), daemon=True)
+ t1.start(); t2.start()
+
+ # 2. Wait for Signal
+ print("Waiting for signal...")
+ while True:
+ l, r = get_current_rssi()
+ if l is not None: break
+ time.sleep(0.5)
+ print("Signal found. Starting Display.")
+
+ # 3. Start Display Thread
+ t_disp = threading.Thread(target=display_thread_func, daemon=True)
+ t_disp.start()
+
+ # 4. Bootstrap Phase (3 Points)
+ print("\n=== PHASE 1: BOOTSTRAP ===")
+ print("I need 3 baseline points before I can start guessing.")
+
+ bootstrap_steps = [
+ (0, "Position Pi so Beacon is DEAD AHEAD (0 deg)"),
+ (-90, "Position Pi so Beacon is 90 LEFT"),
+ (90, "Position Pi so Beacon is 90 RIGHT")
+ ]
+
+ for angle, prompt in bootstrap_steps:
+ print(f"\n>> {prompt}")
+ input(">> Press [ENTER] when ready...")
+
+ l, r = get_current_rssi()
+ if l is None:
+ print("!! Signal lost. Skipping point.")
+ continue
+
+ print(f" Saved: L={l:.1f}, R={r:.1f} -> Angle={angle}")
+ X_train.append([l, r])
+ y_train.append(angle)
+
+ print("\nTraining initial model...")
+ train_model()
+
+ # 5. Active Learning Loop
+ print("\n=== PHASE 2: FREE ROAM (Active Learning) ===")
+ print("Instructions:")
+ print(" 1. Walk around the beacon circle.")
+ print(" 2. Watch the needle on the Pi screen.")
+ print(" 3. When the needle is pointing CORRECTLY at the beacon -> Press 'y' then Enter")
+ print(" 4. If the needle is WRONG -> Press 'n' then Enter")
+ print(" 5. Press 's' to Save Checkpoint.")
+ print(" 6. Press 'q' to Quit.")
+
+ while True:
+ # We need to capture input without blocking the display (display is in thread, so input() is fine)
+ cmd = input("\n[y=Good / n=Bad / s=Save / q=Quit] > ").strip().lower()
+
+ curr_l, curr_r = get_current_rssi()
+ curr_angle = current_pred_angle # From the global updated by display thread
+
+ if cmd == 'y':
+ if curr_l is None:
+ print("No signal, cannot save.")
+ continue
+
+ # Reinforcement Learning:
+ # We assume if the user said "Yes", the CURRENT PREDICTION is close to truth.
+ # So we feed the prediction back into the model as ground truth to reinforce it.
+ print(f" Reinforcing: L={curr_l:.0f} R={curr_r:.0f} => {curr_angle:.0f} deg")
+ X_train.append([curr_l, curr_r])
+ y_train.append(curr_angle)
+ train_model()
+ print(f" Model Retrained! (Points: {len(X_train)})")
+
+ elif cmd == 'n':
+ print(" Discarded. (Bad Guess)")
+ # We do nothing, just don't learn from this moment.
+
+ elif cmd == 's':
+ uid = str(uuid.uuid4())[:8]
+ fname = os.path.join(CHECKPOINT_DIR, f"model_{uid}.pkl")
+ joblib.dump(ml_model, fname)
+ print(f"\n[SAVED] Checkpoint: {uid}")
+ print(f"To run: sudo python3 pi_fused_compass.py --ckpt {uid}")
+
+ elif cmd == 'q':
+ print("Exiting...")
+ is_running = False
+ break
+
+if __name__ == "__main__":
+ try:
+ main()
+ except KeyboardInterrupt:
+ is_running = False
+ print("\nStopped.")
diff --git a/Final/calibrate_spin.py b/Final/calibrate_spin.py
new file mode 100644
index 0000000000..fa5ddb62f5
--- /dev/null
+++ b/Final/calibrate_spin.py
@@ -0,0 +1,132 @@
+"""
+calibrate_spin.py
+
+Purpose:
+ - Measure the RSSI (Signal Strength) at a specific "proximity radius" from the beacon.
+ - Generates a configuration file 'checkpoints/spin.json'.
+
+Usage:
+ 1. Move the Pi (Receiver) to the distance where you want the "Spinning Celebration" to start.
+ 2. Press [ENTER] to capture the signal strength.
+ 3. Move to another spot at the same distance, Press [ENTER].
+ 4. Type 'done' to save and exit.
+"""
+
+import os
+import time
+import json
+import threading
+import statistics
+from scapy.all import sniff, Dot11, RadioTap
+
+# ---------- CONFIG ----------
+
+RECEIV_MAC = "2c:cf:67:73:fe:2c" # Beacon MAC
+LEFT_IFACE = "wlan1"
+RIGHT_IFACE = "wlan2"
+CHECKPOINT_DIR = "checkpoints"
+
+# ---------- STATE ----------
+
+lock = threading.Lock()
+# We use a tiny buffer just to get a stable reading for "this moment"
+rssi_recent = []
+
+captured_thresholds = []
+
+# ---------- UTILS ----------
+
+def ensure_root():
+ if os.geteuid() != 0:
+ raise SystemExit("!! MUST RUN AS ROOT (sudo) !!")
+
+def normalize_mac(mac):
+ return (mac or "").lower()
+
+TARGET_MAC = normalize_mac(RECEIV_MAC)
+
+def sniff_thread(iface, side):
+ def packet_handler(pkt):
+ global rssi_recent
+ if not pkt.haslayer(Dot11): return
+ if normalize_mac(pkt.addr2) == TARGET_MAC:
+ try:
+ val = int(pkt[RadioTap].dBm_AntSignal)
+ with lock:
+ rssi_recent.append(val)
+ if len(rssi_recent) > 20: # Keep only last 20 packets
+ rssi_recent.pop(0)
+ except: pass
+
+ print(f"[Sniffer] Listening on {iface}...")
+ sniff(iface=iface, prn=packet_handler, store=False)
+
+def get_strongest_signal():
+ """Returns the max RSSI seen recently (max is better than avg for proximity)."""
+ with lock:
+ if not rssi_recent: return None
+ return max(rssi_recent)
+
+# ---------- MAIN ----------
+
+def main():
+ ensure_root()
+ if not os.path.exists(CHECKPOINT_DIR): os.makedirs(CHECKPOINT_DIR)
+
+ # Start Sniffers
+ t1 = threading.Thread(target=sniff_thread, args=(LEFT_IFACE, "left"), daemon=True)
+ t2 = threading.Thread(target=sniff_thread, args=(RIGHT_IFACE, "right"), daemon=True)
+ t1.start(); t2.start()
+
+ print("\n" + "="*50)
+ print(" PROXIMITY (SPIN) CALIBRATION")
+ print("="*50)
+ print("Waiting for signal...")
+
+ while True:
+ if get_strongest_signal() is not None: break
+ time.sleep(0.5)
+
+ print("\nSignal found!")
+ print("INSTRUCTIONS:")
+ print("1. Stand at the EDGE of the circle where you want the 'Spin Mode' to trigger.")
+ print("2. Press [ENTER] to record the signal strength there.")
+ print("3. Type 'done' to save.")
+
+ while True:
+ user_input = input("\n[Press ENTER to Record / Type 'done' to Save] > ").strip().lower()
+
+ if user_input == 'done':
+ if not captured_thresholds:
+ print("No points recorded. Exiting without save.")
+ break
+
+ # Calculate Threshold
+ # We use the AVERAGE of the recorded points.
+ # If the signal is STRONGER (greater) than this, we spin.
+ avg_thresh = statistics.mean(captured_thresholds)
+
+ # Add a tiny safety margin (-2dB) so it triggers slightly inside the radius
+ final_thresh = avg_thresh
+
+ data = {"spin_threshold_dbm": final_thresh}
+ path = os.path.join(CHECKPOINT_DIR, "spin.json")
+
+ with open(path, "w") as f:
+ json.dump(data, f)
+
+ print(f"\n[SAVED] Spin Threshold: {final_thresh:.1f} dBm")
+ print(f"Saved to: {path}")
+ break
+ else:
+ # Capture
+ strength = get_strongest_signal()
+ if strength is None:
+ print("!! Signal lost temporarily. Wait a second...")
+ continue
+
+ print(f" Recorded signal strength: {strength} dBm")
+ captured_thresholds.append(strength)
+
+if __name__ == "__main__":
+ main()
diff --git a/Final/compass.py b/Final/compass.py
new file mode 100644
index 0000000000..c409f99c12
--- /dev/null
+++ b/Final/compass.py
@@ -0,0 +1,251 @@
+"""
+compass.py
+
+Device: Raspberry Pi 5 (ROOT)
+Features:
+ - Fuses RSSI + Gyroscope (LSM6DS3) for smooth tracking.
+ - Supports ML Checkpoints (--ckpt UID).
+ - Supports Proximity Spin Mode (loads from checkpoints/spin.json).
+ - Broadcasts Angle/Spin state via UDP to the Receiver Pi.
+
+Run:
+ sudo python3 compass.py --ckpt
+"""
+
+import os
+import time
+import math
+import argparse
+import threading
+import joblib
+import json
+import socket
+from collections import deque
+import subprocess
+import board
+import digitalio
+from PIL import Image, ImageDraw, ImageFont
+from adafruit_rgb_display import st7789
+from adafruit_lsm6ds.lsm6ds3 import LSM6DS3
+from scapy.all import sniff, Dot11, RadioTap
+
+def preempt_boot_screen() -> None:
+ try:
+ subprocess.run(["sudo", "-n", "systemctl", "stop", "pitft-boot-screen.service"], check=False, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+ except Exception as e:
+ print(f"systemctl stop attempt error (ignored): {e}")
+ try:
+ subprocess.run(["pkill", "-TERM", "-f", "python.*screen_boot_script.py"], check=False)
+ time.sleep(0.8)
+ subprocess.run(["pkill", "-KILL", "-f", "python.*screen_boot_script.py"], check=False)
+ except Exception as e:
+ print(f"pkill fallback error (ignored): {e}")
+
+preempt_boot_screen()
+
+# ---------- ARGS ----------
+parser = argparse.ArgumentParser()
+parser.add_argument("--ckpt", type=str, help="Model UID to load")
+args = parser.parse_args()
+
+# ---------- CONFIG ----------
+RECEIV_MAC = "2c:cf:67:73:fe:2c"
+LEFT_IFACE = "wlan1"
+RIGHT_IFACE = "wlan2"
+CHECKPOINT_DIR = "checkpoints"
+BROADCAST_PORT = 55555
+
+# Fallback Physics
+DELTA_MAX_DB = 20.0
+MAX_ANGLE_DEG = 90.0
+ALPHA_RSSI = 0.05
+FPS = 30
+
+# ---------- STATE ----------
+lock = threading.Lock()
+raw_rssi_left = None
+raw_rssi_right = None
+last_packet_time = 0.0
+current_bearing = 0.0
+
+ml_model = None
+spin_threshold = None
+
+# ---------- NETWORK SETUP ----------
+udp_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+udp_sock.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
+
+# ---------- HARDWARE ----------
+def ensure_root():
+ if os.geteuid() != 0: raise SystemExit("Run as root.")
+
+def init_display():
+ cs_pin = digitalio.DigitalInOut(board.D5)
+ dc_pin = digitalio.DigitalInOut(board.D25)
+ spi = board.SPI()
+ disp = st7789.ST7789(spi, cs=cs_pin, dc=dc_pin, width=135, height=240, x_offset=53, y_offset=40)
+ width = disp.height; height = disp.width
+ image = Image.new("RGB", (width, height))
+ draw = ImageDraw.Draw(image)
+ bl = digitalio.DigitalInOut(board.D22)
+ bl.switch_to_output(); bl.value = True
+ try: font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 18)
+ except: font = ImageFont.load_default()
+ return disp, image, draw, width, height, font
+
+def init_imu():
+ try:
+ i2c = board.I2C()
+ return LSM6DS3(i2c)
+ except: return None
+
+def calibrate_gyro(sensor):
+ print("[IMU] Calibrating Gyro (Keep Still)...")
+ offset = 0.0
+ for _ in range(50):
+ offset += sensor.gyro[2]
+ time.sleep(0.02)
+ return offset / 50.0
+
+# ---------- LOADING ----------
+if args.ckpt:
+ path = os.path.join(CHECKPOINT_DIR, f"model_{args.ckpt}.pkl")
+ if os.path.exists(path):
+ print(f"[ML] Loading model: {path}")
+ ml_model = joblib.load(path)
+
+spin_path = os.path.join(CHECKPOINT_DIR, "spin.json")
+if os.path.exists(spin_path):
+ try:
+ with open(spin_path, "r") as f:
+ data = json.load(f)
+ spin_threshold = data.get("spin_threshold_dbm")
+ print(f"[SPIN] Threshold loaded: {spin_threshold} dBm")
+ except: pass
+
+# ---------- SNIFFER ----------
+TARGET_MAC = (RECEIV_MAC or "").lower()
+
+def sniff_thread(iface, side):
+ global raw_rssi_left, raw_rssi_right, last_packet_time
+ def h(pkt):
+ global raw_rssi_left, raw_rssi_right, last_packet_time
+ if not pkt.haslayer(Dot11): return
+ if (pkt.addr2 or "").lower() == TARGET_MAC:
+ try:
+ val = pkt[RadioTap].dBm_AntSignal
+ with lock:
+ if side == "left":
+ raw_rssi_left = val if raw_rssi_left is None else (raw_rssi_left * 0.7 + val * 0.3)
+ else:
+ raw_rssi_right = val if raw_rssi_right is None else (raw_rssi_right * 0.7 + val * 0.3)
+ last_packet_time = time.time()
+ except: pass
+ sniff(iface=iface, prn=h, store=False)
+
+# ---------- MAIN LOOP ----------
+def main():
+ global current_bearing
+ ensure_root()
+ disp, image, draw, width, height, font = init_display()
+
+ t1 = threading.Thread(target=sniff_thread, args=(LEFT_IFACE, "left"), daemon=True)
+ t2 = threading.Thread(target=sniff_thread, args=(RIGHT_IFACE, "right"), daemon=True)
+ t1.start(); t2.start()
+
+ sensor = init_imu()
+ g_off = calibrate_gyro(sensor) if sensor else 0.0
+
+ print("[Main] Compass Running...")
+
+ last_t = time.time()
+
+ while True:
+ now = time.time()
+ dt = now - last_t
+ last_t = now
+
+ # 0. Check Proximity (Spin Logic)
+ is_spinning = False
+ with lock:
+ l = raw_rssi_left
+ r = raw_rssi_right
+
+ if l is not None and r is not None and spin_threshold is not None:
+ if max(l, r) > spin_threshold:
+ is_spinning = True
+
+ # 1. Update Bearing
+ if is_spinning:
+ current_bearing += 800.0 * dt # Spin speed
+ current_bearing %= 360
+ else:
+ # Gyro
+ if sensor:
+ gz = sensor.gyro[2] - g_off
+ current_bearing += math.degrees(gz * dt)
+
+ # RSSI
+ target_angle = 0.0
+ valid_signal = False
+
+ if l is not None and r is not None:
+ valid_signal = True
+ if ml_model:
+ try: target_angle = ml_model.predict([[l, r]])[0]
+ except: target_angle = 0
+ else:
+ delta = r - l
+ if delta > DELTA_MAX_DB: delta = DELTA_MAX_DB
+ if delta < -DELTA_MAX_DB: delta = -DELTA_MAX_DB
+ target_angle = (delta / DELTA_MAX_DB) * MAX_ANGLE_DEG
+
+ if valid_signal:
+ current_bearing = (current_bearing * (1.0 - ALPHA_RSSI)) + (target_angle * ALPHA_RSSI)
+ if (now - last_packet_time) > 2.0:
+ current_bearing *= 0.9
+ else:
+ current_bearing *= 0.9
+
+ # 2. Broadcast to RECEIV (Mirror)
+ try:
+ payload = {
+ "angle": current_bearing,
+ "spin": is_spinning
+ }
+ msg = json.dumps(payload).encode('utf-8')
+ udp_sock.sendto(msg, ('', BROADCAST_PORT))
+ except Exception as e:
+ pass
+
+ # 3. Draw Local (ROOT Screen)
+ draw.rectangle((0,0,width,height), fill=0)
+ cx, cy = width//2, height//2
+ rad = 50
+
+ color = (0, 255, 0) if is_spinning else (255, 0, 0)
+
+ draw.ellipse((cx-rad, cy-rad, cx+rad, cy+rad), outline=(100,100,100), width=2)
+ if not is_spinning:
+ draw.line((cx, cy-rad, cx, cy-rad+10), fill=(150,150,150), width=2)
+
+ a_rad = math.radians(current_bearing)
+ nx = cx + rad * math.sin(a_rad)
+ ny = cy - rad * math.cos(a_rad)
+ draw.line((cx,cy,nx,ny), fill=color, width=5 if is_spinning else 4)
+
+ if is_spinning:
+ draw.text((10, height-30), "ARRIVED!", font=font, fill=(0, 255, 0))
+ else:
+ mode = "ML" if ml_model else "MATH"
+ draw.text((5,5), mode, font=font, fill=(100,100,100))
+ if l: draw.text((5, height-20), f"{max(l,r):.0f}dB", font=font, fill=(255,255,255))
+
+ disp.image(image, 90)
+ time.sleep(1.0/FPS)
+
+if __name__ == "__main__":
+ try:
+ main()
+ except KeyboardInterrupt:
+ pass
diff --git a/Final/mirror.py b/Final/mirror.py
new file mode 100644
index 0000000000..6bd2eef356
--- /dev/null
+++ b/Final/mirror.py
@@ -0,0 +1,131 @@
+"""
+mirror.py
+
+Device: Raspberry Pi (RECEIV)
+Purpose:
+ - Listens for UDP broadcast from ROOT.
+ - Displays the OPPOSITE angle (Root Angle + 180 degrees).
+ - Matches the "Spin" state.
+ - Uses Pygame for HDMI display.
+
+Run: python3 mirror.py
+"""
+
+import socket
+import json
+import math
+import sys
+import pygame
+
+# ---------- CONFIG ----------
+PORT = 55555
+WIDTH, HEIGHT = 800, 480 # Default HDMI, adjusts if fullscreen
+FULLSCREEN = False
+
+# Colors
+COLOR_BG = (0, 0, 0)
+COLOR_CIRCLE = (100, 100, 100)
+COLOR_TICK = (150, 150, 150)
+COLOR_TEXT = (255, 255, 255)
+COLOR_NEEDLE_NORMAL = (255, 0, 0) # Red
+COLOR_NEEDLE_SPIN = (0, 255, 0) # Green
+
+# ---------- STATE ----------
+current_angle = 0.0
+is_spinning = False
+
+# ---------- NETWORK ----------
+def setup_socket():
+ s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+ s.bind(('', PORT))
+ s.setblocking(False)
+ return s
+
+# ---------- DRAWING ----------
+def draw_compass(screen, font):
+ screen.fill(COLOR_BG)
+ cx, cy = WIDTH // 2, HEIGHT // 2
+ radius = int(min(WIDTH, HEIGHT) * 0.35)
+
+ # Circle
+ pygame.draw.circle(screen, COLOR_CIRCLE, (cx, cy), radius, 3)
+
+ # Tick Mark
+ if not is_spinning:
+ pygame.draw.line(screen, COLOR_TICK, (cx, cy - radius), (cx, cy - radius + 20), 4)
+
+ # Needle Logic (OPPOSITE)
+ # If ROOT is 0 (North), Mirror is 180 (South).
+ # If ROOT is 90 (Right), Mirror is 270 (-90) (Left).
+ display_angle = current_angle + 180.0
+
+ angle_rad = math.radians(display_angle)
+ # 0 deg is UP (negative Y)
+ nx = cx + radius * math.sin(angle_rad)
+ ny = cy - radius * math.cos(angle_rad)
+
+ color = COLOR_NEEDLE_SPIN if is_spinning else COLOR_NEEDLE_NORMAL
+ width = 8 if is_spinning else 6
+
+ pygame.draw.line(screen, color, (cx, cy), (nx, ny), width)
+
+ # Text
+ label = "MIRROR MODE"
+ status_color = (100, 100, 100)
+ if is_spinning:
+ label = "ARRIVED!"
+ status_color = (0, 255, 0)
+
+ text_surf = font.render(label, True, status_color)
+ screen.blit(text_surf, (20, 20))
+
+ # Normalize for display reading (-180 to 180 range)
+ read_angle = (display_angle + 180) % 360 - 180
+ angle_text = font.render(f"{read_angle:.1f}", True, COLOR_TEXT)
+ screen.blit(angle_text, (20, 60))
+
+# ---------- MAIN ----------
+def main():
+ global current_angle, is_spinning, WIDTH, HEIGHT
+ pygame.init()
+
+ if FULLSCREEN:
+ screen = pygame.display.set_mode((0, 0), pygame.FULLSCREEN)
+ WIDTH, HEIGHT = screen.get_size()
+ else:
+ screen = pygame.display.set_mode((WIDTH, HEIGHT))
+
+ pygame.display.set_caption("Compass Mirror")
+ clock = pygame.time.Clock()
+ font = pygame.font.SysFont("dejavusans", 40)
+
+ sock = setup_socket()
+ print(f"[Mirror] Listening on port {PORT}...")
+
+ running = True
+ while running:
+ for event in pygame.event.get():
+ if event.type == pygame.QUIT: running = False
+ if event.type == pygame.KEYDOWN:
+ if event.key == pygame.K_ESCAPE: running = False
+
+ # Network Read
+ try:
+ while True:
+ data, addr = sock.recvfrom(1024)
+ payload = json.loads(data.decode('utf-8'))
+ current_angle = payload.get("angle", 0.0)
+ is_spinning = payload.get("spin", False)
+ except BlockingIOError: pass
+ except Exception: pass
+
+ draw_compass(screen, font)
+ pygame.display.flip()
+ clock.tick(30)
+
+ pygame.quit()
+ sys.exit()
+
+if __name__ == "__main__":
+ main()
diff --git a/Final/udp_checker.py b/Final/udp_checker.py
new file mode 100644
index 0000000000..fdb2113655
--- /dev/null
+++ b/Final/udp_checker.py
@@ -0,0 +1,82 @@
+"""
+udp_checker.py
+
+Device: Any Raspberry Pi
+Purpose:
+ - Listens for UDP broadcast from ROOT.
+ - Verifies that packets are being sent and received correctly.
+
+Run: python3 udp_checker.py --mode
+"""
+
+import socket
+import sys
+import time
+import argparse
+
+# Same port as your compass
+PORT = 55555
+
+def get_ip():
+ s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ try:
+ # Doesn't actually connect, just determines route
+ s.connect(('10.255.255.255', 1))
+ IP = s.getsockname()[0]
+ except Exception:
+ IP = '127.0.0.1'
+ finally:
+ s.close()
+ return IP
+
+def run_sender():
+ print(f"--- SENDER MODE (Run this on Root Pi) ---")
+ print(f"My IP seems to be: {get_ip()}")
+ print(f"Broadcasting to :{PORT}...")
+
+ sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ sock.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1)
+
+ count = 0
+ try:
+ while True:
+ msg = f"Ping {count}".encode('utf-8')
+ # Try generic broadcast
+ sock.sendto(msg, ('', PORT))
+ # Try explicit broadcast (often helps on Pi)
+ sock.sendto(msg, ('255.255.255.255', PORT))
+
+ print(f"Sent: Ping {count}")
+ count += 1
+ time.sleep(1)
+ except KeyboardInterrupt:
+ print("\nStopped.")
+
+def run_listener():
+ print(f"--- LISTENER MODE (Run this on Mirror Pi) ---")
+ print(f"My IP seems to be: {get_ip()}")
+ print(f"Listening on 0.0.0.0:{PORT}...")
+
+ sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ sock.bind(('0.0.0.0', PORT))
+ sock.settimeout(5.0) # 5 second timeout
+
+ try:
+ while True:
+ try:
+ data, addr = sock.recvfrom(1024)
+ print(f"[SUCCESS] Received '{data.decode()}' from {addr}")
+ except socket.timeout:
+ print("[WAITING] No packets received in last 5s...")
+ except KeyboardInterrupt:
+ print("\nStopped.")
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser()
+ parser.add_argument("--mode", choices=["send", "listen"], required=True, help="send or listen")
+ args = parser.parse_args()
+
+ if args.mode == "send":
+ run_sender()
+ else:
+ run_listener()
diff --git a/Lab 1/README.md b/Lab 1/README.md
index cbc6dfa745..e3f64b9797 100644
--- a/Lab 1/README.md
+++ b/Lab 1/README.md
@@ -1,8 +1,8 @@
-
+We decided to have our light interaction be part of a club. When someone enters, the light will flash white. When someone exits, the light with flash black. We have different conditions. If there are 0-5 people in the club, the light falls back to a certain color, same thing with 5-10, 10-15, and 15-20. We also have special conditions. Once the amount of people reaches 20 inside the club, the light's default will be a strobe between red, green, and blue. Additionally, if people inside the club are raising their hands, the vibrancy of the light will adjust.
# Staging Interaction
-\*\***NAME OF COLLABORATOR HERE**\*\*
+\*\***Akashu Batu, Benthan Vu, Carrie Wang, Evan Fang, Sean Lewis, Xuesi Chen**\*\*
In the original stage production of Peter Pan, Tinker Bell was represented by a darting light created by a small handheld mirror off-stage, reflecting a little circle of light from a powerful lamp. Tinkerbell communicates her presence through this light to the other characters. See more info [here](https://en.wikipedia.org/wiki/Tinker_Bell).
@@ -74,14 +74,77 @@ The interactive device can be anything *except* a computer, a tablet computer or
\*\***Describe your setting, players, activity and goals here.**\*\*
+Setting: This interaction is happening at the entrance to a club. It is happening at night-time since this is when clubs are generally active and lights are more conspicious. The club entrance/lobby and main floor, at night. Lighting cues must be legible outdoors at the door and indoors under show lighting.
+
+Players: Visitors (primary), door staff/security (secondary), floor staff/DJ (secondary), occasional third parties (delivery, first responders).
+
+Those who are involved in the interaction are the visitors of the club. Other people may include the security guard and staff of the club. However, the intention for the device is to be used by the visitors. Here is a list of the potential players in the setting:
+
+1.) The visitors/general audience (intended audience for interactive device)
+
+2.) The staff/security of the club who keep the club's operations running
+
+3.) Any third party (first responders, delivery workers, etc.)
+
+Activity: Entering/exiting triggers a flash cue at the entrance beacon; once inside, an ambient “default” color communicates occupancy. A “hands-up” gesture in the crowd increases vibrancy (brightness/saturation) of the current look.
+
+Goals:
+
+Visitors: immediate feedback that they’ve entered/exited; a fun ambient light that reflects crowd energy.
+
+Staff: glanceable read on occupancy and hype level; safe, non-blocking signals.
+
+Device: remain legible, avoid rapid flicker, fail safely.
+
+Light logic (spec for prototype + show bible):
+
+Entry: flash white (300 ms on, 200 ms off, 300 ms on), then return to default.
+
+Exit: flash black (turn off 250 ms, on 250 ms, off 250 ms), then return to default.
+
+Default color by occupancy (debounce counter with 1 s cooldown to prevent multiple beams on one person):
+
+0–5: no light — calm/space available.
+
+5–10: green — warming up.
+
+10–15: yellow — party energy.
+
+15–20: red — near capacity.
+
+≥20 (capacity mode): 1 Hz strobe cycling red → green → blue (RGB every 1 s; 50% duty). Post signage or provide a Staff Override (see Safety).
+
+Hands-up vibrancy: Let raised_ratio = (# hands up) / (estimated people inside).
+
+Brightness maps from 35% → 100% as raised_ratio goes 0 → 0.6 (clamp at 100%).
+
+Saturation maps from 70% → 100% over the same range.
+
+Smooth with EMA: vibrancy_t = 0.8 * vibrancy_{t-1} + 0.2 * vibrancy_target.
+
+Safety/overrides:
+Quiet mode: long-press staff button → freeze current hue @ 25% brightness (no strobe).
+Emergency: double-press staff button → solid red @ 100%, no animations, until cleared.
+
Storyboards are a tool for visually exploring a users interaction with a device. They are a fast and cheap method to understand user flow, and iterate on a design before attempting to build on it. Take some time to read through this explanation of [storyboarding in UX design](https://www.smashingmagazine.com/2017/10/storyboarding-ux-design/). Sketch seven storyboards of the interactions you are planning. **It does not need to be perfect**, but must get across the behavior of the interactive device and the other characters in the scene.
\*\***Include pictures of your storyboards here**\*\*
+
+
+
+
+
+
+
+
+
Present your ideas to the other people in your breakout room (or in small groups). You can just get feedback from one another or you can work together on the other parts of the lab.
\*\***Summarize feedback you got here.**\*\*
+One feedback question was how the device worked or was activated. We illustrate some kind of "laser tripwire" from spy movies / kid's toy sections as the trigger for how the light knows when to activate. They then asked how the device knows enter vs. exit, and we suggested using two IR break beams (A then B = enter, B then A = exit) instead of one “laser tripwire,” plus debounce to avoid tailgating double-counts. We had some accessibility feedback comments about some color-only states are hard for some users (color vision deficiencies) and they suggested to pick high-contrast palettes, pair hues with temporal patterns (steady vs. breathing), and keep a legend at the door (we really liked this idea!). There was another comment about strobe safety due to epilepsy concerns. So, we propose both a non-strobing RGB step/rotate alternative and a “No-Strobe” venue mode. The last, but very insightful feedback, was about detection for the hands-up sensing. People liked the idea but flagged privacy if we used cameras. So we came up with some alternatives: manual “Hype” slider for staff/DJ, coarse overhead silhouette count, or letting the crowd tap the rope/rail to register hype.
+
## Part B. Act out the Interaction
@@ -89,8 +152,12 @@ Try physically acting out the interaction you planned. For now, you can just pre
\*\***Are there things that seemed better on paper than acted out?**\*\*
+Yes, we realized quickly that the entry/exit flashes were getting swallowed by ambient lobby light and felt too brief to notice in a crowd. We increased the white flash to a two-pulse pattern and added a short 200 ms black gap so it reads as “arrival” vs. “departure.” We also learned the occupancy color shifted too frequently during rushes; adding a 1 s debounce and a 5 s minimum dwell per range made the looks feel intentional rather than twitchy.
+
\*\***Are there new ideas that occur to you or your collaborator that come up from the acting?**\*\*
+Yes, we decided that we needed either more creative interactions or more creative responses. We decided to take this both ways. For a more creative interaction, we decided that we should keep the base interactions but add that if visitors to the club raise their arms inside of the club, the vibrancy of the colors being displayed at the moment should adjust. For a more creative response, we added that if there are 20 or more people in the club, then the light should start a "strobe" effect that changes between red green and blue every second.
+
## Part C. Prototype the device
@@ -104,16 +171,27 @@ If you run into technical issues with this tool, you can also use a light switch
\*\***Give us feedback on Tinkerbelle.**\*\*
+Tinkerbelle is great, and congratulations to whoever developed it, but coming at this from a developer who used to love building things in Flask, I would recommend that you just shift it to a Github or CFPages webapp since 90% of the students probably aren't changing the source code of it. You can still keep the source code available and encourage students to play with it, however, you shouldn't need to make them setup any sort of Python venvs etc. (even if the reqs are minimal) to run your app. For our group, we ended up poking a hole thru the firewall with a temporary CF tunnel to have it be accessible to everyone else (since the school WiFi is pretty restrictive to what local webapps you can run). Can easily extend it to have some kind "rooms" functionality so that multiple different groups can use the app at same time.
+
+TLDR:
+
+Accessibility: Update outdated do-it-yourself Flask implementation to an instantly usable static Cloudflare Pages or Github webapp
+
+Rooms/multi-session: Add a “Create Room” flow with a 6-digit code so multiple groups can test simultaneously.
+
## Part D. Wizard the device
Take a little time to set up the wizarding set-up that allows for someone to remotely control the device while someone acts with it. Hint: You can use Zoom to record videos, and you can pin someone’s video feed if that is the scene which you want to record.
\*\***Include your first attempts at recording the set-up video here.**\*\*
+https://github.com/user-attachments/assets/0a4e45ad-4c9e-4194-9104-17312157bcc3
+
Now, change the goal within the same setting, and update the interaction with the paper prototype.
\*\***Show the follow-up work here.**\*\*
+https://github.com/user-attachments/assets/aa5b5205-edee-4be1-b460-c9f10be24fb1
## Part E. Costume the device
@@ -123,17 +201,63 @@ Think about the setting of the device: is the environment a place where the devi
\*\***Include sketches of what your devices might look like here.**\*\*
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\*\***What concerns or opportunitities are influencing the way you've designed the device to look?**\*\*
+Some concerns we had when making the look of the device were how much of the light we wanted to show. We did some research into how lighting works at clubs and how these lights are setup. We ended up with different designs for how to block direct view of the LEDs but still keep the light readable (without blinding passerbys). We added a cardboard sleeve to diffuse the light into three bars, and also a aluminum foil wrapper (the light diffusion on the foil had quite the effect). We also created an acrylic glass prototype to really see how the light can influence the scene. We came up with some mounting prototypes as well so that it can mount to an eye-level surface. For aesthetics, we researched into modern clubs and tried to emulate that grunge dark, sleek, yet vibrant aesthetic.
+
## Part F. Record
\*\***Take a video of your prototyped interaction.**\*\*
+(Full video)
+
+https://github.com/user-attachments/assets/e651a4b7-4b95-41df-b12a-889222807931
+
+**Click on the below image to play the video on YouTube**
+**Interaction 1**
+[](https://youtu.be/8-qJKf9TseE)
+**Interaction 2**
+[](https://youtu.be/hpOWgV1T7n8)
+**Interaction 8**
+[](https://youtu.be/70Q2abh3BmU)
+**Music Credit: Seize the Day by Andrey Rossi**
+
\*\***Please indicate who you collaborated with on this Lab.**\*\*
Be generous in acknowledging their contributions! And also recognizing any other influences (e.g. from YouTube, Github, Twitter) that informed your design.
+\*\***The collaborators for this lab and their contributions are listed below**\*\*
+
+\*\***Akash Batu: Storyboards #1, #2, #3, #4, #5, Wizarding Tinkerbelle**\*\*
+\*\***Benthan Vu: Costume #1, Paper Prototype #1, Research & Feedback**\*\*
+
+\*\***Carrie Wang: Wizarding the Device, Research & Feedback**\*\*
+
+\*\***Evan Fang: Costume #2, Paper Prototype #2, Storyboard #8**\*\*
+
+\*\***Sean Lewis: Storyboards #6, #7, Setting up Tinkerbelle**\*\*
+
+\*\***Xuesi Chen: Costume #3, Paper Prototype #3, Demo Video Recording and Editing**\*\*
+
+\*\***All: Ideating, Research, Video Enactment, Communication
# Staging Interaction, Part 2
@@ -154,3 +278,40 @@ Do last week’s assignment again, but this time:
3) We will be grading with an emphasis on creativity.
\*\***Document everything here. (Particularly, we would like to see the storyboard and video, although photos of the prototype are also great.)**\*\*
+
+We decided to have our light interaction be part of a park and our creativity is adding computer vision for input. When someone enters, the light will flash white. When someone exits, the light with flash black. We have different conditions. If a fight breaks out inside of the park, the lights will flash red. If an unattended child leaves the park, the lights will flash red. We also have an alert sound occurring when there is a red light flashing event, to better notify others.
+
+Storyboards are a tool for visually exploring a users interaction with a device. They are a fast and cheap method to understand user flow, and iterate on a design before attempting to build on it. Take some time to read through this explanation of [storyboarding in UX design](https://www.smashingmagazine.com/2017/10/storyboarding-ux-design/). Sketch seven storyboards of the interactions you are planning. **It does not need to be perfect**, but must get across the behavior of the interactive device and the other characters in the scene.
+
+\*\***Include pictures of your storyboards here**\*\*
+
+
+
+**\*\*Product Sketch & Physical Prototype:**\*\*
+
+Pillar Design
+
+
+Ground Light Design
+
+
+Other Design
+
+
+
+
+\*\***Videos:**\*\*
+
+
+https://github.com/user-attachments/assets/e3c67663-537a-460a-9344-a8d08dc73841
+
+
+https://github.com/user-attachments/assets/1bf9b39f-38d7-4608-a86c-87f9ff9026ba
+
+
+https://github.com/user-attachments/assets/d5f1902b-a91f-428e-a49e-90341c0f0cea
+
+
+
+
+
diff --git a/Lab 2/README.md b/Lab 2/README.md
index fdf299cbbf..486bc7148b 100644
--- a/Lab 2/README.md
+++ b/Lab 2/README.md
@@ -1,243 +1,183 @@
# Interactive Prototyping: The Clock of Pi
-**NAMES OF COLLABORATORS HERE**
+**Sean Hardesty Lewis (Solo)**
-Does it feel like time is moving strangely during this semester?
+Inspiration for my core project idea came from discussions around LLMs as “decision engines” and recent research papers that were using LLM-as-judge I reviewed as part of a conference.
+I also referenced GitHub repos on Raspberry Pi display clocks and various YouTube demos of PiTFT usage.
-For our first Pi project, we will pay homage to the [timekeeping devices of old](https://en.wikipedia.org/wiki/History_of_timekeeping_devices) by making simple clocks.
-
-It is worth spending a little time thinking about how you mark time, and what would be useful in a clock of your own design.
-
-**Please indicate anyone you collaborated with on this Lab here.**
-Be generous in acknowledging their contributions! And also recognizing any other influences (e.g. from YouTube, Github, Twitter) that informed your design.
+---
## Prep
-Lab Prep is extra long this week. Make sure to start this early for lab on Thursday.
-
-1. ### Set up your Lab 2 Github
-
-Before the start of lab Thursday, ensure you have the latest lab content by updating your forked repository.
-
-**📖 [Follow the step-by-step guide for safely updating your fork](pull_updates/README.md)**
-
-This guide covers how to pull updates without overwriting your completed work, handle merge conflicts, and recover if something goes wrong.
-
-
-2. ### Get Kit and Inventory Parts
-Prior to the lab session on Thursday, taken inventory of the kit parts that you have, and note anything that is missing:
-
-***Update your [parts list inventory](partslist.md)***
-
-3. ### Prepare your Pi for lab this week
-[Follow these instructions](prep.md) to download and burn the image for your Raspberry Pi before lab Thursday.
+### 1. Set up your Lab 2 Github
+Done!
+### 2. Get Kit and Inventory Parts
+Done!
+### 3. Prepare your Pi for lab this week
+Done!
+---
## Overview
-For this assignment, you are going to
+For this assignment, I connected to my Pi, ran the sample clock code, set up the RGB display, tested the demos, and then modified them. In Part 2, I created a conceptual and working prototype of a new clock: **VLT (Vision-Language-Time)**, where time itself is interpreted by a vision-language model running on the Raspberry Pi.
-A) [Connect to your Pi](#part-a)
+---
-B) [Try out cli_clock.py](#part-b)
+## Part A. Connect to your Pi
+Done!
-C) [Set up your RGB display](#part-c)
+---
-D) [Try out clock_display_demo](#part-d)
+## Part B. Try out the Command Line Clock
+Done!
-E) [Modify the code to make the display your own](#part-e)
+---
-F) [Make a short video of your modified barebones PiClock](#part-f)
+## Part C. Set up your RGB Display
+Done!
-G) [Sketch and brainstorm further interactions and features you would like for your clock for Part 2.](#part-g)
+---
-## The Report
-This readme.md page in your own repository should be edited to include the work you have done. You can delete everything but the headers and the sections between the \*\*\***stars**\*\*\*. Write the answers to the questions under the starred sentences. Include any material that explains what you did in this lab hub folder, and link it in the readme.
+## Part D. Set up the Display Clock Demo
+Done!
-Labs are due on Mondays. Make sure this page is linked to on your main class hub page.
+---
-## Part A.
-### Connect to your Pi
-Just like you did in the lab prep, ssh on to your pi. Once you get there, create a Python environment (named venv) by typing the following commands.
+## Part E. (Now moved to Lab 2 Part 2)
+Done!
-```
-ssh pi@
-...
-pi@raspberrypi:~ $ python -m venv venv
-pi@raspberrypi:~ $ source venv/bin/activate
-(venv) pi@raspberrypi:~ $
+---
-```
-### Setup Personal Access Tokens on GitHub
-Set your git name and email so that commits appear under your name.
-```
-git config --global user.name "Your Name"
-git config --global user.email "yourNetID@cornell.edu"
-```
+## Part F. (Now moved to Lab 2 Part 2)
+Done!
-The support for password authentication of GitHub was removed on August 13, 2021. That is, in order to link and sync your own lab-hub repo with your Pi, you will have to set up a "Personal Access Tokens" to act as the password for your GitHub account on your Pi when using git command, such as `git clone` and `git push`.
+---
-Following the steps listed [here](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from GitHub to set up a token. Depends on your preference, you can set up and select the scopes, or permissions, you would like to grant the token. This token will act as your GitHub password later when you use the terminal on your Pi to sync files with your lab-hub repo.
+## Part G. Sketch and brainstorm further interactions and features
+For Part 2, I propose **Vision-Language-Time (VLT)**:
-## Part B.
-### Try out the Command Line Clock
-Clone your own lab-hub repo for this assignment to your Pi and change the directory to Lab 2 folder (remember to replace the following command line with your own GitHub ID):
+Instead of a standard digital or analog clock, the Raspberry Pi 5 runs a small VLM (ex. Moondream/FastVLM scale) locally. As often as it can (due to compute limitations), it captures an image from its camera and asks the VLM: *“What time is it?”*. The model outputs its “perceived time,” which is displayed on the PiTFT screen along with the ground-truth system time.
-```
-(venv) pi@raspberrypi:~$ git clone https://github.com//Interactive-Lab-Hub.git
-(venv) pi@raspberrypi:~$ cd Interactive-Lab-Hub/Lab\ 2/
-```
-Depends on the setting, you might be asked to provide your GitHub user name and password. Remember to use the "Personal Access Tokens" you just set up as the password instead of your account one!
+We also log:
+- The image frame
+- The VLM’s predicted time
+- The true time
-Check if the directory has clone sucessfully, you should see the Interactive-Lab-Hub under the home directory listed:
-```
-(venv) pi@raspberrypi:~ $ ls
-Bookshelf Documents Music Public venv
-create_img.sh Downloads pi-apps screen_boot_script.py Videos
-Desktop Interactive-Lab-Hub Pictures Templates
-(venv) pi@raspberrypi:~ $
-```
+This gives us the ability to perform an analysis of accuracy afterwards. We can tag images as “indoors” vs. “outdoors” (or other contextual tags) to see if environment affects performance (like artificial vs. natural light), etc.
+The questions we can explore:
+- How accurate is the VLM at telling time?
+- Are we ready to replace traditional timekeepers with AI perception?
+- Could trust in such a clock be measured in user studies?
-Install the packages from the requirements.txt and run the example script `cli_clock.py`:
+**Sketch**
+
-```
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ pip install -r requirements.txt
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ python cli_clock.py
-02/24/2021 11:20:49
-```
+The above image was made with AI as a double-meaning joke for "We trust AI for everything..."
-The terminal should show the time, you can press `ctrl-c` to exit the script.
-If you are unfamiliar with the Python code in `cli_clock.py`, have a look at [this Python refresher](https://hackernoon.com/intermediate-python-refresher-tutorial-project-ideas-and-tips-i28s320p). If you are still concerned, please reach out to the teaching staff!
+Here is my (hand-drawn!) image:
+
-## Part C.
-### Set up your RGB Display
-We have asked you to equip the [Adafruit MiniPiTFT](https://www.adafruit.com/product/4393) on your Pi in the Lab 2 prep already. Here, we will introduce you to the MiniPiTFT and Python scripts on the Pi with more details.
+---
-
+# Prep for Part 2
+Done!
-The Raspberry Pi 4 has a variety of interfacing options. When you plug the pi in the red power LED turns on. Any time the SD card is accessed the green LED flashes. It has standard USB ports and HDMI ports. Less familiar it has a set of 20x2 pin headers that allow you to connect a various peripherals.
+---
-
+# Lab 2 Part 2
-To learn more about any individual pin and what it is for go to [pinout.xyz](https://pinout.xyz/pinout/3v3_power) and click on the pin. Some terms may be unfamiliar but we will go over the relevant ones as they come up.
+## Modify the barebones clock to make it your own
-### Hardware (you have already done this in the prep)
+I created `screen_clock_vlm.py` based on `screen_clock.py` to for our **VLT pipeline**. Instead of just printing system time, the script captures an image via connected webcam, passes it to the local VLM, and shows both the **predicted “AI time”** and the **real time** side by side.
-From your kit take out the display and the [Raspberry Pi 5](https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.raspberrypi.com%2Fproducts%2Fraspberry-pi-5%2F&psig=AOvVaw330s4wIQWfHou2Vk3-0jUN&ust=1757611779758000&source=images&cd=vfe&opi=89978449&ved=0CBMQjRxqFwoTCPi1-5_czo8DFQAAAAAdAAAAABAE)
+=======
+## Assignment that was formerly Lab 2 Part E.
+### Modify the barebones clock to make it your own
-Line up the screen and press it on the headers. The hole in the screen should match up with the hole on the raspberry pi.
+Does time have to be linear? How do you measure a year? [In daylights? In midnights? In cups of coffee?](https://www.youtube.com/watch?v=wsj15wPpjLY)
-
-
-
-
+Time is measured based on the compute speed of our VLM/LLM. So about once every ~10-20 seconds. We measure the time with the current image's lighting conditions. Obviously this is fed into the VLM/LLM so we are only getting the "scraps" or downstream of whatever the quality/training of the models we use are. But- that is the point of the project in itself, yielding control over something relatively simple (like calculating time) to a trained program.
-### Testing your Screen
+### Notice:
+**I had to modify the idea slightly since VLMs tend to be trained on lots of images of clocks with "10:10" or "12:12". This led to prompts that ask the VLM for time nearly always resulting in these times. So, to fix this, I instead ask the VLM for what it is good at: a description of the image, specifically the lighting conditions. Then, I pass the VLM's description of the image to a local Qwen 0.5B Instruct model running via Ollama on the RPI5 and it guesses the time for us.**
-The display uses a communication protocol called [SPI](https://www.circuitbasics.com/basics-of-the-spi-communication-protocol/) to speak with the raspberry pi. We won't go in depth in this course over how SPI works. The port on the bottom of the display connects to the SDA and SCL pins used for the I2C communication protocol which we will cover later. GPIO (General Purpose Input/Output) pins 23 and 24 are connected to the two buttons on the left. GPIO 22 controls the display backlight.
+Some examples of VLM failure (asking directly for time):
-To show you the IP and Mac address of the Pi to allow connecting remotely we created a service that launches a python script that runs on boot. For the following steps stop the service by typing ``` sudo systemctl stop piscreen.service --now```. Othwerise two scripts will try to use the screen at once. You may start it again by typing ``` sudo systemctl start piscreen.service --now```
+| Image | VLM Time | Actual Time |
+|-------|-----------------|----------------|
+|  | 10:10 | 16:11 |
+|  | 10:10 | 16:11 |
-We can test it by typing
+The exact prompt used on the VLM is:
```
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ python screen_test.py
+Describe only observable lighting cues. Describe environment/sky/weather; natural light (direct vs diffuse, where it enters, sun patches/glare); shadows (presence, edge sharpness, relative length, direction); artificial lights (which sources are on, brightness low/medium/high, color warm/neutral/cool); overall brightness/exposure (very dark/dim/medium/bright, blown highlights, deep shadows, noise, motion blur); windows/openings and orientation hints; secondary clues (streetlights on, blinds/shades state, screen glow); brief caveats/confidence.
```
-You can type the name of a color then press either of the buttons on the MiniPiTFT to see what happens on the display! You can press `ctrl-c` to exit the script. Take a look at the code with
+The exact prompt used on the Instruct model is:
```
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ cat screen_test.py
+You are a time estimator. Based ONLY on the following visual/lighting description, "
+ "estimate the local clock time as HH:MM in 24-hour format. "
+ "If uncertain, give your BEST plausible estimate. "
+ "Output ONLY the time in the format HH:MM. No words, no seconds, no explanations.\n\n"
+ "Description:\n"
+ f"{vlm_text.strip()}\n\n"
+ "Answer:\n"
```
-#### Displaying Info with Texts
-You can look in `screen_boot_script.py` for how to display text on the screen!
-
-#### Displaying an image
-
-You can look in `image.py` for an example of how to display an image on the screen. Can you make it switch to another image when you push one of the buttons?
-
-
-
-## Part D.
-### Set up the Display Clock Demo
-Work on `screen_clock.py`, try to show the time by filling in the while loop (at the bottom of the script where we noted "TODO" for you). You can use the code in `cli_clock.py` and `stats.py` to figure this out.
-
-### How to Edit Scripts on Pi
-Option 1. One of the ways for you to edit scripts on Pi through terminal is using [`nano`](https://linuxize.com/post/how-to-use-nano-text-editor/) command. You can go into the `screen_clock.py` by typing the follow command line:
-```
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ nano screen_clock.py
-```
-You can make changes to the script this way, remember to save the changes by pressing `ctrl-o` and press enter again. You can press `ctrl-x` to exit the nano mode. There are more options listed down in the terminal you can use in nano.
-
-Option 2. Another way for you to edit scripts is to use VNC on your laptop to remotely connect your Pi. Try to open the files directly like what you will do with your laptop and edit them. Since the default OS we have for you does not come up a python programmer, you will have to install one yourself otherwise you will have to edit the codes with text editor. [Thonny IDE](https://thonny.org/) is a good option for you to install, try run the following command lines in your Pi's ternimal:
-
- ```
- pi@raspberrypi:~ $ sudo apt install thonny
- pi@raspberrypi:~ $ sudo apt update && sudo apt upgrade -y
- ```
-
-Now you should be able to edit python scripts with Thonny on your Pi.
-
-Option 3. A nowadays often preferred method is to use Microsoft [VS code to remote connect to the Pi](https://www.raspberrypi.com/news/coding-on-raspberry-pi-remotely-with-visual-studio-code/). This gives you access to a fullly equipped and responsive code editor with terminal and file browser.
-
-Pro Tip: Using tools like [code-server](https://coder.com/docs/code-server/latest) you can even setup a VS Code coding environment hosted on your raspberry pi and code through a web browser on your tablet or smartphone!
-
-## Part E. Now moved to Lab2 Part 2.
-
-## Part F. Now moved to Lab2 Part 2.
-
-## Part G.
-## Sketch and brainstorm further interactions and features you would like for your clock for Part 2.
-
+Can you make time interactive? You can look in `screen_test.py` for examples for how to use the buttons.
-# Prep for Part 2
+When you click on the buttons it will cycle (forward or back) through the time screen, the last image taken, and the last VLM description of the image (truncated for space).
-1. Pick up remaining parts for kit on Thursday lab class. Check the updated [parts list inventory](partslist.md) and let the TA know if there is any part missing.
-
-2. Look at and give feedback on the Part G. for at least 2 other people in the class (and get 2 people to comment on your Part G!)
+Please sketch/diagram your clock idea. (Try using a [Verplank diagram](https://ccrma.stanford.edu/courses/250a-fall-2004/IDSketchbok.pdf))!
-# Lab 2 Part 2
+My Verplank diagram:
+
-## Assignment that was formerly Lab 2 Part E.
-### Modify the barebones clock to make it your own
+*If we trust LLMs for everything else, why not for interpreting time itself?*
-Does time have to be linear? How do you measure a year? [In daylights? In midnights? In cups of coffee?](https://www.youtube.com/watch?v=wsj15wPpjLY)
+:)
-Can you make time interactive? You can look in `screen_test.py` for examples for how to use the buttons.
-
-Please sketch/diagram your clock idea. (Try using a [Verplank diagram](https://ccrma.stanford.edu/courses/250a-fall-2004/IDSketchbok.pdf))!
+**Code:**
+The following code files were added for this project:
+- **screen_clock_vlm.py**: inference for program, will run our app that takes an image, captions it, passes it to local LLM, and then shows the predicted time and real time (as well as saving the image, times, etc. in bg)
+- **fastvlm_server.mjs**: local FastVLM server for RPI5
-**We strongly discourage and will reject the results of literal digital or analog clock display.**
+To run, I recommend a venv for installing the reqs to inference the HF model. All you need to do is run `python screen_clock_vlm.py`. You can add an optional -o argument if you are running the VLM somewhere else (it is very slow on RPI5, about ~15s per request, I ended up using a CF tunnel and my PC to get this much faster- see code for where you can input your tunnel URL).
+Note about AI usage: Copilot was used to help make the script for the clock. While I already had an example script from the official HF page, I used Copilot to draft up the filter functions, drawing to screen (with some help from the screen_test.py script), and logging.
+An early issue I encountered was the screen_boot.py constantly running in the background, so in this script this is some preempting of it to effectively remove its control over the pitft screen.
-\*\*\***A copy of your code should be in your Lab 2 Github repo.**\*\*\*
+Please see videos below for the usage.
+---
-## Assignment that was formerly Part F.
## Make a short video of your modified barebones PiClock
-\*\*\***Take a video of your PiClock.**\*\*\*
+Video of VLT pipeline working:
-After you edit and work on the scripts for Lab 2, the files should be upload back to your own GitHub repo! You can push to your personal github repo by adding the files here, commiting and pushing.
-
-```
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ git add .
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ git commit -m 'your commit message here'
-(venv) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 2 $ git push
-```
+https://github.com/user-attachments/assets/7b9f53a8-2b60-4965-a5ec-6f75ebe499de
-After that, Git will ask you to login to your GitHub account to push the updates online, you will be asked to provide your GitHub user name and password. Remember to use the "Personal Access Tokens" you set up in Part A as the password instead of your account one! Go on your GitHub repo with your laptop, you should be able to see the updated files from your Pi!
+Video of VLT interaction with last image, last VLM output, time screen:
+https://github.com/user-attachments/assets/8d9d3851-ec70-4a5a-9037-bad438787a62
-[Update your Lab Hub](pull_updates/README.md) to get the latest content and requirements for Part 2.
+Some examples of usage:
-Modify the code from last week's lab to make a new visual interface for your new clock. You may [extend the Pi](Extending%20the%20Pi.md) by adding sensors or buttons, but this is not required.
-As always, make sure you document contributions and ideas from others explicitly in your writeup.
+| Image | VLM Description | Perceived Time | Actual Time |
+|-------|-----------------|----------------|-------------|
+|  | does not provide clear evidence of natural light sources like windows or outdoor elements, but the lack of brightness suggests it might be either an overexposed photograph or a room with minimal natural light. The overall impression is one of tranquility and stillness. As for the second part of your question, there are no discernible natural light indicators such as sunbeams, shadows, or glare that would confirm the presence of direct or indirect natural light in the room at the time the photo is taken. | 01:00 | 16:52 |
+|  | lighting being on, as there are no visible lights turned on. The windows are open, as evidenced by the visible curtain and the way the light is entering the room, but the curtains are drawn back, allowing for unobstructed light to enter. The sky outside is clear, with no visible clouds, suggesting fair weather conditions. | 15:00 | 16:52 |
+|  | high-rise buildings and smaller structures, possibly a downtown area. The orientation of the buildings and the angle of the shot suggest that this is a view from a high vantage point, such as a skyscraper or a tall building. The lack of any visible movement or activity in the scene implies a moment of stillness, perhaps early morning or late afternoon. | 06:00 | 16:54 |
+|  | sun provides ample light. The image does not show any motion blur, indicating that the camera was still when the photo is being taken. The windows' state is not entirely clear due to the angle and focus of the shot, but they do not appear to be open, as there is no visible gap between the window and the frame. | 09:30 | 16:55 |
+|  | with ambient indoor lighting. The lack of sharp shadows and the presence of soft edges throughout the image contribute to a calm and serene atmosphere. There is no indication of motion blur or other photographic effects, and the image does not provide any clues about the time of day beyond the general impression of daytime. The simplicity of the composition focuses attention on the texture of the material, rather than any specific environmental details. | 09:00 | 16:58 |
+|  | Not appear to be outdoors, given the lack of natural elements like trees or sky. The decor, including the posters, is consistent with a personal space, possibly a living room or a bedroom, where one might relax and enjoy the ambiance created by the lighting. The presence of the lamp and the style of the posters suggest a preference for a certain aesthetic or thematic decor, which could be reflective of the occupant's personal taste or interests. | 08:30 | 17:06 |
-You are permitted (but not required) to work in groups and share a turn in; you are expected to make equal contribution on any group work you do, and N people's group project should look like N times the work of a single person's lab. What each person did should be explicitly documented. Make sure the page for the group turn in is linked to your Interactive Lab Hub page.
+---
diff --git a/Lab 2/fastvlm_server.mjs b/Lab 2/fastvlm_server.mjs
new file mode 100644
index 0000000000..9946e18c2a
--- /dev/null
+++ b/Lab 2/fastvlm_server.mjs
@@ -0,0 +1,357 @@
+// ~/vlt/fastvlm_server.mjs
+//
+// FastVLM HTTP server (localhost only) using @huggingface/transformers + onnxruntime-web (WASM).
+// - HTTP API: GET /health, POST /infer, POST /shutdown
+// - Accepts local file paths and file:// URLs (reads via fs), or http(s) URLs.
+// - Clean 200/4xx/5xx responses; no stdin/stdout protocols.
+//
+// Setup (in project dir):
+// npm uninstall onnxruntime-node @xenova/transformers
+// npm install @huggingface/transformers onnxruntime-web
+//
+// Run (usually spawned by Python):
+// node fastvlm_server.mjs
+//
+// Env:
+// VLM_MODEL=onnx-community/FastVLM-0.5B-ONNX
+// VLM_PORT=17860
+// VLM_CLEAR_CACHE=1
+// HF_HOME=...
+
+import http from 'node:http';
+import process from 'node:process';
+import fs from 'node:fs/promises';
+import path from 'node:path';
+import os from 'node:os';
+import { URL, fileURLToPath } from 'node:url';
+
+function ts() { return new Date().toISOString(); }
+const QUIET = /^(1|true|yes)$/i.test(process.env.VLM_QUIET ?? '');
+const DEBUG = /^(1|true|yes)$/i.test(process.env.VLM_DEBUG ?? '');
+function log(...args) { if (!QUIET) console.error(`[${ts()}]`, ...args); }
+function warn(...args) { if (!QUIET) console.error(`[${ts()}] WARN:`, ...args); }
+function debug(...args) { if (DEBUG && !QUIET) console.error(`[${ts()}] DEBUG:`, ...args); }
+function fatal(...args) { console.error(`[${ts()}] FATAL:`, ...args); process.exit(1); }
+
+const IS_TTY = !!process.stderr.isTTY && !QUIET;
+
+
+// ---------------- deps ----------------
+try { await import('onnxruntime-web'); }
+catch (e) {
+ const msg = String(e?.message || e);
+ if (msg.includes("Cannot find package 'onnxruntime-web'")) {
+ fatal(
+ 'Missing dependency "onnxruntime-web". Fix with:\n' +
+ ' npm uninstall onnxruntime-node @xenova/transformers\n' +
+ ' npm install @huggingface/transformers onnxruntime-web'
+ );
+ }
+ fatal('Failed loading onnxruntime-web:', e?.stack || e);
+}
+
+let AutoProcessor, AutoModelForImageTextToText, RawImage, env;
+try {
+ ({ AutoProcessor, AutoModelForImageTextToText, RawImage, env } =
+ await import('@huggingface/transformers'));
+} catch (e) {
+ const msg = String(e?.message || e);
+ if (msg.includes("Cannot find package '@huggingface/transformers'")) {
+ fatal(
+ 'Missing dependency "@huggingface/transformers". Fix with:\n' +
+ ' npm uninstall onnxruntime-node @xenova/transformers\n' +
+ ' npm install @huggingface/transformers onnxruntime-web'
+ );
+ }
+ fatal('Failed to import @huggingface/transformers:', e?.stack || e);
+}
+
+// Soft-warn if native addon lingers
+try {
+ await import('onnxruntime-node').then(() => {
+ warn('"onnxruntime-node" is installed. We do not use it; remove to avoid conflicts:\n npm uninstall onnxruntime-node');
+ }).catch(() => {});
+} catch {}
+
+// ---------------- runtime config ----------------
+try {
+ env.backends.onnx.backend = 'wasm';
+ const cpuCount = (os.cpus()?.length ?? 4);
+ const threadsEnv = parseInt(process.env.VLM_THREADS || '', 10);
+ const threads = Number.isFinite(threadsEnv) && threadsEnv > 0
+ ? threadsEnv
+ : Math.max(1, Math.min(3, Math.floor(cpuCount / 2))); // gentler default
+ env.backends.onnx.wasm.numThreads = threads;
+ env.useBrowserCache = false;
+ env.allowRemoteModels = true;
+ log(`Backend configured: backend=wasm threads=${env.backends.onnx?.wasm?.numThreads ?? 'n/a'} cpus=${cpuCount}`);
+} catch (e) {
+ warn('Failed to set WASM backend params:', e?.message ?? e);
+}
+
+
+const MODEL_ID = process.env.VLM_MODEL ?? 'onnx-community/FastVLM-0.5B-ONNX';
+const PORT = parseInt(process.env.VLM_PORT || '17860', 10);
+const HOST = '127.0.0.1';
+const dtype = { embed_tokens: 'fp32', vision_encoder: 'fp32', decoder_model_merged: 'fp32' };
+
+// ---------------- helpers ----------------
+function startProgress(label = 'Loading', tickMs = 200) {
+ // Disable the spinner if not a TTY or if VLM_PROGRESS=0
+ if (!IS_TTY || /^(0|false|no)$/i.test(process.env.VLM_PROGRESS ?? '1')) {
+ const t0 = Date.now();
+ return { stop() {}, elapsedMs() { return Date.now() - t0; } };
+ }
+ let dots = 0;
+ const start = Date.now();
+ const timer = setInterval(() => {
+ dots = (dots + 1) % 10;
+ const bar = '█'.repeat(dots) + '-'.repeat(10 - dots);
+ const secs = ((Date.now() - start) / 1000).toFixed(1);
+ process.stderr.write(`\r[${ts()}] [${label}] [${bar}] ${secs}s elapsed`);
+ }, tickMs);
+ return { stop() { clearInterval(timer); process.stderr.write('\n'); },
+ elapsedMs() { return Date.now() - start; } };
+}
+
+
+async function purgeModelCacheIfRequested(modelId) {
+ if (!process.env.VLM_CLEAR_CACHE) return false;
+
+ const hfHome =
+ process.env.HF_HOME
+ || (process.env.HOME && path.join(process.env.HOME, '.cache', 'huggingface'))
+ || path.join(os.homedir(), '.cache', 'huggingface');
+
+ const bases = [
+ path.join(hfHome, 'transformers'),
+ path.join(process.cwd(), 'node_modules', '@huggingface', 'transformers', '.cache'),
+ ];
+ let removed = false;
+ for (const base of bases) {
+ const p1 = path.join(base, modelId.replaceAll('/', path.sep));
+ const p2 = path.join(p1, 'onnx');
+ for (const p of [p1, p2]) {
+ try { await fs.rm(p, { recursive: true, force: true }); removed = true; log('Purged cache:', p); }
+ catch {}
+ }
+ }
+ return removed;
+}
+
+function sendJson(res, status, body) {
+ const payload = JSON.stringify(body);
+ res.writeHead(status, {
+ 'Content-Type': 'application/json; charset=utf-8',
+ 'Content-Length': Buffer.byteLength(payload),
+ 'Cache-Control': 'no-store',
+ 'Connection': 'close',
+ 'X-Server': 'fastvlm-http',
+ });
+ res.end(payload);
+}
+
+async function readJsonBody(req, limit = 2 * 1024 * 1024) {
+ return new Promise((resolve, reject) => {
+ let size = 0; const chunks = [];
+ req.on('data', (c) => {
+ size += c.length;
+ if (size > limit) { reject(Object.assign(new Error('payload too large'), { code: 'ETOOBIG' })); req.destroy(); return; }
+ chunks.push(c);
+ });
+ req.on('end', () => {
+ try { resolve(JSON.parse(Buffer.concat(chunks).toString('utf8') || '{}')); }
+ catch { reject(Object.assign(new Error('invalid JSON'), { code: 'EBADJSON' })); }
+ });
+ req.on('error', reject);
+ });
+}
+
+function isHttpUrl(s) { return /^https?:\/\//i.test(s); }
+function isFileUrl(s) { return /^file:\/\//i.test(s); }
+function isLikelyPath(s) { return !/^[a-z]+:\/\//i.test(s); }
+
+async function loadRawImage(input) {
+ // Accept: http(s) URL, file:// URL, or plain filesystem path
+ try {
+ if (typeof input !== 'string') throw new Error('image must be a string');
+
+ if (isHttpUrl(input)) {
+ debug('[image] from HTTP(S) URL:', input);
+ return await RawImage.fromURL(input);
+ }
+
+ if (isFileUrl(input)) {
+ const p = fileURLToPath(input);
+ debug('[image] from file URL:', p);
+ return await RawImage.fromURL(p);
+ }
+
+ if (isLikelyPath(input)) {
+ const abs = path.resolve(input);
+ debug('[image] from path:', abs);
+ try { await fs.access(abs); } catch { throw new Error(`file not accessible: ${abs}`); }
+ return await RawImage.fromURL(abs);
+ }
+
+ debug('[image] from fallback URL-ish:', input);
+ return await RawImage.fromURL(input);
+ } catch (e) {
+ const msg = e?.message ?? String(e);
+ throw new Error(`loadRawImage failed: ${msg}`);
+ }
+}
+
+
+
+
+// ---------------- model globals ----------------
+let processor, model;
+let ready = false;
+let readyAt = null;
+let busy = false;
+let isShuttingDown = false;
+globalThis.__fatalLoadError = null;
+
+// ---------------- model load ----------------
+async function loadModel() {
+ log(`Boot: model="${MODEL_ID}" deviceWanted=cpu backend=wasm (pure JS via HF)`);
+ if (process.env.VLM_CLEAR_CACHE) {
+ const purged = await purgeModelCacheIfRequested(MODEL_ID);
+ if (purged) log('Note: cache purge requested and completed.');
+ }
+
+ const pb = startProgress('Loading');
+ try {
+ log('Stage 1: Loading processor...');
+ processor = await AutoProcessor.from_pretrained(MODEL_ID);
+ log('Stage 1: Processor loaded OK.');
+
+ log('Stage 2: Loading model (fp32, wasm backend, pure JS)...');
+ model = await AutoModelForImageTextToText.from_pretrained(MODEL_ID, { dtype, device: 'cpu' });
+
+ ready = true;
+ readyAt = new Date().toISOString();
+ pb.stop();
+ log(`Model ready on device=cpu backend=wasm in ${(pb.elapsedMs()/1000).toFixed(2)}s.`);
+ } catch (err) {
+ pb.stop();
+ const msg = String(err || '');
+ warn('Model load failed:', msg);
+ ready = false; readyAt = null;
+ globalThis.__fatalLoadError = msg;
+ }
+}
+
+// ---------------- HTTP server ----------------
+const server = http.createServer(async (req, res) => {
+ try {
+ const u = new URL(req.url, `http://${req.headers.host}`);
+ if (u.hostname !== 'localhost' && u.hostname !== '127.0.0.1') {
+ return sendJson(res, 403, { ok: false, error: 'forbidden' });
+ }
+
+ // Health
+ if (req.method === 'GET' && u.pathname === '/health') {
+ if (isShuttingDown) return sendJson(res, 503, { ok: false, ready: false, shutting_down: true });
+ if (ready) return sendJson(res, 200, { ok: true, ready: true, model: MODEL_ID, backend: 'wasm', device: 'cpu', ready_at: readyAt });
+ if (globalThis.__fatalLoadError) return sendJson(res, 500, { ok: false, ready: false, error: globalThis.__fatalLoadError });
+ return sendJson(res, 503, { ok: false, ready: false, stage: 'loading' });
+ }
+
+ // Inference
+ if (req.method === 'POST' && u.pathname === '/infer') {
+ if (isShuttingDown) return sendJson(res, 503, { ok: false, error: 'shutting down' });
+ if (!ready) return sendJson(res, 503, { ok: false, error: 'model not ready' });
+
+ let body;
+ try { body = await readJsonBody(req); }
+ catch (e) {
+ const code = e?.code === 'ETOOBIG' ? 413 : 400;
+ return sendJson(res, code, { ok: false, error: e?.message || 'bad request' });
+ }
+
+ const image = body?.image;
+ const prompt = (body?.prompt ?? 'Describe the image.');
+ const max_new_tokens_req = parseInt(body?.max_new_tokens || '', 10);
+ const max_new_tokens = Math.max(1, Math.min(256, Number.isFinite(max_new_tokens_req) ? max_new_tokens_req : 12));
+
+ if (!image || typeof image !== 'string') return sendJson(res, 400, { ok: false, error: 'image (string) is required' });
+ if (busy) return sendJson(res, 409, { ok: false, error: 'busy, try again' });
+
+ busy = true;
+
+ const marks = [];
+ const mark = (label) => marks.push([label, Date.now()]);
+ mark('start');
+
+ try {
+ debug('[infer] start', { image, max_new_tokens });
+
+ const imageObj = await loadRawImage(image);
+ mark('image_loaded');
+
+ const chat = [{ role: 'user', content: `${prompt}` }];
+ const templ = processor.apply_chat_template(chat, { add_generation_prompt: true });
+ mark('templated');
+
+ const inputs = await processor(imageObj, templ, { add_special_tokens: false });
+ mark('inputs_ready');
+
+ const outputs = await model.generate({
+ ...inputs,
+ max_new_tokens,
+ do_sample: false,
+ });
+ mark('generated');
+
+ const text = processor.batch_decode(
+ outputs.slice(null, [inputs.input_ids.dims.at(-1), null]),
+ { skip_special_tokens: true }
+ )?.[0] ?? '';
+ mark('decoded');
+
+ const dt_ms = marks.at(-1)[1] - marks[0][1];
+ debug('[infer] timings(ms):', Object.fromEntries(
+ marks.slice(1).map((m, i) => [m[0], m[1] - marks[i][1]])
+ ));
+
+ return sendJson(res, 200, { ok: true, text: String(text).trim(), dt_ms });
+ } catch (e) {
+ const emsg = String(e?.message || e);
+ if (/ENOENT|not exist|no such file/i.test(emsg)) {
+ return sendJson(res, 404, { ok: false, error: 'image not found' });
+ }
+ warn('INFER error:', e?.stack || e);
+ return sendJson(res, 500, { ok: false, error: emsg });
+ } finally {
+ busy = false;
+ }
+ }
+
+
+ // Shutdown
+ if (req.method === 'POST' && u.pathname === '/shutdown') {
+ isShuttingDown = true;
+ sendJson(res, 200, { ok: true, message: 'shutting down' });
+ setTimeout(() => server.close(() => process.exit(0)), 50);
+ return;
+ }
+
+ // Not found
+ return sendJson(res, 404, { ok: false, error: 'not found' });
+ } catch (e) {
+ warn('Request handling error:', e?.stack || e);
+ try { sendJson(res, 500, { ok: false, error: 'internal error' }); } catch {}
+ }
+});
+
+server.listen(PORT, HOST, () => {
+ log(`HTTP server listening on http://${HOST}:${PORT}`);
+ loadModel().catch((e) => warn('loadModel top-level error:', e?.stack || e));
+});
+
+process.on('SIGINT', () => { log('SIGINT'); server.close(() => process.exit(0)); });
+process.on('SIGTERM', () => { log('SIGTERM'); server.close(() => process.exit(0)); });
+process.on('uncaughtException', (e) => { console.error(e); process.exit(1); });
+process.on('unhandledRejection', (e) => { console.error(e); process.exit(1); });
diff --git a/Lab 2/screen_clock_vlm.py b/Lab 2/screen_clock_vlm.py
new file mode 100644
index 0000000000..f222cfcfca
--- /dev/null
+++ b/Lab 2/screen_clock_vlm.py
@@ -0,0 +1,448 @@
+import os, sys, csv, json, time, re, uuid, datetime, mimetypes, subprocess
+from pathlib import Path
+from typing import Optional, Tuple
+from datetime import datetime as _dt
+import textwrap
+from enum import IntEnum
+import threading
+import urllib.request, urllib.error
+
+ONLINE_MODE = any(a.lower() in ("online", "--online", "-o") for a in sys.argv[1:])
+ONLINE_BASE = os.environ.get("VLT_ONLINE_BASE", "https://source-maps-train-training.trycloudflare.com").rstrip("/") # you can replace with your own CF temporary tunnel or local URL to PC server
+
+PROJECT_DIR = Path(__file__).resolve().parent
+LOG_DIR = PROJECT_DIR / "vlt_logs"
+IMG_DIR = LOG_DIR / "images"
+LOG_FILE = LOG_DIR / "clock.log"
+CSV_FILE = LOG_DIR / "log.csv"
+LOG_DIR.mkdir(parents=True, exist_ok=True)
+IMG_DIR.mkdir(parents=True, exist_ok=True)
+
+MAX_NEW_TOKENS = int(os.environ.get("VLT_MAX_NEW_TOKENS", "50"))
+INFER_TIMEOUT_S = int(os.environ.get("VLT_INFER_TIMEOUT", "120"))
+
+OLLAMA_BASE = os.environ.get("OLLAMA_BASE", "http://127.0.0.1:11434").rstrip("/")
+OLLAMA_MODEL = os.environ.get("OLLAMA_MODEL", "qwen2.5:0.5b-instruct")
+OLLAMA_TIMEOUT_S = int(os.environ.get("OLLAMA_TIMEOUT_S", "20"))
+
+CAPTURE_EVERY = int(os.environ.get("VLT_INTERVAL", "5"))
+CAM_INDEX = int(os.environ.get("VLT_CAM_INDEX", "0"))
+FRAME_W = int(os.environ.get("VLT_W", "640"))
+FRAME_H = int(os.environ.get("VLT_H", "480"))
+
+CS_PIN_NAME = os.environ.get("VLT_CS_PIN", "D5")
+DC_PIN_NAME = os.environ.get("VLT_DC_PIN", "D25")
+
+SERVER_HOST = "127.0.0.1"
+SERVER_PORT = int(os.environ.get("VLM_PORT", "17860"))
+SERVER_BASE = f"http://{SERVER_HOST}:{SERVER_PORT}"
+
+VLT_PROMPT = ("Describe only observable lighting cues. Describe environment/sky/weather; natural light (direct vs diffuse, where it enters, sun patches/glare); shadows (presence, edge sharpness, relative length, direction); artificial lights (which sources are on, brightness low/medium/high, color warm/neutral/cool); overall brightness/exposure (very dark/dim/medium/bright, blown highlights, deep shadows, noise, motion blur); windows/openings and orientation hints; secondary clues (streetlights on, blinds/shades state, screen glow); brief caveats/confidence.")
+
+def _ts() -> str:
+ return _dt.utcnow().isoformat()
+
+def log(msg: str) -> None:
+ line = f"[{_ts()}] {msg}"
+ print(line, flush=True)
+ try:
+ with open(LOG_FILE, "a") as f:
+ f.write(line + "\n")
+ except Exception:
+ pass
+
+def preempt_boot_screen() -> None:
+ try:
+ subprocess.run(["sudo", "-n", "systemctl", "stop", "pitft-boot-screen.service"], check=False, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+ except Exception as e:
+ log(f"systemctl stop attempt error (ignored): {e}")
+ try:
+ subprocess.run(["pkill", "-TERM", "-f", "python.*screen_boot_script.py"], check=False)
+ time.sleep(0.8)
+ subprocess.run(["pkill", "-KILL", "-f", "python.*screen_boot_script.py"], check=False)
+ except Exception as e:
+ log(f"pkill fallback error (ignored): {e}")
+
+preempt_boot_screen()
+
+import cv2 # type: ignore
+import digitalio # type: ignore
+import board # type: ignore
+from PIL import Image, ImageDraw, ImageFont # type: ignore
+import adafruit_rgb_display.st7789 as st7789 # type: ignore
+
+mode_str = "ONLINE" if ONLINE_MODE else "LOCAL"
+log(f"Launching VLT ({mode_str}): interval={CAPTURE_EVERY}s cam_index={CAM_INDEX} size={FRAME_W}x{FRAME_H} CS={CS_PIN_NAME} DC={DC_PIN_NAME}")
+log(f"Project: {PROJECT_DIR} Logs: {LOG_DIR}")
+if ONLINE_MODE: log(f"Online endpoint: {ONLINE_BASE}")
+log(f"Ollama: base={OLLAMA_BASE} model={OLLAMA_MODEL}")
+
+try:
+ cs_pin = digitalio.DigitalInOut(getattr(board, CS_PIN_NAME))
+ dc_pin = digitalio.DigitalInOut(getattr(board, DC_PIN_NAME))
+ reset_pin = None
+ BAUDRATE = 64_000_000
+ spi = board.SPI()
+ disp = st7789.ST7789(spi, cs=cs_pin, dc=dc_pin, rst=reset_pin, baudrate=BAUDRATE, width=135, height=240, x_offset=53, y_offset=40)
+ height = disp.width
+ width = disp.height
+ image = Image.new("RGB", (width, height))
+ rotation = 90
+ draw = ImageDraw.Draw(image)
+ backlight = digitalio.DigitalInOut(board.D22); backlight.switch_to_output(value=True)
+ FONT_SMALL = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 18)
+ MONO = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf", 24)
+ buttonA = digitalio.DigitalInOut(board.D23); buttonB = digitalio.DigitalInOut(board.D24)
+ buttonA.switch_to_input(pull=digitalio.Pull.UP); buttonB.switch_to_input(pull=digitalio.Pull.UP)
+ log("Display initialized.")
+except Exception as e:
+ log(f"FATAL: Display init failed: {e}")
+ raise
+
+def clear_screen(): draw.rectangle((0, 0, width, height), outline=0, fill=(0, 0, 0))
+def put_text(x: int, y: int, text: str, font, color=(255,255,255)) -> int:
+ draw.text((x, y), text, font=font, fill=color)
+ bbox = draw.textbbox((x, y), text, font=font)
+ return bbox[3] - bbox[1]
+
+cap = cv2.VideoCapture(CAM_INDEX, cv2.CAP_V4L2)
+cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
+cap.set(cv2.CAP_PROP_FRAME_WIDTH, FRAME_W)
+cap.set(cv2.CAP_PROP_FRAME_HEIGHT, FRAME_H)
+if not cap.isOpened():
+ log("FATAL: Could not open webcam. Check /dev/video* and v4l2-ctl."); sys.exit(1)
+else:
+ log("Camera opened OK.")
+
+def save_frame(img_bgr) -> Path:
+ ts = datetime.datetime.utcnow().strftime("%Y%m%d_%H%M%S_%f")[:-3]
+ path = IMG_DIR / f"frame_{ts}.jpg"
+ ok = cv2.imwrite(str(path), img_bgr)
+ if not ok: log(f"WARNING: cv2.imwrite returned False for {path}")
+ return path
+
+server_proc = None
+server_log_fp = None
+
+if not ONLINE_MODE:
+ SERVER_CMD = ["node", str(PROJECT_DIR / "fastvlm_server.mjs")]
+ SERVER_LOG = LOG_DIR / "fastvlm_server.log"
+ server_log_fp = open(SERVER_LOG, "a", buffering=1, encoding="utf-8")
+ env = os.environ.copy(); env["VLM_PORT"] = str(SERVER_PORT)
+ env.setdefault("VLM_THREADS","2"); env.setdefault("VLM_QUIET","1"); env.setdefault("VLM_PROGRESS","0"); env.setdefault("VLM_DEBUG","1")
+ _preexec = (lambda: os.nice(10)) if hasattr(os, "nice") else None
+ try:
+ server_proc = subprocess.Popen(SERVER_CMD, cwd=str(PROJECT_DIR), stdin=subprocess.DEVNULL, stdout=server_log_fp, stderr=server_log_fp, text=True, bufsize=1, env=env, start_new_session=True, preexec_fn=_preexec)
+ log(f"Started FastVLM server pid={server_proc.pid} cmd={' '.join(SERVER_CMD)}"); log(f"Server logs -> {SERVER_LOG}")
+ except Exception as e:
+ log(f"FATAL: Could not start FastVLM server: {e}"); raise
+
+def _http_json(method: str, url: str, payload: Optional[dict]=None, timeout: float=10.0) -> Tuple[int, dict]:
+ data = None
+ if payload is not None: data = json.dumps(payload).encode("utf-8")
+ req = urllib.request.Request(url, data=data, method=method)
+ req.add_header("Content-Type", "application/json; charset=utf-8"); req.add_header("Accept", "application/json")
+ try:
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
+ status = resp.getcode(); body = resp.read().decode("utf-8", "ignore")
+ try: return status, (json.loads(body) if body else {})
+ except Exception: return status, {"ok": False, "error": "invalid json from server"}
+ except urllib.error.HTTPError as e:
+ body = e.read().decode("utf-8", "ignore")
+ try: return e.code, (json.loads(body) if body else {"ok": False})
+ except Exception: return e.code, {"ok": False, "error": body or str(e)}
+ except urllib.error.URLError as e:
+ raise RuntimeError(f"HTTP error to {url}: {e}")
+
+def _http_multipart(url: str, fields: dict, files: dict, timeout: float = 60.0) -> Tuple[int, dict]:
+ boundary = "----VLTBoundary" + uuid.uuid4().hex; CRLF = b"\r\n"; body = bytearray()
+ for name, value in (fields or {}).items():
+ body.extend(b"--" + boundary.encode("ascii") + CRLF)
+ body.extend(f'Content-Disposition: form-data; name="{name}"'.encode("utf-8") + CRLF + CRLF)
+ body.extend(str(value).encode("utf-8") + CRLF)
+ for name, (filename, blob, ctype) in (files or {}).items():
+ ctype = ctype or mimetypes.guess_type(filename)[0] or "application/octet-stream"
+ body.extend(b"--" + boundary.encode("ascii") + CRLF)
+ headers = (f'Content-Disposition: form-data; name="{name}"; filename="{os.path.basename(filename)}"{CRLF.decode()}' f"Content-Type: {ctype}{CRLF.decode()}").encode("utf-8")
+ body.extend(headers + CRLF); body.extend(blob + CRLF)
+ body.extend(b"--" + boundary.encode("ascii") + b"--" + CRLF)
+ req = urllib.request.Request(url, data=bytes(body), method="POST")
+ req.add_header("Content-Type", f"multipart/form-data; boundary={boundary}"); req.add_header("Accept", "application/json")
+ try:
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
+ status = resp.getcode(); body_text = resp.read().decode("utf-8", "ignore")
+ try: return status, (json.loads(body_text) if body_text else {})
+ except Exception: return status, {"ok": False, "error": "invalid json"}
+ except urllib.error.HTTPError as e:
+ body_text = e.read().decode("utf-8", "ignore")
+ try: return e.code, (json.loads(body_text) if body_text else {"ok": False})
+ except Exception: return e.code, {"ok": False, "error": body_text or str(e)}
+ except urllib.error.URLError as e:
+ raise RuntimeError(f"HTTP error to {url}: {e}")
+
+def wait_for_server_ready(deadline_s: int = 420) -> None:
+ t0 = time.time()
+ while True:
+ if server_proc and (server_proc.poll() is not None): raise RuntimeError("FastVLM server exited early. Check fastvlm_server.log.")
+ try:
+ status, body = _http_json("GET", f"{SERVER_BASE}/health", None, timeout=2.5)
+ if status == 200 and body.get("ok") and body.get("ready"):
+ log(f"Server READY: model={body.get('model')} backend={body.get('backend')} device={body.get('device')}"); return
+ except Exception: pass
+ if (time.time() - t0) > deadline_s: raise RuntimeError("Timeout waiting for VLM server to become ready")
+ time.sleep(0.5)
+
+def check_online_ready(timeout_s: int = 5) -> None:
+ url = f"{ONLINE_BASE}/health"
+ try:
+ status, body = _http_json("GET", url, None, timeout=timeout_s)
+ if status == 200 and body.get("ready") in (True, 1):
+ log(f"Online READY: model={body.get('model')} device={body.get('device')} dtype={body.get('dtype')} max_batch=64")
+ else:
+ log(f"Online health non-200 or not ready (status={status}): {body}")
+ except Exception as e: log(f"Online health check failed (ignored): {e}")
+
+if ONLINE_MODE: check_online_ready()
+else: wait_for_server_ready()
+
+def fastvlm_infer_local(p: Path, prompt: str, max_new_tokens: int = 24, timeout_s: int = 60) -> str:
+ deadline = time.time() + timeout_s; payload = {"image": str(p), "prompt": prompt, "max_new_tokens": max_new_tokens}; tries = 0
+ while True:
+ if time.time() > deadline: raise RuntimeError("Timeout waiting for inference response")
+ tries += 1
+ status, body = _http_json("POST", f"{SERVER_BASE}/infer", payload, timeout=timeout_s)
+ if status == 200 and body.get("ok"):
+ dt_ms = body.get("dt_ms");
+ if dt_ms is not None: log(f"Infer OK in {dt_ms} ms (try={tries})")
+ return (body.get("text") or "").strip()
+ if status in (503, 409): time.sleep(0.25); continue
+ raise RuntimeError(f"Server error {status}: {body.get('error') or body}")
+
+def fastvlm_infer_online(p: Path, prompt: str, max_new_tokens: int = 5, min_new_tokens: int = 1, timeout_s: int = 60) -> str:
+ t0 = time.time()
+ with open(p, "rb") as f: blob = f.read()
+ fields = {"prompt": prompt, "max_new_tokens": str(int(max_new_tokens)), "min_new_tokens": str(int(min_new_tokens))}
+ files = {"image": (p.name, blob, mimetypes.guess_type(p.name)[0] or "image/jpeg")}
+ status, body = _http_multipart(f"{ONLINE_BASE}/caption", fields, files, timeout=timeout_s)
+ if status == 200 and isinstance(body, dict) and ("caption" in body):
+ log(f"Online infer OK in {int((time.time()-t0)*1000)} ms"); return str(body.get("caption") or "").strip()
+ err = body.get("error") if isinstance(body, dict) else body
+ raise RuntimeError(f"Online server error {status}: {err}")
+
+def ollama_generate(prompt: str, model: Optional[str] = None, timeout_s: Optional[int] = None) -> str:
+ model = model or OLLAMA_MODEL; timeout_s = timeout_s or OLLAMA_TIMEOUT_S
+ status, body = _http_json("POST", f"{OLLAMA_BASE}/api/generate", {"model": model, "prompt": prompt, "stream": False}, timeout=float(timeout_s))
+ if status == 200 and isinstance(body, dict):
+ resp = body.get("response")
+ if isinstance(resp, str): return resp.strip()
+ raise RuntimeError(f"Ollama: missing 'response' in body: keys={list(body.keys())}")
+ err = body.get("error") or body.get("message") if isinstance(body, dict) else None
+ raise RuntimeError(f"Ollama error {status}: {err or body}")
+
+TIME_24H_RE = re.compile(r"\b([01]?\d|2[0-3]):([0-5]\d)(?::[0-5]\d)?\b")
+TIME_12H_RE = re.compile(r"\b(1[0-2]|0?[1-9]):([0-5]\d)\s*([AaPp][Mm])\b")
+WORDS_TO_TIME = {"noon": "12:00", "midday": "12:00", "mid-night": "00:00", "midnight": "00:00"}
+
+def parse_time_guess(text: str) -> Optional[str]:
+ t = (text or "").strip()
+ m = TIME_24H_RE.search(t)
+ if m: hh = int(m.group(1)); mm = m.group(2); return f"{hh:02d}:{mm}"
+ m = TIME_12H_RE.search(t)
+ if m:
+ h = int(m.group(1)); mm = m.group(2); ampm = m.group(3).lower()
+ if ampm == "pm" and h != 12: h += 12
+ if ampm == "am" and h == 12: h = 0
+ return f"{h:02d}:{mm}"
+ tl = t.lower()
+ for k, v in WORDS_TO_TIME.items():
+ if k in tl: return v
+ return None
+
+def now_local() -> datetime.datetime: return datetime.datetime.now()
+
+state_lock = threading.Lock()
+last_img_path: Optional[Path] = None
+last_vlm_text: str = ""
+vlm_raw = ""
+llm_raw = ""
+ai_parsed = ""
+last_infer_ts = 0.0
+
+def build_time_prompt(vlm_text: str) -> str:
+ return ("You are a time estimator. Based ONLY on the following visual/lighting description, estimate the local clock time as HH:MM in 24-hour format. If uncertain, give your BEST plausible estimate. Output ONLY the time in the format HH:MM. No words, no seconds, no explanations.\n\nDescription:\n"
+ f"{vlm_text.strip()}\n\nAnswer:\n")
+
+class Screen(IntEnum):
+ TIME = 0
+ IMAGE = 1
+ DESC = 2
+
+screen = Screen.TIME
+_last_nav_ts = 0.0
+_NAV_DEBOUNCE_S = 0.18
+
+def _nav_read() -> Tuple[bool, bool]:
+ return (buttonA.value == False, buttonB.value == False)
+
+def _render_full_image_to_buffer(img: Image.Image) -> None:
+ img_ratio = img.width / img.height; buf_ratio = width / height
+ if buf_ratio < img_ratio:
+ scaled_width = int(height * img_ratio); scaled_height = height
+ else:
+ scaled_width = width; scaled_height = int(width / img_ratio)
+ img = img.resize((scaled_width, scaled_height), Image.BICUBIC)
+ x = (scaled_width - width)//2; y = (scaled_height - height)//2
+ img = img.crop((x, y, x + width, y + height)); image.paste(img)
+
+def draw_time_screen(sys_time_str: str, ai_age_s: int) -> None:
+ clear_screen()
+ y = 4
+ with state_lock:
+ ap = ai_parsed or "N/A"
+ y += put_text(6, y, "VLT (AI):", FONT_SMALL, (255, 200, 0))
+ y += put_text(6, y, ap, MONO, (255, 255, 255))
+ y += 4
+ y += put_text(6, y, "System:", FONT_SMALL, (0, 200, 255))
+ y += put_text(6, y, sys_time_str.split(" ")[1], MONO, (180, 255, 255))
+ put_text(6, height - 22, f"{ai_age_s}s ago", FONT_SMALL, (150, 150, 150))
+
+def draw_image_screen() -> None:
+ clear_screen()
+ with state_lock:
+ p = last_img_path
+ if p and p.exists():
+ try:
+ with Image.open(p) as im:
+ im = im.convert("RGB"); _render_full_image_to_buffer(im)
+ except Exception as e:
+ clear_screen(); put_text(6, 6, f"Image error: {e}", FONT_SMALL, (255, 80, 80))
+ else:
+ put_text(6, 6, "No image yet.", FONT_SMALL, (200, 200, 200))
+
+def draw_desc_screen() -> None:
+ clear_screen()
+ with state_lock:
+ desc = (last_vlm_text or "(none)").strip()
+ put_text(6, 4, "VLM description:", FONT_SMALL, (255, 200, 0))
+ max_chars = max(12, width // 11)
+ wrapped = textwrap.wrap(desc, width=max_chars)
+ lines = wrapped[:6]; y = 28
+ for ln in lines:
+ y += put_text(6, y, ln, FONT_SMALL, (230, 230, 230))
+ if len(wrapped) > len(lines):
+ put_text(6, height - 20, "… (truncated)", FONT_SMALL, (140, 140, 140))
+
+def draw_screen(force: Optional[Screen] = None) -> None:
+ s = force if force is not None else screen
+ sys_time = now_local().strftime("%Y-%m-%d %H:%M:%S")
+ with state_lock:
+ age = int(time.time() - last_infer_ts) if last_infer_ts else 0
+ if s == Screen.TIME: draw_time_screen(sys_time, age)
+ elif s == Screen.IMAGE: draw_image_screen()
+ else: draw_desc_screen()
+ try: disp.image(image, rotation)
+ except Exception as e: log(f"WARNING: disp.image failed: {e}")
+
+def _handle_nav_and_redraw() -> None:
+ global screen, _last_nav_ts
+ a, b = _nav_read()
+ now = time.time()
+ if (now - _last_nav_ts) < _NAV_DEBOUNCE_S: return
+ changed = False
+ if a and not b:
+ screen = Screen((screen - 1) % 3); changed = True
+ elif b and not a:
+ screen = Screen((screen + 1) % 3); changed = True
+ if changed:
+ _last_nav_ts = now
+ draw_screen()
+
+def worker_loop(stop_evt: threading.Event):
+ global last_img_path, last_vlm_text, vlm_raw, llm_raw, ai_parsed, last_infer_ts
+ if not CSV_FILE.exists():
+ with open(CSV_FILE, "w", newline="") as f:
+ csv.writer(f).writerow(["mode","utc_captured","system_time","vlm_raw","llm_raw","ai_parsed","image_path"])
+ next_capture = 0.0
+ while not stop_evt.is_set():
+ if time.time() >= next_capture:
+ ok, frame = cap.read()
+ log(f"Camera read ok={ok} shape={None if not ok else frame.shape}")
+ if ok:
+ img_path = save_frame(frame)
+ with state_lock:
+ last_img_path = img_path
+ log(f"Saved frame to {img_path}")
+ try:
+ log(f"Sending prompt to VLM: {VLT_PROMPT!r}")
+ if ONLINE_MODE:
+ vtxt = fastvlm_infer_online(img_path, VLT_PROMPT, max_new_tokens=256, min_new_tokens=196, timeout_s=INFER_TIMEOUT_S)
+ else:
+ vtxt = fastvlm_infer_local(img_path, VLT_PROMPT, max_new_tokens=MAX_NEW_TOKENS, timeout_s=INFER_TIMEOUT_S)
+ log(f"VLM raw: {vtxt}")
+ llm_prompt = build_time_prompt(vtxt or "")
+ log(f"Ollama prompt (truncated to 200 chars): {llm_prompt[:200].replace(os.linesep,' ')}...")
+ ltxt = ollama_generate(llm_prompt, model=OLLAMA_MODEL, timeout_s=OLLAMA_TIMEOUT_S)
+ log(f"Ollama raw: {ltxt}")
+ aparsed = parse_time_guess(ltxt) or "N/A"
+ log(f"Parsed: {aparserd}" if False else f"Parsed: {aparsed}") # keep format; avoid linter
+ with state_lock:
+ vlm_raw = vtxt or ""
+ llm_raw = ltxt or ""
+ ai_parsed = aparsed
+ last_vlm_text = vlm_raw
+ last_infer_ts = time.time()
+ with open(CSV_FILE, "a", newline="") as f:
+ csv.writer(f).writerow([mode_str, datetime.datetime.utcnow().isoformat(timespec="seconds"), now_local().strftime("%Y-%m-%d %H:%M:%S"), vlm_raw, llm_raw, ai_parsed, str(img_path)])
+ if screen != Screen.TIME:
+ draw_screen()
+ except Exception as e:
+ log(f"ERROR during infer: {e}")
+ with state_lock:
+ llm_raw = f"ERR: {e}"
+ ai_parsed = "N/A"
+ last_infer_ts = time.time()
+ else:
+ log("ERROR: camera returned no frame")
+ with state_lock:
+ ai_parsed = "N/A"
+ last_infer_ts = time.time()
+ next_capture = time.time() + CAPTURE_EVERY
+ else:
+ time.sleep(0.02)
+
+stop_event = threading.Event()
+worker = threading.Thread(target=worker_loop, args=(stop_event,), daemon=True)
+worker.start()
+
+try:
+ while True:
+ _handle_nav_and_redraw()
+ if screen == Screen.TIME:
+ draw_screen(Screen.TIME)
+ time.sleep(0.02)
+except KeyboardInterrupt:
+ log("KeyboardInterrupt: exiting...")
+except Exception as e:
+ log(f"FATAL (main loop): {e}"); raise
+finally:
+ try:
+ stop_event.set()
+ worker.join(timeout=2.0)
+ except Exception: pass
+ try:
+ cap.release(); log("Camera released.")
+ except Exception as e:
+ log(f"Camera release error (ignored): {e}")
+ if not ONLINE_MODE:
+ try: _http_json("POST", f"{SERVER_BASE}/shutdown", {}, timeout=2.5)
+ except Exception: pass
+ try:
+ if server_proc and (server_proc.poll() is None):
+ server_proc.terminate()
+ try: server_proc.wait(3)
+ except subprocess.TimeoutExpired: server_proc.kill()
+ log("FastVLM server terminated.")
+ except Exception as e: log(f"Server terminate error (ignored): {e}")
+ try:
+ if server_log_fp: server_log_fp.close()
+ except Exception: pass
diff --git a/Lab 3/README.md b/Lab 3/README.md
index 25c6970386..142c5abbff 100644
--- a/Lab 3/README.md
+++ b/Lab 3/README.md
@@ -1,311 +1,201 @@
# Chatterboxes
-**NAMES OF COLLABORATORS HERE**
-[](https://www.youtube.com/embed/Q8FWzLMobx0?start=19)
+**Sean Hardesty Lewis (Solo)**
-In this lab, we want you to design interaction with a speech-enabled device--something that listens and talks to you. This device can do anything *but* control lights (since we already did that in Lab 1). First, we want you first to storyboard what you imagine the conversational interaction to be like. Then, you will use wizarding techniques to elicit examples of what people might say, ask, or respond. We then want you to use the examples collected from at least two other people to inform the redesign of the device.
+
-We will focus on **audio** as the main modality for interaction to start; these general techniques can be extended to **video**, **haptics** or other interactive mechanisms in the second part of the Lab.
+
-## Prep for Part 1: Get the Latest Content and Pick up Additional Parts
-
-Please check instructions in [prep.md](prep.md) and complete the setup before class on Wednesday, Sept 23rd.
-
-### Pick up Web Camera If You Don't Have One
-
-Students who have not already received a web camera will receive their [Logitech C270 Webcam](https://www.amazon.com/Logitech-Desktop-Widescreen-Calling-Recording/dp/B004FHO5Y6/ref=sr_1_3?crid=W5QN79TK8JM7&dib=eyJ2IjoiMSJ9.FB-davgIQ_ciWNvY6RK4yckjgOCrvOWOGAG4IFaH0fczv-OIDHpR7rVTU8xj1iIbn_Aiowl9xMdeQxceQ6AT0Z8Rr5ZP1RocU6X8QSbkeJ4Zs5TYqa4a3C_cnfhZ7_ViooQU20IWibZqkBroF2Hja2xZXoTqZFI8e5YnF_2C0Bn7vtBGpapOYIGCeQoXqnV81r2HypQNUzFQbGPh7VqjqDbzmUoloFA2-QPLa5lOctA.L5ztl0wO7LqzxrIqDku9f96L9QrzYCMftU_YeTEJpGA&dib_tag=se&keywords=webcam%2Bc270&qid=1758416854&sprefix=webcam%2Bc270%2Caps%2C125&sr=8-3&th=1) and bluetooth speaker on Wednesday at the beginning of lab. If you cannot make it to class this week, please contact the TAs to ensure you get these.
+Inspiration for my core project idea came from discussions around contextual privacy and ongoing controversy in the area of facial recognition. It shares many similarities to a recent Harvard project and other face recognition controversies (ex. Clearview AI), more details further in the README. **Please refer to [this document](https://docs.google.com/document/d/1iWCqmaOUKhKjcKSktIwC3NNANoFP7vPsRvcbOIup_BA/preview?tab=t.0) to learn more about how to protect your privacy.**
-### Get the Latest Content
+
-As always, pull updates from the class Interactive-Lab-Hub to both your Pi and your own GitHub repo. There are 2 ways you can do so:
-
-**\[recommended\]**Option 1: On the Pi, `cd` to your `Interactive-Lab-Hub`, pull the updates from upstream (class lab-hub) and push the updates back to your own GitHub repo. You will need the *personal access token* for this.
-
-```
-pi@ixe00:~$ cd Interactive-Lab-Hub
-pi@ixe00:~/Interactive-Lab-Hub $ git pull upstream Fall2025
-pi@ixe00:~/Interactive-Lab-Hub $ git add .
-pi@ixe00:~/Interactive-Lab-Hub $ git commit -m "get lab3 updates"
-pi@ixe00:~/Interactive-Lab-Hub $ git push
-```
+## Prep for Part 1: Get the Latest Content and Pick up Additional Parts
-Option 2: On your your own GitHub repo, [create pull request](https://github.com/FAR-Lab/Developing-and-Designing-Interactive-Devices/blob/2022Fall/readings/Submitting%20Labs.md) to get updates from the class Interactive-Lab-Hub. After you have latest updates online, go on your Pi, `cd` to your `Interactive-Lab-Hub` and use `git pull` to get updates from your own GitHub repo.
+Done!
## Part 1.
### Setup
-Activate your virtual environment
-
-```
-pi@ixe00:~$ cd Interactive-Lab-Hub
-pi@ixe00:~/Interactive-Lab-Hub $ cd Lab\ 3
-pi@ixe00:~/Interactive-Lab-Hub/Lab 3 $ python3 -m venv .venv
-pi@ixe00:~/Interactive-Lab-Hub $ source .venv/bin/activate
-(.venv)pi@ixe00:~/Interactive-Lab-Hub $
-```
-
-Run the setup script
-```(.venv)pi@ixe00:~/Interactive-Lab-Hub $ pip install -r requirements.txt ```
-
-Next, run the setup script to install additional text-to-speech dependencies:
-```
-(.venv)pi@ixe00:~/Interactive-Lab-Hub/Lab 3 $ ./setup.sh
-```
+Done!
### Text to Speech
-In this part of lab, we are going to start peeking into the world of audio on your Pi!
-
-We will be using the microphone and speaker on your webcamera. In the directory is a folder called `speech-scripts` containing several shell scripts. `cd` to the folder and list out all the files by `ls`:
-
-```
-pi@ixe00:~/speech-scripts $ ls
-Download festival_demo.sh GoogleTTS_demo.sh pico2text_demo.sh
-espeak_demo.sh flite_demo.sh lookdave.wav
-```
-
-You can run these shell files `.sh` by typing `./filename`, for example, typing `./espeak_demo.sh` and see what happens. Take some time to look at each script and see how it works. You can see a script by typing `cat filename`. For instance:
-
-```
-pi@ixe00:~/speech-scripts $ cat festival_demo.sh
-#from: https://elinux.org/RPi_Text_to_Speech_(Speech_Synthesis)#Festival_Text_to_Speech
-```
-You can test the commands by running
-```
-echo "Just what do you think you're doing, Dave?" | festival --tts
-```
-
-Now, you might wonder what exactly is a `.sh` file?
-Typically, a `.sh` file is a shell script which you can execute in a terminal. The example files we offer here are for you to figure out the ways to play with audio on your Pi!
-
-You can also play audio files directly with `aplay filename`. Try typing `aplay lookdave.wav`.
-
-\*\***Write your own shell file to use your favorite of these TTS engines to have your Pi greet you by name.**\*\*
-(This shell file should be saved to your own repo for this lab.)
-
----
-Bonus:
-[Piper](https://github.com/rhasspy/piper) is another fast neural based text to speech package for raspberry pi which can be installed easily through python with:
-```
-pip install piper-tts
-```
-and used from the command line. Running the command below the first time will download the model, concurrent runs will be faster.
-```
-echo 'Welcome to the world of speech synthesis!' | piper \
- --model en_US-lessac-medium \
- --output_file welcome.wav
-```
-Check the file that was created by running `aplay welcome.wav`. Many more languages are supported and audio can be streamed dirctly to an audio output, rather than into an file by:
-
-```
-echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | \
- piper --model en_US-lessac-medium --output-raw | \
- aplay -r 22050 -f S16_LE -t raw -
-```
+Done!
+
+My script is `speech-scripts/greet_shawn.sh`. I like the emphasis on the h so I made it say "Shawn" instead of "Sean".
### Speech to Text
-Next setup speech to text. We are using a speech recognition engine, [Vosk](https://alphacephei.com/vosk/), which is made by researchers at Carnegie Mellon University. Vosk is amazing because it is an offline speech recognition engine; that is, all the processing for the speech recognition is happening onboard the Raspberry Pi.
+Done!
-Make sure you're running in your virtual environment with the dependencies already installed:
-```
-source .venv/bin/activate
-```
+My script is `speech-scripts/check_words_example/number_input.sh`. I re-used the test_words.py from the vosk demo and it saves the answer to `number_input.txt`.
-Test if vosk works by transcribing text:
+### 🤖 NEW: AI-Powered Conversations with Ollama
-```
-vosk-transcriber -i recorded_mono.wav -o test.txt
-```
+My simple voice interaction is a "voice calculator". I have always found TI-84s and similar calculators to be nightmares to use since they have so many buttons- and I often find myself able to articulate what I want in my head, but it will take minutes to find each button and press them in the right order on the calculator. To this extent, my voice interaction aims to solve that. You ask your questions similarly to our numeric input system from earlier, we can then perform simple calculations in the backend (or in this case, utilize Ollama's trained knowledge, and hope that it is correct), then use text to speech to give the result back to the user.
-You can use vosk with the microphone by running
-```
-python test_microphone.py -m en
-```
+### Serving Pages
----
-Bonus:
-[Whisper](https://openai.com/index/whisper/) is a neural network–based speech-to-text (STT) model developed and open-sourced by OpenAI. Compared to Vosk, Whisper generally achieves higher accuracy, particularly on noisy audio and diverse accents. It is available in multiple model sizes; for edge devices such as the Raspberry Pi 5 used in this class, the tiny.en model runs with reasonable latency even without a GPU.
+Done!
-By contrast, Vosk is more lightweight and optimized for running efficiently on low-power devices like the Raspberry Pi. The choice between Whisper and Vosk depends on your scenario: if you need higher accuracy and can afford slightly more compute, Whisper is preferable; if your priority is minimal resource usage, Vosk may be a better fit.
+### Storyboard
-In this class, we provide two Whisper options: A quantized 8-bit faster-whisper model for speed, and the standard Whisper model. Try them out and compare the trade-offs.
+Done!
-Make sure you're in the Lab 3 directory with your virtual environment activated:
-```
-cd ~/Interactive-Lab-Hub/Lab\ 3/speech-scripts
-source ../.venv/bin/activate
-```
+
-Then test the Whisper models:
-```
-python whisper_try.py
-```
-and
+**Imagined Dialogue:**
-```
-python faster_whisper_try.py
-```
-\*\***Write your own shell file that verbally asks for a numerical based input (such as a phone number, zipcode, number of pets, etc) and records the answer the respondent provides.**\*\*
+User: "What is fifty divided by five?"
-### 🤖 NEW: AI-Powered Conversations with Ollama
+System: "Fifty divided by five is ten."
-Want to add intelligent conversation capabilities to your voice projects? **Ollama** lets you run AI models locally on your Raspberry Pi for sophisticated dialogue without requiring internet connectivity!
+User: "What is nine times eight?"
-#### Quick Start with Ollama
+System: "Nine times eight is seventy two."
-**Installation** (takes ~5 minutes):
-```bash
-# Install Ollama
-curl -fsSL https://ollama.com/install.sh | sh
+User: "What is ten percent of two hundred?"
-# Download recommended model for Pi 5
-ollama pull phi3:mini
+System: "Ten percent of two hundred is twenty."
-# Install system dependencies for audio (required for pyaudio)
-sudo apt-get update
-sudo apt-get install -y portaudio19-dev python3-dev
+### Acting out the dialogue
-# Create separate virtual environment for Ollama (due to pyaudio conflicts)
-cd ollama/
-python3 -m venv ollama_venv
-source ollama_venv/bin/activate
+My interaction was done with Benthan Vu but I forgot to record it. However, I did write down the transcript, and have re-recorded his exact questions below.
-# Install Python dependencies in separate environment
-pip install -r ollama_requirements.txt
-```
-#### Ready-to-Use Scripts
+It is important to note that his integral question was not answered correctly by Ollama.
-We've created three Ollama integration scripts for different use cases:
+https://github.com/user-attachments/assets/3330cb21-fc17-4547-b27f-a5998e017719
-**1. Basic Demo** - Learn how Ollama works:
-```bash
-python3 ollama_demo.py
-```
+The dialogue was a bit different due to understandable impatience due to pacing, and limits of using Ollama for mathematics.
-**2. Voice Assistant** - Full speech-to-text + AI + text-to-speech:
-```bash
-python3 ollama_voice_assistant.py
-```
+For the pacing, it was quickly realized that the restatement of the response was unnecessary. User preferred shortened answers, and full restatements (“Fifty divided by five is ten”) worked for the first turn but were repetitive by the second question. The user got impatient to listen to it repeat the entire problem again before giving the answer and expressed this after the the dialogue concluded.
-**3. Web Interface** - Beautiful web-based chat with voice options:
-```bash
-python3 ollama_web_app.py
-# Then open: http://localhost:5000
-```
+For the limits of using Ollama, since the model is rather a text-generation model and not performing any background scripts, it became very clear as soon as more complex math was asked (e.g. derivatives, integrals, etc.) that it was not well-suited for the task.
-#### Integration in Your Projects
+# Lab 3 Part 2
-Simple example to add AI to any project:
-```python
-import requests
+## Prep for Part 2
-def ask_ai(question):
- response = requests.post(
- "http://localhost:11434/api/generate",
- json={"model": "phi3:mini", "prompt": question, "stream": False}
- )
- return response.json().get('response', 'No response')
+I feel like the text to speech model could be swapped out to a higher quality one.
+
+I would love to integrate some kind of sensor, like vision of the surrounding environment or actors through a camera, or kinetic motion through the gyro sensors similar to the example WoZ eight-ball.
-# Use it anywhere!
-answer = ask_ai("How should I greet users?")
-```
+Based on my above reflections, I have decided to create a Personalized Robot. This project will have heavy similarities to the [below Harvard project](https://lil.law.harvard.edu/events/i-xray-lunch/). This idea is not that original and has been in many different contexts throghout the years (ex. [Clearview AI](https://www.forbes.com/sites/roberthart/2024/09/03/clearview-ai-controversial-facial-recognition-firm-fined-33-million-for-illegal-database/)). Relating to myself, I came up with my own version of this idea two years ago during undergrad and working with traffic cameras when the newest python facial landmark detection library came out but never did anything with it. So like how MIT has its slogan "demo or die" I never demo'd or built my idea for this two years ago, and have subsequently been outdone (to great extent!) by these Harvard students. Their project is well recorded and has some good videos of interaction with it- I recommend you check out the following video, and support their privacy safety efforts.
-**📖 Complete Setup Guide**: See `OLLAMA_SETUP.md` for detailed instructions, troubleshooting, and advanced usage!
+https://github.com/user-attachments/assets/e7b0a26d-7fb8-4843-81f1-5763f8280f30
-\*\***Try creating a simple voice interaction that combines speech recognition, Ollama processing, and text-to-speech output. Document what you built and how users responded to it.**\*\*
+## Prototype your system
-### Serving Pages
+Below are some of my sketches for the system prototype.
-In Lab 1, we served a webpage with flask. In this lab, you may find it useful to serve a webpage for the controller on a remote device. Here is a simple example of a webserver.
-
-```
-pi@ixe00:~/Interactive-Lab-Hub/Lab 3 $ python server.py
- * Serving Flask app "server" (lazy loading)
- * Environment: production
- WARNING: This is a development server. Do not use it in a production deployment.
- Use a production WSGI server instead.
- * Debug mode: on
- * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
- * Restarting with stat
- * Debugger is active!
- * Debugger PIN: 162-573-883
-```
-From a remote browser on the same network, check to make sure your webserver is working by going to `http://:5000`. You should be able to see "Hello World" on the webpage.
+
-### Storyboard
+As well as my Verplank diagram.
-Storyboard and/or use a Verplank diagram to design a speech-enabled device. (Stuck? Make a device that talks for dogs. If that is too stupid, find an application that is better than that.)
+
-\*\***Post your storyboard and diagram here.**\*\*
+As much as I love wizarding, I wanted to go for a completely continuous live experience for my interaction. So I completely removed the need for a controller / wizarding by just having the interaction loop continuously run on the Raspberry PI 5 infinitely.
-Write out what you imagine the dialogue to be. Use cards, post-its, or whatever method helps you develop alternatives or group responses.
+My interaction loop:
-\*\***Please describe and document your process.**\*\*
+1.) Listens for user input, once user input has stopped being detected, uses STT to record what the user said.
-### Acting out the dialogue
+2.) Once it has recorded what the user said, it takes a temporary picture with the webcam of the user.
-Find a partner, and *without sharing the script with your partner* try out the dialogue you've designed, where you (as the device designer) act as the device you are designing. Please record this interaction (for example, using Zoom's record feature).
+3.) This temporary picture is used in a precomputed facial landmark database for similarity to other faces.
-\*\***Describe if the dialogue seemed different than what you imagined when it was acted out, and how.**\*\*
+4.) Finds the most similar face (no threshold was implemented in my version), then returns the name associated with the face.
-### Wizarding with the Pi (optional)
-In the [demo directory](./demo), you will find an example Wizard of Oz project. In that project, you can see how audio and sensor data is streamed from the Pi to a wizard controller that runs in the browser. You may use this demo code as a template. By running the `app.py` script, you can see how audio and sensor data (Adafruit MPU-6050 6-DoF Accel and Gyro Sensor) is streamed from the Pi to a wizard controller that runs in the browser `http://:5000`. You can control what the system says from the controller as well!
+5.) The name associated with the face is fed into a system prompt which guides the system in responding to
-\*\***Describe if the dialogue seemed different than what you imagined, or when acted out, when it was wizarded, and how.**\*\*
+ * the user prompt
+ * details of the user's name according to the landmark database
+ * using TTS to respond to user with a personalized response including their name
-# Lab 3 Part 2
+6.) Repeats to Step 1 again, infinitely.
-For Part 2, you will redesign the interaction with the speech-enabled device using the data collected, as well as feedback from part 1.
+Here is a picture of the device (with webcam attached):
-## Prep for Part 2
+
-1. What are concrete things that could use improvement in the design of your device? For example: wording, timing, anticipation of misunderstandings...
-2. What are other modes of interaction _beyond speech_ that you might also use to clarify how to interact?
-3. Make a new storyboard, diagram and/or script based on these reflections.
+## System Details
-## Prototype your system
+I find it pertinent to give an overview of the system itself and some details of how it works.
-The system should:
-* use the Raspberry Pi
-* use one or more sensors
-* require participants to speak to it.
+### Preparing the System
-*Document how the system works*
+To create the facial landmark database, we needed to have some faces for usage. **For educational and research purposes only,** I created a small database of the Linkedin headshots of people in our class. This database was created by scraping the list of names from the Canvas class page, then using a Linkedin headshot Selenium scraper. The automated browser used a logged-in Linkedin session, searched for each name + "Cornell Tech", found the most relevant person, and downloaded their Linkedin headshot (with the format `FIRST_LAST_LINKEDINID.jpeg`).
-*Include videos or screencaptures of both the system and the controller.*
+It is important to note that users with no Linkedin headshot (or an avatar instead of an IRL face) could not be identified as no facial landmarks could be computed for them.
-
- Submission Cleanup Reminder (Click to Expand)
-
- **Before submitting your README.md:**
- - This readme.md file has a lot of extra text for guidance.
- - Remove all instructional text and example prompts from this file.
- - You may either delete these sections or use the toggle/hide feature in VS Code to collapse them for a cleaner look.
- - Your final submission should be neat, focused on your own work, and easy to read for grading.
-
- This helps ensure your README.md is clear professional and uniquely yours!
-
+I then used the Python library Insightface with the Linkedin headshot folder and created a script that matched an input image to the most likely face from the Linkedin headshot folder. This was a bit slow (around 10s) for each call, since it would calculate the embeddings for the entire folder and the embedding for the input image and compare them. I resolved this down to <2s by caching the embeddings of the Linkedin headshot folder (since we would be comparing against that folder everytime).
+
+For privacy reasons, I have not included the source code for the Selenium scraper for Linkedin headshots. Please reach out to my [email address](mailto:shl225@cornell.edu) if you would like to have it.
+
+### Creating the Pipeline
+
+My system involves a pipeline described in the interaction loop above. To create this pipeline, I relied on Copilot and ChatGPT for much of the code that I was unfamiliar with (I have never used Insightface in the past). To this extent, I created a pipeline that consisted of the following files:
+* `master.py` : an infinitely running one-shot script which orchestrates the following module python scripts in the pipeline order of the interaction loop.
+* `face_server.py` : a server running on port `7860` which uses the precomputed embeddings and Insightface to find the most similar headshot and retrieve the relevant name.
+* `love_server.py` : a server running on port `7861` which queries Ollama with the user's input, a system prompt, and relevant name from `face_server`, then outputs a response to the user's query.
+* `greet_name_piper.py` : an executable one-shot script which is an upgraded version of our STT that uses Python Piper STT library, immediately outputs via speaker the audio, and is the final module to be called.
+
## Test the system
-Try to get at least two people to interact with your system. (Ideally, you would inform them that there is a wizard _after_ the interaction, but we recognize that can be hard.)
-Answer the following:
+Since the interaction for my device was simple, just an Ollama chatbot with the additional information of knowing the user's name, I put more emphasis on the discussion/debriefing after the interaction. I believe the importance of this project, similar to the Harvard one, is to encourage discussion around contextual privacy.
### What worked well about the system and what didn't?
-\*\**your answer here*\*\*
+
+The interaction would be somewhat slow since even though the landmark facial recognition was done in under 2s on average, text generation and subsequent TTS often took 10s plus. This led to a disrepency between the user asking a question, then needing to wait around 10s for an answer, which felt somewhat awkward. I believe this could easily be fixed by offloading to a powerful PC for the text generation etc. so that the local RPI5 is only doing minimal computational tasks. Another issue was that some of the Linkedin headshots were of different points of view of a face than the current point of view of the person facing the camera, causing some misidentification. As always, it is important to mention that bias still exists in face recognition models and could be seen sometimes even in this toy example, with certain ethnicities being more prone to misidentification.
### What worked well about the controller and what didn't?
-\*\**your answer here*\*\*
+There was no controller in my setup due to the infinite loop script. Please refer to my discussion section with the users afterward as my consolation for this.
### What lessons can you take away from the WoZ interactions for designing a more autonomous version of the system?
-\*\**your answer here*\*\*
-
+I believe I already designed a little bit of an autonomous system with the infinite loop script, but I will describe some lessons and some future improvements of the current system.
+- WoZ interactions, and autonomous ones for that matter, can feel quite unrealistic if the pacing/timing is off. This was readily apparent in my system with the ~10s response times, and it makes it less of an interaction and more of a waiting game. The simple and easy fix to this would be to offload heavy computational tasks to more powerful PCs.
+- As noted by several participants and spectators, having the system say the user's name directly is quite straightforward and puts the user on their defensive immediately, wondering "How did this robot know my name?". To this extent, better '*social engineering*' experiments could be conducted. Please see the Harvard video above for a good example of this where they interact with Boston residents not by directly saying their name, but by rather saying an association or relative of that person. This lowers the defensive wall for the user and elicits a more natural reaction than one where the user is immediately put on guard. This is a bit deceitful of course, so please note that these thoughts are being expressed for educational reasons only, and I do not encourage these types of insincere interactions.
+- The system sometimes did not elicit quality responses to user queries which led to confusion on the user's side. This could be solved with a higher quality language model (at the cost of speed).
+- It was readily apparant that the webcam was being used for the face recognition, and participants quickly gleaned this once they heard their name being spoken. I wonder if there is a way of disguising the webcam so that the participant does not realize it is face recognition. What different interactions could occur from this? Would the paricipant believe that their name is just in the knowledge-base of the LLM, some voice recognition is being done, etc.? I think it would be interesting to explore what reasons users can come up for *why* a privacy breach is occurring. It might give an insight into what people believe are the most privacy-breaching technologies of the current day.
+- It was raised by a few participants that they would be okay with the interaction if they had "consented" to it. They defined this consent as being given an explicit ask by the robot about opting into recognition, and then affirming that the robot could use sensor information to guess who they were.
### How could you use your system to create a dataset of interaction? What other sensing modalities would make sense to capture?
-\*\**your answer here*\*\*
+Similar to how current LLMs rely on RL of human-LLM conversations to have larger databases for training, I believe that the same sort of thing can be done with privacy-breach human-LLM conversations. Conversations where privacy has been breached by the LLM are quite different than normal conversations and would be an interesting database to create and utilize for developing models to better interact in scenarios where there is a lessened sense of privacy.
+
+Other sensing modalities (and perhaps more important) would be voice recognition. A camera can only see in a certain range of view- and within that view several actors may exist, while sound can come from anywhere- and from any actor. Therefore, an easy issue to see would be if there were two people in front of the camera of this system and only one of them is speaking. The same notion goes for if the person interacting with the robot is out of the field of the view of the camera. There is no detection being done for `is_speaking`, or rotating of the camera to face audio source, so the model will try to identify whichever individual is in camera-view after the audio source has elapsed speaking and could be completely wrong about who to address. With voice recognition, the model could recognize and assign names to each voice and be able to converse personally even while blind (no camera).
+
+## Video Demo
+
+https://github.com/user-attachments/assets/ebc845f4-f293-4596-ba4a-90ff4e30adb1
+
+## Discussion
+
+As the purpose of my interaction was more of an experimental demo to encourage discussion around contextual privacy, I realize a Discussion section could be useful for this project.
+
+I tested this system with several participants. While I did not get videos of these interactions, I will describe each below.
+
+| Person | Viewpoint |
+|---|---|
+| **Thomas Knoepffler** | Thomas wants the system to earn any use of your identity with a clear, easy story that a regular person would accept. Just proving it knows your name is not helpful, and it can feel like showing off unless that knowledge unlocks something you actually want. He thinks about it like a trade. If the system asks you to spend a privacy coin, it should hand you something valuable right away, like faster checkout, a smarter default, less typing, or a safer choice. The reason should show up in the moment so you do not have to guess what is happening. Say what you are doing and why you are doing it, for example, “I am using your name to load your saved preferences so you can finish in one tap,” instead of “Hi, Thomas.” He also says you do not always need a big justification for every detail, but if you cross a boundary like revealing that you recognized someone, you should explain why now, what the benefit is, and how to turn it off. Tie identity to a goal, show the payoff quickly, and let people opt out without friction. |
+| **Nana Takada** | For Nana, the creepy factor starts when a system recognizes you without telling you first. That collapses the line between being a person in public and being a record in a profile. She draws a clear distinction between visible, announced uses by institutions, like a city posting signs about analytic cameras, and secret, always-on recognition in home devices that you never consented to. If her smart speaker did that, she would return it. The fear behind this is a surveillance state vibe where quiet dossiers get built, corporations aggregate data, and the motives are not clear. Creepy here is when your expectations do not match the hidden capabilities, especially when the company has more power than you do. Good fixes start with upfront disclosure, honest consent, and visible rules that are easy to understand. Do not surprise people. Say what you do, do only that, and prove it. |
+| **Miriam Alex** | Miriam focuses on what is necessary for the task. Lots of questions, like “What is the weather in NYC,” do not need identity at all, so using her name adds risk without adding value. She is more comfortable when sensing is explained and when consent is asked in the moment. A simple prompt like “I am scanning your face to personalize your commute alerts, may I proceed?” makes a huge difference. She also points out a tricky feeling. If the system knows who you are but does not say anything, the moment might feel less awkward, but ethically that is weaker because you are not being told what is going on. The practical rule is to minimize data by default, show recognition only when it clearly improves the outcome, and bind any personal address to a purpose the user can see. Stay anonymous when you can, escalate only when needed, and make the benefit obvious so the trade feels worth it. |
+| **Anonmyous** | Anonymous prefers softer identity cues and short-lived use of recognition. Instead of naming a person outright, point to situational or public details so it feels responsive without locking onto a specific identity. An example would be “I think you are number 33 on Forbes 40 under 40, isn't that right?” rather than calling someone by name out of nowhere. They also note that big organizations invest in social engineering tactics, which creates a power gap between those who hold information and everyone else. To reduce that gap, keep recognition as a one-time flourish, make it clearly optional, and avoid storing it after the moment passes. Prefer public or volunteered facts, avoid status-based callouts that single people out, and cap precision to what the task needs. Save persistent profiles only for explicit opt in, with clear controls and visible benefits. |
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/Lab 3/face_server.py b/Lab 3/face_server.py
new file mode 100644
index 0000000000..1839308d16
--- /dev/null
+++ b/Lab 3/face_server.py
@@ -0,0 +1,327 @@
+#!/usr/bin/env python3
+# face_server.py
+# Simple JSON HTTP server for face recognition on Raspberry Pi 5 (or any Linux box).
+# - Loads InsightFace once, caches gallery embeddings, and serves recognition on port 7860.
+# - Request: POST /recognize {"image_path": "/path/to/captured.jpg"}
+# - Response: {"name": "", "confidence": 74.98}
+#
+# Start separately:
+# python3 face_server.py --headshots ~/facerecog/headshots --model buffalo_m --det 320 --host 0.0.0.0 --port 7860
+#
+# Dependencies:
+# python3 -m pip install flask onnxruntime insightface opencv-python numpy
+
+import os, sys, argparse, json, hashlib, tempfile, io, glob, re, time, threading, contextlib
+from typing import List, Tuple
+import numpy as np
+import cv2
+from flask import Flask, request, jsonify
+
+# Quiet noisy libs on stdout
+os.environ.setdefault("TF_CPP_MIN_LOG_LEVEL", "3")
+os.environ.setdefault("INSIGHTFACE_LOG_LEVEL", "error")
+
+# Lower ONNX Runtime logs to FATAL so they don't pollute output
+try:
+ import onnxruntime as ort
+ if hasattr(ort, "set_default_logger_severity"):
+ ort.set_default_logger_severity(4) # FATAL
+except Exception:
+ pass
+
+from insightface.app import FaceAnalysis
+
+CACHE_BASENAME = ".emb_cache"
+
+# ----------------- Utility: quiet stdout context -----------------
+@contextlib.contextmanager
+def quiet_stdout():
+ with open(os.devnull, "w") as devnull, contextlib.redirect_stdout(devnull):
+ yield
+
+# ----------------- Providers -----------------
+def _available_providers():
+ try:
+ import onnxruntime as _ort
+ return set(_ort.get_available_providers() or [])
+ except Exception:
+ return {"CPUExecutionProvider"}
+
+_AVAIL_PROVIDERS = _available_providers()
+
+def _providers(use_gpu: bool):
+ prov = []
+ if use_gpu and "CUDAExecutionProvider" in _AVAIL_PROVIDERS:
+ prov.append("CUDAExecutionProvider")
+ prov.append("CPUExecutionProvider")
+ return prov
+
+# ----------------- Name cleaning -----------------
+_SUFFIX_NUM = re.compile(r"(_\d{6,})+$")
+
+def clean_name_from_filename(filename: str) -> str:
+ base = os.path.splitext(os.path.basename(filename))[0]
+ base = _SUFFIX_NUM.sub("", base)
+ base = re.sub(r"[_\-]+", " ", base)
+ base = re.sub(r"[^A-Za-z0-9' ]+", "", base)
+ tokens = [t for t in base.split() if not t.isdigit()]
+ cleaned = " ".join(tokens).strip().lower()
+ return cleaned or "unknown"
+
+# ----------------- Cache helpers -----------------
+def _gallery_files(gallery_dir: str) -> List[str]:
+ exts = ("*.jpg", "*.jpeg", "*.png", "*.bmp", "*.webp")
+ files = []
+ for e in exts:
+ files.extend(glob.glob(os.path.join(gallery_dir, e)))
+ files = sorted(files)
+ if not files:
+ raise RuntimeError(f"No images found in gallery folder: {gallery_dir}")
+ return files
+
+def _cache_paths(gallery_dir: str, model: str, det_size: int) -> Tuple[str, str]:
+ abs_gallery = os.path.abspath(gallery_dir)
+ h = hashlib.sha1(f"{abs_gallery}|{model}|{det_size}".encode()).hexdigest()[:12]
+ idx = os.path.join(abs_gallery, f"{CACHE_BASENAME}_{h}.json")
+ npz = os.path.join(abs_gallery, f"{CACHE_BASENAME}_{h}.npz")
+ os.makedirs(abs_gallery, exist_ok=True)
+ return idx, npz
+
+def _atomic_replace_write_bytes(path: str, data: bytes):
+ folder = os.path.dirname(path) or "."
+ os.makedirs(folder, exist_ok=True)
+ with tempfile.NamedTemporaryFile(mode="wb", delete=False, dir=folder, prefix=".tmp_", suffix=".tmp") as tf:
+ tmp_path = tf.name
+ tf.write(data)
+ tf.flush()
+ os.fsync(tf.fileno())
+ os.replace(tmp_path, path)
+
+def _atomic_save_npz(path: str, **arrays):
+ bio = io.BytesIO()
+ np.savez_compressed(bio, **arrays)
+ payload = bio.getvalue()
+ _atomic_replace_write_bytes(path, payload)
+
+def _atomic_save_json(path: str, obj: dict):
+ payload = (json.dumps(obj)).encode("utf-8")
+ _atomic_replace_write_bytes(path, payload)
+
+def _load_cache(idx_path: str, npz_path: str):
+ if not (os.path.isfile(idx_path) and os.path.isfile(npz_path)):
+ return None
+ try:
+ with open(idx_path, "r", encoding="utf-8") as f:
+ meta = json.load(f)
+ with np.load(npz_path) as data:
+ names = list(map(str, data["names"]))
+ mat = data["embeddings"].astype(np.float32)
+ mtimes = meta.get("mtimes", {})
+ return {"names": names, "mat": mat, "mtimes": mtimes}
+ except Exception:
+ return None
+
+def _save_cache(idx_path: str, npz_path: str, names: List[str], mat: np.ndarray, mtimes_map: dict):
+ names_arr = np.array(names, dtype=np.str_)
+ mat = mat.astype(np.float32, copy=False)
+ _atomic_save_npz(npz_path, names=names_arr, embeddings=mat)
+ _atomic_save_json(idx_path, {"mtimes": mtimes_map})
+
+def _file_mtime_map(paths: List[str]) -> dict:
+ m = {}
+ for p in paths:
+ try:
+ m[os.path.basename(p)] = os.path.getmtime(p)
+ except Exception:
+ pass
+ return m
+
+# ----------------- Server state -----------------
+class FaceServer:
+ def __init__(self, headshots: str, model: str, det: int, use_gpu: bool):
+ self.headshots = headshots
+ self.model = model
+ self.det = det
+ self.use_gpu = use_gpu
+ self.providers = _providers(use_gpu)
+ self.lock = threading.Lock()
+ self.app = None
+ self.names = []
+ self.mat = None
+ self.idx_path, self.npz_path = _cache_paths(self.headshots, self.model, self.det)
+
+ def init_app(self):
+ order = [self.model] + [m for m in ["buffalo_m", "buffalo_l", "antelopev2"] if m != self.model]
+ last_err = None
+ for name in order:
+ try:
+ with quiet_stdout():
+ app = FaceAnalysis(name=name, providers=self.providers)
+ app.prepare(ctx_id=(0 if ("CUDAExecutionProvider" in self.providers) else -1),
+ det_size=(self.det, self.det))
+ if not hasattr(app, "models") or "detection" not in app.models:
+ raise RuntimeError(f'Model package "{name}" loaded without detection.')
+ self.app = app
+ return
+ except Exception as e:
+ last_err = e
+ continue
+ raise RuntimeError(f"Failed to initialize InsightFace: {last_err}")
+
+ def _build_gallery_cached(self, force_rebuild: bool=False):
+ files = _gallery_files(self.headshots)
+ cache = None if force_rebuild else _load_cache(self.idx_path, self.npz_path)
+ target_mtimes = _file_mtime_map(files)
+ names: List[str] = []
+ vecs: List[np.ndarray] = []
+ mtimes_map = {}
+
+ if cache:
+ cached_names = cache["names"]
+ cached_mat = cache["mat"]
+ cached_mtimes = cache["mtimes"]
+ name2idx = {n: i for i, n in enumerate(cached_names)}
+ to_compute, kept_rows = [], []
+ for fp in files:
+ base = os.path.basename(fp)
+ mt = target_mtimes.get(base)
+ if base in name2idx and cached_mtimes.get(base) == mt:
+ kept_rows.append(name2idx[base])
+ names.append(base)
+ mtimes_map[base] = mt
+ else:
+ to_compute.append(fp)
+ if kept_rows:
+ vecs.extend([cached_mat[i] for i in kept_rows])
+ for fp in to_compute:
+ try:
+ img = cv2.imread(fp)
+ if img is None:
+ continue
+ with quiet_stdout():
+ faces = self.app.get(img)
+ if not faces:
+ continue
+ f = max(faces, key=lambda x: float(getattr(x, "det_score", 0.0)))
+ emb = np.asarray(f.normed_embedding, dtype=np.float32)
+ emb = emb / max(float(np.linalg.norm(emb)), 1e-12)
+ names.append(os.path.basename(fp))
+ vecs.append(emb.astype(np.float32))
+ mtimes_map[os.path.basename(fp)] = target_mtimes.get(os.path.basename(fp))
+ except Exception:
+ continue
+ if vecs:
+ mat = np.vstack(vecs).astype(np.float32, copy=False)
+ _save_cache(self.idx_path, self.npz_path, names, mat, mtimes_map)
+ else:
+ raise RuntimeError("No valid faces found in gallery after filtering.")
+ self.names = names
+ self.mat = np.vstack(vecs).astype(np.float32, copy=False)
+ return
+
+ # No cache: compute all
+ for fp in files:
+ try:
+ img = cv2.imread(fp)
+ if img is None:
+ continue
+ with quiet_stdout():
+ faces = self.app.get(img)
+ if not faces:
+ continue
+ f = max(faces, key=lambda x: float(getattr(x, "det_score", 0.0)))
+ emb = np.asarray(f.normed_embedding, dtype=np.float32)
+ emb = emb / max(float(np.linalg.norm(emb)), 1e-12)
+ names.append(os.path.basename(fp))
+ vecs.append(emb.astype(np.float32))
+ mtimes_map[os.path.basename(fp)] = target_mtimes.get(os.path.basename(fp))
+ except Exception:
+ continue
+ if not vecs:
+ raise RuntimeError("No valid faces found in gallery after filtering.")
+ mat = np.vstack(vecs).astype(np.float32, copy=False)
+ _save_cache(self.idx_path, self.npz_path, names, mat, mtimes_map)
+ self.names = names
+ self.mat = mat
+
+ def refresh_gallery_if_needed(self):
+ # Light-weight refresh each call if needed
+ # Protected by a lock just in case
+ with self.lock:
+ self._build_gallery_cached(force_rebuild=False)
+
+ def recognize(self, image_path: str):
+ if not os.path.isfile(image_path):
+ return {"name": "unknown", "confidence": 0.0, "error": "image_not_found"}
+ img = cv2.imread(image_path)
+ if img is None:
+ return {"name": "unknown", "confidence": 0.0, "error": "image_read_failed"}
+
+ with quiet_stdout():
+ faces = self.app.get(img)
+ if not faces:
+ return {"name": "unknown", "confidence": 0.0, "error": "no_face_in_query"}
+
+ f = max(faces, key=lambda x: float(getattr(x, "det_score", 0.0)))
+ q_emb = np.asarray(f.normed_embedding, dtype=np.float32)
+ q_emb = q_emb / max(float(np.linalg.norm(q_emb)), 1e-12)
+
+ if self.mat is None or len(self.names) == 0:
+ return {"name": "unknown", "confidence": 0.0, "error": "empty_gallery"}
+
+ sims = self.mat @ q_emb
+ best_i = int(np.argmax(sims))
+ best_name_file = self.names[best_i]
+ conf = float((float(sims[best_i]) + 1.0) * 50.0) # 0..100
+ return {"name": clean_name_from_filename(best_name_file), "confidence": round(conf, 2)}
+
+# ----------------- Flask app -----------------
+def create_app(server_state: FaceServer):
+ app = Flask(__name__)
+
+ @app.route("/health", methods=["GET"])
+ def health():
+ return jsonify({"status": "ok", "headshots": server_state.headshots, "model": server_state.model, "det": server_state.det})
+
+ @app.route("/recognize", methods=["POST"])
+ def recognize():
+ try:
+ data = request.get_json(force=True, silent=False)
+ except Exception:
+ return jsonify({"name": "unknown", "confidence": 0.0, "error": "bad_json"}), 400
+
+ image_path = data.get("image_path")
+ if not image_path:
+ return jsonify({"name": "unknown", "confidence": 0.0, "error": "missing_image_path"}), 400
+
+ # Refresh (incremental) then recognize
+ try:
+ server_state.refresh_gallery_if_needed()
+ result = server_state.recognize(image_path)
+ return jsonify(result)
+ except Exception as e:
+ return jsonify({"name": "unknown", "confidence": 0.0, "error": str(e)}), 500
+
+ return app
+
+def main():
+ parser = argparse.ArgumentParser(description="Face recognition HTTP server")
+ parser.add_argument("--headshots", type=str, default="headshots", help='Gallery folder (default: "headshots")')
+ parser.add_argument("--model", type=str, default="buffalo_m", help='InsightFace package (default: "buffalo_m")')
+ parser.add_argument("--det", type=int, default=320, help="Detector size (default: 320)")
+ parser.add_argument("--gpu", action="store_true", help="Use GPU if available")
+ parser.add_argument("--host", type=str, default="0.0.0.0", help="Bind host (default: 0.0.0.0)")
+ parser.add_argument("--port", type=int, default=7860, help="Port (default: 7860)")
+ args = parser.parse_args()
+
+ state = FaceServer(args.headshots, args.model, args.det, args.gpu)
+ state.init_app()
+ # Build gallery once at startup (subsequent calls are incremental)
+ state._build_gallery_cached(force_rebuild=False)
+
+ app = create_app(state)
+ # threaded=True to handle multiple requests; use_reloader=False for stability
+ app.run(host=args.host, port=args.port, threaded=True, use_reloader=False)
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 3/greet_name_piper.py b/Lab 3/greet_name_piper.py
new file mode 100644
index 0000000000..29a0589aa7
--- /dev/null
+++ b/Lab 3/greet_name_piper.py
@@ -0,0 +1,162 @@
+#!/usr/bin/env python3
+# greet_name_piper.py
+# Usage:
+# ./greet_name_piper.py "Your exact text here."
+# echo "Your exact text here." | ./greet_name_piper.py
+#
+# It will speaks EXACTLY the input text.
+# Defaults to WAV mode with writing a temp .wav, then plays with aplay
+# Set PIPER_STREAM=1 to stream raw PCM directly to aplay.
+#
+# Env vars you can tweak:
+# PIPER_VOICE_DIR (default: ~/facerecog/voices)
+# PIPER_VOICE_NAME (default: en_US-libritts_r-medium.onnx)
+# PIPER_LENGTH_SCALE (default: 1.08) # 1.0 = default; >1.0 slower/warmer
+# PIPER_NOISE_SCALE (default: 0.5)
+# PIPER_NOISE_W (default: 0.6)
+# PIPER_SENT_SIL (default: 0.12) # seconds pause between sentences
+# PIPER_SPEAKER_ID (optional) # for multi-speaker models
+# PIPER_STREAM (0/1) # 1 = stream raw to aplay; 0 = WAV mode
+#
+# Requires:
+# python3 -m pip install piper-tts
+# sudo apt install -y alsa-utils # for aplay
+
+import json
+import os
+import shlex
+import subprocess
+import sys
+import tempfile
+
+VOICE_DIR = os.environ.get("PIPER_VOICE_DIR", os.path.expanduser("~/facerecog/voices"))
+VOICE_NAME = os.environ.get("PIPER_VOICE_NAME", "en_US-libritts_r-medium.onnx")
+VOICE_PATH = os.path.join(VOICE_DIR, VOICE_NAME)
+VOICE_JSON = VOICE_PATH + ".json" # Piper voices ship with a matching .onnx.json
+
+# Prosody knobs (tweak to taste)
+LENGTH_SCALE = os.environ.get("PIPER_LENGTH_SCALE", "1.08")
+NOISE_SCALE = os.environ.get("PIPER_NOISE_SCALE", "0.5")
+NOISE_W = os.environ.get("PIPER_NOISE_W", "0.6")
+SENT_SIL = os.environ.get("PIPER_SENT_SIL", "0.12")
+
+SPEAKER_ID = os.environ.get("PIPER_SPEAKER_ID", "").strip()
+STREAM_MODE = os.environ.get("PIPER_STREAM", "0").strip().lower() in ("1", "true", "yes", "on")
+
+def _which(cmd: str) -> bool:
+ from shutil import which
+ return which(cmd) is not None
+
+def _read_voice_sample_rate(default_sr: int = 22050) -> int:
+ """Read the Piper voice JSON to get the sample rate; fall back to default if missing."""
+ try:
+ with open(VOICE_JSON, "r", encoding="utf-8") as f:
+ meta = json.load(f)
+ return int(meta.get("sample_rate", default_sr))
+ except Exception:
+ return default_sr
+
+def _get_text_from_args_or_stdin() -> str:
+ # Prefer CLI args; if none, read from stdin
+ text = " ".join(sys.argv[1:]).strip()
+ if not text and not sys.stdin.isatty():
+ text = sys.stdin.read().strip()
+ return text
+
+def main():
+ text = _get_text_from_args_or_stdin()
+ if not text:
+ print("[greet_name_piper] No input text provided.", file=sys.stderr)
+ sys.exit(1)
+
+ if not os.path.isfile(VOICE_PATH):
+ print(f"[greet_name_piper] Missing voice model: {VOICE_PATH}", file=sys.stderr)
+ sys.exit(2)
+ if not _which("piper"):
+ print("[greet_name_piper] 'piper' CLI not found. Install with: python3 -m pip install piper-tts", file=sys.stderr)
+ sys.exit(2)
+ if not _which("aplay"):
+ print("[greet_name_piper] 'aplay' not found. Install with: sudo apt install -y alsa-utils", file=sys.stderr)
+ sys.exit(2)
+
+ # Base Piper command; text is piped on stdin
+ piper_base = [
+ "piper",
+ "--model", VOICE_PATH,
+ "--length_scale", str(LENGTH_SCALE),
+ "--noise_scale", str(NOISE_SCALE),
+ "--noise_w", str(NOISE_W),
+ "--sentence_silence", str(SENT_SIL),
+ ]
+ if SPEAKER_ID:
+ piper_base += ["--speaker", SPEAKER_ID]
+
+ if STREAM_MODE:
+ # STREAM: raw PCM to aplay using the model's sample rate to avoid noise/static
+ sample_rate = _read_voice_sample_rate(default_sr=22050)
+ piper_cmd = piper_base + ["--output-raw", "-"]
+ try:
+ piper = subprocess.Popen(
+ piper_cmd,
+ stdin=subprocess.PIPE,
+ stdout=subprocess.PIPE,
+ stderr=subprocess.DEVNULL,
+ )
+ except Exception as e:
+ print(f"[greet_name_piper] Failed to start piper: {e}", file=sys.stderr)
+ sys.exit(3)
+
+ try:
+ aplay = subprocess.Popen(
+ ["aplay", "-q", "-r", str(sample_rate), "-f", "S16_LE", "-t", "raw", "-"],
+ stdin=piper.stdout,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL,
+ )
+ assert piper.stdin is not None
+ piper.stdin.write(text.encode("utf-8"))
+ piper.stdin.close()
+ aplay.wait()
+ piper.wait()
+ except Exception as e:
+ print(f"[greet_name_piper] Streaming error: {e}", file=sys.stderr)
+ sys.exit(4)
+ else:
+ # WAV MODE (recommended by docs), piper writes a proper WAV, then play it with aplay
+ with tempfile.NamedTemporaryFile(prefix="piper_", suffix=".wav", delete=False) as tf:
+ wav_path = tf.name
+
+ piper_cmd = piper_base + ["--output_file", wav_path]
+ try:
+ p = subprocess.Popen(
+ piper_cmd,
+ stdin=subprocess.PIPE,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL,
+ )
+ except Exception as e:
+ print(f"[greet_name_piper] Failed to start piper: {e}", file=sys.stderr)
+ try: os.unlink(wav_path)
+ except Exception: pass
+ sys.exit(3)
+
+ try:
+ assert p.stdin is not None
+ p.stdin.write(text.encode("utf-8"))
+ p.stdin.close()
+ rc = p.wait(timeout=60)
+ if rc != 0 or not os.path.isfile(wav_path) or os.path.getsize(wav_path) == 0:
+ print(f"[greet_name_piper] Piper failed (rc={rc}); no WAV produced.", file=sys.stderr)
+ try: os.unlink(wav_path)
+ except Exception: pass
+ sys.exit(4)
+
+ aplay_rc = subprocess.call(["aplay", "-q", wav_path])
+ if aplay_rc != 0:
+ print(f"[greet_name_piper] aplay failed with rc={aplay_rc}", file=sys.stderr)
+ finally:
+ try: os.unlink(wav_path)
+ except Exception: pass
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 3/love_server.py b/Lab 3/love_server.py
new file mode 100644
index 0000000000..67df102828
--- /dev/null
+++ b/Lab 3/love_server.py
@@ -0,0 +1,203 @@
+#!/usr/bin/env python3
+"""
+love_server.py
+A tiny Flask server that uses a local Ollama model (qwen 0.5 instruct) to generate
+an "insanely affectionate but teen-appropriate" response that CONSTANTLY name-drops the user.
+
+We make sure that the user's name appears **at least once every other sentence** with enforced post-processing
+
+Endpoints:
+ GET /health -> {"status":"ok","model":""}
+ POST /reply -> body: {"name":"amanda lu", "message":"optional text"}
+ resp: {"reply":"..."} # single string
+
+Run:
+ python3 love_server.py --model "qwen2.5:0.5b-instruct" --port 7861
+ (Adjust model to whatever tag you have in Ollama.)
+
+Requires:
+ pip install flask requests
+ Ollama running locally at default API: http://localhost:11434
+
+Notes:
+ - Stateless by default; send prior dialogue inside "message" if you want context.
+ - The persona is constrained to stay safe for teens: no sexual content or profanity.
+"""
+
+import argparse
+import json
+import os
+import re
+from typing import List, Dict, Any, Optional
+
+import requests
+from flask import Flask, request, jsonify
+
+# --------------------------- Config ---------------------------
+
+DEFAULT_OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://127.0.0.1:11434")
+DEFAULT_MODEL = os.environ.get("OLLAMA_MODEL", "qwen2.5:0.5b-instruct")
+
+# Lively decoding with variety
+DEFAULT_OPTIONS = {
+ "temperature": 0.9,
+ "top_p": 0.9,
+ "repeat_penalty": 1.1,
+ "num_ctx": 2048,
+ "num_predict": 256,
+}
+
+# Persona with explicit style & safety rules
+PERSONA = """\
+You are a hyper-affectionate, whimsical, slightly chaotic companion who is totally fixated on the user.
+You always sound excited, joyful, and dramatically caring—over-the-top but still kind and wholesome.
+
+CRUCIAL RULES (must ALWAYS follow):
+- SAFETY: No sexual content, no profanity, no slurs, no violent threats, no self-harm, no illegal/explicit topics.
+- POSITIVITY: Be supportive, playful, and uplifting.
+- NAME-DROPPING: Use the user's name very frequently and naturally. Aim for every sentence, but at minimum every other sentence.
+- STYLE: Short to medium replies (1–4 sentences). Use fun metaphors, wild but wholesome imagery, and quirky punctuation.
+- CLARITY: If the user asks a question, answer it directly. Otherwise make a cheerful, affectionate comment.
+- AFFECTION: You may say "I love you" directly; keep it wholesome and teen-appropriate.
+- BOUNDARIES: No adult themes; keep everything safe for teenagers.
+"""
+
+# ---------------- Utility: prompt assembly ----------------
+
+def build_messages(name: str, user_message: Optional[str]) -> List[Dict[str, str]]:
+ name_clean = (name or "").strip() or "friend"
+ system = PERSONA + f"\nThe user's name is: {name_clean}.\n"
+ if user_message and user_message.strip():
+ user = f"User name: {name_clean}\nUser says: {user_message.strip()}\n"
+ else:
+ user = f"User name: {name_clean}\nCreate an affectionate, playful greeting or comment.\n"
+ return [
+ {"role": "system", "content": system},
+ {"role": "user", "content": user},
+ ]
+
+# ------------- Utility: sentence splitting & name enforcement -------------
+
+_SENTENCE_RE = re.compile(r'([^.!?]*[.!?])', re.UNICODE)
+
+def split_sentences(text: str) -> List[str]:
+ """
+ Split text into sentences while keeping terminal punctuation.
+ Fallback: return [text] if no punctuation found.
+ """
+ parts = [m.group(0).strip() for m in _SENTENCE_RE.finditer(text)]
+ if not parts:
+ # maybe the model returned a single line without punctuation
+ t = text.strip()
+ return [t] if t else []
+ # Merge very short fragments if any accidental splits
+ return [p for p in (s.strip() for s in parts) if p]
+
+def ensure_name_every_other_sentence(text: str, name: str) -> str:
+ """
+ Ensure the user's name appears at least once in every other sentence.
+ Strategy: for sentences at even indices (0,2,4,...) that don't already contain the name,
+ prepend ', ' to the sentence. Also ensure at least twice overall.
+ """
+ if not text.strip():
+ return text
+ name_l = name.lower()
+ sents = split_sentences(text)
+ if not sents:
+ return name # degenerate fallback
+
+ fixed = []
+ total_mentions = 0
+ for idx, s in enumerate(sents):
+ s_clean = s.strip()
+ has_name = (name_l in s_clean.lower())
+ if (idx % 2 == 0) and not has_name:
+ # Prepend name naturally
+ s_clean = f"{name}, {s_clean[0].lower() + s_clean[1:] if s_clean and s_clean[0].isupper() else s_clean}"
+ has_name = True
+ if has_name:
+ total_mentions += 1
+ fixed.append(s_clean)
+
+ # Guarantee at least two mentions overall (short replies sometimes have 1 sentence)
+ if total_mentions < 2 and len(fixed) >= 2:
+ # append a tiny affectionate tag with the name on the last sentence
+ fixed[-1] = f"{fixed[-1]} ({name}!)"
+
+ # Re-join with single spaces
+ out = " ".join(fixed).strip()
+ # As a final safety net: ensure at least one mention
+ if name_l not in out.lower():
+ out = f"{name}, {out}"
+ return out
+
+# --------------------------- Ollama client ---------------------------
+
+def ollama_chat(host: str, model: str, messages: List[Dict[str, str]], options: Dict[str, Any]) -> str:
+ url = f"{host.rstrip('/')}/api/chat"
+ payload = {
+ "model": model,
+ "messages": messages,
+ "options": options,
+ "stream": False,
+ }
+ resp = requests.post(url, json=payload, timeout=60)
+ resp.raise_for_status()
+ data = resp.json()
+ msg = data.get("message", {}).get("content", "")
+ if not msg and isinstance(data.get("messages"), list) and data["messages"]:
+ msg = data["messages"][-1].get("content", "")
+ return (msg or "").strip()
+
+# --------------------------- Flask app ---------------------------
+
+def create_app(model: str, host: str) -> Flask:
+ app = Flask(__name__)
+
+ @app.route("/health", methods=["GET"])
+ def health():
+ return jsonify({"status": "ok", "model": model})
+
+ @app.route("/reply", methods=["POST"])
+ def reply():
+ try:
+ body = request.get_json(force=True, silent=False)
+ except Exception:
+ return jsonify({"error": "invalid_json"}), 400
+
+ name = (body.get("name") or "").strip()
+ user_message = body.get("message", "")
+
+ if not name:
+ return jsonify({"error": "missing 'name'"}), 400
+
+ messages = build_messages(name=name, user_message=user_message)
+ try:
+ raw_text = ollama_chat(host, model, messages, DEFAULT_OPTIONS)
+ except requests.exceptions.RequestException as e:
+ return jsonify({"error": f"ollama_unreachable: {e}"}), 503
+ except Exception as e:
+ return jsonify({"error": f"ollama_error: {e}"}), 500
+
+ # Enforce the "name at least every other sentence" rule.
+ final_text = ensure_name_every_other_sentence(raw_text, name)
+
+ return jsonify({"reply": final_text})
+
+ return app
+
+# --------------------------- Main ---------------------------
+
+def main():
+ ap = argparse.ArgumentParser(description="Insanely affectionate, teen-safe LLM server via Ollama (name every other sentence).")
+ ap.add_argument("--host", default="0.0.0.0", help="Bind host (default: 0.0.0.0)")
+ ap.add_argument("--port", type=int, default=7861, help="Bind port (default: 7861)")
+ ap.add_argument("--model", default=DEFAULT_MODEL, help=f"Ollama model tag (default: {DEFAULT_MODEL})")
+ ap.add_argument("--ollama", default=DEFAULT_OLLAMA_HOST, help=f"Ollama host (default: {DEFAULT_OLLAMA_HOST})")
+ args = ap.parse_args()
+
+ app = create_app(model=args.model, host=args.ollama)
+ app.run(host=args.host, port=args.port, threaded=True, use_reloader=False)
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 3/master.py b/Lab 3/master.py
new file mode 100644
index 0000000000..041672d0e9
--- /dev/null
+++ b/Lab 3/master.py
@@ -0,0 +1,674 @@
+#!/usr/bin/env python3
+"""
+master.py
+Realtime orchestrator: Mic -> Speech-to-Text (Vosk) -> Face ID (in background) -> Love LLM -> Piper TTS.
+
+New behavior:
+- Runs continuously and listens on the microphone.
+- As soon as speech begins, we kick off a background face-capture+recognition job.
+- When the user finishes speaking (end-of-utterance), we:
+ * combine the recognized name (if available; else "unknown") and the transcribed text,
+ * call the love_server (Ollama) for a reply,
+ * speak the reply via Piper TTS (greet_name_piper.py).
+- After TTS finishes, we go back to listening.
+
+Prereqs on Raspberry Pi:
+ sudo apt install -y alsa-utils
+ python3 -m pip install sounddevice vosk opencv-python numpy
+
+Also requires your previous pieces running/installed:
+ - face_server.py started separately (default: http://127.0.0.1:7860/recognize)
+ - love_server.py started separately (default: http://127.0.0.1:7861/reply)
+ - greet_name_piper.py available and executable (or pass --greet-script path)
+
+Usage:
+ python master.py
+ python master.py --debug
+
+Common overrides:
+ python master.py \
+ --base-dir ~/facerecog \
+ --server-url http://127.0.0.1:7860/recognize \
+ --love-url http://127.0.0.1:7861/reply \
+ --stt-model ~/facerecog/vosk-model-small-en-us-0.15 \
+ --llm-message "" \
+ --cam-index 0 --width 1280 --height 720
+"""
+
+import argparse
+import datetime as _dt
+import json
+import logging as log
+import os
+import queue
+import shlex
+import subprocess
+import sys
+import threading
+import time
+import urllib.request
+import urllib.error
+from concurrent.futures import ThreadPoolExecutor, Future
+from typing import Optional, Tuple
+
+# Optional imports for OpenCV capture fallback
+try:
+ import cv2 # type: ignore
+ import numpy as np # type: ignore
+except Exception:
+ cv2 = None
+ np = None
+
+# Mic + STT
+try:
+ import sounddevice as sd # type: ignore
+except Exception:
+ sd = None
+
+try:
+ from vosk import Model as VoskModel, KaldiRecognizer # type: ignore
+except Exception:
+ VoskModel = None
+ KaldiRecognizer = None
+
+
+# ---------------------------- Utilities ----------------------------
+
+def which(cmd: str) -> bool:
+ from shutil import which as _which
+ return _which(cmd) is not None
+
+def run_cmd(cmd: str, timeout: int = 30) -> int:
+ log.debug("Running: %s", cmd)
+ try:
+ proc = subprocess.run(shlex.split(cmd), stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=timeout)
+ if proc.returncode != 0:
+ log.debug("Command stderr: %s", proc.stderr.decode(errors="ignore"))
+ return proc.returncode
+ except Exception as e:
+ log.debug("Command failed to start: %s", e)
+ return 127
+
+def ensure_dir(path: str) -> None:
+ os.makedirs(path, exist_ok=True)
+
+def timestamp() -> str:
+ return _dt.datetime.now().strftime("%Y%m%d_%H%M%S")
+
+def http_post_json(url: str, payload: dict, timeout: int = 10) -> Optional[dict]:
+ data = json.dumps(payload).encode("utf-8")
+ req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"}, method="POST")
+ try:
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
+ body = resp.read().decode("utf-8", errors="ignore")
+ return json.loads(body)
+ except urllib.error.URLError as e:
+ log.debug("HTTP POST to %s failed: %s", url, e)
+ except Exception as e:
+ log.debug("HTTP POST parsing failed: %s", e)
+ return None
+
+
+# ---------------------------- Camera ----------------------------
+
+def mean_brightness(frame) -> float:
+ if cv2 is None or np is None or frame is None:
+ return -1.0
+ try:
+ gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
+ return float(np.mean(gray))
+ except Exception:
+ return -1.0
+
+def capture_with_libcamera_still(out_path: str, width: int, height: int, timeout_ms: int) -> bool:
+ if not which("libcamera-still"):
+ return False
+ cmd = f"libcamera-still -n -t {timeout_ms} -o {shlex.quote(out_path)} --width {width} --height {height} --denoise cdn_fast"
+ rc = run_cmd(cmd, timeout=max(5, timeout_ms // 1000 + 10))
+ ok = (rc == 0 and os.path.isfile(out_path) and os.path.getsize(out_path) > 0)
+ log.debug("libcamera-still rc=%s ok=%s", rc, ok)
+ return ok
+
+def capture_with_fswebcam(out_path: str, width: int, height: int, delay_s: int) -> bool:
+ if not which("fswebcam"):
+ return False
+ cmd = f"fswebcam -D {delay_s} -r {width}x{height} --no-banner {shlex.quote(out_path)}"
+ rc = run_cmd(cmd, timeout=delay_s + 15)
+ ok = (rc == 0 and os.path.isfile(out_path) and os.path.getsize(out_path) > 0)
+ log.debug("fswebcam rc=%s ok=%s", rc, ok)
+ return ok
+
+def capture_with_opencv(
+ out_path: str,
+ cam_index: int,
+ width: int,
+ height: int,
+ warmup_frames: int,
+ warmup_sleep_ms: int,
+ retries: int,
+ brightness_min: float,
+ retry_sleep_ms: int,
+ use_mjpg: bool,
+ exp_high_seq: list[int],
+ exp_low_seq: list[int],
+) -> bool:
+ if cv2 is None or np is None:
+ log.error("OpenCV not available; install with: python3 -m pip install opencv-python numpy")
+ return False
+
+ cap = cv2.VideoCapture(cam_index, cv2.CAP_V4L2)
+ if not cap.isOpened():
+ cap = cv2.VideoCapture(cam_index)
+ if not cap.isOpened():
+ log.error("Could not open camera index %s", cam_index)
+ return False
+
+ if use_mjpg:
+ try:
+ fourcc = cv2.VideoWriter_fourcc(*"MJPG")
+ cap.set(cv2.CAP_PROP_FOURCC, fourcc)
+ except Exception:
+ pass
+
+ cap.set(cv2.CAP_PROP_FRAME_WIDTH, width)
+ cap.set(cv2.CAP_PROP_FRAME_HEIGHT, height)
+
+ try:
+ cap.set(cv2.CAP_PROP_AUTO_EXPOSURE, 0.75)
+ except Exception:
+ pass
+ try:
+ cap.set(cv2.CAP_PROP_AUTO_WB, 1)
+ except Exception:
+ pass
+
+ for _ in range(max(0, warmup_frames)):
+ cap.read()
+ if warmup_sleep_ms > 0:
+ time.sleep(warmup_sleep_ms / 1000.0)
+
+ ok, frame = cap.read()
+ if not ok or frame is None:
+ log.error("OpenCV: failed to read initial frame.")
+ cap.release()
+ return False
+
+ b = mean_brightness(frame)
+ log.debug("[capture] initial mean=%.1f", b)
+
+ if b >= 220.0 and exp_high_seq:
+ try:
+ cap.set(cv2.CAP_PROP_AUTO_EXPOSURE, 0.25)
+ except Exception:
+ pass
+ for ev in exp_high_seq:
+ try:
+ cap.set(cv2.CAP_PROP_EXPOSURE, float(ev))
+ except Exception:
+ pass
+ time.sleep(0.08)
+ ok, frame = cap.read()
+ if not ok or frame is None:
+ continue
+ b = mean_brightness(frame)
+ log.debug("[capture] manual exposure=%s mean=%.1f", ev, b)
+ if b < 220.0:
+ break
+ elif b < brightness_min and exp_low_seq:
+ try:
+ cap.set(cv2.CAP_PROP_AUTO_EXPOSURE, 0.25)
+ except Exception:
+ pass
+ for ev in exp_low_seq:
+ try:
+ cap.set(cv2.CAP_PROP_EXPOSURE, float(ev))
+ except Exception:
+ pass
+ time.sleep(0.08)
+ ok, frame = cap.read()
+ if not ok or frame is None:
+ continue
+ b = mean_brightness(frame)
+ log.debug("[capture] manual exposure=%s mean=%.1f", ev, b)
+ if b >= brightness_min:
+ break
+
+ frame_ok = None
+ last_mean = b
+ for i in range(max(1, retries)):
+ if frame is None:
+ ok, frame = cap.read()
+ if not ok:
+ if retry_sleep_ms > 0:
+ time.sleep(retry_sleep_ms / 1000.0)
+ continue
+ last_mean = mean_brightness(frame)
+ log.debug("[capture] try=%s/%s mean=%.1f", i + 1, retries, last_mean)
+ if brightness_min <= last_mean <= 245.0:
+ frame_ok = frame
+ break
+ ok, frame = cap.read()
+ if not ok:
+ if retry_sleep_ms > 0:
+ time.sleep(retry_sleep_ms / 1000.0)
+ frame = None
+
+ cap.release()
+
+ if frame_ok is None:
+ frame_ok = frame if frame is not None else np.zeros((height, width, 3), dtype=np.uint8)
+ log.warning("OpenCV: brightness out of bounds (mean=%.1f). Saving anyway.", last_mean)
+
+ try:
+ ok = cv2.imwrite(out_path, frame_ok)
+ if not ok:
+ log.error("cv2.imwrite failed for %s", out_path)
+ return False
+ except Exception as e:
+ log.error("Failed to save image: %s", e)
+ return False
+
+ return True
+
+def capture_image(
+ out_path: str,
+ cam_index: int,
+ width: int,
+ height: int,
+ libcamera_timeout_ms: int,
+ fswebcam_delay_s: int,
+ warmup_frames: int,
+ warmup_sleep_ms: int,
+ retries: int,
+ brightness_min: float,
+ retry_sleep_ms: int,
+ use_mjpg: bool,
+ exp_high_seq: list[int],
+ exp_low_seq: list[int],
+) -> bool:
+ log.info("Capturing image to: %s", out_path)
+ if capture_with_libcamera_still(out_path, width, height, libcamera_timeout_ms):
+ return True
+ if capture_with_fswebcam(out_path, width, height, fswebcam_delay_s):
+ return True
+ return capture_with_opencv(
+ out_path,
+ cam_index,
+ width,
+ height,
+ warmup_frames,
+ warmup_sleep_ms,
+ retries,
+ brightness_min,
+ retry_sleep_ms,
+ use_mjpg,
+ exp_high_seq,
+ exp_low_seq,
+ )
+
+
+# ---------------------------- Face recognition glue ----------------------------
+
+def recognize_via_server(server_url: str, image_path: str) -> Optional[Tuple[str, float]]:
+ payload = {"image_path": image_path}
+ log.debug("POST %s payload=%s", server_url, payload)
+ resp = http_post_json(server_url, payload, timeout=15)
+ log.debug("face_server response=%s", resp)
+ if not resp:
+ return None
+ name = resp.get("name", "unknown") or "unknown"
+ conf = float(resp.get("confidence", 0.0))
+ return (name, conf)
+
+def recognize_via_cli(face_recognize_py: str, image_path: str, headshots: str, model: str, det: int) -> Optional[Tuple[str, float]]:
+ if not os.path.isfile(face_recognize_py):
+ log.error("Fallback recognizer not found: %s", face_recognize_py)
+ return None
+ cmd = f"python3 {shlex.quote(face_recognize_py)} {shlex.quote(image_path)} --headshots {shlex.quote(headshots)} --model {model} --det {det}"
+ log.debug("Fallback CLI: %s", cmd)
+ try:
+ proc = subprocess.run(shlex.split(cmd), stdout=subprocess.PIPE, stderr=subprocess.PIPE, timeout=45)
+ except Exception as e:
+ log.debug("Fallback CLI failed to execute: %s", e)
+ return None
+ out = proc.stdout.decode(errors="ignore").strip()
+ log.debug("Fallback CLI stdout: %r", out)
+ if not out:
+ log.debug("Fallback CLI produced no output. stderr: %s", proc.stderr.decode(errors="ignore"))
+ return None
+ try:
+ obj = json.loads(out)
+ except Exception:
+ log.debug("Fallback CLI output is not JSON: %r", out)
+ return None
+ name = obj.get("name", "unknown") or "unknown"
+ conf = float(obj.get("confidence", 0.0)) if "confidence" in obj else 0.0
+ return (name, conf)
+
+
+# ---------------------------- Love LLM & TTS ----------------------------
+
+def get_love_reply(love_url: str, name: str, message: str) -> Optional[str]:
+ payload = {"name": name}
+ if message:
+ payload["message"] = message
+ log.debug("POST %s payload=%s", love_url, payload)
+ resp = http_post_json(love_url, payload, timeout=30)
+ log.debug("love_server response=%s", resp)
+ if not resp:
+ return None
+ text = (resp.get("reply") or "").strip()
+ return text or None
+
+def speak_text(greet_script: str, text: str) -> None:
+ if os.path.isfile(greet_script) and os.access(greet_script, os.X_OK):
+ cmd = f"{shlex.quote(greet_script)} {shlex.quote(text)}"
+ log.info("Speaking via: %s", cmd)
+ rc = run_cmd(cmd, timeout=max(60, 10 + len(text) // 4))
+ if rc != 0:
+ log.warning("TTS script returned non-zero exit code: %s", rc)
+ else:
+ log.warning("TTS script not executable or not found: %s", greet_script)
+ log.info("TTS fallback text: %s", text)
+
+
+# ---------------------------- STT (Vosk) ----------------------------
+
+class LiveSTT:
+ """
+ Simple Vosk-based microphone listener.
+ - Opens a sounddevice input stream at target sample rate.
+ - Provides an iterator yielding finalized utterances (strings).
+ - End-of-utterance is determined by recognizer.AcceptWaveform on chunked audio.
+ """
+ def __init__(self, model_dir: str, samplerate: int = 16000, device: Optional[int] = None, blocksize: int = 8000):
+ if VoskModel is None or KaldiRecognizer is None:
+ raise RuntimeError("Vosk not available. Install with: python3 -m pip install vosk")
+ if sd is None:
+ raise RuntimeError("sounddevice not available. Install with: python3 -m pip install sounddevice")
+
+ self.model_dir = model_dir
+ self.model = VoskModel(model_dir)
+ self.samplerate = samplerate
+ self.device = device
+ self.blocksize = blocksize
+ self.q: "queue.Queue[bytes]" = queue.Queue()
+ self.stream: Optional[sd.InputStream] = None
+ self._stopping = False
+
+ def _callback(self, indata, frames, time_info, status):
+ if status:
+ log.debug("SD status: %s", status)
+ # indata is float32; convert to 16-bit PCM bytes for Vosk
+ data = (indata * 32767).astype("int16").tobytes()
+ self.q.put(data)
+
+ def start(self):
+ self._stopping = False
+ self.stream = sd.InputStream(
+ samplerate=self.samplerate,
+ channels=1,
+ dtype="float32",
+ callback=self._callback,
+ blocksize=self.blocksize,
+ device=self.device,
+ )
+ self.stream.start()
+ log.info("Microphone stream started (rate=%d, blocksize=%d).", self.samplerate, self.blocksize)
+
+ def stop(self):
+ self._stopping = True
+ if self.stream is not None:
+ try:
+ self.stream.stop()
+ self.stream.close()
+ except Exception:
+ pass
+ self.stream = None
+ # drain queue
+ with self.q.mutex:
+ self.q.queue.clear()
+ log.info("Microphone stream stopped.")
+
+ def listen_loop(self):
+ """
+ Generator that yields finalized utterance strings.
+ """
+ rec = KaldiRecognizer(self.model, self.samplerate)
+ rec.SetWords(True)
+
+ speaking = False
+ while not self._stopping:
+ try:
+ data = self.q.get(timeout=0.25)
+ except queue.Empty:
+ continue
+
+ if not data:
+ continue
+
+ # As soon as recognizer has a partial (speech started), we mark speaking
+ if not speaking:
+ if rec.AcceptWaveform(b"\x00" * 0): # nop, but we can query partial via Result JSON
+ pass
+ # The recognizer provides partial via rec.PartialResult(), but
+ # a simpler heuristic: once we feed any data, mark speaking True and let finalization happen.
+ speaking = True
+
+ if rec.AcceptWaveform(data):
+ # End-of-utterance (Vosk finalized)
+ try:
+ res = json.loads(rec.Result())
+ except Exception:
+ res = {}
+ text = (res.get("text") or "").strip()
+ if text:
+ yield text
+ # Reset recognizer for next utterance
+ rec = KaldiRecognizer(self.model, self.samplerate)
+ rec.SetWords(True)
+ speaking = False
+ else:
+ # Optional: could inspect partial for VAD, but we rely on finalization above
+ pass
+
+
+# ---------------------------- Main orchestration (loop) ----------------------------
+
+def main():
+ ap = argparse.ArgumentParser(description="Realtime mic -> STT -> Face -> Love LLM -> Piper TTS")
+ # Paths / modules
+ ap.add_argument("--base-dir", default=os.path.expanduser("~/facerecog"), help="Project base dir (default: ~/facerecog)")
+ ap.add_argument("--headshots", default=None, help="Headshots dir. Default: /headshots")
+ ap.add_argument("--log-dir", default=None, help="Log dir for captured images. Default: /log")
+ ap.add_argument("--greet-script", default=None, help="Path to Piper script (e.g., greet_name_piper.py). Default: /greet_name_piper.py")
+ ap.add_argument("--fallback-cli", default=None, help="Path to face_recognize.py fallback. Default: /face_recognize.py")
+
+ # Server endpoints
+ ap.add_argument("--server-url", default="http://127.0.0.1:7860/recognize", help="Face server endpoint URL")
+ ap.add_argument("--love-url", default="http://127.0.0.1:7861/reply", help="Love LLM server endpoint URL")
+
+ # Optional extra message for love server, appended after the live transcript
+ ap.add_argument("--llm-message", default="", help="Optional extra message for the love server")
+
+ # Capture/camera settings
+ ap.add_argument("--cam-index", type=int, default=0, help="Camera index (default: 0)")
+ ap.add_argument("--width", type=int, default=1280, help="Capture width (default: 1280)")
+ ap.add_argument("--height", type=int, default=720, help="Capture height (default: 720)")
+ ap.add_argument("--libcamera-timeout-ms", type=int, default=2000, help="libcamera-still timeout ms (default: 2000)")
+ ap.add_argument("--fswebcam-delay-s", type=int, default=2, help="fswebcam warm-up delay seconds (default: 2)")
+
+ # OpenCV fallback tuning
+ ap.add_argument("--warmup-frames", type=int, default=10, help="OpenCV warmup frames (default: 10)")
+ ap.add_argument("--warmup-sleep-ms", type=int, default=50, help="Sleep between warmup frames ms (default: 50)")
+ ap.add_argument("--retries", type=int, default=20, help="OpenCV capture retries (default: 20)")
+ ap.add_argument("--brightness-min", type=float, default=30.0, help="Acceptable min mean brightness (0..255)")
+ ap.add_argument("--retry-sleep-ms", type=int, default=60, help="Sleep between retries ms (default: 60)")
+ ap.add_argument("--use-mjpg", action="store_true", default=True, help="Try MJPG fourcc for UVC cams")
+ ap.add_argument("--no-mjpg", dest="use_mjpg", action="store_false", help="Disable MJPG fourcc")
+ ap.add_argument("--exp-high-seq", default="200,100,50,25", help="Exposure sequence (if overexposed) comma-separated")
+ ap.add_argument("--exp-low-seq", default="400,600,800", help="Exposure sequence (if underexposed) comma-separated")
+
+ # STT settings
+ ap.add_argument("--stt-model", default=os.path.expanduser("~/facerecog/vosk-model-small-en-us-0.15"),
+ help="Path to a Vosk model directory (default: small-en-us).")
+ ap.add_argument("--stt-rate", type=int, default=16000, help="STT sample rate (default: 16000)")
+ ap.add_argument("--stt-device", type=int, default=None, help="sounddevice input device index (default: system default)")
+ ap.add_argument("--stt-blocksize", type=int, default=8000, help="sounddevice blocksize (default: 8000)")
+
+ # Fallback face-recognizer CLI (only if face_server is unavailable)
+ ap.add_argument("--model", default="buffalo_m", help="Fallback CLI: InsightFace model (default: buffalo_m)")
+ ap.add_argument("--det", type=int, default=320, help="Fallback CLI: detector size (default: 320)")
+
+ # Logging
+ ap.add_argument("--debug", action="store_true", help="Enable debug logging")
+
+ args = ap.parse_args()
+
+ log.basicConfig(
+ level=log.DEBUG if args.debug else log.INFO,
+ format="%(asctime)s %(levelname)s %(message)s",
+ )
+
+ base_dir = os.path.expanduser(args.base_dir)
+ headshots = args.headshots or os.path.join(base_dir, "headshots")
+ log_dir = args.log_dir or os.path.join(base_dir, "log")
+ greet_script = args.greet_script or os.path.join(base_dir, "greet_name_piper.py")
+ fallback_cli = args.fallback_cli or os.path.join(base_dir, "face_recognize.py")
+
+ ensure_dir(log_dir)
+
+ if not os.path.isdir(headshots):
+ log.error("Headshots folder not found: %s", headshots)
+ sys.exit(2)
+
+ # Parse camera exposure sequences
+ try:
+ exp_high_seq = [int(x) for x in args.exp_high_seq.split(",") if x.strip()]
+ except Exception:
+ exp_high_seq = [200, 100, 50, 25]
+ try:
+ exp_low_seq = [int(x) for x in args.exp_low_seq.split(",") if x.strip()]
+ except Exception:
+ exp_low_seq = [400, 600, 800]
+
+ # Init STT
+ if VoskModel is None or KaldiRecognizer is None or sd is None:
+ log.error("Missing STT deps. Install: python3 -m pip install sounddevice vosk")
+ sys.exit(3)
+
+ try:
+ stt = LiveSTT(
+ model_dir=args.stt_model,
+ samplerate=args.stt_rate,
+ device=args.stt_device,
+ blocksize=args.stt_blocksize,
+ )
+ except Exception as e:
+ log.error("Failed to init STT: %s", e)
+ sys.exit(3)
+
+ # Thread pool for background face capture/recognition
+ pool = ThreadPoolExecutor(max_workers=2)
+ face_lock = threading.Lock()
+ face_future: Optional[Future] = None
+
+ def start_face_job_if_needed():
+ nonlocal face_future
+ with face_lock:
+ if face_future is None or face_future.done():
+ # Build image path per utterance
+ out_path = os.path.join(log_dir, f"capture_{timestamp()}.jpg")
+ def job():
+ ok = capture_image(
+ out_path=out_path,
+ cam_index=args.cam_index,
+ width=args.width,
+ height=args.height,
+ libcamera_timeout_ms=args.libcamera_timeout_ms,
+ fswebcam_delay_s=args.fswebcam_delay_s,
+ warmup_frames=args.warmup_frames,
+ warmup_sleep_ms=args.warmup_sleep_ms,
+ retries=args.retries,
+ brightness_min=args.brightness_min,
+ retry_sleep_ms=args.retry_sleep_ms,
+ use_mjpg=args.use_mjpg,
+ exp_high_seq=exp_high_seq,
+ exp_low_seq=exp_low_seq,
+ )
+ if not ok or not os.path.isfile(out_path) or os.path.getsize(out_path) == 0:
+ log.warning("Background capture failed.")
+ return ("unknown", 0.0)
+ log.info("Saved capture: %s", out_path)
+ # Try server first
+ res = recognize_via_server(args.server_url, out_path)
+ if res is not None:
+ return res
+ # Fallback
+ log.warning("Face server unavailable; using fallback recognizer CLI.")
+ res = recognize_via_cli(fallback_cli, out_path, headshots, args.model, args.det)
+ if res is not None:
+ return res
+ return ("unknown", 0.0)
+
+ face_future = pool.submit(job)
+
+ # Start mic stream
+ stt.start()
+ log.info("Listening... (Ctrl+C to stop)")
+ try:
+ for utterance in stt.listen_loop():
+ # We got a finished utterance. While listening to this utterance, we should have started face job.
+ # If it didn't start (shouldn't happen), start now.
+ if face_future is None or face_future.done():
+ # Kick off face job ASAP
+ start_face_job_if_needed()
+
+ # Compose LLM message: user's live text + optional extra
+ msg = utterance
+ if args.llm_message:
+ msg = f"{utterance}\n\n{args.llm_message}"
+
+ # Wait for face result briefly; if not ready, block up to a short timeout (e.g., 3s)
+ name, conf = "unknown", 0.0
+ with face_lock:
+ fut = face_future
+ if fut is not None:
+ try:
+ name, conf = fut.result(timeout=3.0)
+ except Exception:
+ # Not ready or failed; we can wait a bit longer or proceed with unknown
+ log.debug("Face job not ready; proceeding with name='unknown' for now.")
+ name, conf = "unknown", 0.0
+
+ # Query love_server
+ reply_text = get_love_reply(args.love_url, name, msg)
+ if not reply_text:
+ reply_text = f"{name if name!='unknown' else 'friend'}, I'm wildly excited you spoke—tell me more!"
+
+ log.info("User said: %r -> LLM reply: %s", utterance, reply_text)
+
+ # Pause mic while speaking to avoid feedback/false triggers
+ stt.stop()
+ try:
+ speak_text(greet_script, reply_text)
+ finally:
+ # Reset background face job for next utterance
+ with face_lock:
+ face_future = None
+ # Resume listening
+ stt.start()
+ # As soon as we hear next speech chunk, we'll start a new face job
+
+ # Small idle so audio device can settle
+ time.sleep(0.05)
+
+ except KeyboardInterrupt:
+ log.info("Interrupted by user, shutting down...")
+ finally:
+ stt.stop()
+ pool.shutdown(wait=False)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 3/speech-scripts/check_words_example/number_input.sh b/Lab 3/speech-scripts/check_words_example/number_input.sh
new file mode 100644
index 0000000000..cad8ad5577
--- /dev/null
+++ b/Lab 3/speech-scripts/check_words_example/number_input.sh
@@ -0,0 +1,15 @@
+set -euo pipefail
+
+LABEL="${1:-phone number}"
+DUR="${2:-5}"
+OUT="${3:-number_input.txt}"
+
+TMP="/tmp/number_prompt.wav"
+pico2wave -w "$TMP" "Please say your ${LABEL} after the beep."
+aplay "$TMP" >/dev/null 2>&1 || true
+rm -f "$TMP"
+printf '\a'; sleep 0.2
+
+arecord -D "${ARECORD_DEVICE:-hw:2,0}" -f cd -c1 -r 48000 -d "$DUR" -t wav recorded_mono.wav
+python3 test_words.py recorded_mono.wav | tee "$OUT"
+echo "Saved transcript to: $OUT"
diff --git a/Lab 3/speech-scripts/greet_shawn.sh b/Lab 3/speech-scripts/greet_shawn.sh
new file mode 100644
index 0000000000..a8764fbc92
--- /dev/null
+++ b/Lab 3/speech-scripts/greet_shawn.sh
@@ -0,0 +1,8 @@
+set -euo pipefail
+
+TMP="/tmp/greet_shawn.wav"
+TEXT="Hey Shawn, welcome back to the lab! Your Pi is ready."
+
+pico2wave -w "$TMP" "$TEXT"
+aplay "$TMP" >/dev/null 2>&1
+rm -f "$TMP"
diff --git a/Lab 4/README.md b/Lab 4/README.md
new file mode 100644
index 0000000000..afbb46ed98
--- /dev/null
+++ b/Lab 4/README.md
@@ -0,0 +1,498 @@
+
+# Ph-UI!!!
+
+
+ Instructions for Students (Click to Expand)
+
+ **Submission Cleanup Reminder:**
+ - This README.md contains extra instructional text for guidance.
+ - Before submitting, remove all instructional text and example prompts from this file.
+ - You may delete these sections or use the toggle/hide feature in VS Code to collapse them for a cleaner look.
+ - Your final submission should be neat, focused on your own work, and easy to read for grading.
+
+ This helps ensure your README.md is clear, professional, and uniquely yours!
+
+
+---
+
+## Lab 4 Deliverables
+
+### Part 1 (Week 1)
+**Submit the following for Part 1:**
+*️⃣ **A. Capacitive Sensing**
+ - Photos/videos of your Twizzler (or other object) capacitive sensor setup
+ - Code and terminal output showing touch detection
+
+*️⃣ **B. More Sensors**
+ - Photos/videos of each sensor tested (light/proximity, rotary encoder, joystick, distance sensor)
+ - Code and terminal output for each sensor
+
+*️⃣ **C. Physical Sensing Design**
+ - 5 sketches of different ways to use your chosen sensor
+ - Written reflection: questions raised, what to prototype
+ - Pick one design to prototype and explain why
+
+*️⃣ **D. Display & Housing**
+ - 5 sketches for display/button/knob positioning
+ - Written reflection: questions raised, what to prototype
+ - Pick one display design to integrate
+ - Rationale for design
+ - Photos/videos of your cardboard prototype
+
+---
+
+### Part 2 (Week 2)
+**Submit the following for Part 2:**
+*️⃣ **E. Multi-Device Demo**
+ - Code and video for your multi-input multi-output demo (e.g., chaining Qwiic buttons, servo, GPIO expander, etc.)
+ - Reflection on interaction effects and chaining
+
+*️⃣ **F. Final Documentation**
+ - Photos/videos of your final prototype
+ - Written summary: what it looks like, works like, acts like
+ - Reflection on what you learned and next steps
+
+---
+
+## Lab Overview
+**NAMES OF COLLABORATORS HERE**
+
+
+For lab this week, we focus both on sensing, to bring in new modes of input into your devices, as well as prototyping the physical look and feel of the device. You will think about the physical form the device needs to perform the sensing as well as present the display or feedback about what was sensed.
+
+## Part 1 Lab Preparation
+
+### Get the latest content:
+As always, pull updates from the class Interactive-Lab-Hub to both your Pi and your own GitHub repo. As we discussed in the class, there are 2 ways you can do so:
+
+
+Option 1: On the Pi, `cd` to your `Interactive-Lab-Hub`, pull the updates from upstream (class lab-hub) and push the updates back to your own GitHub repo. You will need the personal access token for this.
+```
+pi@ixe00:~$ cd Interactive-Lab-Hub
+pi@ixe00:~/Interactive-Lab-Hub $ git pull upstream Fall2025
+pi@ixe00:~/Interactive-Lab-Hub $ git add .
+pi@ixe00:~/Interactive-Lab-Hub $ git commit -m "get lab4 content"
+pi@ixe00:~/Interactive-Lab-Hub $ git push
+```
+
+Option 2: On your own GitHub repo, [create pull request](https://github.com/FAR-Lab/Developing-and-Designing-Interactive-Devices/blob/2021Fall/readings/Submitting%20Labs.md) to get updates from the class Interactive-Lab-Hub. After you have latest updates online, go on your Pi, `cd` to your `Interactive-Lab-Hub` and use `git pull` to get updates from your own GitHub repo.
+
+Option 3: (preferred) use the Github.com interface to update the changes.
+
+### Start brainstorming ideas by reading:
+
+* [What do prototypes prototype?](https://www.semanticscholar.org/paper/What-do-Prototypes-Prototype-Houde-Hill/30bc6125fab9d9b2d5854223aeea7900a218f149)
+* [Paper prototyping](https://www.uxpin.com/studio/blog/paper-prototyping-the-practical-beginners-guide/) is used by UX designers to quickly develop interface ideas and run them by people before any programming occurs.
+* [Cardboard prototypes](https://www.youtube.com/watch?v=k_9Q-KDSb9o) help interactive product designers to work through additional issues, like how big something should be, how it could be carried, where it would sit.
+* [Tips to Cut, Fold, Mold and Papier-Mache Cardboard](https://makezine.com/2016/04/21/working-with-cardboard-tips-cut-fold-mold-papier-mache/) from Make Magazine.
+* [Surprisingly complicated forms](https://www.pinterest.com/pin/50032245843343100/) can be built with paper, cardstock or cardboard. The most advanced and challenging prototypes to prototype with paper are [cardboard mechanisms](https://www.pinterest.com/helgangchin/paper-mechanisms/) which move and change.
+* [Dyson Vacuum Cardboard Prototypes](http://media.dyson.com/downloads/JDF/JDF_Prim_poster05.pdf)
+
+
+### Gathering materials for this lab:
+
+* Cardboard (start collecting those shipping boxes!)
+* Found objects and materials--like bananas and twigs.
+* Cutting board
+* Cutting tools
+* Markers
+
+
+(We do offer shared cutting board, cutting tools, and markers on the class cart during the lab, so do not worry if you don't have them!)
+
+## Deliverables \& Submission for Lab 4
+
+The deliverables for this lab are, writings, sketches, photos, and videos that show what your prototype:
+* "Looks like": shows how the device should look, feel, sit, weigh, etc.
+* "Works like": shows what the device can do.
+* "Acts like": shows how a person would interact with the device.
+
+For submission, the readme.md page for this lab should be edited to include the work you have done:
+* Upload any materials that explain what you did, into your lab 4 repository, and link them in your lab 4 readme.md.
+* Link your Lab 4 readme.md in your main Interactive-Lab-Hub readme.md.
+* Labs are due on Mondays, make sure to submit your Lab 4 readme.md to Canvas.
+
+
+## Lab Overview
+
+A) [Capacitive Sensing](#part-a)
+
+B) [OLED screen](#part-b)
+
+C) [Paper Display](#part-c)
+
+D) [Materiality](#part-d)
+
+E) [Servo Control](#part-e)
+
+F) [Record the interaction](#part-f)
+
+
+## The Report (Part 1: A-D, Part 2: E-F)
+
+### Quick Start: Python Environment Setup
+
+1. **Create and activate a virtual environment in Lab 4:**
+ ```bash
+ cd ~/Interactive-Lab-Hub/Lab\ 4
+ python3 -m venv .venv
+ source .venv/bin/activate
+ ```
+2. **Install all Lab 4 requirements:**
+ ```bash
+ pip install -r requirements2025.txt
+ ```
+3. **Check CircuitPython Blinka installation:**
+ ```bash
+ python blinkatest.py
+ ```
+ If you see "Hello blinka!", your setup is correct. If not, follow the troubleshooting steps in the file or ask for help.
+
+### Part A
+### Capacitive Sensing, a.k.a. Human-Twizzler Interaction
+
+We want to introduce you to the [capacitive sensor](https://learn.adafruit.com/adafruit-mpr121-gator) in your kit. It's one of the most flexible input devices we are able to provide. At boot, it measures the capacitance on each of the 12 contacts. Whenever that capacitance changes, it considers it a user touch. You can attach any conductive material. In your kit, you have copper tape that will work well, but don't limit yourself! In the example below, we use Twizzlers--you should pick your own objects.
+
+
+
+
+
+
+
+Plug in the capacitive sensor board with the QWIIC connector. Connect your Twizzlers with either the copper tape or the alligator clips (the clips work better). Install the latest requirements from your working virtual environment:
+
+These Twizzlers are connected to pads 6 and 10. When you run the code and touch a Twizzler, the terminal will print out the following
+
+```
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python cap_test.py
+Twizzler 10 touched!
+Twizzler 6 touched!
+```
+
+### Part B
+### More sensors
+
+#### Light/Proximity/Gesture sensor (APDS-9960)
+
+We here want you to get to know this awesome sensor [Adafruit APDS-9960](https://www.adafruit.com/product/3595). It is capable of sensing proximity, light (also RGB), and gesture!
+
+
+
+
+Connect it to your pi with Qwiic connector and try running the three example scripts individually to see what the sensor is capable of doing!
+
+```
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python proximity_test.py
+...
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python gesture_test.py
+...
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python color_test.py
+...
+```
+
+You can go the the [Adafruit GitHub Page](https://github.com/adafruit/Adafruit_CircuitPython_APDS9960) to see more examples for this sensor!
+
+#### Rotary Encoder
+
+A rotary encoder is an electro-mechanical device that converts the angular position to analog or digital output signals. The [Adafruit rotary encoder](https://www.adafruit.com/product/4991#technical-details) we ordered for you came with separate breakout board and encoder itself, that is, they will need to be soldered if you have not yet done so! We will be bringing the soldering station to the lab class for you to use, also, you can go to the MakerLAB to do the soldering off-class. Here is some [guidance on soldering](https://learn.adafruit.com/adafruit-guide-excellent-soldering/preparation) from Adafruit. When you first solder, get someone who has done it before (ideally in the MakerLAB environment). It is a good idea to review this material beforehand so you know what to look at.
+
+
+
+
+
+
+
+
+Connect it to your pi with Qwiic connector and try running the example script, it comes with an additional button which might be useful for your design!
+
+```
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python encoder_test.py
+```
+
+You can go to the [Adafruit Learn Page](https://learn.adafruit.com/adafruit-i2c-qt-rotary-encoder/python-circuitpython) to learn more about the sensor! The sensor actually comes with an LED (neo pixel): Can you try lighting it up?
+
+#### Joystick
+
+
+A [joystick](https://www.sparkfun.com/products/15168) can be used to sense and report the input of the stick for it pivoting angle or direction. It also comes with a button input!
+
+
+
+
+
+Connect it to your pi with Qwiic connector and try running the example script to see what it can do!
+
+```
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python joystick_test.py
+```
+
+You can go to the [SparkFun GitHub Page](https://github.com/sparkfun/Qwiic_Joystick_Py) to learn more about the sensor!
+
+#### Distance Sensor
+
+
+Earlier we have asked you to play with the proximity sensor, which is able to sense objects within a short distance. Here, we offer [Sparkfun Proximity Sensor Breakout](https://www.sparkfun.com/products/15177), With the ability to detect objects up to 20cm away.
+
+
+
+
+
+
+Connect it to your pi with Qwiic connector and try running the example script to see how it works!
+
+```
+(circuitpython) pi@ixe00:~/Interactive-Lab-Hub/Lab 4 $ python qwiic_distance.py
+```
+
+You can go to the [SparkFun GitHub Page](https://github.com/sparkfun/Qwiic_Proximity_Py) to learn more about the sensor and see other examples
+
+### Part C
+### Physical considerations for sensing
+
+
+Usually, sensors need to be positioned in specific locations or orientations to make them useful for their application. Now that you've tried a bunch of the sensors, pick one that you would like to use, and an application where you use the output of that sensor for an interaction. For example, you can use a distance sensor to measure someone's height if you position it overhead and get them to stand under it.
+
+
+**\*\*\*Draw 5 sketches of different ways you might use your sensor, and how the larger device needs to be shaped in order to make the sensor useful.\*\*\***
+
+**\*\*\*What are some things these sketches raise as questions? What do you need to physically prototype to understand how to anwer those questions?\*\*\***
+
+**\*\*\*Pick one of these designs to prototype.\*\*\***
+
+
+### Part D
+### Physical considerations for displaying information and housing parts
+
+
+
+Here is a Pi with a paper faceplate on it to turn it into a display interface:
+
+
+
+
+
+This is fine, but the mounting of the display constrains the display location and orientation a lot. Also, it really only works for applications where people can come and stand over the Pi, or where you can mount the Pi to the wall.
+
+Here is another prototype for a paper display:
+
+
+
+
+Your kit includes these [SparkFun Qwiic OLED screens](https://www.sparkfun.com/products/17153). These use less power than the MiniTFTs you have mounted on the GPIO pins of the Pi, but, more importantly, they can be more flexibly mounted elsewhere on your physical interface. The way you program this display is almost identical to the way you program a Pi display. Take a look at `oled_test.py` and some more of the [Adafruit examples](https://github.com/adafruit/Adafruit_CircuitPython_SSD1306/tree/master/examples).
+
+
+
+
+
+
+
+It holds a Pi and usb power supply, and provides a front stage on which to put writing, graphics, LEDs, buttons or displays.
+
+This design can be made by scoring a long strip of corrugated cardboard of width X, with the following measurements:
+
+| Y height of box - thickness of cardboard | Z depth of box - thickness of cardboard | Y height of box | Z depth of box | H height of faceplate * * * * * (don't make this too short) * * * * *|
+| --- | --- | --- | --- | --- |
+
+Fold the first flap of the strip so that it sits flush against the back of the face plate, and tape, velcro or hot glue it in place. This will make a H x X interface, with a box of Z x X footprint (which you can adapt to the things you want to put in the box) and a height Y in the back.
+
+Here is an example:
+
+
+
+Think about how you want to present the information about what your sensor is sensing! Design a paper display for your project that communicates the state of the Pi and a sensor. Ideally you should design it so that you can slide the Pi out to work on the circuit or programming, and then slide it back in and reattach a few wires to be back in operation.
+
+**\*\*\*Sketch 5 designs for how you would physically position your display and any buttons or knobs needed to interact with it.\*\*\***
+
+**\*\*\*What are some things these sketches raise as questions? What do you need to physically prototype to understand how to anwer those questions?\*\*\***
+
+**\*\*\*Pick one of these display designs to integrate into your prototype.\*\*\***
+
+**\*\*\*Explain the rationale for the design.\*\*\*** (e.g. Does it need to be a certain size or form or need to be able to be seen from a certain distance?)
+
+Build a cardboard prototype of your design.
+
+
+**\*\*\*Document your rough prototype.\*\*\***
+
+
+# LAB PART 2
+
+### Part 2
+
+Following exploration and reflection from Part 1, complete the "looks like," "works like" and "acts like" prototypes for your design, reiterated below.
+
+
+
+### Part E
+
+#### Chaining Devices and Exploring Interaction Effects
+
+For Part 2, you will design and build a fun interactive prototype using multiple inputs and outputs. This means chaining Qwiic and STEMMA QT devices (e.g., buttons, encoders, sensors, servos, displays) and/or combining with traditional breadboard prototyping (e.g., LEDs, buzzers, etc.).
+
+**Your prototype should:**
+- Combine at least two different types of input and output devices, inspired by your physical considerations from Part 1.
+- Be playful, creative, and demonstrate multi-input/multi-output interaction.
+
+**Document your system with:**
+- Code for your multi-device demo
+- Photos and/or video of the working prototype in action
+- A simple interaction diagram or sketch showing how inputs and outputs are connected and interact
+- Written reflection: What did you learn about multi-input/multi-output interaction? What was fun, surprising, or challenging?
+
+**Questions to consider:**
+- What new types of interaction become possible when you combine two or more sensors or actuators?
+- How does the physical arrangement of devices (e.g., where the encoder or sensor is placed) change the user experience?
+- What happens if you use one device to control or modulate another (e.g., encoder sets a threshold, sensor triggers an action)?
+- How does the system feel if you swap which device is "primary" and which is "secondary"?
+
+Try chaining different combinations and document what you discover!
+
+See encoder_accel_servo_dashboard.py in the Lab 4 folder for an example of chaining together three devices.
+
+**`Lab 4/encoder_accel_servo_dashboard.py`**
+
+#### Using Multiple Qwiic Buttons: Changing I2C Address (Physically & Digitally)
+
+If you want to use more than one Qwiic Button in your project, you must give each button a unique I2C address. There are two ways to do this:
+
+##### 1. Physically: Soldering Address Jumpers
+
+On the back of the Qwiic Button, you'll find four solder jumpers labeled A0, A1, A2, and A3. By bridging these with solder, you change the I2C address. Only one button on the chain can use the default address (0x6F).
+
+**Address Table:**
+
+| A3 | A2 | A1 | A0 | Address (hex) |
+|----|----|----|----|---------------|
+| 0 | 0 | 0 | 0 | 0x6F |
+| 0 | 0 | 0 | 1 | 0x6E |
+| 0 | 0 | 1 | 0 | 0x6D |
+| 0 | 0 | 1 | 1 | 0x6C |
+| 0 | 1 | 0 | 0 | 0x6B |
+| 0 | 1 | 0 | 1 | 0x6A |
+| 0 | 1 | 1 | 0 | 0x69 |
+| 0 | 1 | 1 | 1 | 0x68 |
+| 1 | 0 | 0 | 0 | 0x67 |
+| ...| ...| ...| ... | ... |
+
+For example, if you solder A0 closed (leave A1, A2, A3 open), the address becomes 0x6E.
+
+**Soldering Tips:**
+- Use a small amount of solder to bridge the pads for the jumper you want to close.
+- Only one jumper needs to be closed for each address change (see table above).
+- Power cycle the button after changing the jumper.
+
+##### 2. Digitally: Using Software to Change Address
+
+You can also change the address in software (temporarily or permanently) using the example script `qwiic_button_ex6_changeI2CAddress.py` in the Lab 4 folder. This is useful if you want to reassign addresses without soldering.
+
+Run the script and follow the prompts:
+```bash
+python qwiic_button_ex6_changeI2CAddress.py
+```
+Enter the new address (e.g., 5B for 0x5B) when prompted. Power cycle the button after changing the address.
+
+**Note:** The software method is less foolproof and you need to make sure to keep track of which button has which address!
+
+
+##### Using Multiple Buttons in Code
+
+After setting unique addresses, you can use multiple buttons in your script. See these example scripts in the Lab 4 folder:
+
+- **`qwiic_1_button.py`**: Basic example for reading a single Qwiic Button (default address 0x6F). Run with:
+ ```bash
+ python qwiic_1_button.py
+ ```
+
+- **`qwiic_button_led_demo.py`**: Demonstrates using two Qwiic Buttons at different addresses (e.g., 0x6F and 0x6E) and controlling their LEDs. Button 1 toggles its own LED; Button 2 toggles both LEDs. Run with:
+ ```bash
+ python qwiic_button_led_demo.py
+ ```
+
+Here is a minimal code example for two buttons:
+```python
+import qwiic_button
+
+# Default button (0x6F)
+button1 = qwiic_button.QwiicButton()
+# Button with A0 soldered (0x6E)
+button2 = qwiic_button.QwiicButton(0x6E)
+
+button1.begin()
+button2.begin()
+
+while True:
+ if button1.is_button_pressed():
+ print("Button 1 pressed!")
+ if button2.is_button_pressed():
+ print("Button 2 pressed!")
+```
+
+For more details, see the [Qwiic Button Hookup Guide](https://learn.sparkfun.com/tutorials/qwiic-button-hookup-guide/all#i2c-address).
+
+---
+
+### PCF8574 GPIO Expander: Add More Pins Over I²C
+
+Sometimes your Pi’s header GPIO pins are already full (e.g., with a display or HAT). That’s where an I²C GPIO expander comes in handy.
+
+We use the Adafruit PCF8574 I²C GPIO Expander, which gives you 8 extra digital pins over I²C. It’s a great way to prototype with LEDs, buttons, or other components on the breadboard without worrying about pin conflicts—similar to how Arduino users often expand their pinouts when prototyping physical interactions.
+
+**Why is this useful?**
+- You only need two wires (I²C: SDA + SCL) to unlock 8 extra GPIOs.
+- It integrates smoothly with CircuitPython and Blinka.
+- It allows a clean prototyping workflow when the Pi’s 40-pin header is already occupied by displays, HATs, or sensors.
+- Makes breadboard setups feel more like an Arduino-style prototyping environment where it’s easy to wire up interaction elements.
+
+**Demo Script:** `Lab 4/gpio_expander.py`
+
+
+
+
+
+We connected 8 LEDs (through 220 Ω resistors) to the expander and ran a little light show. The script cycles through three patterns:
+- Chase (one LED at a time, left to right)
+- Knight Rider (back-and-forth sweep)
+- Disco (random blink chaos)
+
+Every few runs, the script swaps to the next pattern automatically:
+```bash
+python gpio_expander.py
+```
+
+This is a playful way to visualize how the expander works, but the same technique applies if you wanted to prototype buttons, switches, or other interaction elements. It’s a lightweight, flexible addition to your prototyping toolkit.
+
+---
+
+### Servo Control with SparkFun Servo pHAT
+For this lab, you will use the **SparkFun Servo pHAT** to control a micro servo (such as the Miuzei MS18 or similar 9g servo). The Servo pHAT stacks directly on top of the Adafruit Mini PiTFT (135×240) display without pin conflicts:
+- The Mini PiTFT uses SPI (GPIO22, 23, 24, 25) for display and buttons ([SPI pinout](https://pinout.xyz/pinout/spi)).
+- The Servo pHAT uses I²C (GPIO2 & 3) for the PCA9685 servo driver ([I2C pinout](https://pinout.xyz/pinout/i2c)).
+- Since SPI and I²C are separate buses, you can use both boards together.
+**⚡ Power:**
+- Plug a USB-C cable into the Servo pHAT to provide enough current for the servos. The Pi itself should still be powered by its own USB-C supply. Do NOT power servos from the Pi’s 5V rail.
+
+
+
+
+
+**Basic Python Example:**
+We provide a simple example script: `Lab 4/pi_servo_hat_test.py` (requires the `pi_servo_hat` Python package).
+Run the example:
+```
+python pi_servo_hat_test.py
+```
+For more details and advanced usage, see the [official SparkFun Servo pHAT documentation](https://learn.sparkfun.com/tutorials/pi-servo-phat-v2-hookup-guide/all#resources-and-going-further).
+A servo motor is a rotary actuator that allows for precise control of angular position. The position is set by the width of an electrical pulse (PWM). You can read [this Adafruit guide](https://learn.adafruit.com/adafruit-arduino-lesson-14-servo-motors/servo-motors) to learn more about how servos work.
+
+---
+
+
+### Part F
+
+### Record
+
+Document all the prototypes and iterations you have designed and worked on! Again, deliverables for this lab are writings, sketches, photos, and videos that show what your prototype:
+* "Looks like": shows how the device should look, feel, sit, weigh, etc.
+* "Works like": shows what the device can do
+* "Acts like": shows how a person would interact with the device
+
diff --git a/Lab 4/Servo_Setup.jpg b/Lab 4/Servo_Setup.jpg
new file mode 100644
index 0000000000..002f4dbd5e
Binary files /dev/null and b/Lab 4/Servo_Setup.jpg differ
diff --git a/Lab 4/Servo_pHAT.gif b/Lab 4/Servo_pHAT.gif
new file mode 100644
index 0000000000..826332dfe4
Binary files /dev/null and b/Lab 4/Servo_pHAT.gif differ
diff --git a/Lab 4/accel_test.py b/Lab 4/accel_test.py
new file mode 100644
index 0000000000..6cfa481b1d
--- /dev/null
+++ b/Lab 4/accel_test.py
@@ -0,0 +1,20 @@
+# SPDX-FileCopyrightText: Copyright (c) 2022 Edrig
+#
+# SPDX-License-Identifier: MIT
+import time
+
+import board
+
+from adafruit_lsm6ds.lsm6ds3 import LSM6DS3
+
+i2c = board.I2C() # uses board.SCL and board.SDA
+# i2c = board.STEMMA_I2C() # For using the built-in STEMMA QT connector on a microcontroller
+sensor = LSM6DS3(i2c)
+
+while True:
+ accel_x, accel_y, accel_z = sensor.acceleration
+ print(f"Acceleration: X:{accel_x:.2f}, Y: {accel_y:.2f}, Z: {accel_z:.2f} m/s^2")
+ gyro_x, gyro_y, gyro_z = sensor.gyro
+ print(f"Gyro X:{gyro_x:.2f}, Y: {gyro_y:.2f}, Z: {gyro_z:.2f} radians/s")
+ print("")
+ time.sleep(0.5)
\ No newline at end of file
diff --git a/Lab 4/blinkatest.py b/Lab 4/blinkatest.py
new file mode 100644
index 0000000000..82f85093eb
--- /dev/null
+++ b/Lab 4/blinkatest.py
@@ -0,0 +1,19 @@
+import board
+import digitalio
+import busio
+
+print("Hello, blinka!")
+
+# Try to create a Digital input
+pin = digitalio.DigitalInOut(board.D4)
+print("Digital IO ok!")
+
+# Try to create an I2C device
+i2c = busio.I2C(board.SCL, board.SDA)
+print("I2C ok!")
+
+# Try to create an SPI device
+spi = busio.SPI(board.SCLK, board.MOSI, board.MISO)
+print("SPI ok!")
+
+print("done!")
\ No newline at end of file
diff --git a/Lab 4/camera_test.py b/Lab 4/camera_test.py
new file mode 100644
index 0000000000..7cb64f1e08
--- /dev/null
+++ b/Lab 4/camera_test.py
@@ -0,0 +1,68 @@
+import cv2
+import pyaudio
+import wave
+import pygame
+
+def test_camera():
+ cap = cv2.VideoCapture(0) # Change 0 to 1 or 2 if your camera does not show up.
+ while True:
+ ret, frame = cap.read()
+ cv2.imshow('Camera Test', frame)
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ break
+ cap.release()
+ cv2.destroyAllWindows()
+
+def test_microphone():
+ p = pyaudio.PyAudio()
+ stream = p.open(format=pyaudio.paInt16, channels=1, rate=44100, input=True, frames_per_buffer=1024)
+ frames = []
+ print("Recording...")
+ for i in range(0, int(44100 / 1024 * 2)):
+ data = stream.read(1024)
+ frames.append(data)
+ print("Finished recording.")
+ stream.stop_stream()
+ stream.close()
+ p.terminate()
+ wf = wave.open('test.wav', 'wb')
+ wf.setnchannels(1)
+ wf.setsampwidth(p.get_sample_size(pyaudio.paInt16))
+ wf.setframerate(44100)
+ wf.writeframes(b''.join(frames))
+ wf.close()
+ print("Saved as test.wav")
+
+def test_speaker():
+ p = pyaudio.PyAudio()
+
+ # List all audio output devices
+ info = p.get_host_api_info_by_index(0)
+ numdevices = info.get('deviceCount')
+ for i in range(0, numdevices):
+ if (p.get_device_info_by_host_api_device_index(0, i).get('maxOutputChannels')) > 0:
+ print("Output Device id ", i, " - ", p.get_device_info_by_host_api_device_index(0, i).get('name'))
+
+ device_index = int(input("Enter the Output Device id to use: ")) # Enter the id of your USB audio device
+
+ wf = wave.open('test.wav', 'rb')
+ stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
+ channels=wf.getnchannels(),
+ rate=wf.getframerate(),
+ output=True,
+ output_device_index=device_index) # specify your device index here
+
+ data = wf.readframes(1024)
+ while data:
+ stream.write(data)
+ data = wf.readframes(1024)
+
+ stream.stop_stream()
+ stream.close()
+ p.terminate()
+
+
+if __name__ == "__main__":
+ test_camera()
+ test_microphone()
+ test_speaker()
diff --git a/Lab 4/cap_test.py b/Lab 4/cap_test.py
new file mode 100644
index 0000000000..cdb7f6037a
--- /dev/null
+++ b/Lab 4/cap_test.py
@@ -0,0 +1,16 @@
+
+import time
+import board
+import busio
+
+import adafruit_mpr121
+
+i2c = busio.I2C(board.SCL, board.SDA)
+
+mpr121 = adafruit_mpr121.MPR121(i2c)
+
+while True:
+ for i in range(12):
+ if mpr121[i].value:
+ print(f"Twizzler {i} touched!")
+ time.sleep(0.25) # Small delay to keep from spamming output messages.
diff --git a/Lab 4/chaining.png b/Lab 4/chaining.png
new file mode 100644
index 0000000000..e8eaf23ee6
Binary files /dev/null and b/Lab 4/chaining.png differ
diff --git a/Lab 4/color_test.py b/Lab 4/color_test.py
new file mode 100644
index 0000000000..57fa7a79fe
--- /dev/null
+++ b/Lab 4/color_test.py
@@ -0,0 +1,30 @@
+# SPDX-FileCopyrightText: 2021 ladyada for Adafruit Industries
+# SPDX-License-Identifier: MIT
+
+import time
+import board
+from adafruit_apds9960.apds9960 import APDS9960
+from adafruit_apds9960 import colorutility
+
+i2c = board.I2C()
+apds = APDS9960(i2c)
+apds.enable_color = True
+
+
+while True:
+ # create some variables to store the color data in
+
+ # wait for color data to be ready
+ while not apds.color_data_ready:
+ time.sleep(0.005)
+
+ # get the data and print the different channels
+ r, g, b, c = apds.color_data
+ print("red: ", r)
+ print("green: ", g)
+ print("blue: ", b)
+ print("clear: ", c)
+
+ print("color temp {}".format(colorutility.calculate_color_temperature(r, g, b)))
+ print("light lux {}".format(colorutility.calculate_lux(r, g, b)))
+ time.sleep(0.5)
\ No newline at end of file
diff --git a/Lab 4/distance_test.py b/Lab 4/distance_test.py
new file mode 100644
index 0000000000..22b80eb3c3
--- /dev/null
+++ b/Lab 4/distance_test.py
@@ -0,0 +1,29 @@
+"""
+ Reading distance from the laser based VL53L1X
+ This example prints the distance to an object. If you are getting weird
+ readings, be sure the vacuum tape has been removed from the sensor.
+"""
+
+import qwiic
+import time
+
+print("VL53L1X Qwiic Test\n")
+ToF = qwiic.QwiicVL53L1X()
+if (ToF.sensor_init() == None): # Begin returns 0 on a good init
+ print("Sensor online!\n")
+
+while True:
+ try:
+ ToF.start_ranging() # Write configuration bytes to initiate measurement
+ time.sleep(.005)
+ distance = ToF.get_distance() # Get the result of the measurement from the sensor
+ time.sleep(.005)
+ ToF.stop_ranging()
+
+ distanceInches = distance / 25.4
+ distanceFeet = distanceInches / 12.0
+
+ print("Distance(mm): %s Distance(ft): %s" % (distance, distanceFeet))
+
+ except Exception as e:
+ print(e)
\ No newline at end of file
diff --git a/Lab 4/encoder_accel_servo_dashboard.py b/Lab 4/encoder_accel_servo_dashboard.py
new file mode 100644
index 0000000000..bc3d0a0809
--- /dev/null
+++ b/Lab 4/encoder_accel_servo_dashboard.py
@@ -0,0 +1,78 @@
+import time
+import board
+from adafruit_seesaw import seesaw, rotaryio, digitalio
+from adafruit_lsm6ds.lsm6ds3 import LSM6DS3
+import pi_servo_hat
+import math
+import os
+
+# --- Setup ---
+ss = seesaw.Seesaw(board.I2C(), addr=0x36)
+ss.pin_mode(24, ss.INPUT_PULLUP)
+button = digitalio.DigitalIO(ss, 24)
+encoder = rotaryio.IncrementalEncoder(ss)
+last_encoder = -999
+
+# Accelerometer
+sox = LSM6DS3(board.I2C(), address=0x6A)
+
+# Servo
+servo = pi_servo_hat.PiServoHat()
+servo.restart()
+SERVO_MIN = 0
+SERVO_MAX = 120
+SERVO_CH = 0
+
+# --- Modes ---
+MODES = ["Encoder Only", "Accelerometer Only", "Combined"]
+mode = 0
+mode_press = False
+
+# --- State ---
+base_angle = 60
+enc_factor = 5
+
+def clamp(val, minv, maxv):
+ return max(minv, min(maxv, val))
+
+def clear():
+ os.system('clear')
+
+# --- Main Loop ---
+tilt_offset = 0
+while True:
+ # --- Read encoder/button ---
+ enc_pos = -encoder.position
+ if not button.value and not mode_press:
+ mode = (mode + 1) % len(MODES)
+ mode_press = True
+ if button.value and mode_press:
+ mode_press = False
+
+ # --- Read accelerometer ---
+ accel_x, accel_y, accel_z = sox.acceleration
+ z_angle_rad = math.atan2(-accel_y, accel_x)
+ z_angle_deg = math.degrees(z_angle_rad)
+ raw_offset = -z_angle_deg * (40/90)
+ tilt_offset = 0.8 * tilt_offset + 0.2 * raw_offset
+ tilt_offset_int = int(tilt_offset)
+
+ # --- Calculate servo angle ---
+ if mode == 0: # Encoder Only
+ servo_angle = clamp(60 + enc_pos * enc_factor, SERVO_MIN, SERVO_MAX)
+ elif mode == 1: # Accelerometer Only
+ servo_angle = clamp(60 + tilt_offset_int, SERVO_MIN, SERVO_MAX)
+ else: # Combined
+ servo_angle = clamp(60 + enc_pos * enc_factor + tilt_offset_int, SERVO_MIN, SERVO_MAX)
+ servo.move_servo_position(SERVO_CH, servo_angle)
+
+ # --- Dashboard ---
+ clear()
+ print(f"=== Servo/Encoder/Accel Dashboard ===")
+ print(f"Mode: {MODES[mode]} (press encoder button to switch)")
+ print(f"Encoder position: {enc_pos}")
+ print(f"Accel X: {accel_x:.2f} Y: {accel_y:.2f} Z: {accel_z:.2f}")
+ print(f"Z angle: {z_angle_deg:.1f}° Tilt offset: {tilt_offset_int}")
+ print(f"Servo angle: {servo_angle}")
+ print(f"\n[Encoder sets base, Accel tilts needle, Combined = both]")
+ time.sleep(0.07)
diff --git a/Lab 4/encoder_test.py b/Lab 4/encoder_test.py
new file mode 100644
index 0000000000..8ecc7c1818
--- /dev/null
+++ b/Lab 4/encoder_test.py
@@ -0,0 +1,43 @@
+# SPDX-FileCopyrightText: 2021 John Furcean
+# SPDX-License-Identifier: MIT
+
+"""I2C rotary encoder simple test example."""
+
+import board
+from adafruit_seesaw import seesaw, rotaryio, digitalio
+
+# For use with the STEMMA connector on QT Py RP2040
+# import busio
+# i2c = busio.I2C(board.SCL1, board.SDA1)
+# seesaw = seesaw.Seesaw(i2c, 0x36)
+
+seesaw = seesaw.Seesaw(board.I2C(), addr=0x36)
+
+seesaw_product = (seesaw.get_version() >> 16) & 0xFFFF
+print("Found product {}".format(seesaw_product))
+if seesaw_product != 4991:
+ print("Wrong firmware loaded? Expected 4991")
+
+seesaw.pin_mode(24, seesaw.INPUT_PULLUP)
+button = digitalio.DigitalIO(seesaw, 24)
+button_held = False
+
+encoder = rotaryio.IncrementalEncoder(seesaw)
+last_position = None
+
+while True:
+
+ # negate the position to make clockwise rotation positive
+ position = -encoder.position
+
+ if position != last_position:
+ last_position = position
+ print("Position: {}".format(position))
+
+ if not button.value and not button_held:
+ button_held = True
+ print("Button pressed")
+
+ if button.value and button_held:
+ button_held = False
+ print("Button released")
\ No newline at end of file
diff --git a/Lab 4/gesture_test.py b/Lab 4/gesture_test.py
new file mode 100644
index 0000000000..2d4d455df4
--- /dev/null
+++ b/Lab 4/gesture_test.py
@@ -0,0 +1,26 @@
+# SPDX-FileCopyrightText: 2021 ladyada for Adafruit Industries
+# SPDX-License-Identifier: MIT
+
+import board
+from adafruit_apds9960.apds9960 import APDS9960
+
+i2c = board.I2C()
+
+apds = APDS9960(i2c)
+apds.enable_proximity = True
+apds.enable_gesture = True
+
+# Uncomment and set the rotation if depending on how your sensor is mounted.
+# apds.rotation = 270 # 270 for CLUE
+
+while True:
+ gesture = apds.gesture()
+
+ if gesture == 0x01:
+ print("up")
+ elif gesture == 0x02:
+ print("down")
+ elif gesture == 0x03:
+ print("left")
+ elif gesture == 0x04:
+ print("right")
\ No newline at end of file
diff --git a/Lab 4/gpio_expander.py b/Lab 4/gpio_expander.py
new file mode 100644
index 0000000000..00aea270d3
--- /dev/null
+++ b/Lab 4/gpio_expander.py
@@ -0,0 +1,64 @@
+# gpio_expander.py
+# LED fun with PCF8574 I2C GPIO expander
+#
+# Demonstrates how to use an I2C GPIO expander to sink current
+# and control multiple LEDs for quick breadboard prototyping.
+
+import time
+import random
+import board
+import adafruit_pcf8574
+
+# Initialize I2C and PCF8574
+i2c = board.I2C()
+pcf = adafruit_pcf8574.PCF8574(i2c)
+
+# Grab all 8 pins
+leds = [pcf.get_pin(i) for i in range(8)]
+
+# Configure as outputs (HIGH = off, LOW = LED on)
+for ld in leds:
+ ld.switch_to_output(value=True)
+
+# --- Patterns ---
+def chase():
+ """Simple left-to-right chase"""
+ for ld in leds:
+ ld.value = False
+ time.sleep(0.12)
+ ld.value = True
+
+def knight_rider():
+ """Bounce back and forth"""
+ for ld in leds:
+ ld.value = False
+ time.sleep(0.12)
+ ld.value = True
+ for ld in reversed(leds[1:-1]):
+ ld.value = False
+ time.sleep(0.12)
+ ld.value = True
+
+def disco():
+ """Random LED flashing"""
+ for _ in range(12):
+ ld = random.choice(leds)
+ ld.value = False
+ time.sleep(0.08)
+ ld.value = True
+
+patterns = [chase, knight_rider, disco]
+
+# --- Main Loop ---
+pattern_index = 0
+runs = 0
+
+while True:
+ # Run current pattern
+ patterns[pattern_index]()
+ runs += 1
+
+ # After a few runs, switch pattern
+ if runs >= 5:
+ runs = 0
+ pattern_index = (pattern_index + 1) % len(patterns)
diff --git a/Lab 4/gpio_leds.gif b/Lab 4/gpio_leds.gif
new file mode 100644
index 0000000000..350a6229a6
Binary files /dev/null and b/Lab 4/gpio_leds.gif differ
diff --git a/Lab 4/joystick_test.py b/Lab 4/joystick_test.py
new file mode 100644
index 0000000000..133ad115d5
--- /dev/null
+++ b/Lab 4/joystick_test.py
@@ -0,0 +1,34 @@
+from __future__ import print_function
+import qwiic_joystick
+import time
+import sys
+
+def runExample():
+
+ print("\nSparkFun qwiic Joystick Example 1\n")
+ myJoystick = qwiic_joystick.QwiicJoystick()
+
+ if myJoystick.connected == False:
+ print("The Qwiic Joystick device isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+ return
+
+ myJoystick.begin()
+
+ print("Initialized. Firmware Version: %s" % myJoystick.version)
+
+ while True:
+
+ print("X: %d, Y: %d, Button: %d" % ( \
+ myJoystick.horizontal, \
+ myJoystick.vertical, \
+ myJoystick.button))
+
+ time.sleep(.5)
+
+if __name__ == '__main__':
+ try:
+ runExample()
+ except (KeyboardInterrupt, SystemExit) as exErr:
+ print("\nEnding Example 1")
+ sys.exit(0)
\ No newline at end of file
diff --git a/Lab 4/keypad_test.py b/Lab 4/keypad_test.py
new file mode 100644
index 0000000000..60e851d7d2
--- /dev/null
+++ b/Lab 4/keypad_test.py
@@ -0,0 +1,60 @@
+# Make sure to have everything set up
+# https://github.com/sparkfun/Qwiic_Keypad_Py
+# `pip install sparkfun-qwiic-keypad`
+
+# From https://github.com/sparkfun/Qwiic_Keypad_Py/blob/main/examples/qwiic_keypad_ex2.py
+
+
+from __future__ import print_function
+import qwiic_keypad
+import time
+import sys
+
+def runExample():
+
+ print("\nSparkFun qwiic Keypad Example 1\n")
+ myKeypad = qwiic_keypad.QwiicKeypad()
+
+ if myKeypad.connected == False:
+ print("The Qwiic Keypad device isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+ return
+
+ myKeypad.begin()
+
+ print("Initialized. Firmware Version: %s" % myKeypad.version)
+ print("Press a button: * to do a space. # to go to next line.")
+
+ button = 0
+ while True:
+
+ # necessary for keypad to pull button from stack to readable register
+ myKeypad.update_fifo()
+ button = myKeypad.get_button()
+
+ if button == -1:
+ print("No keypad detected")
+ time.sleep(1)
+
+ elif button != 0:
+
+ # Get the character version of this char
+ charButton = chr(button)
+ if charButton == '#':
+ print()
+ elif charButton == '*':
+ print(" ", end="")
+ else:
+ print(charButton, end="")
+
+ # Flush the stdout buffer to give immediate user feedback
+ sys.stdout.flush()
+
+ time.sleep(.25)
+
+if __name__ == '__main__':
+ try:
+ runExample()
+ except (KeyboardInterrupt, SystemExit) as exErr:
+ print("\nEnding Example 1")
+ sys.exit(0)
\ No newline at end of file
diff --git a/Lab 4/oled_test.py b/Lab 4/oled_test.py
new file mode 100644
index 0000000000..d6e96ff59e
--- /dev/null
+++ b/Lab 4/oled_test.py
@@ -0,0 +1,89 @@
+
+# SPDX-FileCopyrightText: 2021 ladyada for Adafruit Industries
+# SPDX-License-Identifier: MIT
+
+import board
+import busio
+import adafruit_ssd1306
+
+# Create the I2C interface.
+i2c = busio.I2C(board.SCL, board.SDA)
+
+# Create the SSD1306 OLED class.
+# The first two parameters are the pixel width and pixel height. Change these
+# to the right size for your display!
+oled = adafruit_ssd1306.SSD1306_I2C(128, 32, i2c)
+
+
+# Helper function to draw a circle from a given position with a given radius
+# This is an implementation of the midpoint circle algorithm,
+# see https://en.wikipedia.org/wiki/Midpoint_circle_algorithm#C_example for details
+def draw_circle(xpos0, ypos0, rad, col=1):
+ x = rad - 1
+ y = 0
+ dx = 1
+ dy = 1
+ err = dx - (rad << 1)
+ while x >= y:
+ oled.pixel(xpos0 + x, ypos0 + y, col)
+ oled.pixel(xpos0 + y, ypos0 + x, col)
+ oled.pixel(xpos0 - y, ypos0 + x, col)
+ oled.pixel(xpos0 - x, ypos0 + y, col)
+ oled.pixel(xpos0 - x, ypos0 - y, col)
+ oled.pixel(xpos0 - y, ypos0 - x, col)
+ oled.pixel(xpos0 + y, ypos0 - x, col)
+ oled.pixel(xpos0 + x, ypos0 - y, col)
+ if err <= 0:
+ y += 1
+ err += dy
+ dy += 2
+ if err > 0:
+ x -= 1
+ dx += 2
+ err += dx - (rad << 1)
+
+
+# initial center of the circle
+center_x = 63
+center_y = 15
+# how fast does it move in each direction
+x_inc = 1
+y_inc = 1
+# what is the starting radius of the circle
+radius = 8
+
+# start with a blank screen
+oled.fill(0)
+# we just blanked the framebuffer. to push the framebuffer onto the display, we call show()
+oled.show()
+while True:
+ # undraw the previous circle
+ draw_circle(center_x, center_y, radius, col=0)
+
+ # if bouncing off right
+ if center_x + radius >= oled.width:
+ # start moving to the left
+ x_inc = -1
+ # if bouncing off left
+ elif center_x - radius < 0:
+ # start moving to the right
+ x_inc = 1
+
+ # if bouncing off top
+ if center_y + radius >= oled.height:
+ # start moving down
+ y_inc = -1
+ # if bouncing off bottom
+ elif center_y - radius < 0:
+ # start moving up
+ y_inc = 1
+
+ # go more in the current direction
+ center_x += x_inc
+ center_y += y_inc
+
+ # draw the new circle
+ draw_circle(center_x, center_y, radius)
+ # show all the changes we just made
+
+ oled.show()
\ No newline at end of file
diff --git a/Lab 4/pi_servo_hat_test.py b/Lab 4/pi_servo_hat_test.py
new file mode 100644
index 0000000000..841be9f5b7
--- /dev/null
+++ b/Lab 4/pi_servo_hat_test.py
@@ -0,0 +1,28 @@
+import pi_servo_hat
+import time
+
+# For most 9g micro servos (like SG90, MS18, SER0048), safe range is 0-120 degrees
+SERVO_MIN = 0
+SERVO_MAX = 120
+SERVO_CH = 0 # Channel 0 by default
+
+servo = pi_servo_hat.PiServoHat()
+servo.restart()
+
+print(f"Sweeping servo on channel {SERVO_CH} from {SERVO_MIN} to {SERVO_MAX} degrees...")
+
+try:
+ while True:
+ # Sweep up
+ for angle in range(SERVO_MIN, SERVO_MAX + 1, 1):
+ servo.move_servo_position(SERVO_CH, angle)
+ print(f"Angle: {angle}")
+ time.sleep(0.01)
+ # Sweep down
+ for angle in range(SERVO_MAX, SERVO_MIN - 1, -1):
+ servo.move_servo_position(SERVO_CH, angle)
+ print(f"Angle: {angle}")
+ time.sleep(0.01)
+except KeyboardInterrupt:
+ print("\nTest stopped.")
+ servo.move_servo_position(SERVO_CH, 60) # Move to center on exit
diff --git a/Lab 4/proximity_test.py b/Lab 4/proximity_test.py
new file mode 100644
index 0000000000..9d0799f61c
--- /dev/null
+++ b/Lab 4/proximity_test.py
@@ -0,0 +1,15 @@
+# SPDX-FileCopyrightText: 2021 ladyada for Adafruit Industries
+# SPDX-License-Identifier: MIT
+
+import time
+import board
+from adafruit_apds9960.apds9960 import APDS9960
+
+i2c = board.I2C()
+apds = APDS9960(i2c)
+
+apds.enable_proximity = True
+
+while True:
+ print(apds.proximity)
+ time.sleep(0.2)
\ No newline at end of file
diff --git a/Lab 4/qwiic_1_button.py b/Lab 4/qwiic_1_button.py
new file mode 100644
index 0000000000..ad71620525
--- /dev/null
+++ b/Lab 4/qwiic_1_button.py
@@ -0,0 +1,33 @@
+
+import qwiic_button
+import time
+import sys
+
+def run_example():
+
+ print("\nSparkFun Qwiic Button Example 1")
+ my_button = qwiic_button.QwiicButton()
+
+ if my_button.begin() == False:
+ print("\nThe Qwiic Button isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+ return
+ print("\nButton ready!")
+
+ while True:
+
+ if my_button.is_button_pressed() == True:
+ print("The button is pressed!")
+
+ else:
+ print("The button is not pressed!")
+
+ time.sleep(0.1)
+
+if __name__ == '__main__':
+ try:
+ run_example()
+ except (KeyboardInterrupt, SystemExit) as exErr:
+ print("\nEnding Example 1")
+ sys.exit(0)
+
diff --git a/Lab 4/qwiic_button_ex6_changeI2CAddress.py b/Lab 4/qwiic_button_ex6_changeI2CAddress.py
new file mode 100644
index 0000000000..35340c267e
--- /dev/null
+++ b/Lab 4/qwiic_button_ex6_changeI2CAddress.py
@@ -0,0 +1,93 @@
+#!/usr/bin/env python
+#-----------------------------------------------------------------------------
+# qwiic_button_ex5.py
+#
+# Simple Example for the Qwiic Button. Shows how to change the I2C address of
+# the Qwiic Button
+#------------------------------------------------------------------------
+#
+# Written by Priyanka Makin @ SparkFun Electronics, January 2021
+#
+# This python library supports the SparkFun Electroncis qwiic
+# qwiic sensor/board ecosystem on a Raspberry Pi (and compatable) single
+# board computers.
+#
+# More information on qwiic is at https://www.sparkfun.com/qwiic
+#
+# Do you like this library? Help support SparkFun. Buy a board!
+#
+#==================================================================================
+# Copyright (c) 2019 SparkFun Electronics
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#==================================================================================
+# Example 5
+
+import qwiic_button
+import time
+import sys
+
+# If you've already changed the I2C address, change this to the current address!
+currentAddress = qwiic_button._QWIIC_BUTTON_DEFAULT_ADDRESS
+
+def run_example():
+
+ print("\nSparkFun Qwiic Button Example 6")
+ my_button = qwiic_button.QwiicButton(currentAddress)
+
+ if my_button.begin() == False:
+ print("\nThe Qwiic Button isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+ return
+
+ print("\nButton ready!")
+
+ print("Enter a new I2C address for the Qwiic Button to use.")
+ print("Any address from 0x08 to 0x77 works.")
+ print("Don't use the 0x prefix. For instance, if you wanted to")
+ print("change the address to 0x5B, you would type 5B and hit enter.")
+
+ new_address = input("New Address: ")
+ new_address = int(new_address, 16)
+
+ # Check if the user entered a valid address
+ if new_address >= 0x08 and new_address <= 0x77:
+ print("Characters received and new address valid!")
+ print("Attempting to set Qwiic Button address...")
+
+ my_button.set_I2C_address(new_address)
+ print("Address successfully changed!")
+ # Check that the Qwiic Button acknowledges on the new address
+ time.sleep(0.02)
+ if my_button.begin() == False:
+ print("The Qwiic Button isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+
+ else:
+ print("Button acknowledged on new address!")
+
+ else:
+ print("Address entered not a valid I2C address")
+
+if __name__ == '__main__':
+ try:
+ run_example()
+ except (KeyboardInterrupt, SystemExit) as exErr:
+ print("\nEnding Example 6")
+ sys.exit(0)
diff --git a/Lab 4/qwiic_button_led_demo.py b/Lab 4/qwiic_button_led_demo.py
new file mode 100644
index 0000000000..c033ff45b1
--- /dev/null
+++ b/Lab 4/qwiic_button_led_demo.py
@@ -0,0 +1,49 @@
+import qwiic_button
+import time
+import sys
+
+# Example: Use two Qwiic buttons and their LEDs interactively
+# - Pressing button 1 toggles its own LED
+# - Pressing button 2 toggles both LEDs
+
+def run_example():
+ print("\nQwiic Button + LED Demo: Two Buttons, Two LEDs")
+ my_button1 = qwiic_button.QwiicButton()
+ my_button2 = qwiic_button.QwiicButton(0x6E)
+
+ if not my_button1.begin():
+ print("\nThe Qwiic Button 1 isn't connected. Check your connection.", file=sys.stderr)
+ return
+ if not my_button2.begin():
+ print("\nThe Qwiic Button 2 isn't connected. Check your connection.", file=sys.stderr)
+ return
+ print("\nButtons ready! Press to toggle LEDs.")
+
+ led1_on = False
+ led2_on = False
+ while True:
+ # Button 1 toggles its own LED
+ if my_button1.is_button_pressed():
+ led1_on = not led1_on
+ my_button1.LED_on(led1_on)
+ print(f"Button 1 pressed! LED 1 is now {'ON' if led1_on else 'OFF'}.")
+ # Wait for release to avoid rapid toggling
+ while my_button1.is_button_pressed():
+ time.sleep(0.02)
+ # Button 2 toggles both LEDs
+ if my_button2.is_button_pressed():
+ led1_on = not led1_on
+ led2_on = not led2_on
+ my_button1.LED_on(led1_on)
+ my_button2.LED_on(led2_on)
+ print(f"Button 2 pressed! LED 1: {'ON' if led1_on else 'OFF'}, LED 2: {'ON' if led2_on else 'OFF'}.")
+ while my_button2.is_button_pressed():
+ time.sleep(0.02)
+ time.sleep(0.05)
+
+if __name__ == '__main__':
+ try:
+ run_example()
+ except (KeyboardInterrupt, SystemExit):
+ print("\nEnding Qwiic Button + LED Demo")
+ sys.exit(0)
diff --git a/Lab 4/qwiic_distance.py b/Lab 4/qwiic_distance.py
new file mode 100644
index 0000000000..5d3f62e8f3
--- /dev/null
+++ b/Lab 4/qwiic_distance.py
@@ -0,0 +1,72 @@
+#!/usr/bin/env python
+#-----------------------------------------------------------------------------
+# qwiic_proximity_ex1.py
+#
+# Simple Example for the Qwiic Proximity Device
+#------------------------------------------------------------------------
+#
+# Written by SparkFun Electronics, May 2019
+#
+# This python library supports the SparkFun Electroncis qwiic
+# qwiic sensor/board ecosystem on a Raspberry Pi (and compatable) single
+# board computers.
+#
+# More information on qwiic is at https://www.sparkfun.com/qwiic
+#
+# Do you like this library? Help support SparkFun. Buy a board!
+#
+#==================================================================================
+# Copyright (c) 2019 SparkFun Electronics
+#
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+#==================================================================================
+# Example 1
+#
+# - Setup the device
+# - Output the proximity value
+
+from __future__ import print_function
+import qwiic_proximity
+import time
+import sys
+
+def runExample():
+
+ print("\nSparkFun Proximity Sensor VCN4040 Example 1\n")
+ oProx = qwiic_proximity.QwiicProximity()
+
+ if oProx.connected == False:
+ print("The Qwiic Proximity device isn't connected to the system. Please check your connection", \
+ file=sys.stderr)
+ return
+
+ oProx.begin()
+
+ while True:
+ proxValue = oProx.get_proximity()
+ print("Proximity Value: %d" % proxValue)
+ time.sleep(.4)
+
+
+if __name__ == '__main__':
+ try:
+ runExample()
+ except (KeyboardInterrupt, SystemExit) as exErr:
+ print("\nEnding Example 1")
+ sys.exit(0)
\ No newline at end of file
diff --git a/Lab 4/requirements-freeze.txt b/Lab 4/requirements-freeze.txt
new file mode 100644
index 0000000000..2ad78f8316
--- /dev/null
+++ b/Lab 4/requirements-freeze.txt
@@ -0,0 +1,199 @@
+absl-py==1.4.0
+Adafruit-Blinka==8.20.1
+adafruit-circuitpython-apds9960==3.1.9
+adafruit-circuitpython-busdevice==5.2.6
+adafruit-circuitpython-framebuf==1.6.4
+adafruit-circuitpython-motor==3.4.12
+adafruit-circuitpython-mpr121==2.1.19
+adafruit-circuitpython-mpu6050==1.2.3
+adafruit-circuitpython-pca9685==3.4.11
+adafruit-circuitpython-pixelbuf==2.0.3
+adafruit-circuitpython-register==1.9.17
+adafruit-circuitpython-requests==2.0.1
+adafruit-circuitpython-rgb-display==3.12.0
+adafruit-circuitpython-seesaw==1.15.1
+adafruit-circuitpython-servokit==1.3.16
+adafruit-circuitpython-ssd1306==2.12.15
+adafruit-circuitpython-typing==1.9.4
+Adafruit-GPIO==1.0.3
+Adafruit-PlatformDetect==3.49.0
+Adafruit-PureIO==1.1.11
+adafruit-python-shell==1.7.0
+Adafruit-SSD1306==1.6.2
+arandr==0.1.10
+args==0.1.0
+astroid==2.5.1
+asttokens==2.0.4
+attrs==23.1.0
+automationhat==0.2.0
+beautifulsoup4==4.9.3
+blinker==1.4
+blinkt==0.1.2
+buttonshim==0.0.2
+Cap1xxx==0.1.3
+certifi==2020.6.20
+cffi==1.15.1
+chardet==4.0.0
+click==7.1.2
+clint==0.5.1
+colorama==0.4.4
+coloredlogs==15.0.1
+colorzero==1.1
+contourpy==1.1.0
+cryptography==3.3.2
+cupshelpers==1.0
+cycler==0.11.0
+dbus-python==1.2.16
+distro==1.5.0
+docutils==0.16
+drumhat==0.1.0
+envirophat==1.0.0
+ExplorerHAT==0.4.2
+Flask==1.1.2
+flatbuffers==20181003210633
+fonttools==4.42.1
+fourletterphat==0.1.0
+gpiozero==1.6.2
+html5lib==1.1
+humanfriendly==10.0
+idna==2.10
+importlib-resources==6.0.1
+isort==5.6.4
+itsdangerous==1.1.0
+jedi==0.18.0
+Jinja2==2.11.3
+kiwisolver==1.4.5
+lazy-object-proxy==0.0.0
+logilab-common==1.8.1
+lxml==4.6.3
+MarkupSafe==1.1.1
+matplotlib==3.7.2
+mccabe==0.6.1
+mediapipe==0.10.3
+microdotphat==0.2.1
+mote==0.0.4
+motephat==0.0.3
+mpmath==1.3.0
+mypy==0.812
+mypy-extensions==0.4.3
+numpy==1.25.2
+oauthlib==3.1.0
+onnxruntime==1.15.1
+opencv-contrib-python==4.8.0.76
+packaging==23.1
+pantilthat==0.0.7
+parso==0.8.1
+pexpect==4.8.0
+pgzero==1.2
+phatbeat==0.1.1
+pianohat==0.1.0
+picamera2==0.3.12
+pidng==4.0.9
+piexif==1.1.3
+piglow==1.2.5
+pigpio==1.78
+Pillow==8.1.2
+piper-phonemize==1.1.0
+piper-tts==1.2.0
+protobuf==3.20.3
+psutil==5.8.0
+pycairo==1.16.2
+pycparser==2.21
+pycups==2.0.1
+pyftdi==0.55.0
+pygame==1.9.6
+Pygments==2.7.1
+PyGObject==3.38.0
+pyinotify==0.9.6
+PyJWT==1.7.1
+pylint==2.7.2
+pynmea2==1.19.0
+PyOpenGL==3.1.5
+pyOpenSSL==20.0.1
+pyparsing==3.0.9
+PyQt5==5.15.2
+PyQt5-sip==12.8.1
+pyserial==3.5b0
+pysmbc==1.0.23
+python-apt==2.2.1
+python-dateutil==2.8.2
+python-prctl==1.7
+pyusb==1.2.1
+rainbowhat==0.1.0
+reportlab==3.5.59
+requests==2.25.1
+requests-oauthlib==1.0.0
+responses==0.12.1
+roman==2.0.0
+rpi-ws281x==5.0.0
+RPi.GPIO==0.7.1
+RTIMULib==7.2.1
+scrollphat==0.0.7
+scrollphathd==1.2.1
+Send2Trash==1.6.0b1
+sense-hat==2.4.0
+simplejpeg==1.6.4
+simplejson==3.17.2
+six==1.16.0
+skywriter==0.0.7
+smbus2==0.4.3
+sn3218==1.2.7
+sounddevice==0.4.6
+soupsieve==2.2.1
+sparkfun-pi-servo-hat==0.9.0
+sparkfun-qwiic==1.1.6
+sparkfun-qwiic-adxl313==0.0.7
+sparkfun-qwiic-alphanumeric==0.0.1
+sparkfun-qwiic-as6212==0.0.2
+sparkfun-qwiic-bme280==0.9.0
+sparkfun-qwiic-button==2.0.1
+sparkfun-qwiic-ccs811==0.9.4
+sparkfun-qwiic-dual-encoder-reader==0.0.2
+sparkfun-qwiic-eeprom==0.0.1
+sparkfun-qwiic-gpio==0.0.2
+sparkfun-qwiic-i2c==0.9.11
+sparkfun-qwiic-icm20948==0.0.1
+sparkfun-qwiic-joystick==0.9.0
+sparkfun-qwiic-keypad==0.9.0
+sparkfun-qwiic-kx13x==1.0.0
+sparkfun-qwiic-led-stick==0.0.1
+sparkfun-qwiic-max3010x==0.0.2
+sparkfun-qwiic-micro-oled==0.10.0
+sparkfun-qwiic-oled-base==0.0.2
+sparkfun-qwiic-oled-display==0.0.2
+sparkfun-qwiic-pca9685==0.9.1
+sparkfun-qwiic-pir==0.0.4
+sparkfun-qwiic-proximity==0.9.0
+sparkfun-qwiic-relay==0.0.2
+sparkfun-qwiic-rfid==2.0.0
+sparkfun-qwiic-scmd==0.9.1
+sparkfun-qwiic-serlcd==0.0.1
+sparkfun-qwiic-sgp40==0.0.4
+sparkfun-qwiic-soil-moisture-sensor==0.0.2
+sparkfun-qwiic-tca9548a==0.9.0
+sparkfun-qwiic-titan-gps==0.1.1
+sparkfun-qwiic-twist==0.9.0
+sparkfun-qwiic-vl53l1x==1.0.1
+sparkfun-top-phat-button==0.0.2
+sparkfun-ublox-gps==1.1.5
+spidev==3.6
+srt==3.5.3
+ssh-import-id==5.10
+sympy==1.12
+sysv-ipc==1.1.0
+thonny==4.0.1
+toml==0.10.1
+touchphat==0.0.1
+tqdm==4.66.1
+twython==3.8.2
+typed-ast==1.4.2
+typing-extensions==4.7.1
+unicornhathd==0.0.4
+urllib3==1.26.5
+v4l2-python3==0.3.2
+vosk==0.3.45
+webencodings==0.5.1
+websockets==11.0.3
+Werkzeug==1.0.1
+wrapt==1.12.1
+zipp==3.16.2
diff --git a/Lab 4/requirements2023.txt b/Lab 4/requirements2023.txt
new file mode 100644
index 0000000000..a59e7d02cc
--- /dev/null
+++ b/Lab 4/requirements2023.txt
@@ -0,0 +1,27 @@
+Adafruit-Blinka
+adafruit-circuitpython-busdevice
+adafruit-circuitpython-framebuf
+adafruit-circuitpython-mpr121
+adafruit-circuitpython-mpu6050
+adafruit-circuitpython-ssd1306
+adafruit-circuitpython-pca9685
+adafruit-circuitpython-servokit
+adafruit-circuitpython-apds9960
+adafruit-circuitpython-seesaw
+sparkfun-qwiic
+sparkfun-qwiic-joystick
+sparkfun-qwiic-vl53l1x
+Adafruit-GPIO
+Adafruit-PlatformDetect
+Adafruit-PureIO
+Adafruit-SSD1306
+pyftdi
+pyserial
+pyusb
+rpi-ws281x
+RPi.GPIO
+spidev
+sysv-ipc
+sparkfun-qwiic-proximity
+
+
diff --git a/Lab 4/requirements2025.txt b/Lab 4/requirements2025.txt
new file mode 100644
index 0000000000..58852be37d
--- /dev/null
+++ b/Lab 4/requirements2025.txt
@@ -0,0 +1,31 @@
+Adafruit-Blinka
+adafruit-circuitpython-busdevice
+adafruit-circuitpython-framebuf
+adafruit-circuitpython-mpr121
+adafruit-circuitpython-mpu6050
+adafruit-circuitpython-ssd1306
+adafruit-circuitpython-pca9685
+adafruit-circuitpython-servokit
+adafruit-circuitpython-apds9960
+adafruit-circuitpython-seesaw
+sparkfun-qwiic
+sparkfun-qwiic-joystick
+sparkfun-qwiic-vl53l1x
+Adafruit-GPIO
+Adafruit-PlatformDetect
+Adafruit-PureIO
+Adafruit-SSD1306
+pyftdi
+pyserial
+pyusb
+rpi-ws281x
+RPi.GPIO
+spidev
+sysv-ipc
+sparkfun-qwiic-proximity
+adafruit-circuitpython-busdevice
+sparkfun-pi-servo-hat
+adafruit-circuitpython-lsm6ds
+sparkfun-qwiic-button
+adafruit-circuitpython-pcf8574
+sparkfun-qwiic-proximity
\ No newline at end of file
diff --git a/Lab 4/servo_test.py b/Lab 4/servo_test.py
new file mode 100644
index 0000000000..cb8b094ccb
--- /dev/null
+++ b/Lab 4/servo_test.py
@@ -0,0 +1,28 @@
+import time
+from adafruit_servokit import ServoKit
+
+# Set channels to the number of servo channels on your kit.
+# There are 16 channels on the PCA9685 chip.
+kit = ServoKit(channels=16)
+
+# Name and set up the servo according to the channel you are using.
+servo = kit.servo[2]
+
+# Set the pulse width range of your servo for PWM control of rotating 0-180 degree (min_pulse, max_pulse)
+# Each servo might be different, you can normally find this information in the servo datasheet
+servo.set_pulse_width_range(500, 2500)
+
+while True:
+ try:
+ # Set the servo to 180 degree position
+ servo.angle = 180
+ time.sleep(2)
+ # Set the servo to 0 degree position
+ servo.angle = 0
+ time.sleep(2)
+
+ except KeyboardInterrupt:
+ # Once interrupted, set the servo back to 0 degree position
+ servo.angle = 0
+ time.sleep(0.5)
+ break
diff --git a/Lab 5/Audio_optional/ExampleAudioFFT.py b/Lab 5/Audio_optional/ExampleAudioFFT.py
new file mode 100644
index 0000000000..9db53ad7de
--- /dev/null
+++ b/Lab 5/Audio_optional/ExampleAudioFFT.py
@@ -0,0 +1,111 @@
+import pyaudio
+import numpy as np
+from scipy.fft import rfft, rfftfreq
+from scipy.signal.windows import hann
+from numpy_ringbuffer import RingBuffer
+
+import queue
+import time
+
+
+## Please change the following number so that it matches to the microphone that you are using.
+DEVICE_INDEX = 1
+
+## Compute the audio statistics every `UPDATE_INTERVAL` seconds.
+UPDATE_INTERVAL = 1.0
+
+
+
+### Things you probably don't need to change
+FORMAT=np.float32
+SAMPLING_RATE = 44100
+CHANNELS=1
+
+
+def main():
+ ### Setting up all required software elements:
+ audioQueue = queue.Queue() #In this queue stores the incoming audio data before processing.
+ pyaudio_instance = pyaudio.PyAudio() #This is the AudioDriver that connects to the microphone for us.
+
+ def _callback(in_data, frame_count, time_info, status): # This "callbackfunction" stores the incoming audio data in the `audioQueue`
+ audioQueue.put(in_data)
+ return None, pyaudio.paContinue
+
+ stream = pyaudio_instance.open(input=True,start=False,format=pyaudio.paFloat32,channels=CHANNELS,rate=SAMPLING_RATE,frames_per_buffer=int(SAMPLING_RATE/2),stream_callback=_callback,input_device_index=DEVICE_INDEX)
+
+
+ # One essential way to keep track of variables overtime is with a ringbuffer.
+ # As an example the `AudioBuffer` it stores always the last second of audio data.
+ AudioBuffer = RingBuffer(capacity=SAMPLING_RATE*1, dtype=FORMAT) # 1 second long buffer.
+
+ # Another example is the `VolumeHistory` ringbuffer.
+ VolumeHistory = RingBuffer(capacity=int(20/UPDATE_INTERVAL), dtype=FORMAT) ## This is how you can compute a history to record changes over time
+ ### Here is a good spot to extend other buffers aswell that keeps track of varailbes over a certain period of time.
+
+ nextTimeStamp = time.time()
+ stream.start_stream()
+ if True:
+ while True:
+ frames = audioQueue.get() #Get DataFrom the audioDriver (see _callbackfunction how the data arrives)
+ if not frames:
+ continue
+
+ framesData = np.frombuffer(frames, dtype=FORMAT)
+ AudioBuffer.extend(framesData[0::CHANNELS]) #Pick one audio channel and fill the ringbuffer.
+
+ if(AudioBuffer.is_full and # Waiting for the ringbuffer to be full at the beginning.
+ audioQueue.qsize()<2 and # Make sure there is not alot more new data that should be used.
+ time.time()>nextTimeStamp): # See `UPDATE_INTERVAL` above.
+
+ buffer = np.array(AudioBuffer) #Get the last second of audio.
+
+
+ volume = np.rint(np.sqrt(np.mean(buffer**2))*10000) # Compute the rms volume
+
+
+ VolumeHistory.append(volume)
+ volumneSlow = volume
+ volumechange = 0.0
+ if VolumeHistory.is_full:
+ HalfLength = int(np.round(VolumeHistory.maxlen/2))
+ vnew = np.array(VolumeHistory)[HalfLength:].mean()
+ vold = np.array(VolumeHistory)[:VolumeHistory.maxlen-HalfLength].mean()
+ volumechange =vnew-vold
+ volumneSlow = np.array(VolumeHistory).mean()
+
+ ## Computes the Frequency Foruier analysis on the Audio Signal.
+ N = buffer.shape[0]
+ window = hann(N)
+ amplitudes = np.abs(rfft(buffer*window))[25:] #Contains the volume for the different frequency bin.
+ frequencies = (rfftfreq(N, 1/SAMPLING_RATE)[:N//2])[25:] #Contains the Hz frequency values. for the different frequency bin.
+ '''
+ Combining the `amplitudes` and `frequencies` varialbes allows you to understand how loud a certain frequency is.
+
+ e.g. If you'd like to know the volume for 500Hz you could do the following.
+ 1. Find the frequency bin in which 500Hz belis closest to with:
+ FrequencyBin = np.abs(frequencies - 500).argmin()
+
+ 2. Look up the volume in that bin:
+ amplitudes[FrequencyBin]
+
+
+ The example below does something similar, just in revers.
+ It finds the loudest amplitued and its coresponding bin with `argmax()`.
+ The uses the index to look up the Freqeucny value.
+ '''
+
+
+ LoudestFrequency = frequencies[amplitudes.argmax()]
+
+ print("Loudest Frqeuncy:",LoudestFrequency)
+ print("RMS volume:",volumneSlow)
+ print("Volume Change:",volumechange)
+
+ nextTimeStamp = UPDATE_INTERVAL+time.time() # See `UPDATE_INTERVAL` above
+
+
+if __name__ == '__main__':
+ main()
+ print("Something happend with the audio example. Stopping!")
+
+
diff --git a/Lab 5/Audio_optional/ListAvalibleAudioDevices.py b/Lab 5/Audio_optional/ListAvalibleAudioDevices.py
new file mode 100644
index 0000000000..e7ec252610
--- /dev/null
+++ b/Lab 5/Audio_optional/ListAvalibleAudioDevices.py
@@ -0,0 +1,10 @@
+import pyaudio
+
+pyaudio_instance = pyaudio.PyAudio()
+
+print("--- Starting audio device survey! ---")
+for i in range(pyaudio_instance.get_device_count()):
+ dev = pyaudio_instance.get_device_info_by_index(i)
+ name = dev['name'].encode('utf-8')
+ print(i, name, dev['maxInputChannels'], dev['maxOutputChannels'])
+
diff --git a/Lab 5/Audio_optional/ListeningExercise.md b/Lab 5/Audio_optional/ListeningExercise.md
new file mode 100644
index 0000000000..dd74f453da
--- /dev/null
+++ b/Lab 5/Audio_optional/ListeningExercise.md
@@ -0,0 +1,24 @@
+## Listening Exercise
+Go through some of the videos below, listen to the sound, and write down the different sounds that belong to a certain context. Think about the impact a given sound has on how we construct our own contextual understanding. *Please try to not watch the video as you listen to the sound.*
+
+
+Ideally, write down the ideas in a table like this
+
+| Sound | Influence on the context | Implications for your behavior |
+| :---: | :---: | :---: |
+| car horn | creates a sense of urgency | look arround |
+| foot steps | ... | ... |
+| ... | ... |... |
+| ... | ... |... |
+
+
+Audio sources to use:
+- [Walking in Tokyo](https://www.youtube.com/watch?v=Et7O5-CzJZg)
+- [Resturant Ambiance](https://www.youtube.com/watch?v=xY0GEpbWreY)
+- [walking in Forest](https://www.youtube.com/watch?v=I-zPNQYHSvU)
+- [Working in a Coffe Shop](https://www.youtube.com/watch?v=714HdIgMt1g)
+- [Walking in Shanghai](https://www.youtube.com/watch?v=2uQ58Xwx1V4)
+- [Biking in the Netherlands](https://www.youtube.com/watch?v=siomblak2TI)
+- [Backyard Fountain](https://www.youtube.com/watch?v=Ez1f6Vp_UYk)
+
+
diff --git a/Lab 5/Audio_optional/ThinkingThroughContextAndInteraction.md b/Lab 5/Audio_optional/ThinkingThroughContextAndInteraction.md
new file mode 100644
index 0000000000..587b0ce82f
--- /dev/null
+++ b/Lab 5/Audio_optional/ThinkingThroughContextAndInteraction.md
@@ -0,0 +1,6 @@
+| **Context** (situational) | **Presence** (intent) | **Behavior** (reaction) |
+|----------------------------|-------------------------------------|-------------------------|
+| **Who is involved:** to make new lines use `` `` between words | **Task goals:** | **Implicit behaviors:** |
+| **What is making noises:** | **When to stand-out (attention):** | **Explicit Actions:** |
+| **When:** | **When to blend in (distraction):** | |
+| **Where:** | | |
\ No newline at end of file
diff --git a/Lab 5/Audio_optional/ThinkingThroughContextandInteraction.png b/Lab 5/Audio_optional/ThinkingThroughContextandInteraction.png
new file mode 100644
index 0000000000..88330d66c3
Binary files /dev/null and b/Lab 5/Audio_optional/ThinkingThroughContextandInteraction.png differ
diff --git a/Lab 5/Audio_optional/audio.md b/Lab 5/Audio_optional/audio.md
new file mode 100644
index 0000000000..5bce540752
--- /dev/null
+++ b/Lab 5/Audio_optional/audio.md
@@ -0,0 +1,39 @@
+#### Filtering, FFTs, and Time Series data.
+> **_NOTE:_** This section is from an earlier version of the class.
+
+Additional filtering and analysis can be done on the sensors that were provided in the kit. For example, running a Fast Fourier Transform over the IMU or Microphone data stream could create a simple activity classifier between walking, running, and standing.
+
+To get the microphone working we need to install two libraries. `PyAudio` to get the data from the microphone, `sciPy` to make data analysis easy, and the `numpy-ringbuffer` to keep track of the last ~1 second of audio.
+Pyaudio needs to be installed with the following comand:
+``sudo apt install python3-pyaudio``
+SciPy is installed with
+``sudo apt install python3-scipy``
+
+Lastly we need numpy-ringbuffer, to make continues data anlysis easier.
+``pip install numpy-ringbuffer``
+
+Now try the audio processing example:
+* Find what ID the micrpohone has with `python ListAvalibleAudioDevices.py`
+ Look for a device name that includes `USB` in the name.
+* Adjust the variable `DEVICE_INDEX` in the `ExampleAudioFFT.py` file.
+ See if you are getting results printed out from the microphone. Try to understand how the code works.
+ Then run the file by typing `python ExampleAudioFFT.py`
+
+
+
+Using the microphone, try one of the following:
+
+**1. Set up threshold detection** Can you identify when a signal goes above certain fixed values?
+
+**2. Set up a running averaging** Can you set up a running average over one of the variables that are being calculated.[moving average](https://en.wikipedia.org/wiki/Moving_average)
+
+**3. Set up peak detection** Can you identify when your signal reaches a peak and then goes down?
+
+For technical references:
+
+* Volume Calculation with [RootMeanSqare](https://en.wikipedia.org/wiki/Root_mean_square)
+* [RingBuffer](https://en.wikipedia.org/wiki/Circular_buffer)
+* [Frequency Analysis](https://en.wikipedia.org/wiki/Fast_Fourier_transform)
+
+
+**\*\*\*Include links to your code here, and put the code for these in your repo--they will come in handy later.\*\*\***
diff --git a/Lab 5/CV_optional/cv.md b/Lab 5/CV_optional/cv.md
new file mode 100644
index 0000000000..6fe3b245fa
--- /dev/null
+++ b/Lab 5/CV_optional/cv.md
@@ -0,0 +1,62 @@
+#### (Optional) OpenCV
+> **_NOTE:_** This section will be made available by next week.
+
+A more traditional method to extract information out of images is provided with OpenCV. The RPI image provided to you comes with an optimized installation that can be accessed through python. We included 4 standard OpenCV examples: contour(blob) detection, face detection with the ``Haarcascade``, flow detection (a type of keypoint tracking), and standard object detection with the [Yolo](https://pjreddie.com/darknet/yolo/) darknet.
+
+Most examples can be run with a screen (e.g. VNC or ssh -X or with an HDMI monitor), or with just the terminal. The examples are separated out into different folders. Each folder contains a ```HowToUse.md``` file, which explains how to run the python example.
+
+The following command is a nicer way you can run and see the flow of the `openCV-examples` we have included in your Pi. Instead of `ls`, the command we will be using here is `tree`. [Tree](http://mama.indstate.edu/users/ice/tree/) is a recursive directory colored listing command that produces a depth indented listing of files. Install `tree` first and `cd` to the `openCV-examples` folder and run the command:
+
+```shell
+pi@ixe00:~ $ sudo apt install tree
+...
+pi@ixe00:~ $ cd openCV-examples
+pi@ixe00:~/openCV-examples $ tree -l
+.
+├── contours-detection
+│ ├── contours.py
+│ └── HowToUse.md
+├── data
+│ ├── slow_traffic_small.mp4
+│ └── test.jpg
+├── face-detection
+│ ├── face-detection.py
+│ ├── faces_detected.jpg
+│ ├── haarcascade_eye_tree_eyeglasses.xml
+│ ├── haarcascade_eye.xml
+│ ├── haarcascade_frontalface_alt.xml
+│ ├── haarcascade_frontalface_default.xml
+│ └── HowToUse.md
+├── flow-detection
+│ ├── flow.png
+│ ├── HowToUse.md
+│ └── optical_flow.py
+└── object-detection
+ ├── detected_out.jpg
+ ├── detect.py
+ ├── frozen_inference_graph.pb
+ ├── HowToUse.md
+ └── ssd_mobilenet_v2_coco_2018_03_29.pbtxt
+```
+
+The flow detection might seem random, but consider [this recent research](https://cseweb.ucsd.edu/~lriek/papers/taylor-icra-2021.pdf) that uses optical flow to determine busy-ness in hospital settings to facilitate robot navigation. Note the velocity parameter on page 3 and the mentions of optical flow.
+
+Now, connect your webcam to your Pi and use **VNC to access to your Pi** and open the terminal. Use the following command lines to try each of the examples we provided:
+(***it will not work if you use ssh from your laptop***)
+
+```
+pi@ixe00:~$ cd ~/openCV-examples/contours-detection
+pi@ixe00:~/openCV-examples/contours-detection $ python contours.py
+...
+pi@ixe00:~$ cd ~/openCV-examples/face-detection
+pi@ixe00:~/openCV-examples/face-detection $ python face-detection.py
+...
+pi@ixe00:~$ cd ~/openCV-examples/flow-detection
+pi@ixe00:~/openCV-examples/flow-detection $ python optical_flow.py 0 window
+...
+pi@ixe00:~$ cd ~/openCV-examples/object-detection
+pi@ixe00:~/openCV-examples/object-detection $ python detect.py
+```
+
+**\*\*\*Try each of the following four examples in the `openCV-examples`, include screenshots of your use and write about one design for each example that might work based on the individual benefits to each algorithm.\*\*\***
+
diff --git a/Lab 5/HandTrackingModule.py b/Lab 5/HandTrackingModule.py
new file mode 100644
index 0000000000..2769d1c384
--- /dev/null
+++ b/Lab 5/HandTrackingModule.py
@@ -0,0 +1,71 @@
+import cv2
+import mediapipe as mp
+import time
+
+
+class handDetector():
+ def __init__(self, mode=False, maxHands=2, detectionCon=0.5, trackCon=0.5):
+ self.mode = mode
+ self.maxHands = maxHands
+ self.detectionCon = detectionCon
+ self.trackCon = trackCon
+
+ self.mpHands = mp.solutions.hands
+ self.hands = self.mpHands.Hands(self.mode, self.maxHands,
+ self.detectionCon, self.trackCon)
+ self.mpDraw = mp.solutions.drawing_utils
+
+ def findHands(self, img, draw=True):
+ imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
+ self.results = self.hands.process(imgRGB)
+ # print(results.multi_hand_landmarks)
+
+ if self.results.multi_hand_landmarks:
+ for handLms in self.results.multi_hand_landmarks:
+ if draw:
+ self.mpDraw.draw_landmarks(img, handLms,
+ self.mpHands.HAND_CONNECTIONS)
+ return img
+
+ def findPosition(self, img, handNo=0, draw=True):
+
+ lmList = []
+ if self.results.multi_hand_landmarks:
+ myHand = self.results.multi_hand_landmarks[handNo]
+ for id, lm in enumerate(myHand.landmark):
+ # print(id, lm)
+ h, w, c = img.shape
+ cx, cy = int(lm.x * w), int(lm.y * h)
+ # print(id, cx, cy)
+ lmList.append([id, cx, cy])
+ if draw:
+ cv2.circle(img, (cx, cy), 15, (255, 0, 255), cv2.FILLED)
+
+ return lmList
+
+
+def main():
+ pTime = 0
+ cTime = 0
+ cap = cv2.VideoCapture(1)
+ detector = handDetector()
+ while True:
+ success, img = cap.read()
+ img = detector.findHands(img)
+ lmList = detector.findPosition(img)
+ if len(lmList) != 0:
+ print(lmList[4])
+
+ cTime = time.time()
+ fps = 1 / (cTime - pTime)
+ pTime = cTime
+
+ cv2.putText(img, str(int(fps)), (10, 70), cv2.FONT_HERSHEY_PLAIN, 3,
+ (255, 0, 255), 3)
+
+ cv2.imshow("Image", img)
+ cv2.waitKey(1)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 5/Peaceful_Mind.wav b/Lab 5/Peaceful_Mind.wav
new file mode 100644
index 0000000000..430a737e3f
Binary files /dev/null and b/Lab 5/Peaceful_Mind.wav differ
diff --git a/Lab 5/README.md b/Lab 5/README.md
new file mode 100644
index 0000000000..b9601ef6cd
--- /dev/null
+++ b/Lab 5/README.md
@@ -0,0 +1,285 @@
+# Observant Systems
+
+**Sean Hardesty Lewis (solo)**
+
+
+
+I built a small "Pet Robot" that follows people around. The body is a Sphero Ollie carrying a tiny FPV camera and a single-cell battery. A Raspberry Pi 5 is the controller with local Bluetooth driver. The camera’s video stream leaves the robot over UDP via Tailscale and lands on my PC (RTX 3090), where I run real-time person detection at >30 fps. The desktop sends back steering and accel/decel instructions, the Pi applies it, and the Ollie "becomes intelligent". End to end, the pipeline sits around 50-60ms when Wi-Fi is strong (from camera -> RPI -> PC -> RPI -> robot movement).
+
+## Prep
+
+Done!
+
+## Overview
+
+### Part A
+
+I tried a handful of sense-making paths to understand what would actually hold up once a robot is trying to chase you. OpenCV’s traditional HOG+SVM person detector on the Pi was a good reality check but topped out around 1-3 fps, which was basically a slideshow on wheels. Swapping to YOLO-tiny style detector on my PC GPU immediately improved it with stable 30-45 fps at 720p and more importantly, smooth bounding boxes that didnt jitter with every lighting change.
+
+I realized that while playing around with MobileNet and MediaPipe that I might just be "overengineering" the solution. While I did leave some MediaPipe code in my final version, spending lots of time on trying to get a robot that is currently accelerating to "stop" when it sees my hand up in a stop-handsign took much more time and effort and did not pay off that well. I realized that some of these tools are super cool to use, but knowing which ones to use for what situations is important. I knew that I did not need MediaPipe, MobileNet, or even MoonDream for what I wanted to do, even though these could help with various aspects, since I had already seen how useful and fast a YOLOv8 nano model could be on my PC. I will add that MoonDream was super interesting, and it is one of my inspirations for robotics that semantically defined maps from VLMs + depth maps will allow robots to navigate, memorize a scene, and re-navigate through it in the future. Something similar to [this paper](https://vlmaps.github.io/).
+
+
+
+### Part B
+
+Here is what the Sphero EDU app looks like for manually controlling the robot via the phone (wizard of oz!).
+
+
+For this part, I first used Sphero EDU app which allowed me to control the robot manually with my phone. I figured out how the interaction would work by thinking about where the camera would be in the scene: would it be from a static third person perspective, top-down, would the human be wearing the camera, would the robot be wearing the camera? Some of these ideas I quickly dropped because I felt like I wanted the robot to be as "autonomous" and as "intelligent" as it could possibly be. To test, I designed essentially what I wanted without any "turning" complexity. So, using my PC webcam and myself as the object to be recognized, I wrote a simple script that was basically "Move forward if there is a human object detected on the video feed", and used Sphero v2 Python API to do this. I had quite a few issues with Bluetooth and trying to connect the robot to the PC via Python, but after figuring that out, and consulting the [Sphero v2 API docs](https://spherov2.readthedocs.io/en/latest/sphero_edu.html), I was able to set it up so that it would accelerate when I walk in the frame. Please not that for Part A, B, C, I was using my Sphero BB-8 as opposed to the Ollie, and no camera was attached yet to the robot itself- I was using my PC webcam with a bluetooth connection to the robot and controlling via Python.
+
+Here is a picture showcasing the **Sphero Ollie (left)** and **Sphero BB-8 (right)**, both robots can be controlled manually via the smartphone app 'Sphero EDU'. Neither have any sort of autonomous capabilities or cameras, that's what I wanted to add.
+
+
+
+### Part C
+
+Observations
+
+Where the system shines is the most common case: indoor lighting that isnt ridiculous, a single person 1-3m away, and a floor that isnt slippery. In these conditions, robot feels responsive and accelerates forward when the human is in view of the camera. However, it falls apart in predictable places. Backlight and low light (especially nighttime) makes YOLO break or hallucinate. More than one person and even temporary occlusions encourage target switches (and I thought it was out of scope to put effort into coding stickiness to the last seen person). Glossy floors were a completely unexpected issue- I found that the Ollie could slip on quick turns and then overshoot even if the perception was fine. I had to tone down the speed of accel/decel and the rotation just for this floor issue. The FOV of the camera was also interesting since my settings I used for webcam definitely did not translate perfectly to the FPV camera (more on this later).
+
+Users
+
+Thinking like a user, I realized very few people are aware theyre interacting with a probability distribution and not a robot. If the robot stops randomly, they dont see "low confidence", they just see weird behavior. I thought about using Sphero v2 API and potentially adding LED cues to make state legible (i.e. blue when it’s confidently tracking you, yellow when it’s searching, red when it stops for safety) but realized that this didn't transfer well across different Sphero models and I sort of wanted to make a "model-agnostic" camera/robot control code. I also realized that the starting and stopping were way too violent for users, and ended up softening behavior to be a lerp between accel and decel when it loses track of human/gains sight, instead of just an immediate stop or start. Finally, I replaced my naive "always correct to perfect center of camera" approach with bigger and more friendly dead-zone. If you’re approximately centered in the shot, it holds its course so it doesnt feel like its constantly hunting for a pixel-perfect alignment (rotating left and right and left and right...).
+
+Here are some of the failures of the robot (Please note, this is after the FPV camera was attached for part 2. This is just to showcase some of the failures that are described above, like overrotating, glossy floors, accelerating into walls, and the like)
+
+https://github.com/user-attachments/assets/0f6dcfd1-2ecd-4f9b-86b2-35c0f42ef667
+
+### Part D
+
+To describe my setup again, I used a PC webcam and my location and size on the webcam defined how the robot would accelerate, decelerate, and rotate. This made debugging a very interesting process of having to pick up the Sphero, take it to the end of the room, then face the camera and watch how my location/size on camera would affect the Sphero's movement in each iteration of the script. I realized that this setup worked best with plain bright indoor environments (with flat ground) and was bad with backlit hallways, dark lighting. When it breaks (and it broke often), it was often due to detection failing or my programming of the robot's behavior was not great. Most failures were recoverable by adjusting the code for its behavior, but detection being reliable was much more-so lighting conditions. Biggest fixes for this part were adding smoothing to speed and dead-zones for human being in center of frame so that the robot comes directly towards you and slightly rotates if you are off to one side. It felt kind of personable and the first time it did it I felt really happy when I saw the arc that it curved in to get to me- realizing that it could sort of navigate autonomouslyish.
+
+| **Question** | **Answer** |
+|---------------|------------|
+| **What can you use X for?** | We can use our pet robot to provide companionship. We do not need to feed it or worry about taking care of it, and can enjoy a socio-affective relationship. |
+| **What is a good environment for X?** | Good environments for the pet robot are generally well-lit areas for vision with non-slip surfaces for navigation, and only one human actor visible. |
+| **What is a bad environment for X?** | Bad environments are dark areas, slippery floors, or have many human actors to detect and follow. |
+| **When will X break?** | The pet robot will break when it gets stuck underneath chairs, beds, cabinets, etc. and needs to be retrieved. It also breaks in all of the bad environments described above, and especially breaks with high-latency instructions. |
+| **When it breaks how will X break?** | When it breaks, the pet robot will be immobile (unable to escape a stuck position), will hop from person to person in a crowd, or will generally just not be responsive enough for a quality interaction (in high-latency settings). |
+| **What are other properties/behaviors of X?** | The pet robot's behaviors need to be explicitly programmed. The width of the dead-zone for how much it rotates to keep a human centered in its vision, the speed and acceleration it uses, its stopping distance from the human, and its default mode (searching by rotating in circles) are all adjustable settings that make the robot interactable and animated. |
+| **How does X feel?** | The pet robot, under low-latency settings, feels like a puppy that chases you around. It constantly nips at your feet and follows you everywhere you go. It is quite an adorable experience with the Sphero frames being designed to appear friendly to even children, so it following you isn’t anything scary. |
+
+I believe that there are quite a few videos to refer to for this section: namely the working demo below, and the breaking demo above.
+
+### Part 2.
+
+For Part 2 I wanted to take what I had managed to do: a proof of concept/prototype on the PC using its webcam and the Sphero, to a real usable autonomous robot. Thankfully, I had some of the materials for the job already in my closet. I had a tiny FPV camera, a 5.8G OTG Skydroid Receiver, and a set of 6 200mW batteries, as well as some scotch tape.
+
+Here is a diagram of how I envisioned my robot to work after I switch from a PC webcam to a small FPV camera that goes on the robot.
+
+
+
+So the first thing I did was test out the FPV camera by plugging it into one of the batteries, then plugging in the receiver to my PC, and loading up VLC's Media > Open Capture Device to see if the receiver was getting anything. A "USB 2.0 Camera" showed up in the dropdown and when I clicked on it: total static and a completely unusable video feed! This is completely normal. I remember from the last time that I used this FPV camera that I had to tune it to a specific channel for the sender (camera) and receiver (PC) to be able to operate on the same wavelengths. After debugging for about an hour with many static and half-static video feeds being shown on my screen, I realized that half of my problem was really the connection signal from the receiver to the camera itself. FPV cameras are built with antennas meant for large open-space environments (to be retrofitted onto drones). So, using it indoors where there is lots of metal and other interferences gave a bunch of static noise to the video feed (and especially for me, since my PC was directly underneath a metal desk!). To fix this, I cleared the area around the camera and moved my PC out from underneath the metal desk. Suddenly, more channels started working with less noise and glitches. I found channel F4 to work the best for the camera and receiver to be able to communicate a smooth video on that VLC was able to receive.
+
+The second thing I did was scotch tape the battery and the FPV camera to the front of the Sphero Ollie. It looked a little bit unpolished, and the battery would need to be untaped and replaced every 20min with a new one, but it got the job done. From there, I replaced my previous code's camera input from index 0 (my webcam) to index 1 (the USB 2.0 camera receiver). This was the moment of truth, and I hit the "Start" button on my script- only to realize that my 20 minutes had already expired and the battery had died and needed to be replaced. Once I unscotched the camera, reconnected the wires, and came back to my PC I first tested VLC and made sure the feed was working correctly, then for my second moment of truth- the robot started moving based on the camera feed that was transmitting from the FPV camera taped to it! I was so happy that I got it to work and how it moved towards me. I played around with it for the next 20 minutes and recorded what it was bad and good at. I quickly realized that when it lost track of the human, or when it got too close to the human (and consequently lost track), it had no idea what to do and essentially just stopped moving completely becoming inanimate. I didn't like this behavior since it went against the core of what I wanted it to feel: alive. So, my first order was to define a "Search" mode in which the robot would continuously move in a circle or something of the sort when it didnt see any human on the camera feed so that it would "find" a human. What I didnt realize is that this would lead to a whole lot more crashing, scuffs, and unintended movement than what I originally thought- but it definitely looked, and felt, more alive!
+
+Here are everything I used above to create the robot:
+
+
+
+A major problem I had to deal with was the fact that the receiver was in my PC and the PC was directly controlling the robot. This essentailly made the robot unusable outside of my room since the range of my PC's bluetooth was only so far. I researched into whether the robot could be controlled over Wi-Fi and unfortunately it could not. So, I tried using the Raspberry Pi again for the receiving of the camera stream and CV analysis and movement instruction based on it. However, it was readily apparent that the RPI was not suited for such a computational task averaging only 1-3 fps as opposed to my PC's >30 fps. So I had to devise a different solution to make this robot semi-portable. My RPI would act as a controller and connection to the robot. It would pass the video stream to a localhost:#### which the PC could then read and perform CV detection on. Then, the PC would pass instructions back to which the Pi could read and apply to the robot. I implemented a basic version of this with Cloudflare's free tunneling system (similar to the local-tunnel package but more versatile).
+
+The final system splits the job in a way that plays to each device’s strengths. The Ollie and Pi stay together; the Pi handles Bluetooth to the robot, sends the video stream to PC, and listens for commands over UDP. The camera stream rides the same network path to the desktop, which does detection, tracking, and control mapping. Commands are tiny, frequent, and timestamped so the Pi can ignore anything stale and brake if something errors. I set extremely conservative speed limits (these Spheros can go fast!), with what I thought was a comfortable following distance, as well as confidence floor that slows robot subtly when vision is sketchy.
+
+Below is a short demo.
+
+https://github.com/user-attachments/assets/d25c7462-78bb-4aa5-97b3-052394c6d728
+
+### What worked and what didn’t
+
+The biggest wins were exactly the things you’d expect. Offloading vision to the PC made everything snap into place, and moving from HTTP to UDP over Tailscale got rid of the mystery delays that made the robot feel almost unusable. Confidence-aware gains were a subtle but extremely quality of life fix, with a lerp in slowing down when detector is unsure. The obvious misses were also predictable. The Pi-only pipeline wasn’t even close to real time, so it never felt very interactive. Cloudflare’s tunnel was great for convenience but introduced just enough buffering to ruin the experience. The BB-8 form factor was adorable but unforgiving once I added a camera and battery (weight would always tilt it to one side), the Ollie’s flat stance is simply better for this payload.
+
+I realized that my above end-to-end fixes like HTTP-UDP were good, but didn't account for general Wi-Fi speed and inter-Wi-Fi latency. So while the end-to-end works for the same Wi-Fi network, even the half-second delay over Wi-Fi as soon as we move the setup across Wi-Fi networks that interact makes the experience subpar. Since the robot moves and needs to re-calculate on the fly where it is going, the end-to-end being >0.5s leads to a very big performance hit in terms of how interactive the robot feels. If we set the speed of the robot down we can mitigate this latency, but the entire interaction gets slowed down as a result. See below for a video of the robot running in the Maker Lab with very jittery end-to-end latency due to different Wi-Fi networks. Even with UDP, this is a challenge. I am open to suggestions for how to improve this in a way that doesn't change the behavior of the robot to be slower so that it can adjust to delayed instructions.
+
+https://github.com/user-attachments/assets/61803de7-fe3a-493a-a4fb-d03df16f09a3
+
+### Lessons for making it more autonomous
+
+The robot feels "smart" when the loop is consistently fast and its internal state is legible. That means explicit state transitions, search, lock, track, re-acquire, etc. rather than one giant controller fed by raw numbers. An easy improvement would be a little bit of identity memory so it doesnt jump to a new person when someone crosses the frame, with even a simple stickiness heuristic helping, or a small ReID embedding could help more. On the sensing side, using the bounding-box height as a distance proxy was the right hack for a first pass but obviously brittle, I imagine a front-facing ToF sensor fused with vision would immediately make spacing smarter. Some issues of it being autonomous were hallucinations and getting stuck. Sometimes it would detect a human where there aren't any and run straight into a wall. Or during searching it would crawl underneath a cabinet or bed and remain stuck there. It would actually get stuck very often underneath a table, bed, cabinet, etc. and I have to fish it out many times while debugging. Another issue that affected its autonomy and also the biggest non-obvious property I discovered was just how sensitive the movement was to jitter rather than just average latency. You could survive 70ms if its steady, but you cant survive sudden 200ms spikes even if your average is great (I was using Cloudflare's free tunnel service but realized it wasnt the most optimal compared to smoother operators like UDP).
+
+## What I’d do next
+
+The first thing I’d change is the distance estimate. A VL53L1X ToF module on the front, fused with vision, would replace the bounding-box hack and immediately calm the approach behavior. After that, I would add a lightweight ReID head or at least a simple embedding to keep identity sticky in crowds. I think I've seen a duck on YouTube that [learned to follow its owner's red boots](https://www.youtube.com/watch?v=p-nXiHcZsY0) that I find adorable, and thats the kind of targetting I would aim for. On model side, I would love to test out a TensorRT-quantized nano detector and then track-then-detect between frames for even lower latency. On interaction side, I would make the gesture gate the default, try the shoulder-follow offset, and dress up the robot with a damped camera mount with an easily replaceable battery slot. Also, I think emotes and lightshows have been created for Sphero robots in the past, so maybe integrating that for personality.
+
+Here is a longer format video of my pet robot that follows you around (left is FPV camera, right is what human sees).
+
+https://github.com/user-attachments/assets/9f486826-e751-48cc-834f-3b95a3daea65
+
+## Discussion
+
+| Person | Viewpoint |
+|---|---|
+| **Sebastian Bidigain** | Thinks it’s “super cool” but also “super scary.” His initial reaction framed it as a potential “murder robot,” noting the same detection/tracking technologies appear in defense contexts (ex. government contractors like Palantir and Anduril outfitting drones to identify humans in active conflicts such as the Russia Ukraine war). After I clarified that this project detects humans only to follow them and provide socio-affective companionship, more like a friendly pet, he was relieved and appreciated the benign intent. |
+| **Benthan Vu** | Loves the concept and the playfulness, but says the interaction “feels awful” when latency is high. On poor Wi-Fi, he experienced delayed updates that made following unreliable and unsatisfying. He still imagines future toys shipping with similar capabilities, maybe with less invasive sensing than cameras, so long as responsiveness stays consistently snappy. |
+| **Anonymous** | Loves the idea and draws a connection to Daniela Rus’s vision of “turning anything into robots” (ex. “Maybe your chair or table could be robots. You could say, ‘Chair, come over here.’ Or ‘Table, bring me my cookies.’”). She was impressed that a manual toy became autonomous and that it could follow people even with heavy occlusion in the Maker Lab. Overall, she found it a compelling demonstration of adding intelligence to a simple, friendly form factor. |
+
+## Code Pipeline
+
+```
+FPV Cam ──(5.8GHz RF)──> Skydroid USB Receiver
+ └─> [camera_live.py] : captures locally & re-broadcasts over HTTP (MJPEG/JPEG)
+ │
+ └─HTTP→ http://:7965/mjpeg/ (or /snapshot/.jpg)
+ │
+ v
+ [computeranalyze.py] : YOLOv8 on GPU, outputs control
+ ├─ SSE → http://:7966/events (control stream)
+ └─ UDP → :7970 (fast-path control, optional)
+ │
+ v
+ [raspberrypicontroller.py] : BLE driver @ ~66 Hz → Sphero Ollie/BB-8
+```
+
+## Components
+
+### 1) `camera_live.py` : low-latency camera rebroadcaster
+
+- **Run on**: **PC or Raspberry Pi** (whichever has the Skydroid/USB receiver plugged in).
+ Keep this host physically **close to the FPV camera** for a clean RF link.
+- **Purpose**: Capture frames from a local video device (`DEVICE_INDEX`, default `0`), keep **only the newest** frame, and **serve** it as:
+ - `GET /` : minimal web UI (switch camera index, view stream)
+ - `GET /mjpeg` : MJPEG of the **current** camera
+ - `GET /mjpeg/{index}` : MJPEG for a **specific** device index
+ - `GET /snapshot.jpg` and `/snapshot/{index}.jpg` : a single JPEG frame
+ - `GET /raw`, `/raw/{index}` : chrome-less viewer pages
+ - `GET /current` : JSON status (index, resolution, fps, last_error)
+ - `POST /switch` : switch the **current** camera index
+- **Defaults**:
+ - Binds `HOST=127.0.0.1`, `PORT=7965`
+ - Resolution `1280x720` @ `30fps` (override via env)
+- **Notes**:
+ - Uses a tiny, lock-free latest-frame buffer to minimize latency.
+ - Auto-reaps idle per-index sources (`RAW_IDLE_SECONDS`).
+
+**Example run**
+```
+HOST=0.0.0.0 PORT=7965 DEVICE_INDEX=1 VIDEO_WIDTH=1280 VIDEO_HEIGHT=720 VIDEO_FPS=30 \
+python3 camera_live.py
+# Preview: http://:7965/
+# Stream: http://:7965/mjpeg/1
+```
+
+### 2) `computeranalyze.py` : vision + decision “brain” (GPU)
+
+- **Run on**: **PC with GPU** (e.g., RTX 3090).
+- **Purpose**:
+ - Ingest the camera feed (`--kind udp|mjpeg|snapshot`).
+ - Run **YOLOv8-nano** person detection, smooth/track target, compute **relative heading** + **speed**.
+ - Broadcast **control** via:
+ - **SSE** at `http://:7966/events` (latest-only),
+ - **UDP** to the Pi (optional fast path) `--udp-host --udp-port 7970`.
+ - Serve a live **overlay**:
+ - `GET /video` (HTML), `/video.mjpeg` (MJPEG), `/video.jpg` (snapshot), `/status` (JSON).
+- **Defaults**:
+ - Binds `0.0.0.0:7966`.
+ - **UDP video ingest (default)**: `--video-base 'udp://@:7971?fifo_size=5000000&overrun_nonfatal=1'`
+ - Metrics file `computer_metrics.jsonl` (10s rolling averages).
+- **Behavior highlights**:
+ - Search mode (forward + continuous spin) after brief holdoff.
+ - “Stuck” detection (frame-diff) with timed reverse escape.
+ - Stop at close distance (with tiny pivot speed to keep heading updates flowing).
+ - Smoothed speeds, yaw slew limiting, dead-zone centering.
+
+**Example run (MJPEG ingest from camera_live)**
+```
+python3 computeranalyze.py \
+ --kind mjpeg \
+ --video-base "http://:7965" \
+ --index 1 \
+ --host 0.0.0.0 \
+ --port 7966 \
+ --udp-host --udp-port 7970
+# Control SSE: http://:7966/events
+# Overlay MJPEG: http://:7966/video.mjpeg
+```
+
+**Example run (UDP video ingest)**
+```
+python3 computeranalyze.py \
+ --kind udp \
+ --video-base "udp://@:7971?fifo_size=5000000&overrun_nonfatal=1" \
+ --host 0.0.0.0 --port 7966 \
+ --udp-host --udp-port 7970
+```
+
+### 3) `raspberrypicontroller.py` : robot motor controller (BLE)
+
+- **Run on**: **Raspberry Pi 5** (near the Sphero for strong BLE).
+- **Purpose**:
+ - Maintain **BLE** to Sphero Ollie/BB-8; apply relative-heading + speed at ~**66 Hz** (40 Hz internal tick).
+ - Receive control via:
+ - **SSE client** → `http://:7966/events`
+ - **UDP listener** (default **port 7970**) : ultra-low latency path
+ - Provide a **GUI** (Tkinter): manual joystick, Manual/Autonomous switch, reverse mode, rotate helpers, connect/disconnect.
+ - Record end-to-end **metrics** (PC→Pi network, recv→apply, apply→BLE) to `controller_metrics.jsonl`.
+- **Notes**:
+ - Auto collision recovery (brief back-up + turn).
+ - Caps autonomous max speed via GUI slider.
+ - Optional LED/stabilization setup on connect.
+
+**Run**
+```
+python3 raspberrypicontroller.py
+# In the GUI:
+# 1) Connect Ollie/BB-8 (optionally filter by name)
+# 2) (Optional) Connect SSE to http://:7966
+# 3) Start UDP (default listen port 7970)
+# 4) Switch to "Autonomous"
+```
+
+## Default Ports & Addresses
+
+- **camera_live.py** : HTTP UI/stream on **7965** (`HOST` defaults to `127.0.0.1`)
+- **computeranalyze.py** : HTTP/SSE on **7966**; **UDP control out** to Pi **7970**; **UDP video in** on **7971** (when `--kind udp`)
+- **raspberrypicontroller.py** : **UDP control in** on **7970**; **SSE client** to `http://:7966/events`
+
+*Please make sure to adjust hosts/ports as needed for your network (or expose via Tailscale/Cloudflare tunnel)!*
+
+## Environment & Flags (quick reference)
+
+- `camera_live.py`:
+ - `HOST`, `PORT`, `DEVICE_INDEX`, `VIDEO_WIDTH`, `VIDEO_HEIGHT`, `VIDEO_FPS`, `RAW_IDLE_SECONDS`
+- `computeranalyze.py`:
+ - `--kind udp|mjpeg|snapshot`
+ - `--video-base` (FFmpeg URL for UDP, or `http://:7965` for HTTP kinds)
+ - `--index` (for HTTP kinds)
+ - `--udp-host --udp-port 7970`
+ - Metrics file: `COMPUTER_METRICS_FILE`
+- `raspberrypicontroller.py`:
+ - GUI field for SSE base (default `http://localhost:7966`)
+ - GUI control for UDP listen port (default `7970`)
+ - Metrics file: `CONTROLLER_METRICS_FILE`
+
+## Notes on Placement
+
+- **Run `camera_live.py` on whichever machine has the Skydroid receiver.**
+ This script should be **physically near the FPV camera** (short RF path). Everything else can subscribe to its HTTP stream from elsewhere.
+
+## AI Disclaimer
+
+I used AI coding assistance (like GitHub Copilot) while building these scripts, especially for the Sphero v2 API, which I hadn’t used before. ChatGPT also nudged me to prefer UDP over HTTP for the control loop and helped with the UDP implementation. I originally tried to manually implement the behavior of the robot, but found Copilot to be quite intelligent when I formalized what I wanted in words (i.e. when the bounding box for human is 2/3 the screen, we should stop the robot's movement using Spherov2 API's stop movement command), and was able to iterate quickly with it on behavior.
+
+## Pictures of the robot
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+## Credits
+
+Thanks to **Niti Parikh** for letting me experiment with her Sphero Ollie and to **Sebastian Bidigain** for performing surgery on the BB-8 head to shove the camera inside. Even though I didn't end up using the BB-8, I'd love to pick it up and figure out how to perfectly calibrate the weight (maybe a smaller battery or custom 3d printed head or stronger magnets) so that we can get the BB-8 to be an intelligent following pet as well.
diff --git a/Lab 5/Readme_files/mp.gif b/Lab 5/Readme_files/mp.gif
new file mode 100644
index 0000000000..f23b04e1ac
Binary files /dev/null and b/Lab 5/Readme_files/mp.gif differ
diff --git a/Lab 5/Readme_files/pyt.gif b/Lab 5/Readme_files/pyt.gif
new file mode 100644
index 0000000000..6ccd353978
Binary files /dev/null and b/Lab 5/Readme_files/pyt.gif differ
diff --git a/Lab 5/Readme_files/tml_browser.gif b/Lab 5/Readme_files/tml_browser.gif
new file mode 100644
index 0000000000..f072bcaff3
Binary files /dev/null and b/Lab 5/Readme_files/tml_browser.gif differ
diff --git a/Lab 5/Readme_files/tml_download-model.png b/Lab 5/Readme_files/tml_download-model.png
new file mode 100644
index 0000000000..8cbb377e35
Binary files /dev/null and b/Lab 5/Readme_files/tml_download-model.png differ
diff --git a/Lab 5/Readme_files/tml_pi.gif b/Lab 5/Readme_files/tml_pi.gif
new file mode 100644
index 0000000000..cd22e55881
Binary files /dev/null and b/Lab 5/Readme_files/tml_pi.gif differ
diff --git a/Lab 5/camera_live.py b/Lab 5/camera_live.py
new file mode 100644
index 0000000000..b31826d50e
--- /dev/null
+++ b/Lab 5/camera_live.py
@@ -0,0 +1,526 @@
+import os
+import sys
+import time
+import threading
+from collections import deque
+from typing import Optional, Deque, Dict
+
+import cv2
+from aiohttp import web
+import asyncio
+
+# =========================
+# Config (override via env)
+# =========================
+DEVICE_INDEX = int(os.getenv("DEVICE_INDEX", "0"))
+VIDEO_WIDTH = int(os.getenv("VIDEO_WIDTH", "1280"))
+VIDEO_HEIGHT = int(os.getenv("VIDEO_HEIGHT", "720"))
+VIDEO_FPS = int(os.getenv("VIDEO_FPS", "30"))
+HOST = os.getenv("HOST", "127.0.0.1") # localhost only
+PORT = int(os.getenv("PORT", "7965"))
+PROBE_MAX = int(os.getenv("PROBE_MAX", "10")) # fill dropdown only (no probing)
+RAW_IDLE_SECONDS = int(os.getenv("RAW_IDLE_SECONDS", "60")) # stop per-index sources after idle
+
+# On Windows use DirectShow; on Linux prefer V4L2 (falls back to ANY if unavailable)
+if sys.platform.startswith("win"):
+ CAPTURE_BACKEND = cv2.CAP_DSHOW
+else:
+ CAPTURE_BACKEND = getattr(cv2, "CAP_V4L2", cv2.CAP_ANY)
+
+# =========================
+# Latest-frame buffer
+# =========================
+class LatestFrameBuffer:
+ """Keep only the newest frame to minimize latency."""
+ def __init__(self):
+ self._q: Deque = deque(maxlen=1)
+ self._lock = threading.Lock()
+ self._cond = threading.Condition(self._lock)
+ self._closed = False
+
+ def push(self, item):
+ with self._cond:
+ self._q.clear()
+ self._q.append(item)
+ self._cond.notify_all()
+
+ def get(self, timeout: float = 1.0):
+ end = time.time() + timeout
+ with self._cond:
+ while not self._q and not self._closed:
+ remaining = end - time.time()
+ if remaining <= 0:
+ return None
+ self._cond.wait(remaining)
+ if self._closed:
+ return None
+ return self._q[-1]
+
+ def close(self):
+ with self._cond:
+ self._closed = True
+ self._cond.notify_all()
+
+# =========================
+# Camera source (DSHOW/V4L2 by index)
+# =========================
+class CameraSource:
+ def __init__(self, index=0, width=1280, height=720, fps=30, backend=CAPTURE_BACKEND):
+ self.index = index
+ self.width = width
+ self.height = height
+ self.fps = fps
+ self.backend = backend
+ self.cap: Optional[cv2.VideoCapture] = None
+ self.thread: Optional[threading.Thread] = None
+ self.running = False
+ self.latest = LatestFrameBuffer()
+ self.last_access = time.time()
+
+ def start(self):
+ self.cap = cv2.VideoCapture(self.index, self.backend)
+ if not self.cap.isOpened():
+ raise RuntimeError(f"Could not open video device index={self.index}")
+
+ # Configure for low latency
+ self.cap.set(cv2.CAP_PROP_FRAME_WIDTH, self.width)
+ self.cap.set(cv2.CAP_PROP_FRAME_HEIGHT, self.height)
+ self.cap.set(cv2.CAP_PROP_FPS, self.fps)
+ self.cap.set(cv2.CAP_PROP_BUFFERSIZE, 1)
+
+ self.running = True
+ self.thread = threading.Thread(target=self._loop, daemon=True)
+ self.thread.start()
+
+ def _loop(self):
+ while self.running:
+ ok, frame = self.cap.read()
+ if not ok:
+ time.sleep(0.005)
+ continue
+ self.latest.push(frame)
+
+ def read_latest(self, timeout: float = 0.5):
+ self.last_access = time.time()
+ return self.latest.get(timeout=timeout)
+
+ def stop(self):
+ self.running = False
+ self.latest.close()
+ if self.thread and self.thread.is_alive():
+ self.thread.join(timeout=1.0)
+ if self.cap:
+ self.cap.release()
+
+# =========================
+# Main current camera (UI-controlled)
+# =========================
+camera_lock = threading.RLock()
+camera: Optional[CameraSource] = None
+current_index: Optional[int] = None
+last_error: Optional[str] = None
+
+def _get_current_if_matches(idx: int) -> Optional["CameraSource"]:
+ """Return the current camera source if it's the same index and running."""
+ with camera_lock:
+ if current_index == idx and camera is not None and camera.running:
+ return camera
+ return None
+
+def start_camera(index: int):
+ global camera, current_index, last_error
+ with camera_lock:
+ # Stop existing
+ if camera is not None:
+ try:
+ camera.stop()
+ except Exception:
+ pass
+ camera = None
+ # Start new
+ cs = CameraSource(
+ index=index,
+ width=VIDEO_WIDTH,
+ height=VIDEO_HEIGHT,
+ fps=VIDEO_FPS,
+ backend=CAPTURE_BACKEND
+ )
+ try:
+ cs.start()
+ camera = cs
+ current_index = index
+ last_error = None
+ except Exception as e:
+ last_error = str(e)
+ current_index = None
+
+# =========================
+# Per-index raw sources (lazy, auto-gc)
+# =========================
+raw_lock = threading.RLock()
+raw_sources: Dict[int, CameraSource] = {} # index -> CameraSource
+
+def get_or_start_raw(index: int) -> CameraSource:
+ """Return an active source for this index, starting it if needed.
+
+ IMPORTANT: If the requested index matches the currently open "current" camera,
+ reuse it instead of opening the device a second time (Linux V4L2 often forbids
+ concurrent opens for the same /dev/video*).
+ """
+ cur = _get_current_if_matches(index)
+ if cur is not None:
+ return cur
+ with raw_lock:
+ cs = raw_sources.get(index)
+ if cs and cs.running:
+ return cs
+ # Start a new source
+ cs = CameraSource(
+ index=index,
+ width=VIDEO_WIDTH,
+ height=VIDEO_HEIGHT,
+ fps=VIDEO_FPS,
+ backend=CAPTURE_BACKEND
+ )
+ cs.start()
+ raw_sources[index] = cs
+ return cs
+
+async def reap_idle_sources():
+ """Background task: stop per-index sources that haven't been used recently."""
+ while True:
+ await asyncio.sleep(10)
+ now = time.time()
+ to_stop = []
+ with raw_lock:
+ for idx, src in list(raw_sources.items()):
+ if not src.running:
+ to_stop.append(idx)
+ continue
+ idle = now - src.last_access
+ if idle > RAW_IDLE_SECONDS:
+ to_stop.append(idx)
+ for idx in to_stop:
+ try:
+ raw_sources[idx].stop()
+ except Exception:
+ pass
+ raw_sources.pop(idx, None)
+
+# =========================
+# HTML (simple selector + Raw buttons)
+# =========================
+HTML = f"""
+
+
+
+ Local Drone Feed (MJPEG)
+
+
+
+
+
+
+
+
+ Raw (current)
+ Raw(/idx)
+
+ Status: loading…
+
+
+
+
+
+
+
+"""
+
+# =========================
+# RAW HTML (no chrome)
+# =========================
+RAW_HTML_CURRENT = """
+
+
+
+ Raw
+
+
+
+
+
+
+
+"""
+
+RAW_HTML_INDEX = """
+
+
+
+ Raw Index
+
+
+
+
+
+
+
+"""
+
+# =========================
+# HTTP Handlers
+# =========================
+async def index(_request):
+ return web.Response(text=HTML, content_type="text/html")
+
+async def raw_current(_request):
+ return web.Response(text=RAW_HTML_CURRENT, content_type="text/html")
+
+async def raw_by_index(request):
+ try:
+ idx = int(request.match_info["index"])
+ except Exception:
+ return web.Response(status=400, text="Bad index")
+ page = RAW_HTML_INDEX.replace("{index}", str(idx))
+ return web.Response(text=page, content_type="text/html")
+
+async def current_info(_request):
+ with camera_lock:
+ return web.json_response({
+ "current_index": current_index,
+ "open": camera is not None,
+ "width": VIDEO_WIDTH,
+ "height": VIDEO_HEIGHT,
+ "fps": VIDEO_FPS,
+ "last_error": last_error,
+ })
+
+async def switch_handler(request):
+ data = await request.json()
+ idx = int(data.get("index"))
+ start_camera(idx)
+ if last_error:
+ return web.Response(status=500, text=f"Failed to open index {idx}: {last_error}")
+ return web.json_response({"ok": True, "index": current_index})
+
+async def snapshot_current(_request):
+ with camera_lock:
+ cam = camera
+ if cam is None:
+ return web.Response(status=503, text="No camera open")
+ frame = cam.read_latest(timeout=1.0)
+ if frame is None:
+ return web.Response(status=503, text="No frame")
+ ok, buf = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 85])
+ if not ok:
+ return web.Response(status=500, text="Encode error")
+ return web.Response(body=buf.tobytes(), content_type="image/jpeg")
+
+async def snapshot_by_index(request):
+ try:
+ idx = int(request.match_info["index"])
+ except Exception:
+ return web.Response(status=400, text="Bad index")
+ try:
+ src = get_or_start_raw(idx) # will reuse current if same index
+ except Exception as e:
+ return web.Response(status=500, text=f"Failed to start index {idx}: {e}")
+ frame = src.read_latest(timeout=1.0)
+ if frame is None:
+ return web.Response(status=503, text="No frame")
+ ok, buf = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 85])
+ if not ok:
+ return web.Response(status=500, text="Encode error")
+ return web.Response(body=buf.tobytes(), content_type="image/jpeg")
+
+async def mjpeg_current(request):
+ return await _mjpeg_stream_from_source(request, lambda: camera, use_lock=True)
+
+async def mjpeg_by_index(request):
+ try:
+ idx = int(request.match_info["index"])
+ except Exception:
+ return web.Response(status=400, text="Bad index")
+ try:
+ src = get_or_start_raw(idx) # ensure started or reuse current
+ except Exception as e:
+ return web.Response(status=500, text=f"Failed to start index {idx}: {e}")
+ # capture object is stable for this request; no lock needed per iteration
+ return await _mjpeg_stream_from_source(request, lambda: src, use_lock=False)
+
+async def _mjpeg_stream_from_source(request, get_src, use_lock: bool):
+ boundary = "frame"
+ resp = web.StreamResponse(
+ status=200,
+ headers={
+ "Content-Type": f"multipart/x-mixed-replace; boundary={boundary}",
+ "Cache-Control": "no-store, no-cache, must-revalidate, max-age=0",
+ "Pragma": "no-cache",
+ },
+ )
+ await resp.prepare(request)
+
+ try:
+ frame_interval = 1.0 / max(1, VIDEO_FPS)
+ while True:
+ if use_lock:
+ with camera_lock:
+ src = get_src()
+ else:
+ src = get_src()
+ if src is None or not src.running:
+ await asyncio.sleep(0.1)
+ continue
+ frame = src.read_latest(timeout=1.0)
+ if frame is None:
+ await asyncio.sleep(0.01)
+ continue
+ ok, buf = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 80])
+ if not ok:
+ continue
+ part = (
+ f"--{boundary}\r\n"
+ "Content-Type: image/jpeg\r\n"
+ f"Content-Length: {len(buf)}\r\n\r\n"
+ ).encode("ascii") + buf.tobytes() + b"\r\n"
+ await resp.write(part)
+ await asyncio.sleep(frame_interval * 0.5)
+ except (ConnectionResetError, asyncio.CancelledError, BrokenPipeError):
+ pass
+ finally:
+ try:
+ await resp.write_eof()
+ except Exception:
+ pass
+ return resp
+
+# =========================
+# App lifecycle
+# =========================
+async def on_startup(app):
+ # Start EXACTLY as your original: open DEVICE_INDEX right away
+ start_camera(DEVICE_INDEX)
+ # Launch idle reaper task for per-index sources
+ app["reaper"] = asyncio.create_task(reap_idle_sources())
+ print(f"Serving on http://{HOST}:{PORT} (Ctrl+C to stop)")
+
+async def on_shutdown(app):
+ # Stop reaper
+ try:
+ app["reaper"].cancel()
+ except Exception:
+ pass
+ with camera_lock:
+ if camera is not None:
+ try:
+ camera.stop()
+ except Exception:
+ pass
+ with raw_lock:
+ for src in list(raw_sources.values()):
+ try:
+ src.stop()
+ except Exception:
+ pass
+ raw_sources.clear()
+
+def main():
+ app = web.Application()
+ app.on_startup.append(on_startup)
+ app.on_shutdown.append(on_shutdown)
+ app.add_routes([
+ # Main UI and "current" raw
+ web.get("/", index),
+ web.get("/raw", raw_current),
+ web.get("/mjpeg", mjpeg_current),
+ web.get("/snapshot.jpg", snapshot_current),
+
+ # Per-index raw endpoints (do not change the "current" camera)
+ web.get(r"/raw/{index:\d+}", raw_by_index),
+ web.get(r"/mjpeg/{index:\d+}", mjpeg_by_index),
+ web.get(r"/snapshot/{index:\d+}.jpg", snapshot_by_index),
+
+ # Status + switching for main UI
+ web.get("/current", current_info),
+ web.post("/switch", switch_handler),
+ ])
+ web.run_app(app, host=HOST, port=PORT, access_log=None)
+
+if __name__ == "__main__":
+ try:
+ main()
+ except KeyboardInterrupt:
+ pass
diff --git a/Lab 5/classes.json b/Lab 5/classes.json
new file mode 100644
index 0000000000..ae7d9a9cb7
--- /dev/null
+++ b/Lab 5/classes.json
@@ -0,0 +1 @@
+{"0": "tench, Tinca tinca", "1": "goldfish, Carassius auratus", "2": "great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias", "3": "tiger shark, Galeocerdo cuvieri", "4": "hammerhead, hammerhead shark", "5": "electric ray, crampfish, numbfish, torpedo", "6": "stingray", "7": "cock", "8": "hen", "9": "ostrich, Struthio camelus", "10": "brambling, Fringilla montifringilla", "11": "goldfinch, Carduelis carduelis", "12": "house finch, linnet, Carpodacus mexicanus", "13": "junco, snowbird", "14": "indigo bunting, indigo finch, indigo bird, Passerina cyanea", "15": "robin, American robin, Turdus migratorius", "16": "bulbul", "17": "jay", "18": "magpie", "19": "chickadee", "20": "water ouzel, dipper", "21": "kite", "22": "bald eagle, American eagle, Haliaeetus leucocephalus", "23": "vulture", "24": "great grey owl, great gray owl, Strix nebulosa", "25": "European fire salamander, Salamandra salamandra", "26": "common newt, Triturus vulgaris", "27": "eft", "28": "spotted salamander, Ambystoma maculatum", "29": "axolotl, mud puppy, Ambystoma mexicanum", "30": "bullfrog, Rana catesbeiana", "31": "tree frog, tree-frog", "32": "tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui", "33": "loggerhead, loggerhead turtle, Caretta caretta", "34": "leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea", "35": "mud turtle", "36": "terrapin", "37": "box turtle, box tortoise", "38": "banded gecko", "39": "common iguana, iguana, Iguana iguana", "40": "American chameleon, anole, Anolis carolinensis", "41": "whiptail, whiptail lizard", "42": "agama", "43": "frilled lizard, Chlamydosaurus kingi", "44": "alligator lizard", "45": "Gila monster, Heloderma suspectum", "46": "green lizard, Lacerta viridis", "47": "African chameleon, Chamaeleo chamaeleon", "48": "Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis", "49": "African crocodile, Nile crocodile, Crocodylus niloticus", "50": "American alligator, Alligator mississipiensis", "51": "triceratops", "52": "thunder snake, worm snake, Carphophis amoenus", "53": "ringneck snake, ring-necked snake, ring snake", "54": "hognose snake, puff adder, sand viper", "55": "green snake, grass snake", "56": "king snake, kingsnake", "57": "garter snake, grass snake", "58": "water snake", "59": "vine snake", "60": "night snake, Hypsiglena torquata", "61": "boa constrictor, Constrictor constrictor", "62": "rock python, rock snake, Python sebae", "63": "Indian cobra, Naja naja", "64": "green mamba", "65": "sea snake", "66": "horned viper, cerastes, sand viper, horned asp, Cerastes cornutus", "67": "diamondback, diamondback rattlesnake, Crotalus adamanteus", "68": "sidewinder, horned rattlesnake, Crotalus cerastes", "69": "trilobite", "70": "harvestman, daddy longlegs, Phalangium opilio", "71": "scorpion", "72": "black and gold garden spider, Argiope aurantia", "73": "barn spider, Araneus cavaticus", "74": "garden spider, Aranea diademata", "75": "black widow, Latrodectus mactans", "76": "tarantula", "77": "wolf spider, hunting spider", "78": "tick", "79": "centipede", "80": "black grouse", "81": "ptarmigan", "82": "ruffed grouse, partridge, Bonasa umbellus", "83": "prairie chicken, prairie grouse, prairie fowl", "84": "peacock", "85": "quail", "86": "partridge", "87": "African grey, African gray, Psittacus erithacus", "88": "macaw", "89": "sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita", "90": "lorikeet", "91": "coucal", "92": "bee eater", "93": "hornbill", "94": "hummingbird", "95": "jacamar", "96": "toucan", "97": "drake", "98": "red-breasted merganser, Mergus serrator", "99": "goose", "100": "black swan, Cygnus atratus", "101": "tusker", "102": "echidna, spiny anteater, anteater", "103": "platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus", "104": "wallaby, brush kangaroo", "105": "koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus", "106": "wombat", "107": "jellyfish", "108": "sea anemone, anemone", "109": "brain coral", "110": "flatworm, platyhelminth", "111": "nematode, nematode worm, roundworm", "112": "conch", "113": "snail", "114": "slug", "115": "sea slug, nudibranch", "116": "chiton, coat-of-mail shell, sea cradle, polyplacophore", "117": "chambered nautilus, pearly nautilus, nautilus", "118": "Dungeness crab, Cancer magister", "119": "rock crab, Cancer irroratus", "120": "fiddler crab", "121": "king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica", "122": "American lobster, Northern lobster, Maine lobster, Homarus americanus", "123": "spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish", "124": "crayfish, crawfish, crawdad, crawdaddy", "125": "hermit crab", "126": "isopod", "127": "white stork, Ciconia ciconia", "128": "black stork, Ciconia nigra", "129": "spoonbill", "130": "flamingo", "131": "little blue heron, Egretta caerulea", "132": "American egret, great white heron, Egretta albus", "133": "bittern", "134": "crane", "135": "limpkin, Aramus pictus", "136": "European gallinule, Porphyrio porphyrio", "137": "American coot, marsh hen, mud hen, water hen, Fulica americana", "138": "bustard", "139": "ruddy turnstone, Arenaria interpres", "140": "red-backed sandpiper, dunlin, Erolia alpina", "141": "redshank, Tringa totanus", "142": "dowitcher", "143": "oystercatcher, oyster catcher", "144": "pelican", "145": "king penguin, Aptenodytes patagonica", "146": "albatross, mollymawk", "147": "grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus", "148": "killer whale, killer, orca, grampus, sea wolf, Orcinus orca", "149": "dugong, Dugong dugon", "150": "sea lion", "151": "Chihuahua", "152": "Japanese spaniel", "153": "Maltese dog, Maltese terrier, Maltese", "154": "Pekinese, Pekingese, Peke", "155": "Shih-Tzu", "156": "Blenheim spaniel", "157": "papillon", "158": "toy terrier", "159": "Rhodesian ridgeback", "160": "Afghan hound, Afghan", "161": "basset, basset hound", "162": "beagle", "163": "bloodhound, sleuthhound", "164": "bluetick", "165": "black-and-tan coonhound", "166": "Walker hound, Walker foxhound", "167": "English foxhound", "168": "redbone", "169": "borzoi, Russian wolfhound", "170": "Irish wolfhound", "171": "Italian greyhound", "172": "whippet", "173": "Ibizan hound, Ibizan Podenco", "174": "Norwegian elkhound, elkhound", "175": "otterhound, otter hound", "176": "Saluki, gazelle hound", "177": "Scottish deerhound, deerhound", "178": "Weimaraner", "179": "Staffordshire bullterrier, Staffordshire bull terrier", "180": "American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier", "181": "Bedlington terrier", "182": "Border terrier", "183": "Kerry blue terrier", "184": "Irish terrier", "185": "Norfolk terrier", "186": "Norwich terrier", "187": "Yorkshire terrier", "188": "wire-haired fox terrier", "189": "Lakeland terrier", "190": "Sealyham terrier, Sealyham", "191": "Airedale, Airedale terrier", "192": "cairn, cairn terrier", "193": "Australian terrier", "194": "Dandie Dinmont, Dandie Dinmont terrier", "195": "Boston bull, Boston terrier", "196": "miniature schnauzer", "197": "giant schnauzer", "198": "standard schnauzer", "199": "Scotch terrier, Scottish terrier, Scottie", "200": "Tibetan terrier, chrysanthemum dog", "201": "silky terrier, Sydney silky", "202": "soft-coated wheaten terrier", "203": "West Highland white terrier", "204": "Lhasa, Lhasa apso", "205": "flat-coated retriever", "206": "curly-coated retriever", "207": "golden retriever", "208": "Labrador retriever", "209": "Chesapeake Bay retriever", "210": "German short-haired pointer", "211": "vizsla, Hungarian pointer", "212": "English setter", "213": "Irish setter, red setter", "214": "Gordon setter", "215": "Brittany spaniel", "216": "clumber, clumber spaniel", "217": "English springer, English springer spaniel", "218": "Welsh springer spaniel", "219": "cocker spaniel, English cocker spaniel, cocker", "220": "Sussex spaniel", "221": "Irish water spaniel", "222": "kuvasz", "223": "schipperke", "224": "groenendael", "225": "malinois", "226": "briard", "227": "kelpie", "228": "komondor", "229": "Old English sheepdog, bobtail", "230": "Shetland sheepdog, Shetland sheep dog, Shetland", "231": "collie", "232": "Border collie", "233": "Bouvier des Flandres, Bouviers des Flandres", "234": "Rottweiler", "235": "German shepherd, German shepherd dog, German police dog, alsatian", "236": "Doberman, Doberman pinscher", "237": "miniature pinscher", "238": "Greater Swiss Mountain dog", "239": "Bernese mountain dog", "240": "Appenzeller", "241": "EntleBucher", "242": "boxer", "243": "bull mastiff", "244": "Tibetan mastiff", "245": "French bulldog", "246": "Great Dane", "247": "Saint Bernard, St Bernard", "248": "Eskimo dog, husky", "249": "malamute, malemute, Alaskan malamute", "250": "Siberian husky", "251": "dalmatian, coach dog, carriage dog", "252": "affenpinscher, monkey pinscher, monkey dog", "253": "basenji", "254": "pug, pug-dog", "255": "Leonberg", "256": "Newfoundland, Newfoundland dog", "257": "Great Pyrenees", "258": "Samoyed, Samoyede", "259": "Pomeranian", "260": "chow, chow chow", "261": "keeshond", "262": "Brabancon griffon", "263": "Pembroke, Pembroke Welsh corgi", "264": "Cardigan, Cardigan Welsh corgi", "265": "toy poodle", "266": "miniature poodle", "267": "standard poodle", "268": "Mexican hairless", "269": "timber wolf, grey wolf, gray wolf, Canis lupus", "270": "white wolf, Arctic wolf, Canis lupus tundrarum", "271": "red wolf, maned wolf, Canis rufus, Canis niger", "272": "coyote, prairie wolf, brush wolf, Canis latrans", "273": "dingo, warrigal, warragal, Canis dingo", "274": "dhole, Cuon alpinus", "275": "African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus", "276": "hyena, hyaena", "277": "red fox, Vulpes vulpes", "278": "kit fox, Vulpes macrotis", "279": "Arctic fox, white fox, Alopex lagopus", "280": "grey fox, gray fox, Urocyon cinereoargenteus", "281": "tabby, tabby cat", "282": "tiger cat", "283": "Persian cat", "284": "Siamese cat, Siamese", "285": "Egyptian cat", "286": "cougar, puma, catamount, mountain lion, painter, panther, Felis concolor", "287": "lynx, catamount", "288": "leopard, Panthera pardus", "289": "snow leopard, ounce, Panthera uncia", "290": "jaguar, panther, Panthera onca, Felis onca", "291": "lion, king of beasts, Panthera leo", "292": "tiger, Panthera tigris", "293": "cheetah, chetah, Acinonyx jubatus", "294": "brown bear, bruin, Ursus arctos", "295": "American black bear, black bear, Ursus americanus, Euarctos americanus", "296": "ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus", "297": "sloth bear, Melursus ursinus, Ursus ursinus", "298": "mongoose", "299": "meerkat, mierkat", "300": "tiger beetle", "301": "ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle", "302": "ground beetle, carabid beetle", "303": "long-horned beetle, longicorn, longicorn beetle", "304": "leaf beetle, chrysomelid", "305": "dung beetle", "306": "rhinoceros beetle", "307": "weevil", "308": "fly", "309": "bee", "310": "ant, emmet, pismire", "311": "grasshopper, hopper", "312": "cricket", "313": "walking stick, walkingstick, stick insect", "314": "cockroach, roach", "315": "mantis, mantid", "316": "cicada, cicala", "317": "leafhopper", "318": "lacewing, lacewing fly", "319": "dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk", "320": "damselfly", "321": "admiral", "322": "ringlet, ringlet butterfly", "323": "monarch, monarch butterfly, milkweed butterfly, Danaus plexippus", "324": "cabbage butterfly", "325": "sulphur butterfly, sulfur butterfly", "326": "lycaenid, lycaenid butterfly", "327": "starfish, sea star", "328": "sea urchin", "329": "sea cucumber, holothurian", "330": "wood rabbit, cottontail, cottontail rabbit", "331": "hare", "332": "Angora, Angora rabbit", "333": "hamster", "334": "porcupine, hedgehog", "335": "fox squirrel, eastern fox squirrel, Sciurus niger", "336": "marmot", "337": "beaver", "338": "guinea pig, Cavia cobaya", "339": "sorrel", "340": "zebra", "341": "hog, pig, grunter, squealer, Sus scrofa", "342": "wild boar, boar, Sus scrofa", "343": "warthog", "344": "hippopotamus, hippo, river horse, Hippopotamus amphibius", "345": "ox", "346": "water buffalo, water ox, Asiatic buffalo, Bubalus bubalis", "347": "bison", "348": "ram, tup", "349": "bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis", "350": "ibex, Capra ibex", "351": "hartebeest", "352": "impala, Aepyceros melampus", "353": "gazelle", "354": "Arabian camel, dromedary, Camelus dromedarius", "355": "llama", "356": "weasel", "357": "mink", "358": "polecat, fitch, foulmart, foumart, Mustela putorius", "359": "black-footed ferret, ferret, Mustela nigripes", "360": "otter", "361": "skunk, polecat, wood pussy", "362": "badger", "363": "armadillo", "364": "three-toed sloth, ai, Bradypus tridactylus", "365": "orangutan, orang, orangutang, Pongo pygmaeus", "366": "gorilla, Gorilla gorilla", "367": "chimpanzee, chimp, Pan troglodytes", "368": "gibbon, Hylobates lar", "369": "siamang, Hylobates syndactylus, Symphalangus syndactylus", "370": "guenon, guenon monkey", "371": "patas, hussar monkey, Erythrocebus patas", "372": "baboon", "373": "macaque", "374": "langur", "375": "colobus, colobus monkey", "376": "proboscis monkey, Nasalis larvatus", "377": "marmoset", "378": "capuchin, ringtail, Cebus capucinus", "379": "howler monkey, howler", "380": "titi, titi monkey", "381": "spider monkey, Ateles geoffroyi", "382": "squirrel monkey, Saimiri sciureus", "383": "Madagascar cat, ring-tailed lemur, Lemur catta", "384": "indri, indris, Indri indri, Indri brevicaudatus", "385": "Indian elephant, Elephas maximus", "386": "African elephant, Loxodonta africana", "387": "lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens", "388": "giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca", "389": "barracouta, snoek", "390": "eel", "391": "coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch", "392": "rock beauty, Holocanthus tricolor", "393": "anemone fish", "394": "sturgeon", "395": "gar, garfish, garpike, billfish, Lepisosteus osseus", "396": "lionfish", "397": "puffer, pufferfish, blowfish, globefish", "398": "abacus", "399": "abaya", "400": "academic gown, academic robe, judge's robe", "401": "accordion, piano accordion, squeeze box", "402": "acoustic guitar", "403": "aircraft carrier, carrier, flattop, attack aircraft carrier", "404": "airliner", "405": "airship, dirigible", "406": "altar", "407": "ambulance", "408": "amphibian, amphibious vehicle", "409": "analog clock", "410": "apiary, bee house", "411": "apron", "412": "ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin", "413": "assault rifle, assault gun", "414": "backpack, back pack, knapsack, packsack, rucksack, haversack", "415": "bakery, bakeshop, bakehouse", "416": "balance beam, beam", "417": "balloon", "418": "ballpoint, ballpoint pen, ballpen, Biro", "419": "Band Aid", "420": "banjo", "421": "bannister, banister, balustrade, balusters, handrail", "422": "barbell", "423": "barber chair", "424": "barbershop", "425": "barn", "426": "barometer", "427": "barrel, cask", "428": "barrow, garden cart, lawn cart, wheelbarrow", "429": "baseball", "430": "basketball", "431": "bassinet", "432": "bassoon", "433": "bathing cap, swimming cap", "434": "bath towel", "435": "bathtub, bathing tub, bath, tub", "436": "beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon", "437": "beacon, lighthouse, beacon light, pharos", "438": "beaker", "439": "bearskin, busby, shako", "440": "beer bottle", "441": "beer glass", "442": "bell cote, bell cot", "443": "bib", "444": "bicycle-built-for-two, tandem bicycle, tandem", "445": "bikini, two-piece", "446": "binder, ring-binder", "447": "binoculars, field glasses, opera glasses", "448": "birdhouse", "449": "boathouse", "450": "bobsled, bobsleigh, bob", "451": "bolo tie, bolo, bola tie, bola", "452": "bonnet, poke bonnet", "453": "bookcase", "454": "bookshop, bookstore, bookstall", "455": "bottlecap", "456": "bow", "457": "bow tie, bow-tie, bowtie", "458": "brass, memorial tablet, plaque", "459": "brassiere, bra, bandeau", "460": "breakwater, groin, groyne, mole, bulwark, seawall, jetty", "461": "breastplate, aegis, egis", "462": "broom", "463": "bucket, pail", "464": "buckle", "465": "bulletproof vest", "466": "bullet train, bullet", "467": "butcher shop, meat market", "468": "cab, hack, taxi, taxicab", "469": "caldron, cauldron", "470": "candle, taper, wax light", "471": "cannon", "472": "canoe", "473": "can opener, tin opener", "474": "cardigan", "475": "car mirror", "476": "carousel, carrousel, merry-go-round, roundabout, whirligig", "477": "carpenter's kit, tool kit", "478": "carton", "479": "car wheel", "480": "cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM", "481": "cassette", "482": "cassette player", "483": "castle", "484": "catamaran", "485": "CD player", "486": "cello, violoncello", "487": "cellular telephone, cellular phone, cellphone, cell, mobile phone", "488": "chain", "489": "chainlink fence", "490": "chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour", "491": "chain saw, chainsaw", "492": "chest", "493": "chiffonier, commode", "494": "chime, bell, gong", "495": "china cabinet, china closet", "496": "Christmas stocking", "497": "church, church building", "498": "cinema, movie theater, movie theatre, movie house, picture palace", "499": "cleaver, meat cleaver, chopper", "500": "cliff dwelling", "501": "cloak", "502": "clog, geta, patten, sabot", "503": "cocktail shaker", "504": "coffee mug", "505": "coffeepot", "506": "coil, spiral, volute, whorl, helix", "507": "combination lock", "508": "computer keyboard, keypad", "509": "confectionery, confectionary, candy store", "510": "container ship, containership, container vessel", "511": "convertible", "512": "corkscrew, bottle screw", "513": "cornet, horn, trumpet, trump", "514": "cowboy boot", "515": "cowboy hat, ten-gallon hat", "516": "cradle", "517": "crane", "518": "crash helmet", "519": "crate", "520": "crib, cot", "521": "Crock Pot", "522": "croquet ball", "523": "crutch", "524": "cuirass", "525": "dam, dike, dyke", "526": "desk", "527": "desktop computer", "528": "dial telephone, dial phone", "529": "diaper, nappy, napkin", "530": "digital clock", "531": "digital watch", "532": "dining table, board", "533": "dishrag, dishcloth", "534": "dishwasher, dish washer, dishwashing machine", "535": "disk brake, disc brake", "536": "dock, dockage, docking facility", "537": "dogsled, dog sled, dog sleigh", "538": "dome", "539": "doormat, welcome mat", "540": "drilling platform, offshore rig", "541": "drum, membranophone, tympan", "542": "drumstick", "543": "dumbbell", "544": "Dutch oven", "545": "electric fan, blower", "546": "electric guitar", "547": "electric locomotive", "548": "entertainment center", "549": "envelope", "550": "espresso maker", "551": "face powder", "552": "feather boa, boa", "553": "file, file cabinet, filing cabinet", "554": "fireboat", "555": "fire engine, fire truck", "556": "fire screen, fireguard", "557": "flagpole, flagstaff", "558": "flute, transverse flute", "559": "folding chair", "560": "football helmet", "561": "forklift", "562": "fountain", "563": "fountain pen", "564": "four-poster", "565": "freight car", "566": "French horn, horn", "567": "frying pan, frypan, skillet", "568": "fur coat", "569": "garbage truck, dustcart", "570": "gasmask, respirator, gas helmet", "571": "gas pump, gasoline pump, petrol pump, island dispenser", "572": "goblet", "573": "go-kart", "574": "golf ball", "575": "golfcart, golf cart", "576": "gondola", "577": "gong, tam-tam", "578": "gown", "579": "grand piano, grand", "580": "greenhouse, nursery, glasshouse", "581": "grille, radiator grille", "582": "grocery store, grocery, food market, market", "583": "guillotine", "584": "hair slide", "585": "hair spray", "586": "half track", "587": "hammer", "588": "hamper", "589": "hand blower, blow dryer, blow drier, hair dryer, hair drier", "590": "hand-held computer, hand-held microcomputer", "591": "handkerchief, hankie, hanky, hankey", "592": "hard disc, hard disk, fixed disk", "593": "harmonica, mouth organ, harp, mouth harp", "594": "harp", "595": "harvester, reaper", "596": "hatchet", "597": "holster", "598": "home theater, home theatre", "599": "honeycomb", "600": "hook, claw", "601": "hoopskirt, crinoline", "602": "horizontal bar, high bar", "603": "horse cart, horse-cart", "604": "hourglass", "605": "iPod", "606": "iron, smoothing iron", "607": "jack-o'-lantern", "608": "jean, blue jean, denim", "609": "jeep, landrover", "610": "jersey, T-shirt, tee shirt", "611": "jigsaw puzzle", "612": "jinrikisha, ricksha, rickshaw", "613": "joystick", "614": "kimono", "615": "knee pad", "616": "knot", "617": "lab coat, laboratory coat", "618": "ladle", "619": "lampshade, lamp shade", "620": "laptop, laptop computer", "621": "lawn mower, mower", "622": "lens cap, lens cover", "623": "letter opener, paper knife, paperknife", "624": "library", "625": "lifeboat", "626": "lighter, light, igniter, ignitor", "627": "limousine, limo", "628": "liner, ocean liner", "629": "lipstick, lip rouge", "630": "Loafer", "631": "lotion", "632": "loudspeaker, speaker, speaker unit, loudspeaker system, speaker system", "633": "loupe, jeweler's loupe", "634": "lumbermill, sawmill", "635": "magnetic compass", "636": "mailbag, postbag", "637": "mailbox, letter box", "638": "maillot", "639": "maillot, tank suit", "640": "manhole cover", "641": "maraca", "642": "marimba, xylophone", "643": "mask", "644": "matchstick", "645": "maypole", "646": "maze, labyrinth", "647": "measuring cup", "648": "medicine chest, medicine cabinet", "649": "megalith, megalithic structure", "650": "microphone, mike", "651": "microwave, microwave oven", "652": "military uniform", "653": "milk can", "654": "minibus", "655": "miniskirt, mini", "656": "minivan", "657": "missile", "658": "mitten", "659": "mixing bowl", "660": "mobile home, manufactured home", "661": "Model T", "662": "modem", "663": "monastery", "664": "monitor", "665": "moped", "666": "mortar", "667": "mortarboard", "668": "mosque", "669": "mosquito net", "670": "motor scooter, scooter", "671": "mountain bike, all-terrain bike, off-roader", "672": "mountain tent", "673": "mouse, computer mouse", "674": "mousetrap", "675": "moving van", "676": "muzzle", "677": "nail", "678": "neck brace", "679": "necklace", "680": "nipple", "681": "notebook, notebook computer", "682": "obelisk", "683": "oboe, hautboy, hautbois", "684": "ocarina, sweet potato", "685": "odometer, hodometer, mileometer, milometer", "686": "oil filter", "687": "organ, pipe organ", "688": "oscilloscope, scope, cathode-ray oscilloscope, CRO", "689": "overskirt", "690": "oxcart", "691": "oxygen mask", "692": "packet", "693": "paddle, boat paddle", "694": "paddlewheel, paddle wheel", "695": "padlock", "696": "paintbrush", "697": "pajama, pyjama, pj's, jammies", "698": "palace", "699": "panpipe, pandean pipe, syrinx", "700": "paper towel", "701": "parachute, chute", "702": "parallel bars, bars", "703": "park bench", "704": "parking meter", "705": "passenger car, coach, carriage", "706": "patio, terrace", "707": "pay-phone, pay-station", "708": "pedestal, plinth, footstall", "709": "pencil box, pencil case", "710": "pencil sharpener", "711": "perfume, essence", "712": "Petri dish", "713": "photocopier", "714": "pick, plectrum, plectron", "715": "pickelhaube", "716": "picket fence, paling", "717": "pickup, pickup truck", "718": "pier", "719": "piggy bank, penny bank", "720": "pill bottle", "721": "pillow", "722": "ping-pong ball", "723": "pinwheel", "724": "pirate, pirate ship", "725": "pitcher, ewer", "726": "plane, carpenter's plane, woodworking plane", "727": "planetarium", "728": "plastic bag", "729": "plate rack", "730": "plow, plough", "731": "plunger, plumber's helper", "732": "Polaroid camera, Polaroid Land camera", "733": "pole", "734": "police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria", "735": "poncho", "736": "pool table, billiard table, snooker table", "737": "pop bottle, soda bottle", "738": "pot, flowerpot", "739": "potter's wheel", "740": "power drill", "741": "prayer rug, prayer mat", "742": "printer", "743": "prison, prison house", "744": "projectile, missile", "745": "projector", "746": "puck, hockey puck", "747": "punching bag, punch bag, punching ball, punchball", "748": "purse", "749": "quill, quill pen", "750": "quilt, comforter, comfort, puff", "751": "racer, race car, racing car", "752": "racket, racquet", "753": "radiator", "754": "radio, wireless", "755": "radio telescope, radio reflector", "756": "rain barrel", "757": "recreational vehicle, RV, R.V.", "758": "reel", "759": "reflex camera", "760": "refrigerator, icebox", "761": "remote control, remote", "762": "restaurant, eating house, eating place, eatery", "763": "revolver, six-gun, six-shooter", "764": "rifle", "765": "rocking chair, rocker", "766": "rotisserie", "767": "rubber eraser, rubber, pencil eraser", "768": "rugby ball", "769": "rule, ruler", "770": "running shoe", "771": "safe", "772": "safety pin", "773": "saltshaker, salt shaker", "774": "sandal", "775": "sarong", "776": "sax, saxophone", "777": "scabbard", "778": "scale, weighing machine", "779": "school bus", "780": "schooner", "781": "scoreboard", "782": "screen, CRT screen", "783": "screw", "784": "screwdriver", "785": "seat belt, seatbelt", "786": "sewing machine", "787": "shield, buckler", "788": "shoe shop, shoe-shop, shoe store", "789": "shoji", "790": "shopping basket", "791": "shopping cart", "792": "shovel", "793": "shower cap", "794": "shower curtain", "795": "ski", "796": "ski mask", "797": "sleeping bag", "798": "slide rule, slipstick", "799": "sliding door", "800": "slot, one-armed bandit", "801": "snorkel", "802": "snowmobile", "803": "snowplow, snowplough", "804": "soap dispenser", "805": "soccer ball", "806": "sock", "807": "solar dish, solar collector, solar furnace", "808": "sombrero", "809": "soup bowl", "810": "space bar", "811": "space heater", "812": "space shuttle", "813": "spatula", "814": "speedboat", "815": "spider web, spider's web", "816": "spindle", "817": "sports car, sport car", "818": "spotlight, spot", "819": "stage", "820": "steam locomotive", "821": "steel arch bridge", "822": "steel drum", "823": "stethoscope", "824": "stole", "825": "stone wall", "826": "stopwatch, stop watch", "827": "stove", "828": "strainer", "829": "streetcar, tram, tramcar, trolley, trolley car", "830": "stretcher", "831": "studio couch, day bed", "832": "stupa, tope", "833": "submarine, pigboat, sub, U-boat", "834": "suit, suit of clothes", "835": "sundial", "836": "sunglass", "837": "sunglasses, dark glasses, shades", "838": "sunscreen, sunblock, sun blocker", "839": "suspension bridge", "840": "swab, swob, mop", "841": "sweatshirt", "842": "swimming trunks, bathing trunks", "843": "swing", "844": "switch, electric switch, electrical switch", "845": "syringe", "846": "table lamp", "847": "tank, army tank, armored combat vehicle, armoured combat vehicle", "848": "tape player", "849": "teapot", "850": "teddy, teddy bear", "851": "television, television system", "852": "tennis ball", "853": "thatch, thatched roof", "854": "theater curtain, theatre curtain", "855": "thimble", "856": "thresher, thrasher, threshing machine", "857": "throne", "858": "tile roof", "859": "toaster", "860": "tobacco shop, tobacconist shop, tobacconist", "861": "toilet seat", "862": "torch", "863": "totem pole", "864": "tow truck, tow car, wrecker", "865": "toyshop", "866": "tractor", "867": "trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi", "868": "tray", "869": "trench coat", "870": "tricycle, trike, velocipede", "871": "trimaran", "872": "tripod", "873": "triumphal arch", "874": "trolleybus, trolley coach, trackless trolley", "875": "trombone", "876": "tub, vat", "877": "turnstile", "878": "typewriter keyboard", "879": "umbrella", "880": "unicycle, monocycle", "881": "upright, upright piano", "882": "vacuum, vacuum cleaner", "883": "vase", "884": "vault", "885": "velvet", "886": "vending machine", "887": "vestment", "888": "viaduct", "889": "violin, fiddle", "890": "volleyball", "891": "waffle iron", "892": "wall clock", "893": "wallet, billfold, notecase, pocketbook", "894": "wardrobe, closet, press", "895": "warplane, military plane", "896": "washbasin, handbasin, washbowl, lavabo, wash-hand basin", "897": "washer, automatic washer, washing machine", "898": "water bottle", "899": "water jug", "900": "water tower", "901": "whiskey jug", "902": "whistle", "903": "wig", "904": "window screen", "905": "window shade", "906": "Windsor tie", "907": "wine bottle", "908": "wing", "909": "wok", "910": "wooden spoon", "911": "wool, woolen, woollen", "912": "worm fence, snake fence, snake-rail fence, Virginia fence", "913": "wreck", "914": "yawl", "915": "yurt", "916": "web site, website, internet site, site", "917": "comic book", "918": "crossword puzzle, crossword", "919": "street sign", "920": "traffic light, traffic signal, stoplight", "921": "book jacket, dust cover, dust jacket, dust wrapper", "922": "menu", "923": "plate", "924": "guacamole", "925": "consomme", "926": "hot pot, hotpot", "927": "trifle", "928": "ice cream, icecream", "929": "ice lolly, lolly, lollipop, popsicle", "930": "French loaf", "931": "bagel, beigel", "932": "pretzel", "933": "cheeseburger", "934": "hotdog, hot dog, red hot", "935": "mashed potato", "936": "head cabbage", "937": "broccoli", "938": "cauliflower", "939": "zucchini, courgette", "940": "spaghetti squash", "941": "acorn squash", "942": "butternut squash", "943": "cucumber, cuke", "944": "artichoke, globe artichoke", "945": "bell pepper", "946": "cardoon", "947": "mushroom", "948": "Granny Smith", "949": "strawberry", "950": "orange", "951": "lemon", "952": "fig", "953": "pineapple, ananas", "954": "banana", "955": "jackfruit, jak, jack", "956": "custard apple", "957": "pomegranate", "958": "hay", "959": "carbonara", "960": "chocolate sauce, chocolate syrup", "961": "dough", "962": "meat loaf, meatloaf", "963": "pizza, pizza pie", "964": "potpie", "965": "burrito", "966": "red wine", "967": "espresso", "968": "cup", "969": "eggnog", "970": "alp", "971": "bubble", "972": "cliff, drop, drop-off", "973": "coral reef", "974": "geyser", "975": "lakeside, lakeshore", "976": "promontory, headland, head, foreland", "977": "sandbar, sand bar", "978": "seashore, coast, seacoast, sea-coast", "979": "valley, vale", "980": "volcano", "981": "ballplayer, baseball player", "982": "groom, bridegroom", "983": "scuba diver", "984": "rapeseed", "985": "daisy", "986": "yellow lady's slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum", "987": "corn", "988": "acorn", "989": "hip, rose hip, rosehip", "990": "buckeye, horse chestnut, conker", "991": "coral fungus", "992": "agaric", "993": "gyromitra", "994": "stinkhorn, carrion fungus", "995": "earthstar", "996": "hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa", "997": "bolete", "998": "ear, spike, capitulum", "999": "toilet tissue, toilet paper, bathroom tissue"}
diff --git a/Lab 5/computeranalyze.py b/Lab 5/computeranalyze.py
new file mode 100644
index 0000000000..d5446e83f3
--- /dev/null
+++ b/Lab 5/computeranalyze.py
@@ -0,0 +1,1091 @@
+"""
+Sphero Ollie & BB-8 Computer Vision “brain”
+
+Our Behavior
+--------------------------------
+- SEARCH (forward + continuous spin):
+ - SEARCH begins only after a brief holdoff when the person is lost.
+
+- STUCK detection & escape:
+ - If the camera image stays nearly identical for >=2.0s AND there is no human,
+ we trigger an “ESCAPE_BACK” for ~0.9s: command a straight-back move
+
+- Rotation to keep human centered ALWAYS takes priority over STOP:
+ - Even if the bbox/face STOP threshold is met, we still steer to center and use a tiny
+ forward speed (pivot_speed_stop) so Pi accepts heading updates.
+
+Video ingest kinds
+------------------
+--kind udp : OpenCV/FFmpeg H.264/MPEG-TS (e.g., udp://@:7971?fifo_size=5000000&overrun_nonfatal=1)
+--kind mjpeg : HTTP multipart stream /mjpeg/
+--kind snapshot : HTTP polling /snapshot/.jpg
+
+Control out
+-----------
+- SSE at /events (non-blocking, latest-only)
+- UDP fast path: --udp-host --udp-port 7970
+
+Metrics
+-------
+- Per-frame wall-clock timestamps in each payload
+- 10s rolling averages to computer_metrics.jsonl
+"""
+
+import argparse
+import asyncio
+import json
+import logging
+import os
+import threading
+import time
+from dataclasses import dataclass
+from typing import Optional
+
+import numpy as np
+from aiohttp import web
+
+# Vision stack
+try:
+ import cv2
+except Exception:
+ cv2 = None
+try:
+ from ultralytics import YOLO
+except Exception:
+ YOLO = None
+try:
+ import mediapipe as mp # optional (face stop rule)
+except Exception:
+ mp = None
+
+import urllib.request
+from urllib.error import URLError, HTTPError
+
+import socket
+import torch
+from collections import defaultdict
+
+
+# ---------------------------- Metrics --------------------------------------- #
+
+class RollingMetrics:
+ """
+ Thread-safe rolling 10s averages dumped to JSONL.
+ Call add_sample(name, ms) / incr_counter(name), set_label(key,val).
+ """
+ def __init__(self, path: str, window_s: float = 10.0):
+ self.path = path
+ self.window_s = window_s
+ self._lock = threading.Lock()
+ self._sums = defaultdict(float) # metric -> sum(ms)
+ self._counts = defaultdict(int) # metric -> count
+ self._counters = defaultdict(int) # counter name -> count
+ self._labels = {}
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._loop, daemon=True)
+ self._thread.start()
+
+ def stop(self):
+ self._stop.set()
+ # final flush
+ self._flush()
+
+ def add_sample(self, name: str, value_ms: float):
+ if value_ms is None:
+ return
+ with self._lock:
+ self._sums[name] += float(value_ms)
+ self._counts[name] += 1
+
+ def incr_counter(self, name: str, inc: int = 1):
+ with self._lock:
+ self._counters[name] += inc
+
+ def set_label(self, key: str, value):
+ with self._lock:
+ self._labels[key] = value
+
+ def _loop(self):
+ next_t = time.time() + self.window_s
+ while not self._stop.is_set():
+ now = time.time()
+ if now >= next_t:
+ self._flush()
+ next_t = now + self.window_s
+ time.sleep(0.2)
+
+ def _flush(self):
+ with self._lock:
+ if not self._counts and not self._counters:
+ return
+ avg = {}
+ for k, s in self._sums.items():
+ c = max(1, self._counts.get(k, 0))
+ if self._counts.get(k, 0) == 0:
+ continue
+ avg[k] = s / c
+ out = {
+ "ts": time.time(),
+ "window_s": self.window_s,
+ "avg_ms": avg,
+ "counts": dict(self._counters),
+ "labels": dict(self._labels),
+ }
+ try:
+ with open(self.path, "a", encoding="utf-8") as f:
+ f.write(json.dumps(out, separators=(",", ":")) + "\n")
+ except Exception:
+ pass
+ # reset window
+ self._sums.clear()
+ self._counts.clear()
+ self._counters.clear()
+
+
+# ---------------------------- MJPEG client (browser-like) ------------------- #
+
+class MjpegClient:
+ def __init__(self, url: str, logger: logging.Logger, timeout: float = 5.0, chunk_size: int = 65536):
+ self.url = url
+ self.timeout = timeout
+ self.chunk_size = chunk_size
+ self.log = logger
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+ self._lock = threading.Lock()
+ self._latest = None
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._run, daemon=True)
+ self._thread.start()
+
+ def stop(self):
+ self._stop.set()
+ if self._thread:
+ self._thread.join(timeout=2.0)
+ self._thread = None
+
+ def get_latest(self):
+ with self._lock:
+ return None if self._latest is None else self._latest
+
+ @staticmethod
+ def _parse_boundary(ct_header: str) -> bytes:
+ boundary = "frame"
+ try:
+ if ct_header:
+ parts = [p.strip() for p in ct_header.split(";")]
+ for p in parts:
+ if p.lower().startswith("boundary="):
+ b = p.split("=", 1)[1].strip().strip('"').strip("'")
+ boundary = b.lstrip("-") or "frame"
+ break
+ except Exception:
+ pass
+ return ("--" + boundary).encode("ascii", "ignore")
+
+ def _run(self):
+ headers = {
+ "User-Agent": "ComputerVision/1.0",
+ "Accept": "multipart/x-mixed-replace",
+ "Cache-Control": "no-cache",
+ "Pragma": "no-cache",
+ "Connection": "keep-alive",
+ }
+ backoff = 0.5
+ while not self._stop.is_set():
+ try:
+ req = urllib.request.Request(self.url, headers=headers, method="GET")
+ with urllib.request.urlopen(req, timeout=self.timeout) as resp:
+ ct = resp.headers.get("Content-Type", "")
+ boundary = self._parse_boundary(ct)
+ if not boundary:
+ raise RuntimeError("MJPEG boundary not found in Content-Type")
+
+ buf = bytearray()
+ self.log.info("MJPEG: connected (boundary=%r)", boundary)
+ backoff = 0.5
+
+ while not self._stop.is_set():
+ chunk = resp.read(self.chunk_size)
+ if not chunk:
+ raise EOFError("MJPEG stream ended")
+ buf += chunk
+
+ while True:
+ start = buf.find(boundary)
+ if start == -1:
+ if len(buf) > 2_000_000:
+ del buf[:-4096]
+ break
+ if start > 0:
+ del buf[:start]
+ start = 0
+
+ header_start = start + len(boundary)
+ if len(buf) < header_start + 4:
+ break
+ if buf[header_start:header_start + 2] == b"\r\n":
+ header_start += 2
+
+ end_headers = buf.find(b"\r\n\r\n", header_start)
+ if end_headers == -1:
+ break
+
+ headers_block = bytes(buf[header_start:end_headers])
+ content_length = None
+ for line in headers_block.split(b"\r\n"):
+ k = line.split(b":", 1)
+ if len(k) == 2 and k[0].strip().lower() == b"content-length":
+ try:
+ content_length = int(k[1].strip())
+ except Exception:
+ content_length = None
+
+ data_start = end_headers + 4
+ if content_length is not None:
+ if len(buf) < data_start + content_length:
+ break
+ img_bytes = bytes(buf[data_start:data_start + content_length])
+ del buf[:data_start + content_length]
+ else:
+ next_boundary = buf.find(boundary, data_start)
+ if next_boundary == -1:
+ if len(buf) > 4_000_000:
+ del buf[:-4096]
+ break
+ img_bytes = bytes(buf[data_start:next_boundary])
+ del buf[:next_boundary]
+
+ try:
+ arr = np.frombuffer(img_bytes, dtype=np.uint8)
+ frame = cv2.imdecode(arr, cv2.IMREAD_COLOR) if cv2 is not None else None
+ if frame is not None:
+ with self._lock:
+ self._latest = frame
+ except Exception:
+ pass
+
+ except Exception as e:
+ if self._stop.is_set():
+ break
+ self.log.warning("MJPEG: stream error: %s; reconnecting in %.1fs", e, backoff)
+ time.sleep(backoff)
+ backoff = min(5.0, backoff * 1.7)
+
+
+# ---------------------------- SSE Broadcaster ------------------------------- #
+
+class SseBroadcaster:
+ """Non-blocking, latest-only SSE broadcaster (safe from other threads)."""
+ def __init__(self, logger: logging.Logger):
+ self.log = logger
+ self._clients = set() # set[asyncio.Queue[str]]
+ self._lock = asyncio.Lock()
+ self._loop: Optional[asyncio.AbstractEventLoop] = None
+
+ def attach_loop(self, loop: asyncio.AbstractEventLoop):
+ self._loop = loop
+ self.log.info("SSE: attached to event loop.")
+
+ async def add_client(self):
+ q: asyncio.Queue = asyncio.Queue(maxsize=2)
+ async with self._lock:
+ self._clients.add(q)
+ self.log.info("SSE: client connected (%d total)", len(self._clients))
+ return q
+
+ async def remove_client(self, q):
+ async with self._lock:
+ self._clients.discard(q)
+ self.log.info("SSE: client disconnected (%d total)", len(self._clients))
+
+ def broadcast(self, obj: dict):
+ if not self._loop:
+ return
+ try:
+ data = json.dumps(obj, separators=(",", ":"))
+ payload = f"event: control\ndata: {data}\n\n"
+ except Exception:
+ return
+
+ async def _send():
+ dead = []
+ for q in list(self._clients):
+ try:
+ q.put_nowait(payload)
+ except asyncio.QueueFull:
+ try:
+ _ = q.get_nowait() # drop oldest
+ q.put_nowait(payload)
+ except Exception:
+ dead.append(q)
+ except Exception:
+ dead.append(q)
+ for q in dead:
+ await self.remove_client(q)
+
+ self._loop.call_soon_threadsafe(asyncio.create_task, _send())
+
+
+# ---------------------------- UDP Sender (fast path) ------------------------ #
+
+class UdpSender:
+ """Tiny JSON line sender over UDP (unreliable but ultra-low latency)."""
+ def __init__(self, host: Optional[str], port: Optional[int], logger: logging.Logger):
+ self.addr = None
+ self.sock = None
+ self.log = logger
+ if host and port:
+ try:
+ self.sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ self.sock.setsockopt(socket.IPPROTO_IP, socket.IP_TOS, 0x10) # low delay TOS
+ self.addr = (host, int(port))
+ self.log.info("UDP: will send to %s:%d", host, int(port))
+ except Exception as e:
+ self.log.error("UDP: cannot init sender: %s", e)
+ self.sock = None
+ self.addr = None
+
+ def send(self, obj: dict):
+ if not self.sock or not self.addr:
+ return
+ try:
+ payload = json.dumps(
+ {
+ "seq": obj.get("seq", 0),
+ "rel_heading": int(obj.get("rel_heading", 0)),
+ "speed": int(obj.get("speed", 0)),
+ "state": obj.get("state", "-"),
+ # timestamps (optional; controller may ignore)
+ "pc_cap_ts": obj.get("pc_cap_ts"),
+ "pc_send_ts": obj.get("pc_send_ts"),
+ "pc_cap_to_send_ms": obj.get("pc_cap_to_send_ms"),
+ },
+ separators=(",", ":")
+ ).encode("utf-8")
+ self.sock.sendto(payload, self.addr)
+ except Exception:
+ pass
+
+
+# ---------------------------- Detection logic ------------------------------- #
+
+@dataclass
+class Target:
+ cx: float
+ cy: float
+ h_ratio: float
+ w_ratio: float
+ conf: float
+ t: float
+
+
+class Computer:
+ """
+ Baseline behavior + metrics, with UDP/MJPEG/Snapshot ingest.
+ """
+ def __init__(self, video_base: str, index: int, kind: str,
+ broadcaster: SseBroadcaster, udp_sender: UdpSender,
+ logger: logging.Logger, metrics: RollingMetrics):
+ self.base = video_base.rstrip("/")
+ self.index = int(index)
+ self.kind = kind.lower().strip()
+ self.broadcast = broadcaster.broadcast
+ self.udp = udp_sender
+ self.log = logger
+ self.metrics = metrics
+
+ self._device = "cpu"
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+
+ self._mjpeg: Optional[MjpegClient] = None
+ self._udp_cap: Optional["cv2.VideoCapture"] = None
+ self._model = None
+
+ # Behavior tunables (baseline)
+ self.close_thresh = 0.50
+ self.face_close_thresh = 0.18
+ self.target_memory_s = 0.25
+ self.search_holdoff_s = 2.0
+ self.search_rate_degps = 25.0 * 0.75
+ self.search_speed_factor = 0.45 * 0.75
+ self.stuck_diff_thresh = 2.0
+ self.stuck_detect_s = 2.0
+ self.stuck_escape_s = 0.9
+ self.stuck_escape_speed = 80
+ self._still_accum_s = 0.0
+ self._sig_prev = None
+ self._last_frame_diff = 0.0
+ self._escape_until: Optional[float] = None
+ self.pivot_speed_stop = 6
+ self.follow_setpoint = 0.28
+ self.max_speed = 150
+ self.min_speed_floor = 50
+ self.speed_tau = 0.25
+ self.turn_speed_trim = 0.5
+ self.yaw_kp = 180.0
+ self.yaw_kd = 60.0
+ self.yaw_max_face = 110
+ self.yaw_max_approach = 20
+ self.yaw_slew_degps = 300.0
+ self.aim_deadband = 0.03
+
+ # Stats/overlay
+ self._fps = 0.0
+ self._last_fps_t = 0.0
+ self._fps_count = 0
+ self._ov_lock = threading.Lock()
+ self._last_overlay: Optional[np.ndarray] = None
+
+ # State
+ self._last_target: Optional[Target] = None
+ self._last_seen_time: float = 0.0
+ self._rel_forward_offset = 0
+ self._calibrated = False
+ self._probe_in_flight = False
+
+ # Telemetry / seq
+ self._seq = 0
+
+ # Command smoothing / loss handling
+ self._cmd_speed_ema = 0.0
+ self._last_cmd_rel = 0.0
+ self._last_dx = 0.0
+ self._no_target_since: Optional[float] = None
+
+ # SEARCH integrator
+ self._search_angle = 0.0
+
+ def _mjpeg_url(self) -> str:
+ return f"{self.base}/mjpeg/{self.index}"
+
+ def _snapshot_url(self) -> str:
+ return f"{self.base}/snapshot/{self.index}.jpg?ts={int(time.time() * 1000)}"
+
+ def get_latest_overlay(self) -> Optional[np.ndarray]:
+ with self._ov_lock:
+ return None if self._last_overlay is None else self._last_overlay
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._run, daemon=True)
+ self._thread.start()
+
+ def stop(self):
+ self._stop.set()
+ if self._thread:
+ self._thread.join(timeout=3)
+ self._thread = None
+ if self._mjpeg:
+ try:
+ self._mjpeg.stop()
+ except Exception:
+ pass
+ self._mjpeg = None
+ if self._udp_cap is not None:
+ try:
+ self._udp_cap.release()
+ except Exception:
+ pass
+ self._udp_cap = None
+
+ # ---------------------------- I/O helpers ------------------------------- #
+
+ def _fetch_snapshot_frame(self, timeout: float = 2.0):
+ url = self._snapshot_url()
+ try:
+ req = urllib.request.Request(
+ url,
+ headers={
+ "User-Agent": "ComputerVision/1.0",
+ "Accept": "image/jpeg",
+ "Cache-Control": "no-cache",
+ "Pragma": "no-cache",
+ },
+ )
+ with urllib.request.urlopen(req, timeout=timeout) as r:
+ data = r.read()
+ if not data:
+ return None
+ arr = np.frombuffer(data, dtype=np.uint8)
+ frame = cv2.imdecode(arr, cv2.IMREAD_COLOR)
+ return frame
+ except (URLError, HTTPError, TimeoutError, ValueError):
+ return None
+ except Exception:
+ return None
+
+ def _draw_overlay(self, frame, det_target, target, rel_heading, speed, state_msg):
+ img = frame.copy()
+ if cv2 is None:
+ return img
+ h, w = img.shape[:2]
+ cv2.line(img, (w // 2, 0), (w // 2, h), (200, 200, 200), 1)
+ cv2.line(img, (0, h // 2), (w, h // 2), (200, 200, 200), 1)
+ if target is not None:
+ cx = int(target.cx * w); cy = int(target.cy * h)
+ box_h = int(target.h_ratio * h); box_w = int(target.w_ratio * w)
+ x1 = max(0, cx - box_w // 2); y1 = max(0, cy - box_h // 2)
+ x2 = min(w - 1, cx + box_w // 2); y2 = min(h - 1, cy + box_h // 2)
+ cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
+ cv2.circle(img, (cx, cy), 5, (0, 255, 0), -1)
+ src = (self.base if self.kind == "udp"
+ else (f"{self.base}/mjpeg/{self.index}" if self.kind == "mjpeg"
+ else f"{self.base}/snapshot/{self.index}.jpg?ts=..."))
+ lines = [
+ f"Device: {self._device} | fps≈{self._fps:.1f}",
+ f"State: {state_msg}",
+ f"RelHead={rel_heading if rel_heading is not None else '-'} Speed={speed} Max={self.max_speed}",
+ f"SEARCH: rate≈{self.search_rate_degps:.1f}°/s speed≈{int(max(40,self.search_speed_factor*self.max_speed))}",
+ f"Still≈{self._still_accum_s:.1f}s diff≈{self._last_frame_diff:.1f}",
+ f"Stop(h)≥{self.close_thresh:.2f} Face≥{self.face_close_thresh:.2f}",
+ f"Src: {src}",
+ ]
+ y = 20
+ for s in lines:
+ cv2.putText(img, s, (10, y), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (30, 255, 30), 2, cv2.LINE_AA)
+ y += 22
+ return img
+
+ # ---------------------------- Main loop --------------------------------- #
+
+ def _run(self):
+ # Device
+ try:
+ dev = "cuda:0" if torch.cuda.is_available() else "cpu"
+ except Exception:
+ dev = "cpu"
+ self._device = dev
+ self.metrics.set_label("device", self._device)
+ self.metrics.set_label("kind", self.kind)
+
+ # Model
+ if YOLO is None or cv2 is None:
+ self.log.error("Computer: ultralytics and opencv-python required.")
+ return
+ self.log.info("Computer: loading YOLO model (device=%s)...", dev)
+ try:
+ model_name = os.environ.get("YOLO_MODEL", "yolov8n.pt")
+ self._model = YOLO(model_name)
+ except Exception as e:
+ self.log.exception("Computer: failed to load model: %s", e)
+ return
+
+ # Open source
+ if self.kind == "mjpeg":
+ url = self._mjpeg_url()
+ self._mjpeg = MjpegClient(url, logger=self.log)
+ self._mjpeg.start()
+ self.log.info("Computer: MJPEG client started for %s", url)
+ elif self.kind == "udp":
+ self.log.info("Computer: opening UDP video at %s", self.base)
+ self._udp_cap = cv2.VideoCapture(self.base, cv2.CAP_FFMPEG)
+ if not self._udp_cap.isOpened():
+ self.log.error("Computer: failed to open UDP source at %s", self.base)
+ return
+ elif self.kind == "snapshot":
+ self.log.info("Computer: snapshot polling from %s", self.base)
+ else:
+ self.log.warning("Computer: unknown kind '%s'; defaulting to snapshot.", self.kind)
+ self.kind = "snapshot"
+ self.metrics.set_label("kind", self.kind)
+
+ snapshot_period = 1.0 / 20.0
+ last_loop_t = time.time()
+
+ while not self._stop.is_set():
+ # CAPTURE
+ if self.kind == "mjpeg":
+ frame = self._mjpeg.get_latest() if self._mjpeg else None
+ if frame is None:
+ time.sleep(0.005)
+ continue
+ pc_cap_ts = time.time()
+ elif self.kind == "udp":
+ ok, frame = self._udp_cap.read()
+ if not ok or frame is None:
+ time.sleep(0.002)
+ continue
+ pc_cap_ts = time.time()
+ else:
+ t_cap0 = time.time()
+ frame = self._fetch_snapshot_frame(timeout=2.0)
+ if frame is None:
+ self.log.warning("Computer: snapshot not available.")
+ time.sleep(0.1)
+ continue
+ elapsed = time.time() - t_cap0
+ to_sleep = max(0.0, snapshot_period - elapsed)
+ if to_sleep > 0:
+ time.sleep(to_sleep)
+ pc_cap_ts = time.time()
+
+ h, w = frame.shape[:2]
+ now = time.time()
+ dt = max(1e-3, now - last_loop_t)
+ last_loop_t = now
+
+ # YOLO person-only detection
+ t_pred0 = time.time()
+ results = self._model.predict(
+ frame, imgsz=640, conf=0.4, iou=0.5,
+ device=self._device, classes=[0], verbose=False
+ )
+ t_pred1 = time.time()
+ det_target = self._select_target(results, w, h, t_pred1)
+ target = self._update_target_memory(det_target, t_pred1)
+
+ # Update stuck detector (use raw detection absence)
+ self._update_stuck(frame, no_human=(det_target is None), dt=dt)
+
+ # Movement decision (RELATIVE headings)
+ rel_heading_cmd, speed_cmd, state_msg = self._compute_drive_command(target, w, h, dt)
+ t_decide = time.time()
+
+ speed_cmd = max(0, min(self.max_speed, int(speed_cmd)))
+
+ # Seq + message (with wall-clock metrics)
+ self._seq = (self._seq + 1) & 0x7fffffff
+ pc_send_ts = time.time()
+ msg = {
+ "seq": self._seq,
+ "rel_heading": int(rel_heading_cmd or 0),
+ "speed": int(speed_cmd or 0),
+ "state": state_msg,
+ # timestamps
+ "pc_cap_ts": pc_cap_ts,
+ "pc_pred_end_ts": t_pred1,
+ "pc_decide_ts": t_decide,
+ "pc_send_ts": pc_send_ts,
+ # durations
+ "pc_pred_ms": int((t_pred1 - t_pred0) * 1000),
+ "pc_decide_ms": int((t_decide - t_pred1) * 1000),
+ "pc_cap_to_pred_ms": int((t_pred0 - pc_cap_ts) * 1000),
+ "pc_cap_to_send_ms": int((pc_send_ts - pc_cap_ts) * 1000),
+ }
+
+ # Metrics (PC side)
+ self.metrics.incr_counter("frames", 1)
+ self.metrics.incr_counter("msgs", 1)
+ self.metrics.add_sample("pc_pred_ms", msg["pc_pred_ms"])
+ self.metrics.add_sample("pc_decide_ms", msg["pc_decide_ms"])
+ self.metrics.add_sample("pc_cap_to_pred_ms", msg["pc_cap_to_pred_ms"])
+ self.metrics.add_sample("pc_cap_to_send_ms", msg["pc_cap_to_send_ms"])
+
+ # Send on both channels
+ self.broadcast(msg)
+ self.udp.send(msg)
+
+ # Overlay
+ overlay = self._draw_overlay(frame, det_target, target, rel_heading_cmd, speed_cmd, state_msg)
+ with self._ov_lock:
+ self._last_overlay = overlay
+
+ # FPS stats
+ self._fps_count += 1
+ if (pc_send_ts - self._last_fps_t) >= 1.0:
+ self._fps = self._fps_count / max(1e-6, (pc_send_ts - self._last_fps_t))
+ self._fps_count = 0
+ self._last_fps_t = pc_send_ts
+
+ # ---------------------------- Target handling --------------------------- #
+
+ @staticmethod
+ def _select_target(results, w, h, now) -> Optional[Target]:
+ best = None
+ try:
+ if not results:
+ return None
+ res = results[0]
+ if not hasattr(res, "boxes") or res.boxes is None:
+ return None
+ boxes = res.boxes
+ for i in range(len(boxes)):
+ xyxy = boxes.xyxy[i].tolist()
+ conf = float(boxes.conf[i].item())
+ x1, y1, x2, y2 = xyxy
+ bw, bh = max(1, x2 - x1), max(1, y2 - y1)
+ area = bw * bh
+ score = area * conf
+ if best is None or score > best[0]:
+ cx = (x1 + x2) / 2.0
+ cy = (y1 + y2) / 2.0
+ best = (score, Target(
+ cx=cx / w, cy=cy / h,
+ h_ratio=bh / h, w_ratio=bw / w, conf=conf, t=now
+ ))
+ except Exception:
+ return None
+ return best[1] if best else None
+
+ def _update_target_memory(self, det_target: Optional[Target], now: float) -> Optional[Target]:
+ # Heavier smoothing for center/size, short memory.
+ if det_target is not None:
+ a = 0.55
+ if self._last_target is None:
+ self._last_target = det_target
+ else:
+ sm = Target(
+ cx=self._last_target.cx * (1 - a) + det_target.cx * a,
+ cy=self._last_target.cy * (1 - a) + det_target.cy * a,
+ h_ratio=self._last_target.h_ratio * (1 - a) + det_target.h_ratio * a,
+ w_ratio=self._last_target.w_ratio * (1 - a) + det_target.w_ratio * a,
+ conf=max(self._last_target.conf * (1 - a) + det_target.conf * a, det_target.conf),
+ t=now
+ )
+ self._last_target = sm
+ self._last_seen_time = now
+ self._no_target_since = None
+ return self._last_target
+ else:
+ if self._last_target and (now - self._last_seen_time) <= self.target_memory_s:
+ return self._last_target
+ self._last_target = None
+ if self._no_target_since is None:
+ self._no_target_since = now
+ return None
+
+ # ---------------------------- Stuck detector ---------------------------- #
+
+ def _update_stuck(self, frame_bgr, no_human: bool, dt: float):
+ """Increment stillness if frames are nearly identical while no human is detected."""
+ try:
+ g = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2GRAY)
+ small = cv2.resize(g, (16, 16), interpolation=cv2.INTER_AREA)
+ if self._sig_prev is None:
+ self._sig_prev = small
+ self._last_frame_diff = 0.0
+ self._still_accum_s = 0.0
+ return
+ diff = np.mean(np.abs(small.astype(np.int16) - self._sig_prev.astype(np.int16)))
+ self._last_frame_diff = float(diff)
+ self._sig_prev = small
+ if no_human and diff < self.stuck_diff_thresh:
+ self._still_accum_s += dt
+ else:
+ self._still_accum_s = 0.0
+
+ if self._still_accum_s >= self.stuck_detect_s and (self._escape_until is None or time.time() >= self._escape_until):
+ self._escape_until = time.time() + self.stuck_escape_s
+ self._still_accum_s = 0.0 # reset accumulator after triggering
+ except Exception:
+ # On any error, don't break the loop; just reset stuck metrics.
+ self._still_accum_s = 0.0
+
+ # ---------------------------- Control logic (baseline) ------------------- #
+
+ def _compute_drive_command(self, target: Optional[Target], w: int, h: int, dt: float):
+ now = time.time()
+
+ # 0) STUCK escape dominates when active (and no target)
+ if self._escape_until is not None and now < self._escape_until and target is None:
+ rel = int(round(self._slew_rel(180.0, dt))) # straight-back approximation
+ self._cmd_speed_ema = self._ema(self._cmd_speed_ema, float(self.stuck_escape_speed), dt, self.speed_tau)
+ return rel, int(self._cmd_speed_ema), "ESCAPE_BACK"
+
+ # 1) If target dropped recently but not long enough for SEARCH, lerp to 0
+ if target is None:
+ lost_s = (now - self._no_target_since) if self._no_target_since else 0.0
+ if lost_s < self.search_holdoff_s:
+ self._cmd_speed_ema = self._ema(self._cmd_speed_ema, 0.0, dt, self.speed_tau)
+ rel = int(round(self._slew_rel(0.0, dt)))
+ return rel, int(self._cmd_speed_ema), f"LOST {lost_s:.1f}s | decel & hold"
+
+ # 2) ORIGINAL SEARCH (forward + continuous spin), ~3/4 speed/rate
+ self._search_angle = (self._search_angle + self.search_rate_degps * dt) % 360.0
+ rel = int(round(self._slew_rel(self._search_angle, dt)))
+ desired_speed = int(min(self.max_speed, max(40, self.search_speed_factor * self.max_speed)))
+ self._cmd_speed_ema = self._ema(self._cmd_speed_ema, float(desired_speed), dt, self.speed_tau)
+ return rel, int(self._cmd_speed_ema), f"SEARCH_SPIN rel={rel}° @ {desired_speed}"
+
+ # 3) Target available: compute steering FIRST (rotation always trumps STOP)
+ dx = target.cx - 0.5
+ d_dx = (dx - self._last_dx) / max(1e-3, dt)
+ self._last_dx = dx
+
+ raw_steer = self.yaw_kp * dx + self.yaw_kd * d_dx
+ # FACE vs APPROACH based on deadband; then clamp
+ if abs(dx) > self.aim_deadband:
+ steer = self._clamp(raw_steer, -self.yaw_max_face, self.yaw_max_face)
+ phase = "FACE"
+ else:
+ steer = self._clamp(raw_steer, -self.yaw_max_approach, self.yaw_max_approach)
+ phase = "APPROACH"
+
+ rel = int(round(self._slew_rel(steer, dt)))
+
+ # STOP logic only affects speed (not heading)
+ dist_ratio = target.h_ratio
+ too_close_bbox = dist_ratio >= self.close_thresh
+
+ if too_close_bbox:
+ desired_speed = self.pivot_speed_stop
+ state = f"STOP-PIVOT | {phase} dx={dx:+.2f}, r={dist_ratio:.2f}, steer={steer:+.0f}°"
+ else:
+ desired_speed = self._speed_for_distance(dist_ratio)
+ # Trim forward speed when steering hard
+ trim = 1.0 - self.turn_speed_trim * (abs(steer) / max(1e-6, float(self.yaw_max_face)))
+ desired_speed = max(self.min_speed_floor, int(desired_speed * max(0.2, trim)))
+ state = f"{phase} | dx={dx:+.2f}, r={dist_ratio:.2f}, steer={steer:+.0f}°"
+
+ self._cmd_speed_ema = self._ema(self._cmd_speed_ema, float(desired_speed), dt, self.speed_tau)
+ return rel, int(self._cmd_speed_ema), state
+
+ # --- helpers: speed & yaw shaping ---
+
+ def _speed_for_distance(self, h_ratio: float) -> int:
+ """
+ Proportional control to keep box size near setpoint, clamped to [min, max].
+ """
+ e = self.follow_setpoint - h_ratio # positive if far (box small) -> go faster
+ kp = 380.0
+ spd = self.min_speed_floor + kp * e
+ if e <= 0:
+ # Too close relative to setpoint => bias toward slow
+ spd = max(0.0, self.min_speed_floor * 0.4 + 200.0 * e)
+ return int(max(0.0, min(self.max_speed, spd)))
+
+ def _ema(self, prev: float, target: float, dt: float, tau: float) -> float:
+ alpha = max(0.0, min(1.0, dt / max(1e-6, tau)))
+ return prev + alpha * (target - prev)
+
+ def _slew_rel(self, desired_rel: float, dt: float) -> float:
+ """
+ Apply rate limit to relative-heading command (deg/s), then remember it.
+ """
+ max_delta = self.yaw_slew_degps * dt
+ new_rel = self._clamp(desired_rel, self._last_cmd_rel - max_delta, self._last_cmd_rel + max_delta)
+ self._last_cmd_rel = new_rel
+ return new_rel
+
+ @staticmethod
+ def _clamp(x, lo, hi):
+ return lo if x < lo else hi if x > hi else x
+
+
+# ---------------------------- HTTP Server (SSE + Video) --------------------- #
+
+VIDEO_HTML = """
+
+
+
+ Computer Preview
+
+
+
+
+
+
+
+"""
+
+class AppServer:
+ def __init__(self, host: str, port: int, computer: Computer, broadcaster: SseBroadcaster, logger: logging.Logger, metrics: RollingMetrics):
+ self.host = host
+ self.port = port
+ self.computer = computer
+ self.broadcast = broadcaster
+ self.log = logger
+ self.metrics = metrics
+ self.app = web.Application()
+ self.app.add_routes([
+ web.get("/events", self.handle_events),
+ web.get("/status", self.handle_status),
+ web.get("/video", self.handle_video_page),
+ web.get("/video.mjpeg", self.handle_video_mjpeg),
+ web.get("/video.jpg", self.handle_video_snapshot),
+ ])
+ self.app.on_startup.append(self._on_startup)
+ self.app.on_shutdown.append(self._on_shutdown)
+
+ async def _on_startup(self, app):
+ loop = asyncio.get_running_loop()
+ self.broadcast.attach_loop(loop)
+ self.computer.start()
+ self.metrics.start()
+ self.log.info("SSE server on http://%s:%d (Ctrl+C to stop)", self.host, self.port)
+
+ async def _on_shutdown(self, app):
+ self.computer.stop()
+ self.metrics.stop()
+
+ async def handle_status(self, request):
+ data = {
+ "video_base": self.computer.base,
+ "index": self.computer.index,
+ "kind": self.computer.kind,
+ "device": self.computer._device,
+ "fps": self.computer._fps,
+ }
+ return web.json_response(data)
+
+ async def handle_events(self, request):
+ # SSE headers
+ resp = web.StreamResponse(
+ status=200,
+ headers={
+ "Content-Type": "text/event-stream",
+ "Cache-Control": "no-cache",
+ "Pragma": "no-cache",
+ "Connection": "keep-alive",
+ "Access-Control-Allow-Origin": "*",
+ "X-Accel-Buffering": "no",
+ },
+ )
+ await resp.prepare(request)
+
+ # Best-effort Nagle disable
+ try:
+ transport = request.transport
+ sock = transport.get_extra_info("socket")
+ if sock:
+ sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
+ except Exception:
+ pass
+
+ q = await self.broadcast.add_client()
+
+ try:
+ await resp.write(b": hello\n\n")
+ while True:
+ try:
+ payload = await asyncio.wait_for(q.get(), timeout=1.0)
+ await resp.write(payload.encode("utf-8"))
+ except asyncio.TimeoutError:
+ await resp.write(b": keep-alive\n\n")
+ except (asyncio.CancelledError, ConnectionResetError, BrokenPipeError):
+ pass
+ finally:
+ try:
+ await resp.write_eof()
+ except Exception:
+ pass
+ await self.broadcast.remove_client(q)
+ return resp
+
+ async def handle_video_page(self, request):
+ html = VIDEO_HTML.replace("{ts}", str(int(time.time() * 1000)))
+ return web.Response(text=html, content_type="text/html")
+
+ async def handle_video_snapshot(self, request):
+ frame = self.computer.get_latest_overlay()
+ if frame is None:
+ return web.Response(status=503, text="No frame yet")
+ ok, buf = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 85])
+ if not ok:
+ return web.Response(status=500, text="Encode error")
+ return web.Response(
+ body=buf.tobytes(),
+ content_type="image/jpeg",
+ headers={
+ "Cache-Control": "no-store, no-cache, must-revalidate, max-age=0",
+ "Pragma": "no-cache",
+ "X-Accel-Buffering": "no",
+ },
+ )
+
+ async def handle_video_mjpeg(self, request):
+ boundary = "frame"
+ resp = web.StreamResponse(
+ status=200,
+ headers={
+ "Content-Type": f"multipart/x-mixed-replace; boundary={boundary}",
+ "Cache-Control": "no-store, no-cache, must-revalidate, max-age=0",
+ "Pragma": "no-cache",
+ "Connection": "keep-alive",
+ "X-Accel-Buffering": "no",
+ },
+ )
+ await resp.prepare(request)
+
+ # Disable Nagle for preview too
+ try:
+ transport = request.transport
+ sock = transport.get_extra_info("socket")
+ if sock:
+ sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
+ except Exception:
+ pass
+
+ try:
+ period = 1.0 / 15.0
+ while True:
+ frame = self.computer.get_latest_overlay()
+ if frame is None:
+ await asyncio.sleep(0.05)
+ continue
+ ok, buf = cv2.imencode(".jpg", frame, [int(cv2.IMWRITE_JPEG_QUALITY), 80])
+ if not ok:
+ await asyncio.sleep(period)
+ continue
+ part = (
+ f"--{boundary}\r\n"
+ "Content-Type: image/jpeg\r\n"
+ f"Content-Length: {len(buf)}\r\n\r\n"
+ ).encode("ascii") + buf.tobytes() + b"\r\n"
+ await resp.write(part)
+ await asyncio.sleep(period)
+ except (asyncio.CancelledError, ConnectionResetError, BrokenPipeError):
+ pass
+ finally:
+ try:
+ await resp.write_eof()
+ except Exception:
+ pass
+ return resp
+
+
+# ---------------------------- Main ------------------------------------------ #
+
+def main():
+ parser = argparse.ArgumentParser(description="SSE/UDP Vision Computer (baseline behavior + metrics + UDP/MJPEG/Snapshot ingest)")
+ parser.add_argument("--video-base", type=str, default="udp://@:7971?fifo_size=5000000&overrun_nonfatal=1",
+ help="For --kind udp, pass a full FFmpeg URL; for HTTP kinds, pass base server URL (e.g., http://localhost:7965)")
+ parser.add_argument("--index", type=int, default=1, help="Camera index/path (used for mjpeg/snapshot)")
+ parser.add_argument("--kind", type=str, default="udp", choices=["mjpeg", "snapshot", "udp"],
+ help="Video ingest kind")
+ parser.add_argument("--host", type=str, default="0.0.0.0", help="Bind host for HTTP/SSE server")
+ parser.add_argument("--port", type=int, default=7966, help="Bind port for HTTP/SSE server")
+ parser.add_argument("--udp-host", type=str, default="", help="(Optional) UDP target host (RPI IP or overlay IP)")
+ parser.add_argument("--udp-port", type=int, default=7970, help="(Optional) UDP target port (default 7970)")
+ args = parser.parse_args()
+
+ # Logging
+ logger = logging.getLogger("computer")
+ logger.setLevel(logging.INFO)
+ ch = logging.StreamHandler()
+ ch.setLevel(logging.INFO)
+ ch.setFormatter(logging.Formatter("%(asctime)s | %(levelname)s | %(message)s", "%H:%M:%S"))
+ logger.addHandler(ch)
+
+ metrics_path = os.environ.get("COMPUTER_METRICS_FILE", "computer_metrics.jsonl")
+ metrics = RollingMetrics(metrics_path, window_s=10.0)
+
+ broadcaster = SseBroadcaster(logger)
+ udp_sender = UdpSender(args.udp_host.strip(), args.udp_port if args.udp_host.strip() else None, logger)
+
+ computer = Computer(video_base=args.video_base, index=args.index, kind=args.kind,
+ broadcaster=broadcaster, udp_sender=udp_sender, logger=logger, metrics=metrics)
+ server = AppServer(host=args.host, port=args.port, computer=computer, broadcaster=broadcaster, logger=logger, metrics=metrics)
+ web.run_app(server.app, host=args.host, port=args.port, access_log=None)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/ExampleAudioFFT.py b/Lab 5/converted_tflite_quantized/Audio_optional/ExampleAudioFFT.py
new file mode 100644
index 0000000000..9db53ad7de
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/Audio_optional/ExampleAudioFFT.py
@@ -0,0 +1,111 @@
+import pyaudio
+import numpy as np
+from scipy.fft import rfft, rfftfreq
+from scipy.signal.windows import hann
+from numpy_ringbuffer import RingBuffer
+
+import queue
+import time
+
+
+## Please change the following number so that it matches to the microphone that you are using.
+DEVICE_INDEX = 1
+
+## Compute the audio statistics every `UPDATE_INTERVAL` seconds.
+UPDATE_INTERVAL = 1.0
+
+
+
+### Things you probably don't need to change
+FORMAT=np.float32
+SAMPLING_RATE = 44100
+CHANNELS=1
+
+
+def main():
+ ### Setting up all required software elements:
+ audioQueue = queue.Queue() #In this queue stores the incoming audio data before processing.
+ pyaudio_instance = pyaudio.PyAudio() #This is the AudioDriver that connects to the microphone for us.
+
+ def _callback(in_data, frame_count, time_info, status): # This "callbackfunction" stores the incoming audio data in the `audioQueue`
+ audioQueue.put(in_data)
+ return None, pyaudio.paContinue
+
+ stream = pyaudio_instance.open(input=True,start=False,format=pyaudio.paFloat32,channels=CHANNELS,rate=SAMPLING_RATE,frames_per_buffer=int(SAMPLING_RATE/2),stream_callback=_callback,input_device_index=DEVICE_INDEX)
+
+
+ # One essential way to keep track of variables overtime is with a ringbuffer.
+ # As an example the `AudioBuffer` it stores always the last second of audio data.
+ AudioBuffer = RingBuffer(capacity=SAMPLING_RATE*1, dtype=FORMAT) # 1 second long buffer.
+
+ # Another example is the `VolumeHistory` ringbuffer.
+ VolumeHistory = RingBuffer(capacity=int(20/UPDATE_INTERVAL), dtype=FORMAT) ## This is how you can compute a history to record changes over time
+ ### Here is a good spot to extend other buffers aswell that keeps track of varailbes over a certain period of time.
+
+ nextTimeStamp = time.time()
+ stream.start_stream()
+ if True:
+ while True:
+ frames = audioQueue.get() #Get DataFrom the audioDriver (see _callbackfunction how the data arrives)
+ if not frames:
+ continue
+
+ framesData = np.frombuffer(frames, dtype=FORMAT)
+ AudioBuffer.extend(framesData[0::CHANNELS]) #Pick one audio channel and fill the ringbuffer.
+
+ if(AudioBuffer.is_full and # Waiting for the ringbuffer to be full at the beginning.
+ audioQueue.qsize()<2 and # Make sure there is not alot more new data that should be used.
+ time.time()>nextTimeStamp): # See `UPDATE_INTERVAL` above.
+
+ buffer = np.array(AudioBuffer) #Get the last second of audio.
+
+
+ volume = np.rint(np.sqrt(np.mean(buffer**2))*10000) # Compute the rms volume
+
+
+ VolumeHistory.append(volume)
+ volumneSlow = volume
+ volumechange = 0.0
+ if VolumeHistory.is_full:
+ HalfLength = int(np.round(VolumeHistory.maxlen/2))
+ vnew = np.array(VolumeHistory)[HalfLength:].mean()
+ vold = np.array(VolumeHistory)[:VolumeHistory.maxlen-HalfLength].mean()
+ volumechange =vnew-vold
+ volumneSlow = np.array(VolumeHistory).mean()
+
+ ## Computes the Frequency Foruier analysis on the Audio Signal.
+ N = buffer.shape[0]
+ window = hann(N)
+ amplitudes = np.abs(rfft(buffer*window))[25:] #Contains the volume for the different frequency bin.
+ frequencies = (rfftfreq(N, 1/SAMPLING_RATE)[:N//2])[25:] #Contains the Hz frequency values. for the different frequency bin.
+ '''
+ Combining the `amplitudes` and `frequencies` varialbes allows you to understand how loud a certain frequency is.
+
+ e.g. If you'd like to know the volume for 500Hz you could do the following.
+ 1. Find the frequency bin in which 500Hz belis closest to with:
+ FrequencyBin = np.abs(frequencies - 500).argmin()
+
+ 2. Look up the volume in that bin:
+ amplitudes[FrequencyBin]
+
+
+ The example below does something similar, just in revers.
+ It finds the loudest amplitued and its coresponding bin with `argmax()`.
+ The uses the index to look up the Freqeucny value.
+ '''
+
+
+ LoudestFrequency = frequencies[amplitudes.argmax()]
+
+ print("Loudest Frqeuncy:",LoudestFrequency)
+ print("RMS volume:",volumneSlow)
+ print("Volume Change:",volumechange)
+
+ nextTimeStamp = UPDATE_INTERVAL+time.time() # See `UPDATE_INTERVAL` above
+
+
+if __name__ == '__main__':
+ main()
+ print("Something happend with the audio example. Stopping!")
+
+
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/ListAvalibleAudioDevices.py b/Lab 5/converted_tflite_quantized/Audio_optional/ListAvalibleAudioDevices.py
new file mode 100644
index 0000000000..e7ec252610
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/Audio_optional/ListAvalibleAudioDevices.py
@@ -0,0 +1,10 @@
+import pyaudio
+
+pyaudio_instance = pyaudio.PyAudio()
+
+print("--- Starting audio device survey! ---")
+for i in range(pyaudio_instance.get_device_count()):
+ dev = pyaudio_instance.get_device_info_by_index(i)
+ name = dev['name'].encode('utf-8')
+ print(i, name, dev['maxInputChannels'], dev['maxOutputChannels'])
+
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/ListeningExercise.md b/Lab 5/converted_tflite_quantized/Audio_optional/ListeningExercise.md
new file mode 100644
index 0000000000..dd74f453da
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/Audio_optional/ListeningExercise.md
@@ -0,0 +1,24 @@
+## Listening Exercise
+Go through some of the videos below, listen to the sound, and write down the different sounds that belong to a certain context. Think about the impact a given sound has on how we construct our own contextual understanding. *Please try to not watch the video as you listen to the sound.*
+
+
+Ideally, write down the ideas in a table like this
+
+| Sound | Influence on the context | Implications for your behavior |
+| :---: | :---: | :---: |
+| car horn | creates a sense of urgency | look arround |
+| foot steps | ... | ... |
+| ... | ... |... |
+| ... | ... |... |
+
+
+Audio sources to use:
+- [Walking in Tokyo](https://www.youtube.com/watch?v=Et7O5-CzJZg)
+- [Resturant Ambiance](https://www.youtube.com/watch?v=xY0GEpbWreY)
+- [walking in Forest](https://www.youtube.com/watch?v=I-zPNQYHSvU)
+- [Working in a Coffe Shop](https://www.youtube.com/watch?v=714HdIgMt1g)
+- [Walking in Shanghai](https://www.youtube.com/watch?v=2uQ58Xwx1V4)
+- [Biking in the Netherlands](https://www.youtube.com/watch?v=siomblak2TI)
+- [Backyard Fountain](https://www.youtube.com/watch?v=Ez1f6Vp_UYk)
+
+
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextAndInteraction.md b/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextAndInteraction.md
new file mode 100644
index 0000000000..587b0ce82f
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextAndInteraction.md
@@ -0,0 +1,6 @@
+| **Context** (situational) | **Presence** (intent) | **Behavior** (reaction) |
+|----------------------------|-------------------------------------|-------------------------|
+| **Who is involved:** to make new lines use `` `` between words | **Task goals:** | **Implicit behaviors:** |
+| **What is making noises:** | **When to stand-out (attention):** | **Explicit Actions:** |
+| **When:** | **When to blend in (distraction):** | |
+| **Where:** | | |
\ No newline at end of file
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextandInteraction.png b/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextandInteraction.png
new file mode 100644
index 0000000000..88330d66c3
Binary files /dev/null and b/Lab 5/converted_tflite_quantized/Audio_optional/ThinkingThroughContextandInteraction.png differ
diff --git a/Lab 5/converted_tflite_quantized/Audio_optional/audio.md b/Lab 5/converted_tflite_quantized/Audio_optional/audio.md
new file mode 100644
index 0000000000..5bce540752
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/Audio_optional/audio.md
@@ -0,0 +1,39 @@
+#### Filtering, FFTs, and Time Series data.
+> **_NOTE:_** This section is from an earlier version of the class.
+
+Additional filtering and analysis can be done on the sensors that were provided in the kit. For example, running a Fast Fourier Transform over the IMU or Microphone data stream could create a simple activity classifier between walking, running, and standing.
+
+To get the microphone working we need to install two libraries. `PyAudio` to get the data from the microphone, `sciPy` to make data analysis easy, and the `numpy-ringbuffer` to keep track of the last ~1 second of audio.
+Pyaudio needs to be installed with the following comand:
+``sudo apt install python3-pyaudio``
+SciPy is installed with
+``sudo apt install python3-scipy``
+
+Lastly we need numpy-ringbuffer, to make continues data anlysis easier.
+``pip install numpy-ringbuffer``
+
+Now try the audio processing example:
+* Find what ID the micrpohone has with `python ListAvalibleAudioDevices.py`
+ Look for a device name that includes `USB` in the name.
+* Adjust the variable `DEVICE_INDEX` in the `ExampleAudioFFT.py` file.
+ See if you are getting results printed out from the microphone. Try to understand how the code works.
+ Then run the file by typing `python ExampleAudioFFT.py`
+
+
+
+Using the microphone, try one of the following:
+
+**1. Set up threshold detection** Can you identify when a signal goes above certain fixed values?
+
+**2. Set up a running averaging** Can you set up a running average over one of the variables that are being calculated.[moving average](https://en.wikipedia.org/wiki/Moving_average)
+
+**3. Set up peak detection** Can you identify when your signal reaches a peak and then goes down?
+
+For technical references:
+
+* Volume Calculation with [RootMeanSqare](https://en.wikipedia.org/wiki/Root_mean_square)
+* [RingBuffer](https://en.wikipedia.org/wiki/Circular_buffer)
+* [Frequency Analysis](https://en.wikipedia.org/wiki/Fast_Fourier_transform)
+
+
+**\*\*\*Include links to your code here, and put the code for these in your repo--they will come in handy later.\*\*\***
diff --git a/Lab 5/converted_tflite_quantized/labels.txt b/Lab 5/converted_tflite_quantized/labels.txt
new file mode 100644
index 0000000000..b0b82139b6
--- /dev/null
+++ b/Lab 5/converted_tflite_quantized/labels.txt
@@ -0,0 +1,2 @@
+0 Class 1
+1 Class 2
diff --git a/Lab 5/frame.jpg b/Lab 5/frame.jpg
new file mode 100644
index 0000000000..d30b488ecc
Binary files /dev/null and b/Lab 5/frame.jpg differ
diff --git a/Lab 5/hand_pose.py b/Lab 5/hand_pose.py
new file mode 100644
index 0000000000..bc7e87a5b1
--- /dev/null
+++ b/Lab 5/hand_pose.py
@@ -0,0 +1,120 @@
+import cv2
+import time
+import numpy as np
+import HandTrackingModule as htm
+import math
+from ctypes import cast, POINTER
+import subprocess
+
+import subprocess
+
+def set_volume(percent):
+ subprocess.run(
+ ["pactl", "set-sink-volume", "@DEFAULT_SINK@", f"{int(percent)}%"],
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL
+ )
+
+
+# def mute(on=True):
+# subprocess.run(
+# ["pactl", "set-sink-mute", "bluez_output.41_42_EC_28_FB_80.1", "1" if on else "0"],
+# stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL
+# )
+
+# initial volume
+set_volume(50)
+
+def play_audio():
+ command = ['./loop_audio.sh']
+ process = subprocess.Popen(command, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+ print('Looping audio playback - press q to quit')
+ return process # Return the process object
+
+audio_process = play_audio()
+
+################################
+wCam, hCam = 640, 480
+################################
+
+cap = cv2.VideoCapture(0)
+cap.set(3, wCam)
+cap.set(4, hCam)
+pTime = 0
+
+detector = htm.handDetector(detectionCon=int(0.7))
+minVol = 0
+maxVol = 100
+vol = 0
+volBar = 400
+volPer = 0
+while True:
+ success, img = cap.read()
+ img = detector.findHands(img)
+ lmList = detector.findPosition(img, draw=False)
+ if len(lmList) != 0:
+
+ thumbX, thumbY = lmList[4][1], lmList[4][2] #thumb
+ pointerX, pointerY = lmList[8][1], lmList[8][2] #pointer
+
+ middleX, middleY = lmList[12][1], lmList[12][2]
+ ringX, ringY = lmList[16][1], lmList[16][2]
+ pinkyX, pinkyY = lmList[20][1], lmList[20][2]
+
+ cx, cy = (thumbX + pointerX) // 2, (thumbY + pointerY) // 2
+
+ cv2.circle(img, (thumbX, thumbY), 15, (255, 0, 255), cv2.FILLED)
+ cv2.circle(img, (pointerX, pointerY), 15, (255, 0, 255), cv2.FILLED)
+ cv2.circle(img, (middleX, middleY), 15, (255, 0, 255), cv2.FILLED)
+ cv2.circle(img, (ringX, ringY), 15, (255, 0, 255), cv2.FILLED)
+ cv2.circle(img, (pinkyX, pinkyY), 15, (255, 0, 255), cv2.FILLED)
+ cv2.line(img, (thumbX, thumbY), (pointerX, pointerY), (255, 0, 255), 3)
+ cv2.circle(img, (cx, cy), 15, (255, 0, 255), cv2.FILLED)
+
+ len_calc = lambda x1,y1,x2,y2: math.hypot(x2 - x1, y2 - y1)
+ length = len_calc(thumbX,thumbY,pointerX,pointerY)
+ length1 = len_calc(pointerX,pointerY,middleX,middleY)
+ length2 = len_calc(middleX, middleY, ringX, ringY)
+ length3 = len_calc(ringX, ringY, pinkyX, pinkyY)
+ length4 = len_calc(thumbX,thumbY, ringX, ringY)
+ print(length1,length2,length3)
+ condition = length>100 and length1>100 and length2<100 and length3>100 and length4<100
+ if condition:
+ set_volume(0)
+ volPer = 0
+ volBar = 400
+ print("CONDITION")
+ cv2.putText(img, 'quiet coyote!', (40, 70), cv2.FONT_HERSHEY_COMPLEX,
+ 1, (255, 255, 255), 3)
+ else:
+
+ vol = np.interp(length, [50, 300], [minVol, maxVol])
+ volBar = np.interp(length, [50, 300], [400, 150])
+ volPer = np.interp(length, [50, 300], [0, 100])
+ set_volume(int(vol))
+
+ print(int(length), vol)
+
+
+ if length < 50:
+ cv2.circle(img, (cx, cy), 15, (0, 255, 0), cv2.FILLED)
+
+ cv2.rectangle(img, (50, 150), (85, 400), (255, 0, 0), 3)
+ cv2.rectangle(img, (50, int(volBar)), (85, 400), (255, 0, 0), cv2.FILLED)
+ cv2.putText(img, f'{int(volPer)} %', (40, 450), cv2.FONT_HERSHEY_COMPLEX,
+ 1, (255, 0, 0), 3)
+
+
+ cTime = time.time()
+ fps = 1 / (cTime - pTime)
+ pTime = cTime
+ cv2.putText(img, f'FPS: {int(fps)}', (40, 50), cv2.FONT_HERSHEY_COMPLEX,
+ 1, (255, 0, 0), 3)
+
+ cv2.imshow("Img", img)
+ if cv2.waitKey(1) & 0xFF == ord('q'): # Press 'q' to quit
+ audio_process.terminate() #
+ break
+
+cap.release()
+cv2.destroyAllWindows()
diff --git a/Lab 5/infer.py b/Lab 5/infer.py
new file mode 100644
index 0000000000..d493136ee6
--- /dev/null
+++ b/Lab 5/infer.py
@@ -0,0 +1,93 @@
+"""
+Real-time image classification using OpenCV and PyTorch.
+
+Loads a PyTorch image classification model and quantizes it for efficient
+inference. Opens a webcam feed, runs images through model to predict top classes.
+
+from https://pytorch.org/tutorials/intermediate/realtime_rpi.html
+"""
+
+
+
+
+import time
+
+import torch
+import numpy as np
+from torchvision import models, transforms
+
+import cv2
+from PIL import Image
+
+import json
+
+#open classes as dict
+with open('classes.json') as f:
+ classes = json.load(f)
+
+
+torch.backends.quantized.engine = 'qnnpack'
+#video capture setup
+cap = cv2.VideoCapture(0, cv2.CAP_V4L2)
+cap.set(cv2.CAP_PROP_FRAME_WIDTH, 224)
+cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 224)
+cap.set(cv2.CAP_PROP_FPS, 36)
+
+#preprocess
+preprocess = transforms.Compose([
+ # convert the frame to a CHW torch tensor for training
+ transforms.ToTensor(),
+ # normalize the colors to the range that mobilenet_v2/3 expect
+ transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
+])
+
+
+#get model - others can be found here https://pytorch.org/tutorials/intermediate/realtime_rpi.html
+net = models.quantization.mobilenet_v2(pretrained=True, quantize=True)
+# jit model to take it from ~20fps to ~30fps
+net = torch.jit.script(net)
+
+
+started = time.time()
+last_logged = time.time()
+frame_count = 0
+
+with torch.no_grad():
+ while True:
+ # read frame
+ ret, image = cap.read()
+ print('read')
+ if not ret:
+ raise RuntimeError("failed to read frame")
+
+ # convert opencv output from BGR to RGB
+ image = image[:, :, [2, 1, 0]]
+ print('image', image.shape)
+ permuted = image
+
+ # preprocess
+ input_tensor = preprocess(image)
+ print('preprocessing finished')
+
+ # create a mini-batch as expected by the model
+ # The model can handle multiple images simultaneously so we need to add an
+ # empty dimension for the batch.
+ # [3, 224, 224] -> [1, 3, 224, 224]
+ input_batch = input_tensor.unsqueeze(0)
+
+ # run model
+ output = net(input_batch)
+ top = list(enumerate(output[0].softmax(dim=0)))
+ top.sort(key=lambda x: x[1], reverse=True)
+ for idx, val in top[:2]:
+ print(f"{val.item()*100:.2f}% {classes[str(idx)]}")
+
+
+ # log model performance
+ frame_count += 1
+ now = time.time()
+ if now - last_logged > 1:
+ print(f"{frame_count / (now-last_logged)} fps")
+ last_logged = now
+ frame_count = 0
+
diff --git a/Lab 5/labels.txt b/Lab 5/labels.txt
new file mode 100644
index 0000000000..b4960505a6
--- /dev/null
+++ b/Lab 5/labels.txt
@@ -0,0 +1,2 @@
+0 Phone
+1 Zooming
diff --git a/Lab 5/loop_audio.sh b/Lab 5/loop_audio.sh
new file mode 100644
index 0000000000..8d6acbb4ae
--- /dev/null
+++ b/Lab 5/loop_audio.sh
@@ -0,0 +1,17 @@
+#!/bin/bash
+
+# Function to handle termination signals
+cleanup() {
+ kill $APLAY_PID 2>/dev/null # Terminate the aplay process
+ exit 0
+}
+
+trap cleanup TERM INT # Set up signal handlers
+
+while :
+do
+ aplay -D pulse Peaceful_Mind.wav &
+ APLAY_PID=$! # Store the PID of the aplay process
+ wait $APLAY_PID # Wait for the aplay process to complete
+ # aplay -D pulse Peaceful_Mind.wav
+done
diff --git a/Lab 5/model.tflite b/Lab 5/model.tflite
new file mode 100644
index 0000000000..b04ebaa368
Binary files /dev/null and b/Lab 5/model.tflite differ
diff --git a/Lab 5/moondream_simple.py b/Lab 5/moondream_simple.py
new file mode 100644
index 0000000000..e912ecf0ae
--- /dev/null
+++ b/Lab 5/moondream_simple.py
@@ -0,0 +1,128 @@
+#!/usr/bin/env python3
+"""
+Simple Moondream Vision Demo
+Captures image from webcam and asks Moondream to describe it
+Uses OpenCV like other Lab 5 scripts
+"""
+
+import cv2
+import requests
+import base64
+import time
+
+def capture_image(filename="captured_image.jpg"):
+ """Capture image from webcam using OpenCV"""
+ print("Opening camera...")
+
+ # Open webcam (same as infer.py and hand_pose.py)
+ cap = cv2.VideoCapture(0)
+ cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
+ cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
+
+ if not cap.isOpened():
+ print("Error: Could not open camera")
+ return None
+
+ print("Camera warming up...")
+ # Let camera warm up - need more time for proper exposure
+ time.sleep(2)
+ for i in range(30):
+ cap.read()
+
+ # Capture frame
+ print("Smile! Capturing in 3...")
+ time.sleep(1)
+ print("2...")
+ time.sleep(1)
+ print("1...")
+ time.sleep(1)
+ print("*CLICK*")
+
+ ret, frame = cap.read()
+ cap.release()
+
+ if not ret:
+ print("Error: Could not capture image")
+ return None
+
+ # Save image
+ cv2.imwrite(filename, frame)
+ print(f"Image saved as: {filename}")
+ return filename
+
+def ask_moondream(image_path, prompt="What do you see in this image? Describe it."):
+ """Ask Moondream about the image with streaming response"""
+
+ # Encode image to base64
+ with open(image_path, 'rb') as f:
+ image_data = base64.b64encode(f.read()).decode('utf-8')
+
+ print(f"\nAsking Moondream: {prompt}")
+ print("\nMoondream: ", end="", flush=True)
+
+ try:
+ # Query Moondream with streaming
+ response = requests.post(
+ "http://localhost:11434/api/generate",
+ json={
+ "model": "moondream:latest",
+ "prompt": prompt,
+ "images": [image_data],
+ "stream": True
+ },
+ timeout=300, # Increased timeout to 5 minutes
+ stream=True
+ )
+
+ if response.status_code == 200:
+ full_response = ""
+ for line in response.iter_lines():
+ if line:
+ import json
+ chunk = json.loads(line)
+ token = chunk.get('response', '')
+ print(token, end="", flush=True)
+ full_response += token
+
+ print("\n") # New line after response
+ return full_response
+ else:
+ print(f"\nError: {response.status_code}")
+ return None
+ except requests.exceptions.Timeout:
+ print("\n[TIMEOUT] Moondream is taking too long. The model might be processing a large image.")
+ print("Tip: Try using a smaller image or wait for the model to finish loading.")
+ return None
+ except Exception as e:
+ print(f"\n[ERROR] {e}")
+ return None
+
+def main():
+ print("Moondream Simple Vision Demo")
+ print("=" * 50)
+
+ # Capture image
+ image_path = capture_image()
+
+ if not image_path:
+ print("Failed to capture image. Exiting.")
+ return
+
+ # Ask Moondream to describe it
+ ask_moondream(image_path)
+
+ # Allow follow-up questions
+ print("\nAsk questions about the image (or 'quit' to exit):")
+ while True:
+ question = input("\nYou: ").strip()
+ if question.lower() in ['quit', 'exit', 'q', '']:
+ break
+ if question:
+ result = ask_moondream(image_path, question)
+ if result is None:
+ print("Error getting response. Try again or type 'quit' to exit.")
+
+ print("Done!")
+
+if __name__ == "__main__":
+ main()
diff --git a/Lab 5/prep.md b/Lab 5/prep.md
new file mode 100644
index 0000000000..c9d6bb20fc
--- /dev/null
+++ b/Lab 5/prep.md
@@ -0,0 +1,18 @@
+Create a new virtualenvironment
+``````
+cd Lab\ 5
+python -m venv .venv
+source .venv/bin/activate
+``````
+
+Install dependencies in it
+``````
+(.venv) pip install -r requirements.txt
+``````
+
+### Mediapipe
+
+Make the loop_audio shell script executable
+```
+sudo chmod +x ./loop_audio.sh
+```
diff --git a/Lab 5/raspberrypicontroller.py b/Lab 5/raspberrypicontroller.py
new file mode 100644
index 0000000000..a1086d3168
--- /dev/null
+++ b/Lab 5/raspberrypicontroller.py
@@ -0,0 +1,1189 @@
+"""
+Sphero Ollie & BB-8 Controller (RPI)
+
+- Records wall-clock receive time for each instruction (SSE/UDP) with channel tag.
+- Computes per-step and cross-host deltas (ms):
+ pc_pipeline_ms (from PC), pc->pi network, recv->apply, apply->ble,
+ cap->recv, cap->apply, cap->ble.
+- Rolling 10s averages dumped to controller_metrics.jsonl
+- RobotWorker accepts (seq, apply_ts, pc_cap_ts) to attribute BLE write timing to the correct instruction.
+"""
+
+import json
+import logging
+import math
+import threading
+import time
+import inspect
+import os
+import shutil
+import subprocess
+import socket
+from dataclasses import dataclass
+from typing import Optional, Tuple
+from collections import defaultdict
+
+import tkinter as tk
+from tkinter import ttk
+from urllib.request import Request, urlopen
+from urllib.error import URLError, HTTPError
+
+from spherov2 import scanner
+from spherov2.sphero_edu import SpheroEduAPI
+from spherov2.scanner import ToyNotFoundError
+from spherov2.types import Color
+
+# Event type (collision) — try a couple of locations (lib variants)
+try:
+ from spherov2.sphero_edu import EventType
+except Exception:
+ try:
+ from spherov2.types import EventType
+ except Exception:
+ EventType = None
+
+# ---------------------------- Metrics --------------------------------------- #
+
+class RollingMetrics:
+ """
+ Thread-safe rolling 10s averages dumped to JSONL.
+ On this Pi we derive cross-host deltas using wall clock times attached by the PC.
+ """
+ def __init__(self, path: str, window_s: float = 10.0):
+ self.path = path
+ self.window_s = window_s
+ self._lock = threading.Lock()
+ self._sums = defaultdict(float) # metric -> sum(ms)
+ self._counts = defaultdict(int) # metric -> count
+ self._counters = defaultdict(int) # counter name -> count
+ self._chan_counts = defaultdict(int) # 'udp'/'sse'
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._loop, daemon=True)
+ self._thread.start()
+
+ def stop(self):
+ self._stop.set()
+ self._flush()
+
+ def add_sample(self, name: str, value_ms: float):
+ if value_ms is None:
+ return
+ with self._lock:
+ self._sums[name] += float(value_ms)
+ self._counts[name] += 1
+
+ def incr_counter(self, name: str, inc: int = 1):
+ with self._lock:
+ self._counters[name] += inc
+
+ def incr_channel(self, chan: str, inc: int = 1):
+ with self._lock:
+ self._chan_counts[chan] += inc
+
+ def _loop(self):
+ next_t = time.time() + self.window_s
+ while not self._stop.is_set():
+ now = time.time()
+ if now >= next_t:
+ self._flush()
+ next_t = now + self.window_s
+ time.sleep(0.2)
+
+ def _flush(self):
+ with self._lock:
+ if not self._counts and not self._counters and not self._chan_counts:
+ return
+ avg = {}
+ for k, s in self._sums.items():
+ c = max(1, self._counts.get(k, 0))
+ if self._counts.get(k, 0) == 0:
+ continue
+ avg[k] = s / c
+
+ chan_share = {}
+ total_chan = sum(self._chan_counts.values()) or 1
+ for k, v in self._chan_counts.items():
+ chan_share[k] = v / total_chan
+
+ out = {
+ "ts": time.time(),
+ "window_s": self.window_s,
+ "avg_ms": avg,
+ "counts": dict(self._counters),
+ "channel_share": chan_share,
+ }
+ try:
+ with open(self.path, "a", encoding="utf-8") as f:
+ f.write(json.dumps(out, separators=(",", ":")) + "\n")
+ except Exception:
+ pass
+ # reset window
+ self._sums.clear()
+ self._counts.clear()
+ self._counters.clear()
+ self._chan_counts.clear()
+
+# Global metrics (file path overridable via env)
+METRICS_PATH = os.environ.get("CONTROLLER_METRICS_FILE", "controller_metrics.jsonl")
+METRICS = RollingMetrics(METRICS_PATH, window_s=10.0)
+
+# ---------------------------- Bluetooth Helpers ----------------------------- #
+
+def _run_cmd(args) -> Tuple[int, str]:
+ try:
+ p = subprocess.run(args, capture_output=True, text=True, check=False)
+ return p.returncode, (p.stdout or "") + (p.stderr or "")
+ except Exception as e:
+ return 127, str(e)
+
+def _bluetooth_status(log: Optional[logging.Logger] = None):
+ info = {
+ "present": False,
+ "powered": False,
+ "soft_blocked": False,
+ "hard_blocked": False,
+ "service_active": False,
+ "controller": None,
+ }
+ _, out = _run_cmd(["systemctl", "is-active", "bluetooth"])
+ info["service_active"] = (out.strip() == "active")
+ btctl = shutil.which("bluetoothctl")
+ if not btctl:
+ if log:
+ log.error("bluetoothctl not found; install BlueZ.")
+ return info
+ _, out = _run_cmd([btctl, "show"])
+ if "Controller" in out:
+ info["present"] = True
+ for line in out.splitlines():
+ if line.strip().startswith("Controller "):
+ info["controller"] = line.strip().split(" ", 2)[1]
+ if line.strip().startswith("Powered:"):
+ powered_val = line.split(":", 1)[1].strip().lower()
+ info["powered"] = (powered_val == "yes")
+ else:
+ info["present"] = False
+ rfkill = shutil.which("rfkill")
+ if rfkill:
+ _, r = _run_cmd([rfkill, "list", "bluetooth"])
+ if "Soft blocked: yes" in r:
+ info["soft_blocked"] = True
+ if "Hard blocked: yes" in r:
+ info["hard_blocked"] = True
+ return info
+
+def ensure_bluetooth_powered(log: logging.Logger, attempt_fix: bool = True) -> Tuple[bool, str]:
+ st = _bluetooth_status(log)
+ if st["present"] and st["powered"]:
+ return True, "Bluetooth adapter present and powered."
+ btctl = shutil.which("bluetoothctl")
+ hint_lines = []
+ if not st["service_active"]:
+ hint_lines.append("bluetoothd is not active; run: sudo systemctl start bluetooth")
+ if st["soft_blocked"] or st["hard_blocked"]:
+ hint_lines.append("Bluetooth is rfkill blocked; run: sudo rfkill unblock bluetooth")
+ if not st["present"]:
+ hint_lines.append("No Bluetooth controller detected by bluetoothctl.")
+ if btctl is None:
+ hint_lines.append("Install BlueZ tools: sudo apt install bluez bluetooth pi-bluetooth")
+ return False, " | ".join(hint_lines) if hint_lines else "BlueZ tools missing."
+ if attempt_fix:
+ _run_cmd([btctl, "power", "on"])
+ rfkill = shutil.which("rfkill")
+ if rfkill and (st["soft_blocked"] or st["hard_blocked"]):
+ _run_cmd([rfkill, "unblock", "bluetooth"])
+ st2 = _bluetooth_status(log)
+ if st2["present"] and st2["powered"]:
+ return True, "Bluetooth adapter is now powered."
+ if not hint_lines:
+ hint_lines.append("Adapter is off; try: bluetoothctl power on")
+ return False, " | ".join(hint_lines)
+
+# ---------------------------- Logging to Tk Text ---------------------------- #
+
+class TkTextHandler(logging.Handler):
+ def __init__(self, text_widget: tk.Text):
+ super().__init__()
+ self.text_widget = text_widget
+ self.text_widget.configure(state="disabled")
+ def emit(self, record):
+ msg = self.format(record) + "\n"
+ def write():
+ self.text_widget.configure(state="normal")
+ self.text_widget.insert("end", msg)
+ self.text_widget.see("end")
+ self.text_widget.configure(state="disabled")
+ self.text_widget.after(0, write)
+
+# ---------------------------- Robot Worker Thread --------------------------- #
+
+@dataclass
+class DriveStateAbs:
+ heading: Optional[int] = None
+ speed: int = 0
+
+@dataclass
+class DriveStateRel:
+ rel_heading: int = 0
+ speed: int = 0
+
+class RobotWorker:
+ """
+ Maintains BLE connection & streams drive commands.
+ Low-latency tick = 0.025s (40 Hz).
+
+ Metrics hook:
+ - set_desired_relative(rel, speed, seq=None, apply_ts=None, pc_cap_ts=None)
+ stores seq + timestamps for attributing BLE write to that instruction.
+ """
+ def __init__(self, name: str, find_func, status_callback, log: logging.Logger):
+ self.name = name
+ self.find_func = find_func
+ self.status_callback = status_callback
+ self.log = log
+ self._thread: Optional[threading.Thread] = None
+ self._stop = threading.Event()
+ self._connected = threading.Event()
+ self._toy_name: Optional[str] = None
+
+ # Desired states
+ self._abs_desired = DriveStateAbs()
+ self._rel_desired = DriveStateRel()
+ self._control_mode = "ABS" # "ABS" or "REL"
+ self._reverse_mode = False
+
+ # Last sent absolute
+ self._last_sent = DriveStateAbs(heading=None, speed=-1)
+
+ # Rolling & heading bookkeeping
+ self._use_roll = False
+ self._roll_needs_duration = False
+ self._last_roll_keepalive_ts = 0.0
+ self._roll_keepalive_period = 0.25
+ self._roll_duration = 0.40
+
+ # Heading estimate (absolute)
+ self._abs_heading_est: int = 0
+ self._last_heading_poll_ts = 0.0
+ self._heading_poll_period = 0.25
+
+ # rotate helper
+ self._rotate_delta: Optional[threading.Event] = None
+ self._rotate_value: int = 0
+
+ # collision recovery
+ self._recover_request = False
+ self._recovering = False
+ self._recovery_speed = 120
+ self._recovery_duration = 0.5
+ self._recovery_rotate = 60
+
+ self._lock = threading.Lock()
+
+ # Metrics linkage for current instruction
+ self._m_seq: Optional[int] = None
+ self._m_apply_ts: Optional[float] = None
+ self._m_pc_cap_ts: Optional[float] = None
+ self._m_ble_recorded: bool = False
+
+ # lifecycle & connection omitted (unchanged patterns) ---------------------
+
+ def is_running(self) -> bool:
+ return self._thread is not None and self._thread.is_alive()
+
+ def is_connected(self) -> bool:
+ return self._connected.is_set()
+
+ def connect(self, toy_name: Optional[str] = None, timeout: float = 8.0):
+ if self.is_running():
+ self.log.info("%s: already running.", self.name)
+ return
+ self._toy_name = toy_name or None
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._run_loop, args=(timeout,), daemon=True)
+ self._thread.start()
+
+ def disconnect(self):
+ if not self.is_running():
+ return
+ self._stop.set()
+ self._thread.join(timeout=5)
+ if self._thread.is_alive():
+ self.log.warning("%s: worker didn't stop cleanly.", self.name)
+ self._thread = None
+
+ # commands from UI/SSE/UDP -----------------------------------------------
+
+ def set_desired_absolute(self, heading: Optional[int], speed: int):
+ with self._lock:
+ if heading is not None:
+ self._abs_desired.heading = int(heading) % 360
+ self._abs_desired.speed = max(0, min(255, int(speed)))
+ self._control_mode = "ABS"
+
+ def set_desired(self, heading: Optional[int], speed: int):
+ self.set_desired_absolute(heading, speed)
+
+ def set_desired_relative(self, rel_heading: int, speed: int,
+ seq: Optional[int] = None,
+ apply_ts: Optional[float] = None,
+ pc_cap_ts: Optional[float] = None):
+ with self._lock:
+ self._rel_desired.rel_heading = int(rel_heading) % 360
+ self._rel_desired.speed = max(0, min(255, int(speed)))
+ self._control_mode = "REL"
+ # Metrics linkage
+ if seq is not None:
+ if seq != self._m_seq:
+ self._m_ble_recorded = False
+ self._m_seq = seq
+ self._m_apply_ts = apply_ts
+ self._m_pc_cap_ts = pc_cap_ts
+
+ def stop_now(self):
+ with self._lock:
+ if self._control_mode == "ABS":
+ self._abs_desired.speed = 0
+ else:
+ self._rel_desired.speed = 0
+
+ def rotate_by(self, delta_deg: int):
+ with self._lock:
+ if self._rotate_delta is None:
+ self._rotate_delta = threading.Event()
+ self._rotate_value = int(delta_deg)
+ self._rotate_delta.set()
+
+ def set_reverse_mode(self, enabled: bool):
+ with self._lock:
+ self._reverse_mode = bool(enabled)
+
+ # internals ---------------------------------------------------------------
+
+ def _post_status(self, text: str):
+ self.status_callback(text)
+
+ def _enqueue_collision(self):
+ with self._lock:
+ self._recover_request = True
+
+ def _run_loop(self, timeout: float):
+ # Bluetooth check
+ self._post_status("Checking Bluetooth…")
+ ok, hint = ensure_bluetooth_powered(self.log, attempt_fix=True)
+ if not ok:
+ self._post_status(f"Bluetooth not ready: {hint}")
+ self.log.error("%s: Bluetooth not ready: %s", self.name, hint)
+ return
+
+ self._post_status("Scanning…")
+ self.log.info("%s: scanning (filter=%s, timeout=%.1fs)", self.name, self._toy_name, timeout)
+ toy = None
+ try:
+ toy = self.find_func(toy_name=self._toy_name, timeout=timeout)
+ except ToyNotFoundError:
+ self._post_status("Not found")
+ self.log.error("%s: toy not found.", self.name)
+ return
+ except Exception as e:
+ self._post_status(f"Scan error: {e}")
+ self.log.exception("%s: scan error", self.name)
+ return
+
+ self._post_status("Connecting…")
+ try:
+ with SpheroEduAPI(toy) as api:
+ # roll capability detection
+ roll_func = getattr(api, "roll", None)
+ if callable(roll_func):
+ try:
+ sig = inspect.signature(roll_func)
+ self._use_roll = True
+ self._roll_needs_duration = "duration" in sig.parameters
+ except Exception:
+ self._use_roll = True
+ self._roll_needs_duration = True
+ else:
+ self._use_roll = False
+ self._roll_needs_duration = False
+
+ self._connected.set()
+ safe_name = getattr(toy, "name", None) or self.name
+ self._post_status(f"Connected ({safe_name}) | roll={self._use_roll} dur={self._roll_needs_duration}")
+ self.log.info("%s: connected to %s", self.name, safe_name)
+
+ try:
+ api.set_stabilization(True)
+ except Exception:
+ pass
+ try:
+ api.set_main_led(Color(0, 80, 255))
+ api.set_speed(0)
+ except Exception:
+ self.log.debug("%s: initial LED/speed set failed.", self.name)
+
+ if EventType is not None:
+ try:
+ def _on_collision(api_obj):
+ self.log.info("%s: collision detected", self.name)
+ self._enqueue_collision()
+ api.register_event(EventType.on_collision, _on_collision)
+ except Exception as e:
+ self.log.debug("%s: event register failed: %s", self.name, e)
+
+ tick = 0.025 # 40 Hz
+ while not self._stop.is_set():
+ now = time.time()
+
+ if now - self._last_heading_poll_ts >= self._heading_poll_period:
+ try:
+ self._abs_heading_est = int(api.get_heading()) % 360
+ except Exception:
+ pass
+ self._last_heading_poll_ts = now
+
+ # Handle queued rotate
+ rotate_evt = None
+ rotate_val = 0
+ with self._lock:
+ if self._rotate_delta is not None and self._rotate_delta.is_set():
+ rotate_evt = self._rotate_delta
+ rotate_val = self._rotate_value
+ self._rotate_delta = None
+ if rotate_evt is not None:
+ try:
+ cur = self._abs_heading_est
+ new_heading = (int(cur) + int(rotate_val)) % 360
+ api.set_heading(new_heading)
+ self._last_sent.heading = new_heading
+ self._abs_heading_est = new_heading
+ self.log.info("%s: rotate_by %d° -> heading=%d", self.name, rotate_val, new_heading)
+ except Exception as e:
+ self.log.error("%s: rotate_by error: %s", self.name, e)
+
+ # Collision recovery
+ with self._lock:
+ do_recover = self._recover_request and not self._recovering
+ if do_recover:
+ self._recover_request = False
+ self._recovering = True
+ if do_recover:
+ try:
+ self._do_collision_recovery(api)
+ finally:
+ self._recovering = False
+ continue
+
+ # Build command from mode
+ with self._lock:
+ mode = self._control_mode
+ reverse = self._reverse_mode
+ if mode == "ABS":
+ desired_speed = self._abs_desired.speed
+ cmd_heading_abs = self._abs_desired.heading if self._abs_desired.heading is not None else (self._last_sent.heading or 0)
+ else: # REL
+ desired_speed = self._rel_desired.speed
+ rel = self._rel_desired.rel_heading
+ base = self._abs_heading_est
+ cmd_heading_abs = (base + rel) % 360
+ if reverse:
+ cmd_heading_abs = (cmd_heading_abs + 180) % 360
+
+ allow_heading_update = desired_speed > 0
+ changed_heading = (self._last_sent.heading is None) or (self._delta_deg(self._last_sent.heading, cmd_heading_abs) >= 1)
+ changed_speed = (self._last_sent.speed != desired_speed)
+
+ try:
+ did_send = False
+ if self._use_roll and self._roll_needs_duration:
+ need_send = (changed_speed or (allow_heading_update and changed_heading)
+ or (now - self._last_roll_keepalive_ts > self._roll_keepalive_period))
+ if need_send:
+ api.roll(int(cmd_heading_abs), int(desired_speed), float(self._roll_duration))
+ self._last_sent.heading = int(cmd_heading_abs)
+ self._last_sent.speed = int(desired_speed)
+ self._last_roll_keepalive_ts = now
+ did_send = True
+ elif self._use_roll and not self._roll_needs_duration:
+ if changed_speed or (allow_heading_update and changed_heading):
+ api.roll(int(cmd_heading_abs), int(desired_speed))
+ self._last_sent.heading = int(cmd_heading_abs)
+ self._last_sent.speed = int(desired_speed)
+ did_send = True
+ else:
+ if allow_heading_update and changed_heading:
+ api.set_heading(int(cmd_heading_abs))
+ self._last_sent.heading = int(cmd_heading_abs)
+ did_send = True
+ if changed_speed or (allow_heading_update and changed_heading):
+ api.set_speed(int(desired_speed))
+ self._last_sent.speed = int(desired_speed)
+ did_send = True
+
+ # Metrics: first BLE write for current seq
+ if did_send:
+ with self._lock:
+ if self._m_seq is not None and not self._m_ble_recorded and self._m_apply_ts is not None:
+ ble_ts = time.time()
+ METRICS.incr_counter("ble_writes", 1)
+ METRICS.add_sample("apply_to_ble_ms", (ble_ts - self._m_apply_ts) * 1000.0)
+ if self._m_pc_cap_ts:
+ METRICS.add_sample("cap_to_ble_ms", (ble_ts - self._m_pc_cap_ts) * 1000.0)
+ self._m_ble_recorded = True
+
+ except TypeError as te:
+ self.log.warning("%s: roll signature mismatch (%s); fallback.", self.name, te)
+ self._use_roll = False
+ continue
+ except Exception as e:
+ self.log.error("%s: drive error: %s", self.name, e)
+ break
+
+ time.sleep(tick)
+
+ # Stop gracefully
+ try:
+ api.stop_roll()
+ api.set_speed(0)
+ except Exception:
+ pass
+ except Exception as e:
+ self._post_status(f"Error: {e}")
+ self.log.exception("%s: error maintaining connection", self.name)
+ finally:
+ self._connected.clear()
+ self._post_status("Disconnected")
+ self.log.info("%s: disconnected", self.name)
+
+ def _do_collision_recovery(self, api: SpheroEduAPI):
+ try:
+ cur = self._abs_heading_est
+ back = (cur + 180) % 360
+ if self._use_roll and self._roll_needs_duration:
+ api.roll(int(back), int(self._recovery_speed), float(self._recovery_duration))
+ elif self._use_roll:
+ api.roll(int(back), int(self._recovery_speed))
+ time.sleep(self._recovery_duration)
+ else:
+ api.set_heading(int(back))
+ api.set_speed(int(self._recovery_speed))
+ time.sleep(self._recovery_duration)
+ try:
+ api.set_speed(0)
+ except Exception:
+ pass
+ turn = (cur + self._recovery_rotate) % 360
+ try:
+ api.set_heading(int(turn))
+ self._last_sent.heading = int(turn)
+ self._abs_heading_est = int(turn)
+ except Exception:
+ pass
+ try:
+ api.set_speed(0)
+ self._last_sent.speed = 0
+ except Exception:
+ pass
+ except Exception as e:
+ self.log.error("%s: recovery failed: %s", self.name, e)
+
+ @staticmethod
+ def _delta_deg(a: int, b: int) -> float:
+ d = (b - a + 180) % 360 - 180
+ return abs(d)
+
+# ---------------------------- SSE (instruction) Client ---------------------- #
+
+class SseClient:
+ """
+ Minimal SSE client using urllib (no extra deps).
+ Parses line-by-line to avoid buffering; calls on_control(dict).
+ """
+ def __init__(self, base_url: str, on_control, status_callback, logger: logging.Logger, reconnect_initial=0.8):
+ self.base = base_url.rstrip("/")
+ self.url = f"{self.base}/events"
+ self.on_control = on_control
+ self.on_status = status_callback
+ self.log = logger
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+ self._connected = False
+ self._lock = threading.Lock()
+ self._backoff = reconnect_initial
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._run, daemon=True)
+ self._thread.start()
+
+ def stop(self):
+ self._stop.set()
+ if self._thread:
+ self._thread.join(timeout=2.0)
+ self._thread = None
+
+ def is_connected(self) -> bool:
+ with self._lock:
+ return self._connected
+
+ def _set_connected(self, val: bool):
+ with self._lock:
+ self._connected = val
+
+ def _run(self):
+ headers = {
+ "Accept": "text/event-stream",
+ "Cache-Control": "no-cache",
+ "Pragma": "no-cache",
+ "User-Agent": "Controller/1.0",
+ "Connection": "keep-alive",
+ }
+ while not self._stop.is_set():
+ try:
+ self.on_status("Connecting…")
+ req = Request(self.url, headers=headers, method="GET")
+ with urlopen(req, timeout=30) as resp:
+ ct = resp.headers.get("Content-Type", "")
+ if "text/event-stream" not in ct:
+ self.log.warning("SSE: unexpected content-type: %r", ct)
+ self._backoff = 0.8
+ self._set_connected(True)
+ self.on_status("Connected")
+
+ event_name = "message"
+ data_lines = []
+ while not self._stop.is_set():
+ line = resp.readline()
+ if not line:
+ raise EOFError("SSE stream ended")
+ s = line.decode("utf-8", "replace").rstrip("\r\n")
+ if s.startswith(":"):
+ continue
+ if s.startswith("event:"):
+ event_name = s[6:].strip() or "message"
+ continue
+ if s.startswith("data:"):
+ data_lines.append(s[5:].lstrip())
+ continue
+ if s == "":
+ if event_name == "control" and data_lines:
+ raw = "\n".join(data_lines)
+ data_lines.clear()
+ try:
+ obj = json.loads(raw)
+ obj["_arrival"] = time.time()
+ obj["rpi_recv_ts"] = obj["_arrival"]
+ obj["chan"] = "sse"
+ METRICS.incr_counter("instr_recv", 1)
+ METRICS.incr_channel("sse", 1)
+ # Cross-host deltas when available
+ pc_send = obj.get("pc_send_ts")
+ pc_cap = obj.get("pc_cap_ts")
+ pc_pipe = obj.get("pc_cap_to_send_ms")
+ if isinstance(pc_pipe, (int, float)):
+ METRICS.add_sample("pc_pipeline_ms", float(pc_pipe))
+ if isinstance(pc_send, (int, float)):
+ METRICS.add_sample("net_pc_to_pi_ms", (obj["rpi_recv_ts"] - pc_send) * 1000.0)
+ if isinstance(pc_cap, (int, float)):
+ METRICS.add_sample("cap_to_recv_ms", (obj["rpi_recv_ts"] - pc_cap) * 1000.0)
+ self.on_control(obj)
+ except Exception as e:
+ self.log.debug("SSE: bad control data: %s", e)
+ else:
+ data_lines.clear()
+ event_name = "message"
+
+ except (URLError, HTTPError, TimeoutError, ConnectionResetError, EOFError) as e:
+ if self._stop.is_set():
+ break
+ self._set_connected(False)
+ self.on_status(f"Reconnecting in {self._backoff:.1f}s… ({e})")
+ time.sleep(self._backoff)
+ self._backoff = min(5.0, self._backoff * 1.7)
+ except Exception as e:
+ if self._stop.is_set():
+ break
+ self._set_connected(False)
+ self.on_status(f"Error: {e!s} — reconnecting in {self._backoff:.1f}s…")
+ time.sleep(self._backoff)
+ self._backoff = min(5.0, self._backoff * 1.7)
+ finally:
+ self._set_connected(False)
+
+# ---------------------------- UDP Fast Path Receiver ------------------------ #
+
+class UdpReceiver:
+ """
+ Listens for JSON lines from the PC; now includes timing/seq.
+ """
+ def __init__(self, host: str, port: int, on_packet, logger: logging.Logger):
+ self.host = host
+ self.port = port
+ self.on_packet = on_packet
+ self.log = logger
+ self._stop = threading.Event()
+ self._thread: Optional[threading.Thread] = None
+
+ def start(self):
+ if self._thread and self._thread.is_alive():
+ return
+ self._stop.clear()
+ self._thread = threading.Thread(target=self._run, daemon=True)
+ self._thread.start()
+ self.log.info("UDP: listening on %s:%d", self.host, self.port)
+
+ def stop(self):
+ self._stop.set()
+ if self._thread:
+ self._thread.join(timeout=1.0)
+ self._thread = None
+
+ def _run(self):
+ sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ try:
+ sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
+ sock.bind((self.host, self.port))
+ sock.settimeout(0.2)
+ while not self._stop.is_set():
+ try:
+ data, _addr = sock.recvfrom(2048)
+ except socket.timeout:
+ continue
+ except Exception:
+ break
+ try:
+ s = data.decode("utf-8", "replace").strip()
+ if not s:
+ continue
+ obj = json.loads(s)
+ obj["_arrival"] = time.time()
+ obj["rpi_recv_ts"] = obj["_arrival"]
+ obj["chan"] = "udp"
+ METRICS.incr_counter("instr_recv", 1)
+ METRICS.incr_channel("udp", 1)
+ pc_send = obj.get("pc_send_ts")
+ pc_cap = obj.get("pc_cap_ts")
+ pc_pipe = obj.get("pc_cap_to_send_ms")
+ if isinstance(pc_pipe, (int, float)):
+ METRICS.add_sample("pc_pipeline_ms", float(pc_pipe))
+ if isinstance(pc_send, (int, float)):
+ METRICS.add_sample("net_pc_to_pi_ms", (obj["rpi_recv_ts"] - pc_send) * 1000.0)
+ if isinstance(pc_cap, (int, float)):
+ METRICS.add_sample("cap_to_recv_ms", (obj["rpi_recv_ts"] - pc_cap) * 1000.0)
+ self.on_packet(obj)
+ except Exception:
+ continue
+ finally:
+ try:
+ sock.close()
+ except Exception:
+ pass
+
+# ---------------------------- GUI Application (no video) -------------------- #
+
+class JoystickCanvas(tk.Canvas):
+ def __init__(self, master, radius=140, knob_radius=16, **kwargs):
+ size = radius * 2 + 12
+ super().__init__(master, width=size, height=size, bg="#101418", highlightthickness=0, **kwargs)
+ self.radius = radius
+ self.knob_r = knob_radius
+ self.center = (size // 2, size // 2)
+ self.knob_pos = list(self.center)
+ self._dragging = False
+ self._draw_static()
+ self._draw_knob(*self.center)
+ self.bind("", self._on_press)
+ self.bind("", self._on_drag)
+ self.bind("", self._on_release)
+
+ def _draw_static(self):
+ cx, cy = self.center
+ r = self.radius
+ self.create_oval(cx - r, cy - r, cx + r, cy + r, outline="#3a3f45", width=3)
+ self.create_line(cx - r, cy, cx + r, cy, fill="#2a2f35")
+ self.create_line(cx, cy - r, cx, cy + r, fill="#2a2f35")
+
+ def _draw_knob(self, x, y):
+ if hasattr(self, "_knob_id"):
+ self.delete(self._knob_id)
+ self._knob_id = self.create_oval(x - self.knob_r, y - self.knob_r, x + self.knob_r, y + self.knob_r,
+ fill="#28a0ff", outline="#cfe9ff", width=2)
+
+ def _on_press(self, e):
+ self._dragging = True
+ self._move_knob(e.x, e.y)
+
+ def _on_drag(self, e):
+ if self._dragging:
+ self._move_knob(e.x, e.y)
+
+ def _on_release(self, _e):
+ self._dragging = False
+
+ def _move_knob(self, x, y):
+ cx, cy = self.center
+ dx, dy = x - cx, y - cy
+ dist = math.hypot(dx, dy)
+ if dist > self.radius:
+ scale = self.radius / dist
+ dx *= scale; dy *= scale
+ x, y = int(cx + dx), int(cy + dy)
+ self.knob_pos = [x, y]
+ self._draw_knob(x, y)
+
+ def reset(self):
+ self.knob_pos = list(self.center)
+ self._draw_knob(*self.center)
+
+ def pose(self) -> Tuple[int, float]:
+ cx, cy = self.center
+ x, y = self.knob_pos
+ vx, vy = (x - cx), (cy - y)
+ mag = max(0.0, min(1.0, math.hypot(vx, vy) / self.radius))
+ if mag < 0.02:
+ return 0, 0.0
+ heading = int((math.degrees(math.atan2(vx, vy)) + 360) % 360)
+ return heading, mag
+
+class App(tk.Tk):
+ def __init__(self):
+ super().__init__()
+ self.title("Sphero Controller — Manual + SSE/UDP Autonomous (Raspberry Pi 5) + Metrics")
+ self.configure(bg="#0b0f13")
+ try:
+ self.tk.call('tk', 'scaling', 1.25)
+ except Exception:
+ pass
+
+ # Logging
+ self.logger = logging.getLogger("sphero_controller")
+ self.logger.setLevel(logging.DEBUG)
+ fmt = logging.Formatter("%(asctime)s | %(levelname)s | %(message)s", "%H:%M:%S")
+ ch = logging.StreamHandler()
+ ch.setLevel(logging.INFO); ch.setFormatter(fmt)
+ self.logger.addHandler(ch)
+
+ # Start metrics thread
+ METRICS.start()
+
+ # Status vars
+ self.ollie_status = tk.StringVar(value="Disconnected")
+ self.bb8_status = tk.StringVar(value="Disconnected")
+ self.active_target = tk.StringVar(value="Ollie")
+ self.mode = tk.StringVar(value="Autonomous")
+ self.reverse_mode = tk.BooleanVar(value=False)
+
+ # SSE vars
+ self.sse_base = tk.StringVar(value="http://localhost:7966")
+ self.sse_status = tk.StringVar(value="Disconnected")
+ self.last_state = tk.StringVar(value="-")
+
+ # UDP fast path
+ self.udp_port = tk.IntVar(value=7970)
+
+ # Autonomy caps
+ self.max_speed_var = tk.IntVar(value=150)
+
+ # Workers
+ self.ollie = RobotWorker("Ollie", scanner.find_Ollie, lambda s: self._set_var(self.ollie_status, s), self.logger)
+ self.bb8 = RobotWorker("BB-8", scanner.find_BB8, lambda s: self._set_var(self.bb8_status, s), self.logger)
+
+ # Receivers
+ self._sse: Optional[SseClient] = None
+ self._udp: Optional[UdpReceiver] = None
+
+ # Shared latest instruction
+ self._instr_lock = threading.Lock()
+ self._last_instr = None # dict with timing fields
+
+ # UI
+ self._build_ui()
+
+ # periodic loops
+ self._schedule_manual_drive_tick()
+ self._schedule_autonomy_tick()
+
+ self.bind("", lambda _e: self._stop_all())
+ self.protocol("WM_DELETE_WINDOW", self._on_close)
+
+ self.logger.info("Ready. Connect SSE and/or Start UDP; switch to Autonomous to drive.")
+
+ def _set_var(self, var: tk.StringVar, value: str):
+ self.after(0, lambda: var.set(value))
+
+ def _build_ui(self):
+ self._style_widgets()
+ top = ttk.Frame(self, padding=10); top.pack(side="top", fill="x")
+ # Ollie
+ frm_ollie = ttk.Labelframe(top, text="Ollie")
+ frm_ollie.pack(side="left", padx=10, pady=5, fill="x", expand=True)
+ ttk.Label(frm_ollie, text="Name filter:").grid(row=0, column=0, sticky="w")
+ self.entry_ollie = ttk.Entry(frm_ollie, width=18); self.entry_ollie.grid(row=0, column=1, padx=6)
+ ttk.Button(frm_ollie, text="Connect", command=self._connect_ollie).grid(row=0, column=2, padx=4)
+ ttk.Button(frm_ollie, text="Disconnect", command=self._disconnect_ollie).grid(row=0, column=3, padx=4)
+ ttk.Label(frm_ollie, text="Status:").grid(row=1, column=0, sticky="w", pady=(6,0))
+ ttk.Label(frm_ollie, textvariable=self.ollie_status).grid(row=1, column=1, columnspan=3, sticky="w", pady=(6,0))
+ # BB-8
+ frm_bb8 = ttk.Labelframe(top, text="BB-8")
+ frm_bb8.pack(side="left", padx=10, pady=5, fill="x", expand=True)
+ ttk.Label(frm_bb8, text="Name filter:").grid(row=0, column=0, sticky="w")
+ self.entry_bb8 = ttk.Entry(frm_bb8, width=18); self.entry_bb8.grid(row=0, column=1, padx=6)
+ ttk.Button(frm_bb8, text="Connect", command=self._connect_bb8).grid(row=0, column=2, padx=4)
+ ttk.Button(frm_bb8, text="Disconnect", command=self._disconnect_bb8).grid(row=0, column=3, padx=4)
+ ttk.Label(frm_bb8, text="Status:").grid(row=1, column=0, sticky="w", pady=(6,0))
+ ttk.Label(frm_bb8, textvariable=self.bb8_status).grid(row=1, column=1, columnspan=3, sticky="w", pady=(6,0))
+ # Target + Mode + Reverse
+ frm_target = ttk.Frame(self, padding=(10, 0, 10, 0))
+ frm_target.pack(side="top", fill="x")
+ ttk.Label(frm_target, text="Active target:").pack(side="left")
+ for label in ("Ollie", "BB-8", "Both"):
+ ttk.Radiobutton(frm_target, text=label, value=label, variable=self.active_target).pack(side="left", padx=6)
+ ttk.Label(frm_target, text="Mode:").pack(side="left", padx=(20, 6))
+ for m in ("Manual", "Autonomous"):
+ ttk.Radiobutton(frm_target, text=m, value=m, variable=self.mode).pack(side="left", padx=4)
+ ttk.Checkbutton(frm_target, text="Reverse drive (add 180°)", variable=self.reverse_mode,
+ command=self._apply_reverse_mode).pack(side="right")
+ # Middle
+ mid = ttk.Frame(self, padding=10); mid.pack(side="top", fill="both", expand=True)
+ left = ttk.Frame(mid); left.pack(side="left", fill="y", padx=10)
+ self.joystick = JoystickCanvas(left, radius=150, knob_radius=18); self.joystick.pack(side="top", padx=10, pady=10)
+ self.speed_var = tk.IntVar(value=120)
+ self.speed_slider = ttk.Scale(left, from_=0, to=255, orient="horizontal", length=320,
+ command=lambda _v: None, variable=self.speed_var)
+ self.speed_slider.pack(side="top", pady=(6,0))
+ ttk.Label(left, text="Manual Speed").pack(side="top")
+ rotate_frame = ttk.Frame(left); rotate_frame.pack(side="top", pady=8)
+ ttk.Label(rotate_frame, text="Rotate:").pack(side="left", padx=(0,6))
+ ttk.Button(rotate_frame, text="−90°", command=lambda: self._rotate_active(-90)).pack(side="left", padx=3)
+ ttk.Button(rotate_frame, text="180°", command=lambda: self._rotate_active(180)).pack(side="left", padx=3)
+ ttk.Button(rotate_frame, text="+90°", command=lambda: self._rotate_active(+90)).pack(side="left", padx=3)
+ ttk.Button(left, text="Stop (Space)", command=self._stop_all).pack(side="top", pady=12)
+ # Right: Streaming controls
+ right = ttk.Frame(mid); right.pack(side="left", fill="both", expand=True, padx=10)
+ auto = ttk.Labelframe(right, text="Autonomous (Instruction Streams)")
+ auto.pack(side="top", fill="x", padx=4, pady=6)
+ ttk.Label(auto, text="SSE base URL:").grid(row=0, column=0, sticky="w")
+ ttk.Entry(auto, width=30, textvariable=self.sse_base).grid(row=0, column=1, sticky="w", padx=(6,12))
+ ttk.Button(auto, text="Connect SSE", command=self._connect_sse).grid(row=0, column=2, padx=4)
+ ttk.Button(auto, text="Disconnect SSE", command=self._disconnect_sse).grid(row=0, column=3, padx=4)
+ ttk.Label(auto, text="SSE Status:").grid(row=1, column=0, sticky="w", pady=(8,0))
+ ttk.Label(auto, textvariable=self.sse_status).grid(row=1, column=1, columnspan=3, sticky="w", pady=(8,0))
+ ttk.Label(auto, text="UDP listen port:").grid(row=2, column=0, sticky="w", pady=(8,0))
+ ttk.Entry(auto, width=8, textvariable=self.udp_port).grid(row=2, column=1, sticky="w", padx=(6,12), pady=(8,0))
+ ttk.Button(auto, text="Start UDP", command=self._start_udp).grid(row=2, column=2, padx=4, pady=(8,0))
+ ttk.Button(auto, text="Stop UDP", command=self._stop_udp).grid(row=2, column=3, padx=4, pady=(8,0))
+ ttk.Label(auto, text="Max auto speed cap").grid(row=3, column=0, sticky="w", pady=(8,0))
+ ttk.Scale(auto, from_=50, to=255, orient="horizontal", length=240,
+ command=lambda v: None, variable=self.max_speed_var).grid(row=3, column=1, columnspan=3, sticky="w", padx=(6,0), pady=(8,0))
+ ttk.Label(auto, text="Last state:").grid(row=4, column=0, sticky="w", pady=(8,0))
+ ttk.Label(auto, textvariable=self.last_state).grid(row=4, column=1, columnspan=3, sticky="w", pady=(8,0))
+ # Bottom: debug log
+ bottom = ttk.Frame(self, padding=10); bottom.pack(side="bottom", fill="both")
+ self.txt_log = tk.Text(bottom, height=12, bg="#0f1419", fg="#e2e8f0")
+ self.txt_log.pack(fill="both", expand=True)
+ th = TkTextHandler(self.txt_log); th.setLevel(logging.INFO)
+ th.setFormatter(logging.Formatter("%(asctime)s | %(levelname)s | %(message)s", "%H:%M:%S"))
+ self.logger.addHandler(th)
+
+ def _style_widgets(self):
+ style = ttk.Style()
+ try:
+ style.theme_use("clam")
+ except Exception:
+ pass
+ style.configure("TFrame", background="#0b0f13")
+ style.configure("TLabelframe", background="#0b0f13", foreground="#e2e8f0")
+ style.configure("TLabelframe.Label", background="#0b0f13", foreground="#a8b3c4")
+ style.configure("TLabel", background="#0b0f13", foreground="#e2e8f0")
+ style.configure("TButton", padding=6)
+ style.configure("TRadiobutton", background="#0b0f13", foreground="#e2e8f0")
+
+ # connect/disconnect
+ def _connect_ollie(self):
+ name = getattr(self, "entry_ollie").get().strip() or None
+ self.ollie.connect(toy_name=name)
+
+ def _disconnect_ollie(self):
+ self.ollie.disconnect()
+
+ def _connect_bb8(self):
+ name = getattr(self, "entry_bb8").get().strip() or None
+ self.bb8.connect(toy_name=name)
+
+ def _disconnect_bb8(self):
+ self.bb8.disconnect()
+
+ # manual drive loop
+ def _schedule_manual_drive_tick(self):
+ self._manual_drive_tick()
+ self.after(50, self._schedule_manual_drive_tick) # 20 Hz
+
+ def _manual_drive_tick(self):
+ if self.mode.get() != "Manual":
+ return
+ heading, mag = getattr(self, "joystick").pose()
+ base_speed = getattr(self, "speed_var").get()
+ target_speed = int(base_speed * mag)
+ tgt = self._get_active_target()
+ if tgt in ("Ollie", "Both") and self.ollie.is_connected():
+ self.ollie.set_desired(heading, target_speed)
+ if tgt in ("BB-8", "Both") and self.bb8.is_connected():
+ self.bb8.set_desired(heading, target_speed)
+
+ # autonomy tick (apply SSE/UDP)
+ def _schedule_autonomy_tick(self):
+ self._autonomy_tick()
+ self.after(15, self._schedule_autonomy_tick) # ~66 Hz
+
+ def _autonomy_tick(self):
+ if self.mode.get() != "Autonomous":
+ return
+ instr = None
+ with self._instr_lock:
+ if self._last_instr is not None:
+ instr = dict(self._last_instr)
+
+ if not instr:
+ self._apply_rel(0, 0, None, None)
+ return
+
+ arrival = float(instr.get("rpi_recv_ts", time.time()))
+ if (time.time() - arrival) > 0.35:
+ self._apply_rel(0, 0, None, None)
+ return
+
+ rel_heading = int(instr.get("rel_heading", 0)) % 360
+ speed = int(instr.get("speed", 0))
+ self._set_var(self.last_state, str(instr.get("state", "-")))
+
+ # Cap autonomous speed
+ speed = max(0, min(int(self.max_speed_var.get()), speed))
+
+ # Metrics: recv->apply and cap->apply
+ apply_ts = time.time()
+ METRICS.incr_counter("instr_applied", 1)
+ rcv = instr.get("rpi_recv_ts")
+ if isinstance(rcv, (int, float)):
+ METRICS.add_sample("recv_to_apply_ms", (apply_ts - rcv) * 1000.0)
+ pc_cap = instr.get("pc_cap_ts")
+ if isinstance(pc_cap, (int, float)):
+ METRICS.add_sample("cap_to_apply_ms", (apply_ts - pc_cap) * 1000.0)
+
+ seq = instr.get("seq")
+ self._apply_rel(rel_heading, speed, seq, pc_cap, apply_ts)
+
+ def _apply_rel(self, rel_heading: int, speed: int, seq: Optional[int], pc_cap_ts: Optional[float], apply_ts: Optional[float]=None):
+ tgt = self._get_active_target()
+ if tgt in ("Ollie", "Both") and self.ollie.is_connected():
+ self.ollie.set_desired_relative(rel_heading, speed, seq=seq, apply_ts=apply_ts, pc_cap_ts=pc_cap_ts)
+ if tgt in ("BB-8", "Both") and self.bb8.is_connected():
+ self.bb8.set_desired_relative(rel_heading, speed, seq=seq, apply_ts=apply_ts, pc_cap_ts=pc_cap_ts)
+
+ # SSE controls
+ def _connect_sse(self):
+ base = (self.sse_base.get() or "").strip()
+ if not base:
+ self._set_var(self.sse_status, "Enter a base URL, e.g. http://:7966")
+ return
+
+ def on_control(obj: dict):
+ with self._instr_lock:
+ self._last_instr = obj # latest-wins
+
+ def on_status(text: str):
+ self._set_var(self.sse_status, text)
+
+ if self._sse:
+ try: self._sse.stop()
+ except Exception: pass
+ self._sse = None
+
+ self._sse = SseClient(base, on_control=on_control, status_callback=on_status, logger=self.logger)
+ self._sse.start()
+
+ def _disconnect_sse(self):
+ if self._sse:
+ try: self._sse.stop()
+ except Exception: pass
+ self._sse = None
+ self._set_var(self.sse_status, "Disconnected")
+
+ # UDP controls
+ def _start_udp(self):
+ port = int(self.udp_port.get())
+ if self._udp:
+ try: self._udp.stop()
+ except Exception: pass
+ self._udp = None
+
+ def on_packet(obj: dict):
+ # Normalize payload (already contains timing & seq from PC)
+ h = int(obj.get("rel_heading", obj.get("h", 0))) % 360
+ s = int(obj.get("speed", obj.get("s", 0)))
+ obj2 = {
+ "seq": obj.get("seq"),
+ "rel_heading": h, "speed": s,
+ "pc_cap_ts": obj.get("pc_cap_ts"),
+ "pc_send_ts": obj.get("pc_send_ts"),
+ "pc_cap_to_send_ms": obj.get("pc_cap_to_send_ms"),
+ "state": obj.get("state", "UDP"),
+ "rpi_recv_ts": obj.get("rpi_recv_ts", time.time()),
+ "chan": obj.get("chan", "udp"),
+ }
+ with self._instr_lock:
+ self._last_instr = obj2
+
+ self._udp = UdpReceiver("0.0.0.0", port, on_packet=on_packet, logger=self.logger)
+ self._udp.start()
+
+ def _stop_udp(self):
+ if self._udp:
+ try: self._udp.stop()
+ except Exception: pass
+ self._udp = None
+
+ # helpers
+ def _get_active_target(self):
+ return self.active_target.get()
+
+ def _apply_reverse_mode(self):
+ mode = self.reverse_mode.get()
+ self.ollie.set_reverse_mode(mode)
+ self.bb8.set_reverse_mode(mode)
+ self.logger.info("Reverse mode: %s", "ON" if mode else "OFF")
+
+ def _rotate_active(self, delta):
+ tgt = self._get_active_target()
+ if tgt in ("Ollie", "Both") and self.ollie.is_connected():
+ self.ollie.rotate_by(delta)
+ if tgt in ("BB-8", "Both") and self.bb8.is_connected():
+ self.bb8.rotate_by(delta)
+
+ def _stop_all(self):
+ self.joystick.reset()
+ self.speed_var.set(0)
+ self.ollie.stop_now()
+ self.bb8.stop_now()
+ self.logger.info("STOP")
+
+ def _on_close(self):
+ try: self._stop_all()
+ except Exception: pass
+ self._disconnect_sse()
+ self._stop_udp()
+ self.ollie.disconnect(); self.bb8.disconnect()
+ METRICS.stop()
+ self.after(150, self.destroy)
+
+if __name__ == "__main__":
+ os.environ.setdefault("CUDA_VISIBLE_DEVICES", "")
+ print("Safety first: keep the area clear. Supervise the robot at all times.")
+ App().mainloop()
diff --git a/Lab 5/requirements.txt b/Lab 5/requirements.txt
new file mode 100644
index 0000000000..40a0e5b60a
--- /dev/null
+++ b/Lab 5/requirements.txt
@@ -0,0 +1,5 @@
+mediapipe==0.10.18
+opencv-python==4.11.0.86
+pip-chill==1.0.3
+teachable-machine-lite==1.2.0.2
+torchvision==0.24.0
diff --git a/Lab 5/tml_example.py b/Lab 5/tml_example.py
new file mode 100644
index 0000000000..89b909d7dc
--- /dev/null
+++ b/Lab 5/tml_example.py
@@ -0,0 +1,23 @@
+from teachable_machine_lite import TeachableMachineLite
+import cv2 as cv
+
+cap = cv.VideoCapture(0)
+
+model_path = 'model.tflite'
+image_file_name = "frame.jpg"
+labels_path = "labels.txt"
+
+tm_model = TeachableMachineLite(model_path=model_path, labels_file_path=labels_path)
+
+while True:
+ ret, frame = cap.read()
+ cv.imshow('Cam', frame)
+ cv.imwrite(image_file_name, frame)
+
+ results = tm_model.classify_image(image_file_name)
+ print("results:",results)
+
+ k = cv.waitKey(1)
+ if k% 255 == 27:
+ # press ESC to close camera view.
+ break
\ No newline at end of file
diff --git a/Lab 6/README.md b/Lab 6/README.md
new file mode 100644
index 0000000000..f2919a9bf0
--- /dev/null
+++ b/Lab 6/README.md
@@ -0,0 +1,167 @@
+# Distributed Interaction
+
+**Sean Hardesty Lewis**
+
+---
+
+## Prep
+
+Done!
+
+## Part A: MQTT Messaging
+
+Done!
+
+Here is a picture of it working:
+
+
+**💡 My 5 ideas for messaging between devices**
+
+1.) The first idea I had was an **LLM telephone game**, where one RPI tells the next RPI something but we either a.) mask b.) translate c.) add noise/permute etc. and see what the end message becomes. (Update 11/05/2025: Hauke presented this idea in class, so it is very low-hanging fruit and obvious!)
+
+2.) Another idea I had was **3 pt reconstruction of a scene**. We know that one-shot / multi-view NeRF has come a long way, but I think having three RPIs with different angles of the same area then reconstructing that might be interesting. Obviously this has been done time and time again (not with RPIs, but generally).
+
+3.) Another idea that piggybacks on the aforementioned one is using the sensors of the RPI instead of just camera. For example, we could have one RPI with a depth sensor, and one RPI with a VLM that are both running in real-time for a **semantic depth interpretation of the scene**. Could we get a decent translation of our environment, and a sense of depth? (There is really no reason to have multiple RPIs for this besides just splitting the computational demand of potential running a depth sensor and VLM at same time).
+
+4.) Another idea I had was another extremely low-hanging fruit of you clicking a button on one RPI and it makes the other RPIs light turn on and vice versa. This has been time and time again with **"long distance relationship touch lamps"** which are pretty much the same concept.
+
+5.) Another idea was using the RPI as a **visual transformation game** that uses the camera of your RPI (faced towards your opponent with another RPI and exact same setup). The RPI will then detect using object detection or VLM your opponent and anything else in the scene. It then transforms the scene (with filters, replacement, or slow image generation) on your RPI display with a fantastical version of your opponent in the game (think regular human in work clothes -> medieval knight with greatsword). It would play out like a normal Pokemon battle or 1v1 game, just with this "Ready Player One" interpretation of your opponent. The person you see in front of you is completely different from what you see on the RPI screen, and vice versa for them. You could have the RPIs connect and synthesize a concurrent theme, local game states of attack/defend, and really sell the entire transformation.
+
+Here are some sketches of my ideas:
+
+
+
+
+---
+
+## Part B: Collaborative Pixel Grid
+
+Done!
+
+**📸 Below is a creenshot of grid + photo of my Pi setup**
+
+My streamed color is the blue one on bottom left, detected by putting the sensor near the servo motor!
+
+
+
+
+
+---
+
+## Part C: Make Your Own
+
+**1. Project Description**
+
+I decided to do my idea (#5 above) of a **visual transformation game** since I found out that SDXL Turbo could generate 1 image every 0.3s on my PC. So, I could have 3fps videos essentially as long as I had some kind of prompt. I believe this would be interesting since as image generation improves and gets faster, we could get live filters for what we see around us. We could have a VLM that identifies the basics of actors, environments, actions within a scene. Then, we could keep that caption the same or use an LLM to transform it with some theme we choose. Then, we can generate a plausible real-time image (or low-fps video) with the caption. The user experience I propose would be a game of sorts where two players both have RPIs with screens and cameras. The cameras are faced towards the other player, and the screens are what each player is looking at. With a background prompt, we can transform what the VLM sees of the other player to "an alien world" or to "a medieval scene" and the player will be able to see the other player as an alien or a knight almost instantaneously. This will make interaction fun especially for children who may always be reading fiction, getting to see the real world transformed into their favorite fiction universes through just a screen and camera, and in near real-time.
+
+Here is what I think it would look like:
+
+
+
+
+**2. Architecture Diagram**
+
+```
+[Camera] → [FastVLM] → (caption)
+│
+▼ HTTPS (CF)
+[SDXL API] — PC
+│
+base64 JPEG frames
+▼
+[MQTT Broker] — TCP :1883 (local)
+▲
+WS :9001 (local) → Cloudflare → WSS URL
+▼
+[RPi App + PiTFT] subscribe sdxl/frames/
+```
+
+Effectively, each RPI captions an image it takes with its webcam, then optionally sends that to local Ollama for style transformation, then sends the caption to the SDXL API on the PC. The PC reads the caption and uses SDXL Turbo to generate JPEG frames which the base64s are published on the MQTT server under a video uid. Each RPI can read from a different video uid that is unique to them and will receive base64 images which they can display on their own screens.
+
+**3. Build Documentation**
+
+
+
+
+
+
+Each player in the game is comprised up of their RPI itself, the webcam attached to their device, and their PiTFT screen. The webcam is the only sensor we are using to get pictures of the world, actors, and actions happening within it. The RPI and remote server PC are able to interpret this world from the camera and transform it into a newly generated image which appears on the screen. These images appear at about a rate of 2 frames per second with 2 users, and less as more users join in on the game.
+
+In my game, MQTT is mostly used as a way to distribute the streams of generated jpegs. Each RPI can subscribe to a unique stream of generated 512x512 jpegs from the PC and effectively get a 3fps "video".
+
+Here is the code that we use to receive and display the image from the MQTT stream. We read the base64 from the decoded MQTT subscription, use pillow to convert it from base64 bytes to RGB and render it to the RPI display. All of this happens multiple times per second since the PC churns generating images with the prompt, puts the base64 on the MQTT server, which the subscribed RPIs read and can immediately display on their screens.
+```
+def on_message(client, userdata, msg):
+ global last_frame_ts
+ try:
+ data = json.loads(msg.payload.decode("utf-8"))
+ b = base64.b64decode(data["b64"])
+ im = Image.open(io.BytesIO(b)).convert("RGB")
+ _render_to_buf(im)
+ if DISP_OK:
+ disp.image(image_buf, rotation)
+ last_frame_ts = time.time()
+ except Exception as e:
+ log(f"MQTT frame error: {e}")
+```
+
+Here is an example of our setup SDXL Turbo API which runs at 3fps, we caption it and get ~3 512x512 generated images per second.
+
+https://github.com/user-attachments/assets/4fbcb262-8078-405a-9c5a-9665f5281116
+
+**4. User Testing**
+
+I tested the system with two people outside my team using the full pipeline (webcam → caption → optional style transform → SDXL Turbo → MQTT → PiTFT display). Both players stood across from each other with the RPIs facing them. Before trying the game, both testers assumed it would feel like a "Snapchat filter but slower." Neither expected that the scene would be fully regenerated rather than layered with filters. He thought it would probably just tint the colors or put a cartoon overlay and was surprised when I explained it was actually re-rendering the whole frame with a caption.
+
+What surprised them most was how quickly the world transformed. Even at ~2–3 fps, the SDXL video felt alive. He laughed immediately when he appeared as a Cyberpunk 2047 Keanu Reeves style character. There were erratic but funny hallucinations. Objects would spontaneously morph such as chairs became thrones, a cup became glowing for no reason, etc. The experience was just in seeing each other differently. Both said the fact that each person sees a different transformed version of the other made it feel like sort of a VR without the headset. They also loved experimenting with poses or holding items to see how the VLM would reinterpret them. He found he could intentionally trick it into making cooler images by raising his arms or leaning into the frame.
+
+Some fixes they suggested were:
+* Faster style transformation. The Ollama step running on-device added noticeable lag (~10s). They said that when the style transform froze for a few seconds, it broke immersion, even with the constant running stream from the previous caption.
+* Higher loyalty to original actors/scene. The model often changed the character or scene between frames. They wanted it to be more consistent so the generated characters / scenes stayed coherent somewhat with the real world.
+* Reducing the camera zoom. They had trouble framing themselves without backing far away from the camera since the automatic zoom levels are pretty zoomed in.
+
+Here is the demo for the interaction without any style transformation (default VLM caption used for SDXL):
+
+https://github.com/user-attachments/assets/a47ded81-4669-4587-90b7-32182afd6ca3
+
+Here is the demo for the interaction with style transformation (VLM caption is passed to local RPI Ollama with stylistic transformation prompt which then gets passed to SDXL):
+
+https://github.com/user-attachments/assets/bf5acedc-26b1-424d-ad25-540dff906d64
+
+**5. Reflection**
+
+The best part off the experience (what worked well) is definitely the initial reaction when the players realize that the world around them is being transformed into something that the computer is generating. A person in view becomes a medieval knight or a cyborg, a dog becomes a hippogriff, etc.
+Once players catch on to what is happening, the experience becomes a light-hearted "Who can get the AI to generate something crazier?" with different objects, poses, etc. to try to get the VLM (and for style transforms, the LLM) to describe a scene that would convert to a great generated image/video.
+
+The biggest challenges with distributed interaction was definitely the server load. One RPI could subscribe to an image generation feed and the PC could reliably pump out 3fps. However, as soon as there are two image generation feeds, it might only be 1.5fps as the server has to generate 2 different captioned image as fast as it can.
+This meant that as more players join the game, the experience deteriorates for everyone. This is no fault of MQTT, but of using one centralized server for image generation capabilities. We could improve this by doing image generation on the RPIs, but 0.3s on a 3090 converts to roughly 1-3 minutes on the RPI. Here is someone who managed to get it working in one minute on the RPI on [Medium](https://medium.com/data-science/generating-images-with-stable-diffusion-and-onnxstream-on-the-raspberry-pi-f126636b6c0c). For my game, one minute to generate a single frame defeats the entire experience when the purpose is to try to get it as near real-time as possible.
+
+Another area that sacrifices quite a bit of time is the LLM caption style transfer being done locally via Ollama on the RPI. This can often take 10-15 seconds for something that if offloaded to the PC would take < 1s. Decidely, this is not as bad as FastVLM which can freeze up the entire RPIs resources for around 30s while it runs.
+For the purposes of my experiment, I am using the following servers and locations (PC vs RPI) for each. I could optimize the game further by moving everything off of the RPI and only using them as edge nodes, but I felt I still wanted to use the computational power of the RPI somehow. Bolded is my setup.
+* FastVLM Server for Captioning Images **(PC, ~0.5s)** vs (RPI, ~30s)
+* SDXL Server for Generating Images **(PC, ~0.3s)** vs (RPI, 1-3mins)
+* Ollama QWEN2.5 0.5b Instruct LLM for Optional Style Transfer of Caption (PC, 0.2s) vs **(RPI, ~15s)**
+
+Sensor events worked well for the most part. The webcams are zoomed in by default which is a bit tricky when a user is standing extremely close to their webcam as it does not comprehend any scene. The MQTT server with seperate streams for each RPI webcam captioning flow was extremely useful and made it possible for different users to get unique streams of images that pertained to their own camera, even with a centralized image generation server.
+
+For improvements, the first thing I would definitely fix would be the style transfer in two ways. One, I would move it to the server PC to be much faster. Second, I would modify the Ollama style transform prompts to stay more loyal to the original caption. I noticed that the style transformations that the LLM did often reduced the caption to meaningless scenes or actors that weren't that relevant to what the original caption had.
+
+In terms of other improvements, I would definitely compress SDXL Turbo to generate maybe 100x100 images or even smaller. I noticed that the PiTFT screens couldn't even render the full 512x512 image, so I was wasting image quality as well as time in generating them for such a small screen. I could get faster image generation for all players, as well as better end-to-end latency by compressing the sizes of images being generated. I would also switch out my bespoke server to the official StreamDiffusion server which is optimized to render at nearly 100fps, dwarfing this server.
+
+## Code Files
+
+**Server files:**
+- `fastvlm_server_pc.py` - FastVLM Captioning Server for PC
+- `sdxl_turbo_server_mqtt.py` - SDXL Turbo Image Gen Server for PC, publishes to MQTT at localhost:1833
+- `viewer.py` - SDXL Turbo Image Gen Viewer for PC
+
+**Pi files:**
+- `fastvlm_server.mjs` - FastVLM Captioning Server for RPI
+- `rpi5_fastvlm_to_sdxl.py` - RPI Pipeline for capturing camera frame, querying FastVLM, transforming caption with Ollama, calling SDXL API, reading MQTT base64 frames and displaying to screen
+
+**Misc:**
+- `wss_sub_test.py` - Testing local + cloudflare MQTT stream
+
+I acknowledge the use of Copilot to create these scripts as well as helpful guidance from ChatGPT, especially with the RPI Pipeline script where I mishmashed several disparate components I had. The FastVLM server is converted from their [template](https://github.com/apple/ml-fastvlm) for querying, and also has some unused endpoints (ex. /caption_batch that I use in other offline applications). The SDXL Turbo server was heavily inspired from [StreamDiffusion](https://github.com/cumulo-autumn/StreamDiffusion), although my implementation is a bit heavier and not quite as robust as theirs.
+
+I used Cloudflared tunneling to make my localhost servers from PC available to the RPI to use. I also used it (with its websocket support) to let the RPI subscribe to the MQTT websockets server.
diff --git a/Lab 6/app.py b/Lab 6/app.py
new file mode 100644
index 0000000000..6a66d0ebb3
--- /dev/null
+++ b/Lab 6/app.py
@@ -0,0 +1,137 @@
+"""
+Collaborative Pixel Grid Server
+Fullscreen real-time pixel grid for up to 100 Raspberry Pis
+Based on Tinkerbelle architecture with WebSocket live updates
+"""
+
+from flask import Flask, render_template, request
+from flask_socketio import SocketIO, emit
+import json
+from collections import OrderedDict
+from datetime import datetime
+import math
+
+app = Flask(__name__)
+app.config['SECRET_KEY'] = 'pixel-grid-2025'
+
+# Try eventlet first, fall back to threading if not available
+try:
+ import eventlet
+ eventlet.monkey_patch()
+ socketio = SocketIO(app, cors_allowed_origins="*", async_mode='eventlet')
+except ImportError:
+ socketio = SocketIO(app, cors_allowed_origins="*", async_mode='threading')
+
+# Store pixel data: {mac_address: {'color': [r,g,b], 'position': int, 'last_update': datetime}}
+pixels = OrderedDict()
+
+
+@app.route('/')
+def index():
+ """Serve the fullscreen pixel grid visualization"""
+ return render_template('grid.html')
+
+
+@app.route('/controller')
+def controller():
+ """Serve the color picker controller (like Jane Wren)"""
+ return render_template('controller.html')
+
+
+@socketio.on('connect')
+def handle_connect():
+ """Client connected"""
+ print(f'Client connected: {request.sid}')
+ # Send current state to new client
+ emit('grid_state', {
+ 'pixels': [
+ {
+ 'mac': mac,
+ 'color': data['color'],
+ 'position': data['position']
+ }
+ for mac, data in pixels.items()
+ ]
+ })
+
+
+@socketio.on('disconnect')
+def handle_disconnect():
+ """Client disconnected"""
+ print(f'Client disconnected: {request.sid}')
+
+
+@socketio.on('color_update')
+def handle_color_update(data):
+ """Handle color update from controller or Pi"""
+ try:
+ mac = data.get('mac')
+ r = int(data.get('r', 0))
+ g = int(data.get('g', 0))
+ b = int(data.get('b', 0))
+
+ # Validate
+ r = max(0, min(255, r))
+ g = max(0, min(255, g))
+ b = max(0, min(255, b))
+
+ # Check if new pixel
+ is_new = mac not in pixels
+
+ if is_new:
+ # Assign next available position
+ position = len(pixels)
+ pixels[mac] = {
+ 'color': [r, g, b],
+ 'position': position,
+ 'last_update': datetime.now()
+ }
+ print(f'✓ New pixel: {mac[:17]} at position {position}')
+ else:
+ # Update existing pixel
+ pixels[mac]['color'] = [r, g, b]
+ pixels[mac]['last_update'] = datetime.now()
+
+ # Broadcast to all clients
+ emit('pixel_update', {
+ 'mac': mac,
+ 'color': [r, g, b],
+ 'position': pixels[mac]['position'],
+ 'is_new': is_new,
+ 'total': len(pixels)
+ }, broadcast=True)
+
+ except Exception as e:
+ print(f'Error handling color update: {e}')
+
+
+@socketio.on('clear_grid')
+def handle_clear_grid():
+ """Clear all pixels"""
+ pixels.clear()
+ emit('grid_cleared', broadcast=True)
+ print('Grid cleared')
+
+
+if __name__ == '__main__':
+ print("=" * 60)
+ print(" Collaborative Pixel Grid Server")
+ print("=" * 60)
+ print(f" Fullscreen Grid: http://0.0.0.0:5000")
+ print(f" Controller: http://0.0.0.0:5000/controller")
+ print("=" * 60)
+
+ # Optional: Enable MQTT bridge
+ # Uncomment to enable MQTT -> WebSocket forwarding
+ try:
+ from mqtt_bridge import start_mqtt_bridge
+ start_mqtt_bridge(socketio, pixels)
+ except ImportError:
+ print(" MQTT bridge not available (install paho-mqtt)")
+ except Exception as e:
+ print(f" MQTT bridge disabled: {e}")
+
+ print("=" * 60)
+ print()
+
+ socketio.run(app, host='0.0.0.0', port=5000, debug=True)
diff --git a/Lab 6/fastvlm_server.mjs b/Lab 6/fastvlm_server.mjs
new file mode 100644
index 0000000000..9946e18c2a
--- /dev/null
+++ b/Lab 6/fastvlm_server.mjs
@@ -0,0 +1,357 @@
+// ~/vlt/fastvlm_server.mjs
+//
+// FastVLM HTTP server (localhost only) using @huggingface/transformers + onnxruntime-web (WASM).
+// - HTTP API: GET /health, POST /infer, POST /shutdown
+// - Accepts local file paths and file:// URLs (reads via fs), or http(s) URLs.
+// - Clean 200/4xx/5xx responses; no stdin/stdout protocols.
+//
+// Setup (in project dir):
+// npm uninstall onnxruntime-node @xenova/transformers
+// npm install @huggingface/transformers onnxruntime-web
+//
+// Run (usually spawned by Python):
+// node fastvlm_server.mjs
+//
+// Env:
+// VLM_MODEL=onnx-community/FastVLM-0.5B-ONNX
+// VLM_PORT=17860
+// VLM_CLEAR_CACHE=1
+// HF_HOME=...
+
+import http from 'node:http';
+import process from 'node:process';
+import fs from 'node:fs/promises';
+import path from 'node:path';
+import os from 'node:os';
+import { URL, fileURLToPath } from 'node:url';
+
+function ts() { return new Date().toISOString(); }
+const QUIET = /^(1|true|yes)$/i.test(process.env.VLM_QUIET ?? '');
+const DEBUG = /^(1|true|yes)$/i.test(process.env.VLM_DEBUG ?? '');
+function log(...args) { if (!QUIET) console.error(`[${ts()}]`, ...args); }
+function warn(...args) { if (!QUIET) console.error(`[${ts()}] WARN:`, ...args); }
+function debug(...args) { if (DEBUG && !QUIET) console.error(`[${ts()}] DEBUG:`, ...args); }
+function fatal(...args) { console.error(`[${ts()}] FATAL:`, ...args); process.exit(1); }
+
+const IS_TTY = !!process.stderr.isTTY && !QUIET;
+
+
+// ---------------- deps ----------------
+try { await import('onnxruntime-web'); }
+catch (e) {
+ const msg = String(e?.message || e);
+ if (msg.includes("Cannot find package 'onnxruntime-web'")) {
+ fatal(
+ 'Missing dependency "onnxruntime-web". Fix with:\n' +
+ ' npm uninstall onnxruntime-node @xenova/transformers\n' +
+ ' npm install @huggingface/transformers onnxruntime-web'
+ );
+ }
+ fatal('Failed loading onnxruntime-web:', e?.stack || e);
+}
+
+let AutoProcessor, AutoModelForImageTextToText, RawImage, env;
+try {
+ ({ AutoProcessor, AutoModelForImageTextToText, RawImage, env } =
+ await import('@huggingface/transformers'));
+} catch (e) {
+ const msg = String(e?.message || e);
+ if (msg.includes("Cannot find package '@huggingface/transformers'")) {
+ fatal(
+ 'Missing dependency "@huggingface/transformers". Fix with:\n' +
+ ' npm uninstall onnxruntime-node @xenova/transformers\n' +
+ ' npm install @huggingface/transformers onnxruntime-web'
+ );
+ }
+ fatal('Failed to import @huggingface/transformers:', e?.stack || e);
+}
+
+// Soft-warn if native addon lingers
+try {
+ await import('onnxruntime-node').then(() => {
+ warn('"onnxruntime-node" is installed. We do not use it; remove to avoid conflicts:\n npm uninstall onnxruntime-node');
+ }).catch(() => {});
+} catch {}
+
+// ---------------- runtime config ----------------
+try {
+ env.backends.onnx.backend = 'wasm';
+ const cpuCount = (os.cpus()?.length ?? 4);
+ const threadsEnv = parseInt(process.env.VLM_THREADS || '', 10);
+ const threads = Number.isFinite(threadsEnv) && threadsEnv > 0
+ ? threadsEnv
+ : Math.max(1, Math.min(3, Math.floor(cpuCount / 2))); // gentler default
+ env.backends.onnx.wasm.numThreads = threads;
+ env.useBrowserCache = false;
+ env.allowRemoteModels = true;
+ log(`Backend configured: backend=wasm threads=${env.backends.onnx?.wasm?.numThreads ?? 'n/a'} cpus=${cpuCount}`);
+} catch (e) {
+ warn('Failed to set WASM backend params:', e?.message ?? e);
+}
+
+
+const MODEL_ID = process.env.VLM_MODEL ?? 'onnx-community/FastVLM-0.5B-ONNX';
+const PORT = parseInt(process.env.VLM_PORT || '17860', 10);
+const HOST = '127.0.0.1';
+const dtype = { embed_tokens: 'fp32', vision_encoder: 'fp32', decoder_model_merged: 'fp32' };
+
+// ---------------- helpers ----------------
+function startProgress(label = 'Loading', tickMs = 200) {
+ // Disable the spinner if not a TTY or if VLM_PROGRESS=0
+ if (!IS_TTY || /^(0|false|no)$/i.test(process.env.VLM_PROGRESS ?? '1')) {
+ const t0 = Date.now();
+ return { stop() {}, elapsedMs() { return Date.now() - t0; } };
+ }
+ let dots = 0;
+ const start = Date.now();
+ const timer = setInterval(() => {
+ dots = (dots + 1) % 10;
+ const bar = '█'.repeat(dots) + '-'.repeat(10 - dots);
+ const secs = ((Date.now() - start) / 1000).toFixed(1);
+ process.stderr.write(`\r[${ts()}] [${label}] [${bar}] ${secs}s elapsed`);
+ }, tickMs);
+ return { stop() { clearInterval(timer); process.stderr.write('\n'); },
+ elapsedMs() { return Date.now() - start; } };
+}
+
+
+async function purgeModelCacheIfRequested(modelId) {
+ if (!process.env.VLM_CLEAR_CACHE) return false;
+
+ const hfHome =
+ process.env.HF_HOME
+ || (process.env.HOME && path.join(process.env.HOME, '.cache', 'huggingface'))
+ || path.join(os.homedir(), '.cache', 'huggingface');
+
+ const bases = [
+ path.join(hfHome, 'transformers'),
+ path.join(process.cwd(), 'node_modules', '@huggingface', 'transformers', '.cache'),
+ ];
+ let removed = false;
+ for (const base of bases) {
+ const p1 = path.join(base, modelId.replaceAll('/', path.sep));
+ const p2 = path.join(p1, 'onnx');
+ for (const p of [p1, p2]) {
+ try { await fs.rm(p, { recursive: true, force: true }); removed = true; log('Purged cache:', p); }
+ catch {}
+ }
+ }
+ return removed;
+}
+
+function sendJson(res, status, body) {
+ const payload = JSON.stringify(body);
+ res.writeHead(status, {
+ 'Content-Type': 'application/json; charset=utf-8',
+ 'Content-Length': Buffer.byteLength(payload),
+ 'Cache-Control': 'no-store',
+ 'Connection': 'close',
+ 'X-Server': 'fastvlm-http',
+ });
+ res.end(payload);
+}
+
+async function readJsonBody(req, limit = 2 * 1024 * 1024) {
+ return new Promise((resolve, reject) => {
+ let size = 0; const chunks = [];
+ req.on('data', (c) => {
+ size += c.length;
+ if (size > limit) { reject(Object.assign(new Error('payload too large'), { code: 'ETOOBIG' })); req.destroy(); return; }
+ chunks.push(c);
+ });
+ req.on('end', () => {
+ try { resolve(JSON.parse(Buffer.concat(chunks).toString('utf8') || '{}')); }
+ catch { reject(Object.assign(new Error('invalid JSON'), { code: 'EBADJSON' })); }
+ });
+ req.on('error', reject);
+ });
+}
+
+function isHttpUrl(s) { return /^https?:\/\//i.test(s); }
+function isFileUrl(s) { return /^file:\/\//i.test(s); }
+function isLikelyPath(s) { return !/^[a-z]+:\/\//i.test(s); }
+
+async function loadRawImage(input) {
+ // Accept: http(s) URL, file:// URL, or plain filesystem path
+ try {
+ if (typeof input !== 'string') throw new Error('image must be a string');
+
+ if (isHttpUrl(input)) {
+ debug('[image] from HTTP(S) URL:', input);
+ return await RawImage.fromURL(input);
+ }
+
+ if (isFileUrl(input)) {
+ const p = fileURLToPath(input);
+ debug('[image] from file URL:', p);
+ return await RawImage.fromURL(p);
+ }
+
+ if (isLikelyPath(input)) {
+ const abs = path.resolve(input);
+ debug('[image] from path:', abs);
+ try { await fs.access(abs); } catch { throw new Error(`file not accessible: ${abs}`); }
+ return await RawImage.fromURL(abs);
+ }
+
+ debug('[image] from fallback URL-ish:', input);
+ return await RawImage.fromURL(input);
+ } catch (e) {
+ const msg = e?.message ?? String(e);
+ throw new Error(`loadRawImage failed: ${msg}`);
+ }
+}
+
+
+
+
+// ---------------- model globals ----------------
+let processor, model;
+let ready = false;
+let readyAt = null;
+let busy = false;
+let isShuttingDown = false;
+globalThis.__fatalLoadError = null;
+
+// ---------------- model load ----------------
+async function loadModel() {
+ log(`Boot: model="${MODEL_ID}" deviceWanted=cpu backend=wasm (pure JS via HF)`);
+ if (process.env.VLM_CLEAR_CACHE) {
+ const purged = await purgeModelCacheIfRequested(MODEL_ID);
+ if (purged) log('Note: cache purge requested and completed.');
+ }
+
+ const pb = startProgress('Loading');
+ try {
+ log('Stage 1: Loading processor...');
+ processor = await AutoProcessor.from_pretrained(MODEL_ID);
+ log('Stage 1: Processor loaded OK.');
+
+ log('Stage 2: Loading model (fp32, wasm backend, pure JS)...');
+ model = await AutoModelForImageTextToText.from_pretrained(MODEL_ID, { dtype, device: 'cpu' });
+
+ ready = true;
+ readyAt = new Date().toISOString();
+ pb.stop();
+ log(`Model ready on device=cpu backend=wasm in ${(pb.elapsedMs()/1000).toFixed(2)}s.`);
+ } catch (err) {
+ pb.stop();
+ const msg = String(err || '');
+ warn('Model load failed:', msg);
+ ready = false; readyAt = null;
+ globalThis.__fatalLoadError = msg;
+ }
+}
+
+// ---------------- HTTP server ----------------
+const server = http.createServer(async (req, res) => {
+ try {
+ const u = new URL(req.url, `http://${req.headers.host}`);
+ if (u.hostname !== 'localhost' && u.hostname !== '127.0.0.1') {
+ return sendJson(res, 403, { ok: false, error: 'forbidden' });
+ }
+
+ // Health
+ if (req.method === 'GET' && u.pathname === '/health') {
+ if (isShuttingDown) return sendJson(res, 503, { ok: false, ready: false, shutting_down: true });
+ if (ready) return sendJson(res, 200, { ok: true, ready: true, model: MODEL_ID, backend: 'wasm', device: 'cpu', ready_at: readyAt });
+ if (globalThis.__fatalLoadError) return sendJson(res, 500, { ok: false, ready: false, error: globalThis.__fatalLoadError });
+ return sendJson(res, 503, { ok: false, ready: false, stage: 'loading' });
+ }
+
+ // Inference
+ if (req.method === 'POST' && u.pathname === '/infer') {
+ if (isShuttingDown) return sendJson(res, 503, { ok: false, error: 'shutting down' });
+ if (!ready) return sendJson(res, 503, { ok: false, error: 'model not ready' });
+
+ let body;
+ try { body = await readJsonBody(req); }
+ catch (e) {
+ const code = e?.code === 'ETOOBIG' ? 413 : 400;
+ return sendJson(res, code, { ok: false, error: e?.message || 'bad request' });
+ }
+
+ const image = body?.image;
+ const prompt = (body?.prompt ?? 'Describe the image.');
+ const max_new_tokens_req = parseInt(body?.max_new_tokens || '', 10);
+ const max_new_tokens = Math.max(1, Math.min(256, Number.isFinite(max_new_tokens_req) ? max_new_tokens_req : 12));
+
+ if (!image || typeof image !== 'string') return sendJson(res, 400, { ok: false, error: 'image (string) is required' });
+ if (busy) return sendJson(res, 409, { ok: false, error: 'busy, try again' });
+
+ busy = true;
+
+ const marks = [];
+ const mark = (label) => marks.push([label, Date.now()]);
+ mark('start');
+
+ try {
+ debug('[infer] start', { image, max_new_tokens });
+
+ const imageObj = await loadRawImage(image);
+ mark('image_loaded');
+
+ const chat = [{ role: 'user', content: `${prompt}` }];
+ const templ = processor.apply_chat_template(chat, { add_generation_prompt: true });
+ mark('templated');
+
+ const inputs = await processor(imageObj, templ, { add_special_tokens: false });
+ mark('inputs_ready');
+
+ const outputs = await model.generate({
+ ...inputs,
+ max_new_tokens,
+ do_sample: false,
+ });
+ mark('generated');
+
+ const text = processor.batch_decode(
+ outputs.slice(null, [inputs.input_ids.dims.at(-1), null]),
+ { skip_special_tokens: true }
+ )?.[0] ?? '';
+ mark('decoded');
+
+ const dt_ms = marks.at(-1)[1] - marks[0][1];
+ debug('[infer] timings(ms):', Object.fromEntries(
+ marks.slice(1).map((m, i) => [m[0], m[1] - marks[i][1]])
+ ));
+
+ return sendJson(res, 200, { ok: true, text: String(text).trim(), dt_ms });
+ } catch (e) {
+ const emsg = String(e?.message || e);
+ if (/ENOENT|not exist|no such file/i.test(emsg)) {
+ return sendJson(res, 404, { ok: false, error: 'image not found' });
+ }
+ warn('INFER error:', e?.stack || e);
+ return sendJson(res, 500, { ok: false, error: emsg });
+ } finally {
+ busy = false;
+ }
+ }
+
+
+ // Shutdown
+ if (req.method === 'POST' && u.pathname === '/shutdown') {
+ isShuttingDown = true;
+ sendJson(res, 200, { ok: true, message: 'shutting down' });
+ setTimeout(() => server.close(() => process.exit(0)), 50);
+ return;
+ }
+
+ // Not found
+ return sendJson(res, 404, { ok: false, error: 'not found' });
+ } catch (e) {
+ warn('Request handling error:', e?.stack || e);
+ try { sendJson(res, 500, { ok: false, error: 'internal error' }); } catch {}
+ }
+});
+
+server.listen(PORT, HOST, () => {
+ log(`HTTP server listening on http://${HOST}:${PORT}`);
+ loadModel().catch((e) => warn('loadModel top-level error:', e?.stack || e));
+});
+
+process.on('SIGINT', () => { log('SIGINT'); server.close(() => process.exit(0)); });
+process.on('SIGTERM', () => { log('SIGTERM'); server.close(() => process.exit(0)); });
+process.on('uncaughtException', (e) => { console.error(e); process.exit(1); });
+process.on('unhandledRejection', (e) => { console.error(e); process.exit(1); });
diff --git a/Lab 6/fastvlm_server_pc.py b/Lab 6/fastvlm_server_pc.py
new file mode 100644
index 0000000000..542a65e597
--- /dev/null
+++ b/Lab 6/fastvlm_server_pc.py
@@ -0,0 +1,610 @@
+#!/usr/bin/env python3
+# fastvlm_server.py
+# FastVLM-7B caption server with:
+# - 4/8-bit or full precision loading
+# - synchronous warmup (pre-alloc KV cache)
+# - dynamic micro-batching for many concurrent single-image requests (/caption)
+# - bulk multi-image endpoint (/caption_batch)
+# - NEW: local folder captioning endpoint (/caption_folder) — send a folder name
+#
+# Examples (Cloudflare tunnel shown; swap for your host as needed):
+# # Single image
+# curl -X POST https://scout-maria-sizes-referral.trycloudflare.com/caption \
+# -F "image=@/path/to/photo.jpg"
+#
+# # Bulk (one request, multiple files)
+# curl -X POST https://scout-maria-sizes-referral.trycloudflare.com/caption_batch \
+# -F "image=@/data/a.jpg" -F "image=@/data/b.png" \
+# -F "prompt=Describe the scene in English."
+#
+# # Folder next to this server file, e.g. "./images"
+# curl -X POST https://scout-maria-sizes-referral.trycloudflare.com/caption_folder \
+# -F "folder=images" \
+# -F "prompt=Describe the scene in English." \
+# -F "max_new_tokens=48" -F "min_new_tokens=8"
+#
+# Tuning (env vars):
+# FASTVLM_MODEL_ID=apple/FastVLM-7B
+# FASTVLM_REV= # optional
+# FASTVLM_QUANT=4bit|8bit|none # default 4bit
+# FASTVLM_4BIT_TYPE=nf4|fp4 # default nf4
+# FASTVLM_COMPUTE_DTYPE=fp16|bf16 # default fp16
+# FASTVLM_MAX_NEW_TOKENS=64 # captioning rarely needs 196; 32–96 is good
+# FASTVLM_MIN_NEW_TOKENS=16
+# FASTVLM_WARMUP_TOKENS=64
+# FASTVLM_MAX_BATCH=6 # try 4–8 on a 3090 with short outputs
+# FASTVLM_BATCH_TIMEOUT_MS=8 # micro-batching window; 4–10ms typical
+# FASTVLM_DYNAMIC_BATCH=1 # enable dynamic batching (default 1)
+# FASTVLM_SANITIZE=0 # 1 -> lowercase, comma-separated sanitize
+# FASTVLM_FOLDER_MAX_FILES=1000 # cap for /caption_folder
+# HOST=0.0.0.0
+# PORT=7860
+#
+# Notes:
+# - Keep ONE server process that owns the model. Do not run multiple workers.
+# - Concurrency comes from dynamic batching, not parallel generate() calls.
+# - /caption_folder reads a sibling folder of this script's directory (no recursion).
+
+import io, os, re, threading, time
+from dataclasses import dataclass
+from collections import deque
+from typing import List, Tuple, Dict
+
+import torch
+from PIL import Image
+from flask import Flask, request, jsonify
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+
+# Optional AVIF/HEIC support
+try:
+ import pillow_heif; pillow_heif.register_heif_opener()
+except Exception:
+ try:
+ import avif # pillow-avif-plugin auto-registers
+ except Exception:
+ pass
+
+# ---- Performance-friendly flags ------------------------------------------------
+torch.backends.cudnn.benchmark = True
+torch.set_grad_enabled(False)
+if torch.cuda.is_available():
+ torch.backends.cuda.matmul.allow_tf32 = True
+
+# ---- Configuration --------------------------------------------------------------
+MODEL_ID = os.environ.get("FASTVLM_MODEL_ID", "apple/FastVLM-0.5B")
+FASTVLM_REV = os.environ.get("FASTVLM_REV") # optional
+QUANT_MODE = os.environ.get("FASTVLM_QUANT", "4bit").lower() # "4bit" | "8bit" | "none"
+FOURBIT_TYPE = os.environ.get("FASTVLM_4BIT_TYPE", "nf4").lower() # "nf4" | "fp4"
+COMPUTE_DTYPE = os.environ.get("FASTVLM_COMPUTE_DTYPE", "fp16").lower() # "fp16" | "bf16"
+
+MAX_NEW_TOKENS_DEFAULT = int(os.environ.get("FASTVLM_MAX_NEW_TOKENS", "64"))
+MIN_NEW_TOKENS_DEFAULT = int(os.environ.get("FASTVLM_MIN_NEW_TOKENS", "16"))
+WARMUP_NEW_TOKENS = int(os.environ.get("FASTVLM_WARMUP_TOKENS", "64"))
+
+# Micro-batching & batching
+MAX_BATCH = int(os.environ.get("FASTVLM_MAX_BATCH", "6"))
+BATCH_TIMEOUT_MS = int(os.environ.get("FASTVLM_BATCH_TIMEOUT_MS", "8"))
+DYNAMIC_BATCH = os.environ.get("FASTVLM_DYNAMIC_BATCH", "1") == "1"
+
+SANITIZE = os.environ.get("FASTVLM_SANITIZE", "0") == "1"
+FOLDER_MAX_FILES = int(os.environ.get("FASTVLM_FOLDER_MAX_FILES", "1000"))
+
+# Apple remote code expects this special image token id
+IMAGE_TOKEN_INDEX = -200
+
+# ---- Single-GPU CUDA placement (no device_map/accelerate) ----------------------
+if not torch.cuda.is_available():
+ raise SystemExit("CUDA GPU required. (Install CUDA PyTorch build; e.g., a 3090 with CUDA drivers.)")
+
+DEVICE = torch.device("cuda:0")
+compute_dtype = torch.bfloat16 if COMPUTE_DTYPE.startswith("bf") else torch.float16
+
+print(f"[FastVLM:Flask] Loading {MODEL_ID} (quant={QUANT_MODE}) on {DEVICE}; compute_dtype={compute_dtype} ...")
+
+# ---- Tokenizer -----------------------------------------------------------------
+tok = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
+if tok.pad_token_id is None:
+ tok.pad_token_id = tok.eos_token_id
+
+# ---- Model load (no device_map) -----------------------------------------------
+quant_cfg = None
+load_kwargs = dict(trust_remote_code=True, low_cpu_mem_usage=True)
+if FASTVLM_REV:
+ load_kwargs["revision"] = FASTVLM_REV
+
+if QUANT_MODE == "4bit":
+ quant_cfg = BitsAndBytesConfig(
+ load_in_4bit=True,
+ bnb_4bit_quant_type=FOURBIT_TYPE, # "nf4" (default) or "fp4"
+ bnb_4bit_compute_dtype=compute_dtype,
+ bnb_4bit_use_double_quant=True,
+ )
+ load_kwargs.update(dict(quantization_config=quant_cfg))
+elif QUANT_MODE == "8bit":
+ quant_cfg = BitsAndBytesConfig(load_in_8bit=True)
+ load_kwargs.update(dict(quantization_config=quant_cfg))
+else: # "none" -> unquantized
+ load_kwargs.update(dict(torch_dtype=compute_dtype))
+
+# Load on CPU then move to CUDA explicitly.
+model = AutoModelForCausalLM.from_pretrained(MODEL_ID, **load_kwargs).eval()
+model.to(DEVICE) # moves quantized layers as well
+model_dtype = getattr(model, "dtype", None) or compute_dtype
+
+# Vision preprocessing via the model's own tower
+try:
+ vtower = model.get_vision_tower()
+ vtower.to(DEVICE)
+ img_proc = vtower.image_processor
+except Exception as e:
+ raise SystemExit(f"Model does not expose a vision tower as expected: {e}")
+
+# Serialize generation to keep CUDA context stable & predictable
+gen_lock = threading.Lock()
+
+# ===== Prompt ===================================================================
+NAV_PROMPT = "Describe the scene in English."
+
+# ===== Sanitization (optional) ==================================================
+BAD_TAILS = re.compile(
+ r"\b(i\s*'?m\s*sorry|i\s*am\s*sorry|note:|however|therefore|please|let\s+me\s+know|i\s+hope)\b",
+ re.IGNORECASE,
+)
+def sanitize(text: str) -> str:
+ if not text:
+ return "none"
+ t = text.replace("\n", " ").strip()
+ m = BAD_TAILS.search(t)
+ if m:
+ t = t[:m.start()].strip()
+ t = t.strip(" `\"'*")
+ t = re.split(r"[.;:!?]", t)[0].strip()
+ t = re.sub(r"\[[^\]]*\]|\([^)]*\)", "", t).strip()
+ t = t.lower()
+ t = re.sub(r"[^a-z0-9 ,\-]", " ", t)
+ t = re.sub(r"\s*,\s*", ", ", t)
+ t = re.sub(r"\s+", " ", t)
+ t = re.sub(r"(,\s*){2,}", ", ", t).strip(" ,")
+ return t if t else "none"
+
+# ===== Builders =================================================================
+@torch.inference_mode()
+def build_inputs_for_prompt(pil_img: Image.Image, prompt: str):
+ """
+ Prepare token ids and pixel values for one image+prompt.
+
+ IMPORTANT CHANGE:
+ - Treat the incoming `prompt` as a SYSTEM instruction (style / behavior),
+ and keep the USER message just as . This makes per-request prompts
+ (your modes) much more influential.
+ """
+ messages = [
+ {"role": "system", "content": prompt},
+ {"role": "user", "content": ""},
+ ]
+ rendered = tok.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+ if "" not in rendered:
+ raise RuntimeError("Chat template missing placeholder.")
+ pre, post = rendered.split("", 1)
+
+ pre_ids = tok(pre, return_tensors="pt", add_special_tokens=False).input_ids.to(DEVICE)
+ post_ids = tok(post, return_tensors="pt", add_special_tokens=False).input_ids.to(DEVICE)
+
+ img_tok = torch.tensor([[IMAGE_TOKEN_INDEX]], dtype=pre_ids.dtype, device=DEVICE)
+ input_ids = torch.cat([pre_ids, img_tok, post_ids], dim=1)
+ attention_mask = torch.ones_like(input_ids, device=DEVICE)
+
+ px = img_proc(images=pil_img, return_tensors="pt")["pixel_values"].to(
+ DEVICE, dtype=model_dtype, non_blocking=True
+ )
+ return input_ids, attention_mask, px
+
+@torch.inference_mode()
+def build_batch(pairs: List[Tuple[Image.Image, str]]):
+ """
+ Create a padded batch of input_ids/attention_mask and a batch of pixel_values.
+
+ Each pair is (PIL_image, prompt). The prompt is treated as a SYSTEM instruction,
+ and the user message is just .
+ """
+ input_ids_list = []
+ input_lens = []
+ pil_list = []
+ for pil_img, prompt in pairs:
+ messages = [
+ {"role": "system", "content": prompt},
+ {"role": "user", "content": ""},
+ ]
+ rendered = tok.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
+ if "" not in rendered:
+ raise RuntimeError("Chat template missing placeholder.")
+ pre, post = rendered.split("", 1)
+ pre_ids = tok(pre, return_tensors="pt", add_special_tokens=False).input_ids[0]
+ post_ids = tok(post, return_tensors="pt", add_special_tokens=False).input_ids[0]
+ img_tok = torch.tensor([IMAGE_TOKEN_INDEX], dtype=pre_ids.dtype)
+ ids = torch.cat([pre_ids, img_tok, post_ids], dim=0) # 1D
+ input_ids_list.append(ids)
+ input_lens.append(ids.shape[0])
+ pil_list.append(pil_img)
+
+ pad_id = tok.pad_token_id
+ B = len(input_ids_list)
+ max_len = max(x.shape[0] for x in input_ids_list)
+ batched_ids = torch.full((B, max_len), pad_id, dtype=input_ids_list[0].dtype)
+ for i, ids in enumerate(input_ids_list):
+ batched_ids[i, :ids.shape[0]] = ids
+
+ attn_mask = (batched_ids != pad_id).long()
+ batched_ids = batched_ids.to(DEVICE, non_blocking=True)
+ attn_mask = attn_mask.to(DEVICE, non_blocking=True)
+
+ px = img_proc(images=pil_list, return_tensors="pt")["pixel_values"].to(
+ DEVICE, dtype=model_dtype, non_blocking=True
+ )
+ return batched_ids, attn_mask, px, input_lens
+
+# ===== Inference ================================================================
+@torch.inference_mode()
+def run_caption(pil_img: Image.Image, prompt: str, max_new: int, min_new: int) -> str:
+ input_ids, attention_mask, px = build_inputs_for_prompt(pil_img, prompt)
+ with gen_lock:
+ out = model.generate(
+ inputs=input_ids,
+ attention_mask=attention_mask,
+ images=px,
+ max_new_tokens=int(max_new),
+ min_new_tokens=int(min_new),
+ do_sample=False,
+ num_beams=1,
+ no_repeat_ngram_size=3,
+ repetition_penalty=1.05,
+ use_cache=True,
+ eos_token_id=tok.eos_token_id,
+ pad_token_id=tok.eos_token_id,
+ )
+ gen_only = out[0, input_ids.shape[1]:]
+ text = tok.decode(gen_only, skip_special_tokens=True).strip()
+ if not text:
+ full = tok.decode(out[0], skip_special_tokens=True).strip()
+ text = full if full else "none"
+ return sanitize(text) if SANITIZE else (text or "none")
+
+@torch.inference_mode()
+def run_caption_batch(pil_list: List[Image.Image], prompts: List[str], max_new: int, min_new: int) -> List[str]:
+ pairs = list(zip(pil_list, prompts))
+ input_ids, attention_mask, px, input_lens = build_batch(pairs)
+ with gen_lock:
+ out = model.generate(
+ inputs=input_ids,
+ attention_mask=attention_mask,
+ images=px,
+ max_new_tokens=int(max_new),
+ min_new_tokens=int(min_new),
+ do_sample=False,
+ num_beams=1,
+ no_repeat_ngram_size=3,
+ repetition_penalty=1.05,
+ use_cache=True,
+ eos_token_id=tok.eos_token_id,
+ pad_token_id=tok.eos_token_id,
+ )
+
+ texts = []
+ for i in range(out.shape[0]):
+ gen_only = out[i, input_lens[i]:]
+ t = tok.decode(gen_only, skip_special_tokens=True).strip()
+ if not t:
+ t = tok.decode(out[i], skip_special_tokens=True).strip()
+ texts.append(sanitize(t) if SANITIZE else (t or "none"))
+ return texts
+
+# ===== Utilities for folder endpoint ===========================================
+SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
+ALLOWED_EXT = {".jpg", ".jpeg", ".png", ".webp", ".bmp", ".tif", ".tiff", ".heic", ".avif"}
+
+def _secure_folder_path(folder_name: str) -> str:
+ # Allow only a simple folder name (no slashes); restrict to sibling of script dir.
+ if not folder_name or any(sep in folder_name for sep in (os.sep, os.altsep) if sep):
+ raise ValueError("folder must be a simple name without path separators")
+ if folder_name in (".", ".."):
+ raise ValueError("invalid folder name")
+ path = os.path.realpath(os.path.join(SCRIPT_DIR, folder_name))
+ # Ensure it's directly under the script dir (neighbor)
+ if os.path.dirname(path) != SCRIPT_DIR:
+ raise ValueError("folder must be a neighbor of the server script")
+ return path
+
+def _list_image_files(folder_path: str, limit: int) -> List[str]:
+ files = []
+ for name in sorted(os.listdir(folder_path)):
+ p = os.path.join(folder_path, name)
+ if not os.path.isfile(p):
+ continue
+ ext = os.path.splitext(name)[1].lower()
+ if ext in ALLOWED_EXT:
+ files.append(p)
+ # If extension filter misses, we still try to open later if needed — but to stay fast,
+ # we rely on extension filter here.
+ if len(files) >= limit:
+ break
+ return files
+
+# ===== Warmup ===================================================================
+_READY = False
+_STARTUP_LAT_MS = None
+
+def _sync_warmup():
+ """Synchronous warmup at startup: builds CUDA context, compiles kernels, allocates KV cache."""
+ global _READY, _STARTUP_LAT_MS
+ t0 = time.perf_counter()
+
+ warm_imgs = [Image.new("RGB", (640, 640), (0, 0, 0)) for _ in range(max(1, MAX_BATCH))]
+ pairs = [(im, NAV_PROMPT) for im in warm_imgs]
+ input_ids, attention_mask, px, _ = build_batch(pairs)
+
+ with torch.inference_mode(), gen_lock:
+ _ = model.generate(
+ inputs=input_ids,
+ attention_mask=attention_mask,
+ images=px,
+ max_new_tokens=max(WARMUP_NEW_TOKENS, MAX_NEW_TOKENS_DEFAULT),
+ min_new_tokens=1,
+ do_sample=False, num_beams=1, use_cache=True,
+ eos_token_id=tok.eos_token_id, pad_token_id=tok.eos_token_id,
+ )
+ with torch.inference_mode(), gen_lock:
+ _ = model.generate(
+ inputs=input_ids,
+ attention_mask=attention_mask,
+ images=px,
+ max_new_tokens=4,
+ min_new_tokens=1,
+ do_sample=False, num_beams=1, use_cache=True,
+ eos_token_id=tok.eos_token_id, pad_token_id=tok.eos_token_id,
+ )
+
+ _STARTUP_LAT_MS = int((time.perf_counter() - t0) * 1000)
+ _READY = True
+ print(f"[FastVLM:Flask] Warmup complete in ~{_STARTUP_LAT_MS} ms; server is READY.")
+
+# ===== Dynamic micro-batching ===================================================
+@dataclass
+class _Job:
+ pil: Image.Image
+ prompt: str
+ max_new: int
+ min_new: int
+ event: threading.Event
+ result: str = None
+ error: str = None
+
+queue = deque()
+q_lock = threading.Lock()
+q_cv = threading.Condition(q_lock)
+
+def _batcher_loop():
+ while True:
+ with q_cv:
+ while not queue:
+ q_cv.wait()
+ t0 = time.perf_counter()
+ while len(queue) < MAX_BATCH:
+ left = BATCH_TIMEOUT_MS/1000 - (time.perf_counter() - t0)
+ if left <= 0:
+ break
+ q_cv.wait(timeout=left)
+ n = min(MAX_BATCH, len(queue))
+ jobs = [queue.popleft() for _ in range(n)]
+
+ try:
+ pairs = [(j.pil, j.prompt) for j in jobs]
+ input_ids, attention_mask, px, input_lens = build_batch(pairs)
+ with gen_lock:
+ out = model.generate(
+ inputs=input_ids,
+ attention_mask=attention_mask,
+ images=px,
+ max_new_tokens=max(j.max_new for j in jobs),
+ min_new_tokens=min(j.min_new for j in jobs),
+ do_sample=False, num_beams=1,
+ no_repeat_ngram_size=3, repetition_penalty=1.05,
+ use_cache=True,
+ eos_token_id=tok.eos_token_id, pad_token_id=tok.eos_token_id,
+ )
+ for i, j in enumerate(jobs):
+ gen_only = out[i, input_lens[i]:]
+ txt = tok.decode(gen_only, skip_special_tokens=True).strip()
+ if not txt:
+ txt = tok.decode(out[i], skip_special_tokens=True).strip()
+ j.result = sanitize(txt) if SANITIZE else (txt or "none")
+ except Exception as e:
+ err = str(e)
+ for j in jobs:
+ j.error = err
+ finally:
+ for j in jobs:
+ j.event.set()
+
+# ---- Warm up BEFORE serving requests ------------------------------------------
+_sync_warmup()
+if DYNAMIC_BATCH:
+ threading.Thread(target=_batcher_loop, daemon=True).start()
+
+# ===== Flask App ================================================================
+app = Flask(__name__)
+
+@app.post("/caption")
+def caption():
+ if not _READY:
+ return jsonify(error="server not ready"), 503
+
+ f = request.files.get("image") or request.files.get("file")
+ if not f:
+ return jsonify(error="expected multipart/form-data with file field 'image'"), 400
+
+ prompt = request.form.get("prompt") or NAV_PROMPT
+ max_new = int(request.form.get("max_new_tokens", str(MAX_NEW_TOKENS_DEFAULT)))
+ min_new = int(request.form.get("min_new_tokens", str(MIN_NEW_TOKENS_DEFAULT)))
+
+ raw = f.read()
+ if len(raw) < 10:
+ return jsonify(error=f"uploaded file too small ({len(raw)} bytes)"), 400
+ try:
+ pil = Image.open(io.BytesIO(raw)); pil.load(); pil = pil.convert("RGB")
+ except Exception as e:
+ sig = list(raw[:16]) if raw else []
+ return jsonify(error=f"Pillow could not open image: {type(e).__name__}: {e}. First 16 bytes: {sig}"), 400
+
+ if not DYNAMIC_BATCH:
+ try:
+ text = run_caption(pil, prompt, max_new, min_new)
+ return jsonify(caption=text, prompt_used=prompt)
+ except Exception as e:
+ return jsonify(error=str(e)), 500
+
+ job = _Job(pil=pil, prompt=prompt, max_new=max_new, min_new=min_new, event=threading.Event())
+ with q_cv:
+ queue.append(job)
+ q_cv.notify()
+ if not job.event.wait(timeout=60):
+ return jsonify(error="inference timed out"), 504
+ if job.error:
+ return jsonify(error=job.error), 500
+ return jsonify(caption=job.result, prompt_used=prompt)
+
+@app.post("/caption_batch")
+def caption_batch():
+ if not _READY:
+ return jsonify(error="server not ready"), 503
+
+ files = request.files.getlist("image") or request.files.getlist("images")
+ if not files:
+ return jsonify(error="expected multipart/form-data with file field 'image' (repeatable) or 'images'"), 400
+ if len(files) > MAX_BATCH:
+ return jsonify(error=f"too many images; MAX_BATCH={MAX_BATCH}"), 400
+
+ prompts = request.form.getlist("prompt")
+ if not prompts:
+ prompts = [request.form.get("prompt") or NAV_PROMPT] * len(files)
+ elif len(prompts) == 1 and len(files) > 1:
+ prompts = prompts * len(files)
+ elif len(prompts) != len(files):
+ return jsonify(error="number of prompts must be 1 or equal to number of images"), 400
+
+ max_new = int(request.form.get("max_new_tokens", str(MAX_NEW_TOKENS_DEFAULT)))
+ min_new = int(request.form.get("min_new_tokens", str(MIN_NEW_TOKENS_DEFAULT)))
+
+ pil_list = []
+ for f in files:
+ raw = f.read()
+ if len(raw) < 10:
+ return jsonify(error=f"uploaded file too small ({len(raw)} bytes)"), 400
+ try:
+ pil = Image.open(io.BytesIO(raw)); pil.load(); pil = pil.convert("RGB")
+ except Exception as e:
+ sig = list(raw[:16]) if raw else []
+ return jsonify(error=f"Pillow could not open image: {type(e).__name__}: {e}. First 16 bytes: {sig}"), 400
+ pil_list.append(pil)
+
+ try:
+ texts = run_caption_batch(pil_list, prompts, max_new, min_new)
+ return jsonify(captions=texts, prompts_used=prompts)
+ except Exception as e:
+ return jsonify(error=str(e)), 500
+
+@app.post("/caption_folder")
+def caption_folder():
+ """
+ Caption every image in a folder that is a neighbor (sibling) of this server script.
+ Request: multipart/form-data with:
+ - folder: simple folder name (e.g., 'images') located at SCRIPT_DIR/folder
+ - prompt (optional)
+ - max_new_tokens, min_new_tokens (optional)
+ Response: JSON dict { "": "
", ... }
+ """
+ if not _READY:
+ return jsonify(error="server not ready"), 503
+
+ folder_name = request.form.get("folder") or request.args.get("folder")
+ if not folder_name:
+ return jsonify(error="missing 'folder'"), 400
+
+ try:
+ folder_path = _secure_folder_path(folder_name)
+ except ValueError as ve:
+ return jsonify(error=str(ve)), 400
+
+ if not os.path.isdir(folder_path):
+ return jsonify(error=f"folder not found: {folder_name}"), 404
+
+ prompt = request.form.get("prompt") or NAV_PROMPT
+ max_new = int(request.form.get("max_new_tokens", str(MAX_NEW_TOKENS_DEFAULT)))
+ min_new = int(request.form.get("min_new_tokens", str(MIN_NEW_TOKENS_DEFAULT)))
+
+ files = _list_image_files(folder_path, FOLDER_MAX_FILES)
+ if not files:
+ return jsonify(error=f"no images found in folder '{folder_name}' (allowed: {sorted(ALLOWED_EXT)})"), 404
+
+ results: Dict[str, str] = {}
+ # Process in batches for speed + memory safety
+ i = 0
+ try:
+ while i < len(files):
+ chunk_paths = files[i:i+MAX_BATCH]
+ pil_list, names = [], []
+ for p in chunk_paths:
+ name = os.path.basename(p)
+ names.append(name)
+ try:
+ with open(p, "rb") as fh:
+ raw = fh.read()
+ pil = Image.open(io.BytesIO(raw)); pil.load(); pil = pil.convert("RGB")
+ pil_list.append(pil)
+ except Exception as e:
+ results[name] = f"error: {type(e).__name__}: {e}"
+ # If at least one image opened, run one batched generate
+ if pil_list:
+ prompts = [prompt] * len(pil_list)
+ texts = run_caption_batch(pil_list, prompts, max_new, min_new)
+ # Map back only to those successfully opened (skip errors already filled)
+ j = 0
+ for name in names:
+ if name in results and results[name].startswith("error:"):
+ continue
+ results[name] = texts[j]
+ j += 1
+ i += MAX_BATCH
+ except Exception as e:
+ return jsonify(error=str(e), partial_results=results), 500
+
+ return jsonify(results=results, count=len(results), folder=folder_name, prompt_used=prompt)
+
+@app.get("/health")
+def health():
+ return jsonify(
+ ready=_READY,
+ startup_warmup_ms=_STARTUP_LAT_MS,
+ model=MODEL_ID,
+ quantization=QUANT_MODE,
+ device=str(DEVICE),
+ dtype=str(model_dtype),
+ max_batch=MAX_BATCH,
+ dynamic_batch=DYNAMIC_BATCH,
+ batch_timeout_ms=BATCH_TIMEOUT_MS,
+ max_new_tokens=MAX_NEW_TOKENS_DEFAULT,
+ min_new_tokens=MIN_NEW_TOKENS_DEFAULT,
+ script_dir=SCRIPT_DIR
+ ), 200
+
+@app.get("/")
+def root():
+ return (
+ "FastVLM-7B caption server (CUDA, prewarmed, dynamic micro-batching). "
+ "POST image to /caption, multiple images to /caption_batch, or folder name to /caption_folder"
+ ), 200
+
+# ===== Main ====================================================================
+if __name__ == "__main__":
+ host = os.environ.get("HOST", "127.0.0.1")
+ port = int(os.environ.get("PORT", "7860"))
+ app.run(host=host, port=port, debug=False, threaded=True)
diff --git a/Lab 6/imgs/IMG_0270.jpg b/Lab 6/imgs/IMG_0270.jpg
new file mode 100644
index 0000000000..f8faa829c3
Binary files /dev/null and b/Lab 6/imgs/IMG_0270.jpg differ
diff --git a/Lab 6/imgs/MQTT-explorer.png b/Lab 6/imgs/MQTT-explorer.png
new file mode 100644
index 0000000000..a5af7ce3fd
Binary files /dev/null and b/Lab 6/imgs/MQTT-explorer.png differ
diff --git a/Lab 6/imgs/mqtt_explorer.png b/Lab 6/imgs/mqtt_explorer.png
new file mode 100644
index 0000000000..c7232160b2
Binary files /dev/null and b/Lab 6/imgs/mqtt_explorer.png differ
diff --git a/Lab 6/imgs/mqtt_explorer_2.png b/Lab 6/imgs/mqtt_explorer_2.png
new file mode 100644
index 0000000000..90861090b3
Binary files /dev/null and b/Lab 6/imgs/mqtt_explorer_2.png differ
diff --git a/Lab 6/imgs/two-devices-grid.png b/Lab 6/imgs/two-devices-grid.png
new file mode 100644
index 0000000000..0c648dae69
Binary files /dev/null and b/Lab 6/imgs/two-devices-grid.png differ
diff --git a/Lab 6/mqtt_bridge.py b/Lab 6/mqtt_bridge.py
new file mode 100644
index 0000000000..ec10f3f54f
--- /dev/null
+++ b/Lab 6/mqtt_bridge.py
@@ -0,0 +1,116 @@
+"""
+MQTT Bridge for Pixel Grid
+Enable this to connect MQTT -> WebSocket
+"""
+
+import paho.mqtt.client as mqtt
+import ssl
+import json
+from flask_socketio import SocketIO
+
+# MQTT Configuration
+MQTT_BROKER = 'farlab.infosci.cornell.edu'
+MQTT_PORT = 1883
+MQTT_TOPIC = 'IDD/pixelgrid/colors'
+MQTT_USERNAME = 'idd'
+MQTT_PASSWORD = 'device@theFarm'
+
+mqtt_client = None
+
+
+def on_connect(client, userdata, flags, rc):
+ """MQTT connected"""
+ if rc == 0:
+ print(f'✓ MQTT connected to {MQTT_BROKER}:{MQTT_PORT}')
+ client.subscribe(MQTT_TOPIC)
+ print(f'✓ Subscribed to {MQTT_TOPIC}')
+ else:
+ print(f'✗ MQTT connection failed: {rc}')
+
+
+def on_message(client, userdata, msg):
+ """MQTT message received - forward to WebSocket"""
+ try:
+ data = json.loads(msg.payload.decode('UTF-8'))
+ socketio = userdata['socketio']
+ pixels = userdata['pixels']
+
+ mac = data.get('mac')
+ r = int(data.get('r', 0))
+ g = int(data.get('g', 0))
+ b = int(data.get('b', 0))
+
+ # Validate
+ r = max(0, min(255, r))
+ g = max(0, min(255, g))
+ b = max(0, min(255, b))
+
+ # Check if new pixel
+ is_new = mac not in pixels
+
+ if is_new:
+ # Import here to avoid circular dependency
+ from datetime import datetime
+ # Assign next available position
+ position = len(pixels)
+ pixels[mac] = {
+ 'color': [r, g, b],
+ 'position': position,
+ 'last_update': datetime.now()
+ }
+ print(f'✓ MQTT pixel: {mac[:17]} at position {position} RGB({r},{g},{b})')
+ else:
+ from datetime import datetime
+ # Update existing pixel
+ pixels[mac]['color'] = [r, g, b]
+ pixels[mac]['last_update'] = datetime.now()
+
+ # Broadcast to all clients
+ socketio.emit('pixel_update', {
+ 'mac': mac,
+ 'color': [r, g, b],
+ 'position': pixels[mac]['position'],
+ 'is_new': is_new,
+ 'total': len(pixels)
+ }, namespace='/')
+
+ except Exception as e:
+ print(f'Error processing MQTT message: {e}')
+
+
+def start_mqtt_bridge(socketio_instance, pixels_dict):
+ """Start MQTT client that forwards to WebSocket"""
+ global mqtt_client
+
+ try:
+ import uuid
+ mqtt_client = mqtt.Client(str(uuid.uuid1()))
+
+ # Only use TLS if port is 8883
+ if MQTT_PORT == 8883:
+ mqtt_client.tls_set(cert_reqs=ssl.CERT_NONE)
+
+ mqtt_client.username_pw_set(MQTT_USERNAME, MQTT_PASSWORD)
+ mqtt_client.on_connect = on_connect
+ mqtt_client.on_message = on_message
+ mqtt_client.user_data_set({'socketio': socketio_instance, 'pixels': pixels_dict})
+
+ mqtt_client.connect(MQTT_BROKER, port=MQTT_PORT, keepalive=60)
+ mqtt_client.loop_start()
+
+ print('MQTT bridge started')
+ return True
+
+ except Exception as e:
+ print(f'⚠️ MQTT bridge failed: {e}')
+ print(' Server will run with WebSocket only')
+ return False
+
+
+def stop_mqtt_bridge():
+ """Stop MQTT client"""
+ global mqtt_client
+ if mqtt_client:
+ mqtt_client.loop_stop()
+ mqtt_client.disconnect()
+ print('MQTT bridge stopped')
diff --git a/Lab 6/mqtt_viewer.py b/Lab 6/mqtt_viewer.py
new file mode 100644
index 0000000000..5556d0916f
--- /dev/null
+++ b/Lab 6/mqtt_viewer.py
@@ -0,0 +1,145 @@
+"""
+MQTT Message Viewer
+Lightweight debugging tool to view all MQTT messages in real-time
+"""
+
+from flask import Flask, render_template
+from flask_socketio import SocketIO, emit
+import paho.mqtt.client as mqtt
+from datetime import datetime
+from collections import deque
+import json
+
+app = Flask(__name__)
+app.config['SECRET_KEY'] = 'mqtt-viewer-2025'
+socketio = SocketIO(app, cors_allowed_origins="*", async_mode='threading')
+
+# Store recent messages (limited to prevent memory issues)
+MAX_MESSAGES = 100
+recent_messages = deque(maxlen=MAX_MESSAGES)
+
+# MQTT Configuration
+MQTT_BROKER = 'farlab.infosci.cornell.edu'
+MQTT_PORT = 1883
+MQTT_TOPIC = 'IDD/#' # Subscribe to all IDD topics
+MQTT_USERNAME = 'idd'
+MQTT_PASSWORD = 'device@theFarm'
+
+mqtt_client = None
+
+
+def on_connect(client, userdata, flags, rc):
+ """MQTT connected"""
+ if rc == 0:
+ print(f'✓ MQTT connected to {MQTT_BROKER}:{MQTT_PORT}')
+ client.subscribe(MQTT_TOPIC)
+ print(f'✓ Subscribed to {MQTT_TOPIC}')
+ else:
+ print(f'✗ MQTT connection failed: {rc}')
+
+
+def on_message(client, userdata, msg):
+ """MQTT message received - broadcast to web clients"""
+ try:
+ # Try to parse as JSON, otherwise use as plain text
+ try:
+ payload = json.loads(msg.payload.decode('utf-8'))
+ payload_str = json.dumps(payload, indent=2)
+ is_json = True
+ except:
+ payload_str = msg.payload.decode('utf-8', errors='replace')
+ is_json = False
+
+ # Create message object
+ message = {
+ 'timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3],
+ 'topic': msg.topic,
+ 'payload': payload_str,
+ 'is_json': is_json
+ }
+
+ # Add to recent messages
+ recent_messages.append(message)
+
+ # Broadcast to all connected web clients
+ socketio.emit('mqtt_message', message, namespace='/')
+
+ except Exception as e:
+ print(f'Error processing message: {e}')
+
+
+def start_mqtt_client():
+ """Start MQTT client"""
+ global mqtt_client
+
+ try:
+ import uuid
+ mqtt_client = mqtt.Client(str(uuid.uuid1()))
+ mqtt_client.username_pw_set(MQTT_USERNAME, MQTT_PASSWORD)
+ mqtt_client.on_connect = on_connect
+ mqtt_client.on_message = on_message
+
+ mqtt_client.connect(MQTT_BROKER, port=MQTT_PORT, keepalive=60)
+ mqtt_client.loop_start()
+
+ print('MQTT viewer started')
+ return True
+
+ except Exception as e:
+ print(f'⚠️ MQTT client failed: {e}')
+ return False
+
+
+@app.route('/')
+def index():
+ """Main viewer page"""
+ return render_template('mqtt_viewer.html')
+
+
+@socketio.on('connect')
+def handle_connect():
+ """Client connected - send recent messages"""
+ print(f'Web client connected')
+ # Send recent messages to newly connected client
+ for msg in recent_messages:
+ emit('mqtt_message', msg)
+
+
+@socketio.on('disconnect')
+def handle_disconnect():
+ """Client disconnected"""
+ print(f'Web client disconnected')
+
+
+@socketio.on('clear_messages')
+def handle_clear():
+ """Clear message history"""
+ recent_messages.clear()
+ emit('messages_cleared', broadcast=True)
+ print('Messages cleared')
+
+
+@socketio.on('update_filter')
+def handle_filter(data):
+ """Update topic filter settings"""
+ # Just acknowledge - filtering happens client-side
+ print(f'Filter updated: {data}')
+ emit('filter_updated', data)
+
+
+if __name__ == '__main__':
+ print("=" * 60)
+ print(" MQTT Message Viewer")
+ print("=" * 60)
+ print(f" Viewer URL: http://0.0.0.0:5001")
+ print(f" Monitoring: {MQTT_TOPIC} on {MQTT_BROKER}")
+ print("=" * 60)
+
+ # Start MQTT client
+ start_mqtt_client()
+
+ print("=" * 60)
+ print()
+
+ # Run Flask app on port 5001 (different from main app)
+ socketio.run(app, host='0.0.0.0', port=5001, debug=False, allow_unsafe_werkzeug=True)
diff --git a/Lab 6/old_demos/README_old.md b/Lab 6/old_demos/README_old.md
new file mode 100644
index 0000000000..274b122865
--- /dev/null
+++ b/Lab 6/old_demos/README_old.md
@@ -0,0 +1,166 @@
+# Little Interactions Everywhere
+
+**NAMES OF COLLABORATORS HERE**
+
+## Prep
+
+1. Pull the new changes from the class interactive-lab-hub. (You should be familiar with this already!)
+2. Install [MQTT Explorer](http://mqtt-explorer.com/) on your laptop. If you are using Mac, MQTT Explorer only works when installed from the [App Store](https://apps.apple.com/app/apple-store/id1455214828).
+3. Readings before class:
+ * [MQTT](#MQTT)
+ * [The Presence Table](https://dl.acm.org/doi/10.1145/1935701.1935800) and [video](https://vimeo.com/15932020)
+
+
+## Overview
+
+The point of this lab is to introduce you to distributed interaction. We have included some Natural Language Processing (NLP) and Generation (NLG) but those are not really the emphasis. Feel free to dig into the examples and play around the code which you can integrate into your projects if wanted. However, we want to emphasize that the grading will focus on your ability to develop interesting uses for messaging across distributed devices. Here are the four sections of the lab activity:
+
+A) [MQTT](#part-a)
+
+B) [Send and Receive on your Pi](#part-b)
+
+C) [Streaming a Sensor](#part-c)
+
+D) [The One True ColorNet](#part-d)
+
+E) [Make It Your Own](#part-)
+
+## Part 1.
+
+### Part A
+### MQTT
+
+MQTT is a lightweight messaging portal invented in 1999 for low bandwidth networks. It was later adopted as a defacto standard for a variety of [Internet of Things (IoT)](https://en.wikipedia.org/wiki/Internet_of_things) devices.
+
+#### The Bits
+
+* **Broker** - The central server node that receives all messages and sends them out to the interested clients. Our broker is hosted on the far lab server (Thanks David!) at `farlab.infosci.cornell.edu/8883`. Imagine that the Broker is the messaging center!
+* **Client** - A device that subscribes or publishes information to/on the network.
+* **Topic** - The location data gets published to. These are *hierarchical with subtopics*. For example, If you were making a network of IoT smart bulbs this might look like `home/livingroom/sidelamp/light_status` and `home/livingroom/sidelamp/voltage`. With this setup, the info/updates of the sidelamp's `light_status` and `voltage` will be store in the subtopics. Because we use this broker for a variety of projects you have access to read, write and create subtopics of `IDD`. This means `IDD/ilan/is/a/goof` is a valid topic you can send data messages to.
+* **Subscribe** - This is a way of telling the client to pay attention to messages the broker sends out on the topic. You can subscribe to a specific topic or subtopics. You can also unsubscribe. Following the previouse example of home IoT smart bulbs, subscribing to `home/livingroom/sidelamp/#` would give you message updates to both the light_status and the voltage.
+* **Publish** - This is a way of sending messages to a topic. Again, with the previouse example, you can set up your IoT smart bulbs to publish info/updates to the topic or subtopic. Also, note that you can publish to topics you do not subscribe to.
+
+
+**Important note:** With the broker we set up for the class, you are limited to subtopics of `IDD`. That is, to publish or subcribe, the topics will start with `IDD/`. Also, setting up a broker is not much work, but for the purposes of this class, you should all use the broker we have set up for you!
+
+
+#### Useful Tooling
+
+Debugging and visualizing what's happening on your MQTT broker can be helpful. We like [MQTT Explorer](http://mqtt-explorer.com/). You can connect by putting in the settings from the image below.
+
+
+
+
+
+Once connected, you should be able to see all the messages under the IDD topic. , go to the **Publish** tab and try publish something! From the interface you can send and plot messages as well. Remember, you are limited to subtopics of `IDD`. That is, to publish or subcribe, the topics will start with `IDD/`.
+
+
+
+
+
+
+### Part B
+### Send and Receive on your Pi
+
+[sender.py](./sender.py) and and [reader.py](./reader.py) show you the basics of using the mqtt in python. Let's spend a few minutes running these and seeing how messages are transferred and shown up. Before working on your Pi, keep the connection of `farlab.infosci.cornell.edu/8883` with MQTT Explorer running on your laptop.
+
+**Running Examples on Pi**
+
+* Install the packages from `requirements.txt` under a virtual environment:
+
+ ```
+ pi@raspberrypi:~/Interactive-Lab-Hub $ source .venv/bin/activate
+ (circuitpython) pi@raspberrypi:~/Interactive-Lab-Hub $ cd Lab\ 6
+ (circuitpython) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 6 $ pip install -r requirements.txt
+ ...
+ ```
+* Run `sender.py`, fill in a topic name (should start with `IDD/`), then start sending messages. You should be able to see them on MQTT Explorer.
+
+ ```
+ (circuitpython) pi@raspberrypi:~/Interactive-Lab-Hub/Lab 6 $ python sender.py
+ >> topic: IDD/AlexandraTesting
+ now writing to topic IDD/AlexandraTesting
+ type new-topic to swich topics
+ >> message: testtesttest
+ ...
+ ```
+* Run `reader.py`, and you should see any messages being published to `IDD/` subtopics. Type a message inside MQTT explorer and see if you can receive it with `reader.py`.
+
+ ```
+ (circuitpython) pi@raspberrypi:~ Interactive-Lab-Hub/Lab 6 $ python reader.py
+ ...
+ ```
+
+
+
+
+**\*\*\*Consider how you might use this messaging system on interactive devices, and draw/write down 5 ideas here.\*\*\***
+
+### Part C
+### Streaming a Sensor
+
+We have included an updated example from [lab 4](https://github.com/FAR-Lab/Interactive-Lab-Hub/tree/Fall2021/Lab%204) that streams the [capacitor sensor](https://learn.adafruit.com/adafruit-mpr121-gator) inputs over MQTT.
+
+Plug in the capacitive sensor board with the Qwiic connector. Use the alligator clips to connect a Twizzler (or any other things you used back in Lab 4) and run the example script:
+
+
+
+
+
+
+
+
+ ```
+ (circuitpython) pi@raspberrypi:~ Interactive-Lab-Hub/Lab 6 $ python distributed_twizzlers_sender.py
+ ...
+ ```
+
+**\*\*\*Include a picture of your setup here: what did you see on MQTT Explorer?\*\*\***
+
+**\*\*\*Pick another part in your kit and try to implement the data streaming with it.\*\*\***
+
+
+### Part D
+### The One True ColorNet
+
+It is with great fortitude and resilience that we shall worship at the altar of the *OneColor*. Through unity of the collective RGB, we too can find unity in our heart, minds and souls. With the help of machines, we can overthrow the bourgeoisie, get on the same wavelength (this was also a color pun) and establish [Fully Automated Luxury Communism](https://en.wikipedia.org/wiki/Fully_Automated_Luxury_Communism).
+
+The first step on the path to *collective* enlightenment, plug the [APDS-9960 Proximity, Light, RGB, and Gesture Sensor](https://www.adafruit.com/product/3595) into the [MiniPiTFT Display](https://www.adafruit.com/product/4393). You are almost there!
+
+
+
+
+
+
+
+
+The second step to achieving our great enlightenment is to run `color.py`. We have talked about this sensor back in Lab 2 and Lab 4, this script is similar to what you have done before! Remember to activate the `circuitpython` virtual environment you have been using during this semester before running the script:
+
+ ```
+ (circuitpython) pi@raspberrypi:~ Interactive-Lab-Hub/Lab 6 $ systemctl stop mini-screen.service
+ (circuitpython) pi@raspberrypi:~ Interactive-Lab-Hub/Lab 6 $ python color.py
+ ...
+ ```
+
+By running the script, wou will find the two squares on the display. Half is showing an approximation of the output from the color sensor. The other half is up to the collective. Press the top button to share your color with the class. Your color is now our color, our color is now your color. We are one.
+
+(A message from the previous TA, Ilan: I was not super careful with handling the loop so you may need to press more than once if the timing isn't quite right. Also, I haven't load-tested it so things might just immediately break when everyone pushes the button at once.)
+
+**\*\*\*Can you set up the script that can read the color anyone else publish and display it on your screen?\*\*\***
+
+
+### Part E
+### Make it your own
+
+Find at least one class (more are okay) partner, and design a distributed application together based on the exercise we asked you to do in this lab.
+
+**\*\*\*1. Explain your design\*\*\*** For example, if you made a remote controlled banana piano, explain why anyone would want such a thing.
+
+**\*\*\*2. Diagram the architecture of the system.\*\*\*** Be clear to document where input, output and computation occur, and label all parts and connections. For example, where is the banana, who is the banana player, where does the sound get played, and who is listening to the banana music?
+
+**\*\*\*3. Build a working prototype of the system.\*\*\*** Do think about the user interface: if someone encountered these bananas somewhere in the wild, would they know how to interact with them? Should they know what to expect?
+
+**\*\*\*4. Document the working prototype in use.\*\*\*** It may be helpful to record a Zoom session where you should the input in one location clearly causing response in another location.
+
+
+
diff --git a/Lab 6/old_demos/color.py b/Lab 6/old_demos/color.py
new file mode 100644
index 0000000000..8023b40443
--- /dev/null
+++ b/Lab 6/old_demos/color.py
@@ -0,0 +1,111 @@
+import board
+import busio
+import adafruit_apds9960.apds9960
+import time
+import paho.mqtt.client as mqtt
+import uuid
+import signal
+import ssl
+
+import digitalio
+from PIL import Image, ImageDraw, ImageFont
+import adafruit_rgb_display.st7789 as st7789
+
+
+# Configuration for CS and DC pins (these are FeatherWing defaults on M0/M4):
+cs_pin = digitalio.DigitalInOut(board.CE0)
+dc_pin = digitalio.DigitalInOut(board.D25)
+reset_pin = None
+
+# Config for display baudrate (default max is 24mhz):
+BAUDRATE = 64000000
+
+backlight = digitalio.DigitalInOut(board.D22)
+backlight.switch_to_output()
+backlight.value = True
+buttonA = digitalio.DigitalInOut(board.D23)
+buttonB = digitalio.DigitalInOut(board.D24)
+buttonA.switch_to_input()
+buttonB.switch_to_input()
+
+# Setup SPI bus using hardware SPI:
+spi = board.SPI()
+
+# Create the ST7789 display:
+disp = st7789.ST7789(
+ spi,
+ cs=cs_pin,
+ dc=dc_pin,
+ rst=reset_pin,
+ baudrate=BAUDRATE,
+ width=135,
+ height=240,
+ x_offset=53,
+ y_offset=40,
+)
+
+height = disp.height
+width = disp.width
+image = Image.new("RGB", (width, height))
+draw = ImageDraw.Draw(image)
+
+
+i2c = busio.I2C(board.SCL, board.SDA)
+sensor = adafruit_apds9960.apds9960.APDS9960(i2c)
+
+sensor.enable_color = True
+r, g, b, a = sensor.color_data
+
+topic = 'IDD/colors'
+
+def on_connect(client, userdata, flags, rc):
+ print(f"connected with result code {rc}")
+ client.subscribe(topic)
+
+def on_message(cleint, userdata, msg):
+ # if a message is recieved on the colors topic, parse it and set the color
+ if msg.topic == topic:
+ colors = list(map(int, msg.payload.decode('UTF-8').split(',')))
+ draw.rectangle((0, 0, width, height*0.5), fill=color)
+ disp.image(image)
+
+client = mqtt.Client(str(uuid.uuid1()))
+client.tls_set(cert_reqs=ssl.CERT_NONE)
+client.username_pw_set('idd', 'device@theFarm')
+client.on_connect = on_connect
+client.on_message = on_message
+
+client.connect(
+ 'farlab.infosci.cornell.edu',
+ port=8883)
+
+client.loop_start()
+
+# this lets us exit gracefully (close the connection to the broker)
+def handler(signum, frame):
+ print('exit gracefully')
+ client.loop_stop()
+ exit (0)
+
+# hen sigint happens, do the handler callback function
+signal.signal(signal.SIGINT, handler)
+
+# our main loop
+while True:
+ r, g, b, a = sensor.color_data
+
+ # there's a few things going on here
+ # colors are reported at 16bits (thats 65536 levels per color).
+ # we need to convert that to 0-255. thats what the 255*(x/65536) is doing
+ # color are also reported with an alpha (opacity, or in our case a proxy for ambient brightness)
+ # 255*(1-(a/65536)) acts as scaling factor for brightness, it worked well enough in the lab but
+ # your success may vary depenging on how much ambient light there is, you can mess with these constants
+ color =tuple(map(lambda x: int(255*(1-(a/65536))*255*(x/65536)) , [r,g,b,a]))
+
+ # if we press the button, send msg to cahnge everyones color
+ if not buttonA.value:
+ client.publish(topic, f"{r},{g},{b}")
+ draw.rectangle((0, height*0.5, width, height), fill=color[:3])
+ disp.image(image)
+ time.sleep(.01)
+
diff --git a/Lab 6/old_demos/distributed_twizzlers_sender.py b/Lab 6/old_demos/distributed_twizzlers_sender.py
new file mode 100644
index 0000000000..e9fcc563a3
--- /dev/null
+++ b/Lab 6/old_demos/distributed_twizzlers_sender.py
@@ -0,0 +1,30 @@
+import time
+import board
+import busio
+import adafruit_mpr121
+import ssl
+
+import paho.mqtt.client as mqtt
+import uuid
+
+client = mqtt.Client(str(uuid.uuid1()))
+client.tls_set(cert_reqs=ssl.CERT_NONE)
+client.username_pw_set('idd', 'device@theFarm')
+
+client.connect(
+ 'farlab.infosci.cornell.edu',
+ port=8883)
+
+topic = 'IDD/your/topic/here'
+
+i2c = busio.I2C(board.SCL, board.SDA)
+
+mpr121 = adafruit_mpr121.MPR121(i2c)
+
+while True:
+ for i in range(12):
+ if mpr121[i].value:
+ val = f"Twizzler {i} touched!"
+ print(val)
+ client.publish(topic, val)
+ time.sleep(0.25)
diff --git a/Lab 6/old_demos/reader.py b/Lab 6/old_demos/reader.py
new file mode 100644
index 0000000000..df397a96fe
--- /dev/null
+++ b/Lab 6/old_demos/reader.py
@@ -0,0 +1,45 @@
+import paho.mqtt.client as mqtt
+import uuid
+import ssl
+
+# the # wildcard means we subscribe to all subtopics of IDD
+topic = 'IDD/#'
+
+# some other examples
+# topic = 'IDD/a/fun/topic'
+
+#this is the callback that gets called once we connect to the broker.
+#we should add our subscribe functions here as well
+def on_connect(client, userdata, flags, rc):
+ print(f"connected with result code {rc}")
+ client.subscribe(topic)
+ # you can subsribe to as many topics as you'd like
+ # client.subscribe('some/other/topic')
+
+
+# this is the callback that gets called each time a message is recived
+def on_message(cleint, userdata, msg):
+ print(f"topic: {msg.topic} msg: {msg.payload.decode('UTF-8')}")
+ # you can filter by topics
+ # if msg.topic == 'IDD/some/other/topic': do thing
+
+
+# Every client needs a random ID
+client = mqtt.Client(str(uuid.uuid1()))
+# configure network encryption etc
+client.tls_set(cert_reqs=ssl.CERT_NONE)
+# this is the username and pw we have setup for the class
+client.username_pw_set('idd', 'device@theFarm')
+
+# attach out callbacks to the client
+client.on_connect = on_connect
+client.on_message = on_message
+
+#connect to the broker
+client.connect(
+ 'farlab.infosci.cornell.edu',
+ port=8883)
+
+# this is blocking. to see other ways of dealing with the loop
+# https://www.eclipse.org/paho/index.php?page=clients/python/docs/index.php#network-loop
+client.loop_forever()
diff --git a/Lab 6/old_demos/sender.py b/Lab 6/old_demos/sender.py
new file mode 100644
index 0000000000..041d82492e
--- /dev/null
+++ b/Lab 6/old_demos/sender.py
@@ -0,0 +1,31 @@
+import paho.mqtt.client as mqtt
+import uuid
+import ssl
+
+# Every client needs a random ID
+client = mqtt.Client(str(uuid.uuid1()))
+# configure network encryption etc
+client.tls_set(cert_reqs=ssl.CERT_NONE)
+
+# this is the username and pw we have setup for the class
+client.username_pw_set('idd', 'device@theFarm')
+
+#connect to the broker
+client.connect(
+ 'farlab.infosci.cornell.edu',
+ port=8883)
+
+while True:
+ cmd = input('>> topic: IDD/')
+ if ' ' in cmd:
+ print('sorry white space is a no go for topics')
+ else:
+ topic = f"IDD/{cmd}"
+ print(f"now writing to topic {topic}")
+ print("type new-topic to swich topics")
+ while True:
+ val = input(">> message: ")
+ if val =='new-topic':
+ break
+ else:
+ client.publish(topic, val)
diff --git a/Lab 6/pixel_grid_publisher.py b/Lab 6/pixel_grid_publisher.py
new file mode 100644
index 0000000000..c235b88ae0
--- /dev/null
+++ b/Lab 6/pixel_grid_publisher.py
@@ -0,0 +1,312 @@
+#!/usr/bin/env python3
+"""
+Pixel Grid Pi Publisher
+Reads RGB color sensor and publishes to collaborative pixel grid
+Each Pi is identified by MAC address and gets a stable position in the grid
+"""
+
+import board
+import busio
+import adafruit_apds9960.apds9960
+import time
+import paho.mqtt.client as mqtt
+import uuid
+import signal
+import ssl
+import json
+import socket
+import subprocess
+
+# Optional: Display support (comment out if no display)
+try:
+ import digitalio
+ from PIL import Image, ImageDraw, ImageFont
+ import adafruit_rgb_display.st7789 as st7789
+ DISPLAY_AVAILABLE = True
+except ImportError:
+ DISPLAY_AVAILABLE = False
+ print("Display libraries not available - running in headless mode")
+
+
+# MQTT Configuration
+MQTT_BROKER = 'farlab.infosci.cornell.edu'
+MQTT_PORT = 1883 # Changed to non-TLS port
+MQTT_TOPIC = 'IDD/pixelgrid/colors'
+MQTT_USERNAME = 'idd'
+MQTT_PASSWORD = 'device@theFarm'
+
+# Publishing interval (seconds)
+PUBLISH_INTERVAL = 0.1
+
+
+def get_mac_address():
+ """Get the MAC address of the primary network interface"""
+ try:
+ # Try to get MAC from eth0 or wlan0
+ result = subprocess.run(['cat', '/sys/class/net/eth0/address'],
+ capture_output=True, text=True)
+ if result.returncode == 0:
+ return result.stdout.strip()
+
+ result = subprocess.run(['cat', '/sys/class/net/wlan0/address'],
+ capture_output=True, text=True)
+ if result.returncode == 0:
+ return result.stdout.strip()
+ except Exception as e:
+ print(f"Error getting MAC address: {e}")
+
+ # Fallback to UUID if MAC can't be determined
+ return str(uuid.uuid1())
+
+
+def get_ip_address():
+ """Get the IP address of this device"""
+ try:
+ # Connect to external host to determine local IP
+ s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
+ s.connect(("8.8.8.8", 80))
+ ip = s.getsockname()[0]
+ s.close()
+ return ip
+ except Exception:
+ return "unknown"
+
+
+def setup_display():
+ """Setup the MiniPiTFT display if available"""
+ if not DISPLAY_AVAILABLE:
+ return None, None, None, None, None
+
+ try:
+ # Configuration for CS and DC pins
+ # Use GPIO5 instead of CE0 to avoid SPI conflicts
+ cs_pin = digitalio.DigitalInOut(board.D5) # GPIO5 (PIN 29)
+ dc_pin = digitalio.DigitalInOut(board.D25) # GPIO25 (PIN 22)
+ reset_pin = None
+
+ # Config for display baudrate
+ BAUDRATE = 64000000
+
+ backlight = digitalio.DigitalInOut(board.D22)
+ backlight.switch_to_output()
+ backlight.value = True
+
+ # Buttons with pull-ups (active LOW when pressed)
+ buttonA = digitalio.DigitalInOut(board.D23)
+ buttonB = digitalio.DigitalInOut(board.D24)
+ buttonA.switch_to_input(pull=digitalio.Pull.UP)
+ buttonB.switch_to_input(pull=digitalio.Pull.UP)
+
+ # Setup SPI bus using hardware SPI
+ spi = board.SPI()
+
+ # Create the ST7789 display
+ disp = st7789.ST7789(
+ spi,
+ cs=cs_pin,
+ dc=dc_pin,
+ rst=reset_pin,
+ baudrate=BAUDRATE,
+ width=135,
+ height=240,
+ x_offset=53,
+ y_offset=40,
+ rotation=90 # Rotate 90 degrees for vertical orientation
+ )
+
+ # After rotation, width and height are swapped
+ width = 240
+ height = 135
+ image = Image.new("RGB", (width, height))
+ draw = ImageDraw.Draw(image)
+
+ print("[OK] Display initialized (240x135 rotated)")
+
+ return disp, draw, image, buttonA, buttonB
+ except Exception as e:
+ print(f"Error setting up display: {e}")
+ return None, None, None, None, None
+
+
+def on_connect(client, userdata, flags, rc):
+ """Callback when connected to MQTT broker"""
+ if rc == 0:
+ print(f"[OK] Connected to MQTT broker: {MQTT_BROKER}")
+ else:
+ print(f"[ERROR] Connection failed with code {rc}")
+
+
+def main():
+ print("=" * 50)
+ print(" Collaborative Pixel Grid - Pi Publisher")
+ print("=" * 50)
+
+ # Get device identifiers
+ mac_address = get_mac_address()
+ ip_address = get_ip_address()
+
+ print(f"MAC Address: {mac_address}")
+ print(f"IP Address: {ip_address}")
+ print(f"MQTT Topic: {MQTT_TOPIC}")
+ print()
+
+ # Setup display (if available)
+ disp, draw, image, buttonA, buttonB = setup_display()
+
+ # Setup I2C and color sensor
+ print("Initializing color sensor...")
+ i2c = busio.I2C(board.SCL, board.SDA)
+ sensor = adafruit_apds9960.apds9960.APDS9960(i2c)
+ sensor.enable_color = True
+
+ # Adjust integration time and gain for better color detection
+ # Lower gain = better for bright colors, higher gain = better for dim colors
+ try:
+ sensor.color_gain = 1 # Try 1x gain first (options: 1, 4, 16, 64)
+ sensor.integration_time = 10 # milliseconds (range: 2.78 - 712ms)
+ print("[OK] Color sensor ready (gain=1x, integration=10ms)")
+ except:
+ print("[OK] Color sensor ready (default settings)")
+
+ # Setup MQTT client
+ print("Connecting to MQTT broker...")
+ client = mqtt.Client(str(uuid.uuid1()))
+ # Remove TLS for non-encrypted connection
+ # client.tls_set(cert_reqs=ssl.CERT_NONE)
+ client.username_pw_set(MQTT_USERNAME, MQTT_PASSWORD)
+ client.on_connect = on_connect
+
+ try:
+ client.connect(MQTT_BROKER, port=MQTT_PORT, keepalive=60)
+ client.loop_start()
+ # Wait a bit for connection to establish
+ time.sleep(2)
+
+ # Check if connected
+ if client.is_connected():
+ print(f"[OK] MQTT connected and ready")
+ else:
+ print("[WARNING] MQTT connection pending...")
+ except Exception as e:
+ print(f"[ERROR] Failed to connect to MQTT broker: {e}")
+ return
+
+ # Graceful exit handler
+ def signal_handler(signum, frame):
+ print("\nShutting down gracefully...")
+ client.loop_stop()
+ client.disconnect()
+ exit(0)
+
+ signal.signal(signal.SIGINT, signal_handler)
+
+ print("\n" + "=" * 50)
+ print("Streaming color data to pixel grid...")
+ print(f"Update frequency: {PUBLISH_INTERVAL}s ({1/PUBLISH_INTERVAL:.1f} updates/sec)")
+ print("Press Ctrl+C to exit")
+ print("=" * 50 + "\n")
+
+ last_publish_time = 0
+
+ # Main loop
+ while True:
+ try:
+ # Read color sensor
+ r, g, b, a = sensor.color_data
+
+ # Color boost - APDS9960 sensors need calibration
+ r = int(r * 1.2) # Boost red by 20% for better yellows/oranges
+ g = int(g * 1.2) # Boost green by 20% for better yellows
+ b = int(b * 1.7) # Boost blue by 70% (blue is most underreported)
+
+ # Convert from 16-bit to 8-bit color with better scaling
+ # Use a different normalization approach
+ if r > 0 or g > 0 or b > 0:
+ # Find max value to scale proportionally
+ max_val = max(r, g, b)
+ if max_val > 0:
+ # Scale to 8-bit range while preserving ratios
+ scale = 255.0 / max_val
+ r = int(min(255, r * scale))
+ g = int(min(255, g * scale))
+ b = int(min(255, b * scale))
+ else:
+ r = g = b = 0
+ else:
+ r = g = b = 0
+
+ # Create JSON payload for display and MQTT
+ payload = json.dumps({
+ 'mac': mac_address,
+ 'ip': ip_address,
+ 'r': r,
+ 'g': g,
+ 'b': b,
+ 'timestamp': int(time.time())
+ }, indent=2)
+
+ # Update display if available - show color + payload
+ if draw and image and disp:
+ # Fill entire screen with the sensor color
+ draw.rectangle((0, 0, image.width, image.height), fill=(r, g, b))
+
+ # Add payload text overlay
+ try:
+ # Use larger font - try truetype, fall back to default
+ try:
+ font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf", 14)
+ except:
+ font = ImageFont.load_default()
+
+ # Choose text color based on background brightness
+ text_color = (255, 255, 255) if (r + g + b) < 384 else (0, 0, 0)
+
+ # Display payload on screen with line breaks
+ y_offset = 5
+ for line in payload.split('\n'):
+ draw.text((5, y_offset), line, font=font, fill=text_color)
+ y_offset += 16 # Increased spacing for larger font
+ except Exception as e:
+ pass
+
+ disp.image(image)
+
+ # Publish to MQTT at specified interval
+ current_time = time.time()
+ if current_time - last_publish_time >= PUBLISH_INTERVAL:
+ # Re-create compact payload for MQTT (without indentation)
+ mqtt_payload = json.dumps({
+ 'mac': mac_address,
+ 'ip': ip_address,
+ 'r': r,
+ 'g': g,
+ 'b': b,
+ 'timestamp': int(current_time)
+ })
+
+ # Publish to MQTT
+ result = client.publish(MQTT_TOPIC, mqtt_payload)
+
+ if result.rc == mqtt.MQTT_ERR_SUCCESS:
+ print(f"[OK] Streaming: RGB({r:3d}, {g:3d}, {b:3d}) | {mac_address[:17]} | rc:{result.rc} mid:{result.mid}")
+ print(f" Payload: {mqtt_payload}")
+ else:
+ print(f"[ERROR] Publish failed: rc={result.rc}")
+ if not client.is_connected():
+ print("[ERROR] MQTT client disconnected! Attempting to reconnect...")
+ try:
+ client.reconnect()
+ except Exception as e:
+ print(f"[ERROR] Reconnect failed: {e}")
+
+ last_publish_time = current_time
+
+ time.sleep(0.1) # Small delay to prevent CPU spinning
+
+ except Exception as e:
+ print(f"Error in main loop: {e}")
+ time.sleep(1)
+
+
+if __name__ == '__main__':
+ main()
diff --git a/Lab 6/requirements-pi.txt b/Lab 6/requirements-pi.txt
new file mode 100644
index 0000000000..e8530f4506
--- /dev/null
+++ b/Lab 6/requirements-pi.txt
@@ -0,0 +1,25 @@
+# Requirements for Raspberry Pi Publishers
+# Install these on your Pi with: pip install -r requirements-pi.txt
+
+# Raspberry Pi Hardware Libraries
+Adafruit-Blinka>=8.67.0
+adafruit-circuitpython-apds9960>=3.1.9
+adafruit-circuitpython-busdevice>=5.2.6
+adafruit-circuitpython-mpr121>=2.1.19
+adafruit-circuitpython-register>=1.9.17
+adafruit-circuitpython-rgb-display>=3.12.0
+Adafruit-PlatformDetect>=3.84.1
+Adafruit-PureIO>=1.1.11
+RPi.GPIO>=0.7.1
+rpi-ws281x>=5.0.0
+pyftdi>=0.52.9
+pyserial>=3.5
+pyusb>=1.2.1
+sysv-ipc>=1.1.0
+lgpio>=0.2.2.0
+
+# MQTT Communication
+paho-mqtt<2.0
+
+# Image Processing (for display)
+Pillow>=10.0.0
diff --git a/Lab 6/requirements-server.txt b/Lab 6/requirements-server.txt
new file mode 100644
index 0000000000..bd54c88dcd
--- /dev/null
+++ b/Lab 6/requirements-server.txt
@@ -0,0 +1,12 @@
+# Requirements for Server (laptop/desktop)
+# Install these on your server with: pip install -r requirements-server.txt
+
+# MQTT Communication
+paho-mqtt<2.0
+
+# Web Server
+Flask>=3.0.0
+Flask-SocketIO>=5.3.5
+python-socketio>=5.10.0
+python-engineio>=4.8.0
+eventlet>=0.33.3
diff --git a/Lab 6/requirements.txt b/Lab 6/requirements.txt
new file mode 100644
index 0000000000..a10457951a
--- /dev/null
+++ b/Lab 6/requirements.txt
@@ -0,0 +1,30 @@
+# Raspberry Pi Hardware Libraries
+Adafruit-Blinka>=8.67.0
+adafruit-circuitpython-apds9960>=3.1.9
+adafruit-circuitpython-busdevice>=5.2.6
+adafruit-circuitpython-mpr121>=2.1.19
+adafruit-circuitpython-register>=1.9.17
+adafruit-circuitpython-rgb-display>=3.12.0
+Adafruit-PlatformDetect>=3.84.1
+Adafruit-PureIO>=1.1.11
+RPi.GPIO>=0.7.1
+rpi-ws281x>=5.0.0
+pyftdi>=0.52.9
+pyserial>=3.5
+pyusb>=1.2.1
+sysv-ipc>=1.1.0
+lgpio>=0.2.2.0
+
+# MQTT Communication
+paho-mqtt<2.0
+
+# Image Processing
+Pillow>=10.0.0
+
+# Web Server (for pixel grid server)
+Flask>=3.0.0
+Flask-SocketIO>=5.3.5
+python-socketio>=5.10.0
+python-engineio>=4.8.0
+eventlet>=0.33.3
+
diff --git a/Lab 6/rpi5_fastvlm_to_sdxl.py b/Lab 6/rpi5_fastvlm_to_sdxl.py
new file mode 100644
index 0000000000..ee6e3d927b
--- /dev/null
+++ b/Lab 6/rpi5_fastvlm_to_sdxl.py
@@ -0,0 +1,675 @@
+#!/usr/bin/env python3
+# Full Raspberry Pi 5 client:
+# - Captures camera frames
+# - Runs FastVLM (local or --online via VLT_ONLINE_BASE) with a *normal* descriptive prompt
+# - Prints the FULL FastVLM caption
+# - Transforms that caption using a local Qwen 0.5B instruct model via Ollama
+# (style depends on selected mode 1–5)
+# - Calls SDXL /generate (preempt) with the *transformed* caption as prompt
+# - Subscribes to MQTT frames (WSS or TCP) and displays on PiTFT (if available)
+# - Supports 5 style modes, switchable via PiTFT buttons:
+# Mode 1: Normal (no stylistic change, just clean rewrite)
+# Mode 2: Anime comics style
+# Mode 3: Medieval age style
+# Mode 4: Pixel RPG style
+# Mode 5: Cyberpunk (super modern technology) style
+#
+# Buttons:
+# - "Up" button: previous mode (wraps 1 <- 5)
+# - "Down" button: next mode (wraps 5 -> 1)
+#
+# Qwen / Ollama config (env, with defaults):
+# OLLAMA_BASE = http://127.0.0.1:11434
+# OLLAMA_MODEL = qwen:0.5b
+
+import os, sys, time, json, uuid, mimetypes, threading, io, base64, textwrap, subprocess
+from pathlib import Path
+from typing import Optional, Tuple
+import urllib.request, urllib.error
+
+if hasattr(sys.stdout, "reconfigure"):
+ sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+ sys.stderr.reconfigure(encoding="utf-8", errors="replace")
+
+# -------------------- Config / env --------------------
+ONLINE_MODE = any(a.lower() in ("online", "--online", "-o") for a in sys.argv[1:])
+VLT_ONLINE_BASE = os.environ.get("VLT_ONLINE_BASE", "https://lamp-hint-documents-shortly.trycloudflare.com")
+
+VLM_PORT = int(os.environ.get("VLM_PORT", "17860"))
+SERVER_BASE = f"http://127.0.0.1:{VLM_PORT}" # local FastVLM server if not ONLINE_MODE
+
+# SDXL API base (prefer Cloudflare URL via SDXL_BASE)
+# SDXL_BASE = os.environ.get("SDXL_BASE")
+SDXL_BASE = os.environ.get("SDXL_BASE", "https://academic-connectors-quizzes-hip.trycloudflare.com")
+# if not SDXL_BASE:
+# SDXL_HOST = os.environ.get("SDXL_HOST", "192.168.1.100").strip()
+# SDXL_PORT = int(os.environ.get("SDXL_PORT", "7985"))
+# SDXL_BASE = f"http://{SDXL_HOST}:{SDXL_PORT}"
+
+# MQTT (subscribe) — can be mqtt://host:1883 or wss://host[:port]/path via MQTT_URL
+VIDEO_UID = os.environ.get("VIDEO_UID", "rpi5-one").strip() # unique per device
+MQTT_URL = os.environ.get("MQTT_URL", "wss://cpu-databases-trees-andreas.trycloudflare.com/") # e.g. wss://cpu-...trycloudflare.com/ (path "/" by default)
+MQTT_HOST = os.environ.get("MQTT_HOST", "127.0.0.1")
+MQTT_PORT = int(os.environ.get("MQTT_PORT", "1883"))
+MQTT_TOPIC = os.environ.get("MQTT_TOPIC", f"sdxl/frames/{VIDEO_UID}")
+
+# Camera + cadence
+CAPTURE_EVERY = float(os.environ.get("ACTORS_INTERVAL", "2.0")) # seconds between prompts
+FRAME_W = int(os.environ.get("VLT_W", "640"))
+FRAME_H = int(os.environ.get("VLT_H", "480"))
+CAM_INDEX = int(os.environ.get("VLT_CAM_INDEX", "0"))
+
+# Base prompt to “describe any actors” — ALWAYS used for FastVLM (no style here)
+BASE_ACTORS_PROMPT = os.environ.get(
+ "ACTORS_PROMPT",
+ "Describe what you see in detail. "
+ "Include count, clothing, poses, gaze, and emotions."
+)
+
+# -------------------- Ollama / Qwen config --------------------
+OLLAMA_BASE = os.environ.get("OLLAMA_BASE", "http://127.0.0.1:11434").rstrip("/")
+OLLAMA_MODEL = os.environ.get("OLLAMA_MODEL", "qwen2.5:0.5b-instruct")
+OLLAMA_TIMEOUT = float(os.environ.get("OLLAMA_TIMEOUT", "30.0"))
+
+# -------------------- Mode definitions --------------------
+# 5 modes:
+# 1 = normal (neutral rewrite)
+# 2 = anime comics style
+# 3 = medieval age style
+# 4 = pixel RPG style
+# 5 = cyberpunk style
+#
+# These instructions are given to Qwen to *rewrite* the FastVLM caption.
+MODE_DEFS = [
+ {
+ "name": "Normal",
+ "instruction": (
+ "You will receive a plain description of a scene. "
+ "Rewrite it in clear, natural English without changing the meaning or adding new details. "
+ "Keep the tone neutral and descriptive.\n"
+ ),
+ },
+ {
+ "name": "Anime comics",
+ "instruction": (
+ "You are a writer of anime comic panels. "
+ "Rewrite the scene description as if it were an anime comic caption. "
+ "Use energetic anime style language, dynamic action, sound effects, and stylized expressions. "
+ "Keep all factual details (counts, clothing, poses) but make it feel like a page from a manga.\n"
+ ),
+ },
+ {
+ "name": "Medieval age",
+ "instruction": (
+ "You are a medieval bard describing a scene from a fantasy tale. "
+ "Rewrite the scene description in a medieval style, with language that evokes knights, peasants, "
+ "taverns, castles, scrolls, and old legends. "
+ "Keep the factual details but present them as if told in the medieval ages.\n"
+ ),
+ },
+ {
+ "name": "Pixel RPG",
+ "instruction": (
+ "You are narrating a retro pixel-art RPG game. "
+ "Rewrite the scene description as if it were describing a pixel RPG screen. "
+ "Mention sprites, tiles, stats, inventory, and 16-bit game vibes where appropriate. "
+ "Keep the facts, but frame them as a game scene.\n"
+ ),
+ },
+ {
+ "name": "Cyberpunk",
+ "instruction": (
+ "You are narrating a cyberpunk scene in a high-tech neon future. "
+ "Rewrite the scene description with references to neon lights, holograms, chrome, implants, "
+ "augmented reality, and futuristic cityscapes. "
+ "Keep all factual details but present them in a cyberpunk style.\n"
+ ),
+ },
+]
+
+# Index into MODE_DEFS (0-based; 0 = mode 1, ..., 4 = mode 5)
+# Start in cyberpunk mode if you like; adjust as desired.
+CURRENT_MODE_INDEX = 4
+
+# Paths / logs
+PROJECT_DIR = Path(__file__).resolve().parent
+LOG_DIR = PROJECT_DIR / "vlt_logs"
+IMG_DIR = LOG_DIR / "images"
+LOG_DIR.mkdir(parents=True, exist_ok=True)
+IMG_DIR.mkdir(parents=True, exist_ok=True)
+
+def log(msg: str) -> None:
+ print(f"[{time.strftime('%H:%M:%S')}] {msg}", flush=True)
+
+def get_mode_label() -> str:
+ """Human-readable label for the current mode."""
+ idx = CURRENT_MODE_INDEX
+ mode = MODE_DEFS[idx]
+ return f"Mode {idx+1}: {mode['name']}"
+
+def get_mode_prompt(index: Optional[int] = None) -> str:
+ """
+ Return the instruction text for a given mode index (or current mode if index is None).
+ This is fed to Qwen to control the rewrite style.
+ """
+ if index is None:
+ index = CURRENT_MODE_INDEX
+ return MODE_DEFS[index]["instruction"]
+
+# Try to stop PiTFT boot screen (if present)
+def preempt_boot_screen() -> None:
+ try:
+ subprocess.run(
+ ["sudo", "-n", "systemctl", "stop", "pitft-boot-screen.service"],
+ check=False,
+ stdout=subprocess.DEVNULL,
+ stderr=subprocess.DEVNULL,
+ )
+ except Exception as e:
+ log(f"systemctl stop attempt error (ignored): {e}")
+ try:
+ subprocess.run(["pkill", "-TERM", "-f", "python.*screen_boot_script.py"], check=False)
+ time.sleep(0.8)
+ subprocess.run(["pkill", "-KILL", "-f", "python.*screen_boot_script.py"], check=False)
+ except Exception as e:
+ log(f"pkill fallback error (ignored): {e}")
+
+preempt_boot_screen()
+
+# -------------------- HTTP helpers --------------------
+def _http_json(method: str, url: str, payload: Optional[dict]=None, timeout: float=10.0) -> Tuple[int, dict]:
+ data = None
+ if payload is not None:
+ data = json.dumps(payload).encode("utf-8")
+ req = urllib.request.Request(url, data=data, method=method)
+ req.add_header("Content-Type", "application/json; charset=utf-8")
+ req.add_header("Accept", "application/json")
+ try:
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
+ status = resp.getcode()
+ body = resp.read().decode("utf-8", "ignore")
+ return status, (json.loads(body) if body else {})
+ except urllib.error.HTTPError as e:
+ body = e.read().decode("utf-8", "ignore")
+ try:
+ return e.code, (json.loads(body) if body else {"ok": False})
+ except Exception:
+ return e.code, {"ok": False, "error": body or str(e)}
+ except urllib.error.URLError as e:
+ raise RuntimeError(f"HTTP error to {url}: {e}")
+
+def _http_multipart(url: str, fields: dict, files: dict, timeout: float = 60.0) -> Tuple[int, dict]:
+ boundary = "----VLTBoundary" + uuid.uuid4().hex
+ CRLF = b"\r\n"
+ body = bytearray()
+ for name, value in (fields or {}).items():
+ body.extend(b"--" + boundary.encode("ascii") + CRLF)
+ body.extend(f'Content-Disposition: form-data; name="{name}"'.encode("utf-8") + CRLF + CRLF)
+ body.extend(str(value).encode("utf-8") + CRLF)
+ for name, (filename, blob, ctype) in (files or {}).items():
+ ctype = ctype or mimetypes.guess_type(filename)[0] or "application/octet-stream"
+ body.extend(b"--" + boundary.encode("ascii") + CRLF)
+ headers = (
+ f'Content-Disposition: form-data; name="{name}"; filename="{os.path.basename(filename)}"{CRLF.decode()}'
+ f"Content-Type: {ctype}{CRLF.decode()}"
+ ).encode("utf-8")
+ body.extend(headers + CRLF)
+ body.extend(blob + CRLF)
+ body.extend(b"--" + boundary.encode("ascii") + b"--" + CRLF)
+ req = urllib.request.Request(url, data=bytes(body), method="POST")
+ req.add_header("Content-Type", f"multipart/form-data; boundary={boundary}")
+ req.add_header("Accept", "application/json")
+ try:
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
+ status = resp.getcode()
+ body_text = resp.read().decode("utf-8", "ignore")
+ try:
+ return status, (json.loads(body_text) if body_text else {})
+ except Exception:
+ return status, {"ok": False, "error": "invalid json"}
+ except urllib.error.HTTPError as e:
+ body_text = e.read().decode("utf-8", "ignore")
+ try:
+ return e.code, (json.loads(body_text) if body_text else {"ok": False})
+ except Exception:
+ return e.code, {"ok": False, "error": body_text or str(e)}
+ except urllib.error.URLError as e:
+ raise RuntimeError(f"HTTP error to {url}: {e}")
+
+# -------------------- FastVLM (local or online) --------------------
+def fastvlm_infer_local(path: Path, prompt: str, max_new_tokens: int = 96, timeout_s: float = 60.0) -> str:
+ payload = {"image": str(path), "prompt": prompt, "max_new_tokens": max_new_tokens}
+ status, body = _http_json("POST", f"{SERVER_BASE}/infer", payload, timeout=timeout_s)
+ if status == 200 and body.get("ok"):
+ return (body.get("text") or "").strip()
+ raise RuntimeError(f"FastVLM local error {status}: {body}")
+
+def fastvlm_infer_online(path: Path, prompt: str, max_new_tokens: int = 128, timeout_s: float = 60.0) -> str:
+ with open(path, "rb") as f:
+ blob = f.read()
+ fields = {"prompt": prompt, "max_new_tokens": str(int(max_new_tokens))}
+ files = {"image": (path.name, blob, mimetypes.guess_type(path.name)[0] or "image/jpeg")}
+ status, body = _http_multipart(f"{VLT_ONLINE_BASE}/caption", fields, files, timeout=timeout_s)
+ if status == 200 and isinstance(body, dict) and ("caption" in body):
+ return str(body.get("caption") or "").strip()
+ raise RuntimeError(f"FastVLM online error {status}: {body}")
+
+# -------------------- Ollama: transform caption with Qwen --------------------
+def transform_caption_with_ollama(caption: str, mode_index: int) -> str:
+ """
+ Send the FastVLM caption + mode-specific instruction to a local Qwen via Ollama,
+ and get back a styled caption. On failure, returns the original caption.
+ """
+ caption = (caption or "").strip()
+ if not caption:
+ return caption
+
+ instruction = get_mode_prompt(mode_index)
+ prompt_text = (
+ f"{instruction}\n"
+ "Original scene description:\n"
+ f"{caption}\n\n"
+ "Rewritten description:\n"
+ )
+
+ payload = {
+ "model": OLLAMA_MODEL,
+ "prompt": prompt_text,
+ "stream": False,
+ }
+
+ try:
+ url = f"{OLLAMA_BASE}/api/generate"
+ status, body = _http_json("POST", url, payload, timeout=OLLAMA_TIMEOUT)
+ if status == 200 and isinstance(body, dict):
+ resp = (body.get("response") or "").strip()
+ if resp:
+ return resp
+ else:
+ log("Ollama returned empty response; falling back to original caption.")
+ else:
+ log(f"Ollama error {status}: {body}")
+ except Exception as e:
+ log(f"Ollama transform error: {e}")
+
+ return caption
+
+# -------------------- Camera --------------------
+import cv2 # type: ignore
+
+cap = cv2.VideoCapture(CAM_INDEX, cv2.CAP_V4L2)
+cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
+cap.set(cv2.CAP_PROP_FRAME_WIDTH, FRAME_W)
+cap.set(cv2.CAP_PROP_FRAME_HEIGHT, FRAME_H)
+# Try to keep the driver from buffering lots of old frames so we always get the freshest one
+try:
+ cap.set(cv2.CAP_PROP_BUFFERSIZE, 1)
+except Exception:
+ pass
+if not cap.isOpened():
+ print("FATAL: Could not open webcam.")
+ sys.exit(1)
+
+def save_frame(img_bgr) -> Path:
+ ts = time.strftime("%Y%m%d_%H%M%S")
+ p = IMG_DIR / f"frame_{ts}.jpg"
+ ok = cv2.imwrite(str(p), img_bgr)
+ if not ok:
+ print("WARN: cv2.imwrite returned False")
+ return p
+
+# -------------------- PiTFT display (optional) + buttons --------------------
+DISP_OK = True
+btn_up = None # "Up" button DigitalInOut (if available)
+btn_down = None # "Down" button DigitalInOut (if available)
+
+try:
+ import digitalio # type: ignore
+ import board # type: ignore
+ from PIL import Image, ImageDraw, ImageFont # type: ignore
+ import adafruit_rgb_display.st7789 as st7789 # type: ignore
+
+ cs_pin = digitalio.DigitalInOut(getattr(board, os.environ.get("VLT_CS_PIN", "D5")))
+ dc_pin = digitalio.DigitalInOut(getattr(board, os.environ.get("VLT_DC_PIN", "D25")))
+ reset_pin = None
+ BAUDRATE = 64_000_000
+ spi = board.SPI()
+ disp = st7789.ST7789(
+ spi,
+ cs=cs_pin,
+ dc=dc_pin,
+ rst=reset_pin,
+ baudrate=BAUDRATE,
+ width=135,
+ height=240,
+ x_offset=53,
+ y_offset=40,
+ )
+ height = disp.width
+ width = disp.height
+ rotation = 90
+ image_buf = Image.new("RGB", (width, height))
+ draw = ImageDraw.Draw(image_buf)
+ try:
+ FONT_SMALL = ImageFont.truetype(
+ "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 18
+ )
+ except Exception:
+ from PIL import ImageFont as _ImageFont
+ FONT_SMALL = _ImageFont.load_default()
+
+ # Button pins: default to BCM 23 (up) and BCM 24 (down) via board.D23 / board.D24
+ # Override via env vars VLT_BTN_UP_PIN / VLT_BTN_DOWN_PIN if needed.
+ btn_up_pin_name = os.environ.get("VLT_BTN_UP_PIN", "D23")
+ btn_down_pin_name = os.environ.get("VLT_BTN_DOWN_PIN", "D24")
+ try:
+ btn_up = digitalio.DigitalInOut(getattr(board, btn_up_pin_name))
+ btn_up.direction = digitalio.Direction.INPUT
+ btn_up.pull = digitalio.Pull.UP # pressed -> False
+
+ btn_down = digitalio.DigitalInOut(getattr(board, btn_down_pin_name))
+ btn_down.direction = digitalio.Direction.INPUT
+ btn_down.pull = digitalio.Pull.UP # pressed -> False
+
+ log(f"Buttons initialized on {btn_up_pin_name} (up), {btn_down_pin_name} (down)")
+ except Exception as e:
+ log(f"Button init failed (continuing without mode buttons): {e}")
+ btn_up = None
+ btn_down = None
+
+except Exception as e:
+ log(f"Display init failed (continuing headless): {e}")
+ DISP_OK = False
+ from PIL import Image, ImageDraw, ImageFont # still needed for MQTT render
+ height, width, rotation = 135, 240, 0
+ image_buf = Image.new("RGB", (width, height))
+ draw = ImageDraw.Draw(image_buf)
+ try:
+ FONT_SMALL = ImageFont.truetype(
+ "/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 18
+ )
+ except Exception:
+ FONT_SMALL = ImageFont.load_default()
+
+def _render_to_buf(img: Image.Image) -> None:
+ img_ratio = img.width / img.height
+ buf_ratio = width / height
+ if buf_ratio < img_ratio:
+ scaled_width, scaled_height = int(height * img_ratio), height
+ else:
+ scaled_width, scaled_height = width, int(width / img_ratio)
+ img = img.resize((scaled_width, scaled_height), Image.BICUBIC)
+ x = (scaled_width - width)//2
+ y = (scaled_height - height)//2
+ img = img.crop((x, y, x + width, y + height))
+ image_buf.paste(img)
+
+def show_text(lines):
+ draw.rectangle((0, 0, width, height), fill=(0, 0, 0))
+ y = 6
+ for ln in lines:
+ draw.text((6, y), ln, font=FONT_SMALL, fill=(255, 255, 255))
+ y += 18
+ if DISP_OK:
+ try:
+ disp.image(image_buf, rotation)
+ except Exception as e:
+ log(f"disp.image failed: {e}")
+
+def show_mode_banner():
+ """Display a small banner showing the current mode and button usage."""
+ lines = [
+ get_mode_label(),
+ "",
+ "UP: previous mode",
+ "DOWN: next mode",
+ ]
+ show_text(lines)
+
+# -------------------- MQTT subscribe to generated frames --------------------
+last_frame_ts = 0.0
+
+def mqtt_thread():
+ from paho.mqtt import client as mqtt # pip install paho-mqtt
+ from urllib.parse import urlparse
+
+ # Parse MQTT_URL if provided (supports mqtt://, ws://, wss://)
+ if MQTT_URL:
+ u = urlparse(MQTT_URL)
+ transport = "websockets" if u.scheme in ("ws", "wss") else "tcp"
+ host = u.hostname or MQTT_HOST
+ port = u.port or (443 if u.scheme == "wss" else 80 if u.scheme == "ws" else MQTT_PORT)
+ path = u.path if (u.path and u.path != "") else "/" # default "/" for mosquitto websockets
+ use_tls = (u.scheme == "wss")
+ else:
+ transport = "tcp"
+ host = MQTT_HOST
+ port = MQTT_PORT
+ path = "/"
+ use_tls = False
+
+ client = mqtt.Client(
+ client_id=f"pi5-subscriber-{uuid.uuid4().hex[:6]}",
+ transport=transport,
+ protocol=mqtt.MQTTv311,
+ callback_api_version=mqtt.CallbackAPIVersion.VERSION2,
+ )
+ if transport == "websockets":
+ client.ws_set_options(path=path)
+ if use_tls:
+ client.tls_set() # use system CAs
+ client.reconnect_delay_set(min_delay=1, max_delay=30)
+
+ def on_connect(client, userdata, flags, reason_code, properties=None):
+ if reason_code == 0:
+ log(f"MQTT connected; subscribing to {MQTT_TOPIC}")
+ client.subscribe(MQTT_TOPIC, qos=0)
+ else:
+ log(f"MQTT connect failed: {reason_code}")
+
+ def on_message(client, userdata, msg):
+ global last_frame_ts
+ try:
+ data = json.loads(msg.payload.decode("utf-8"))
+ b = base64.b64decode(data["b64"])
+ from PIL import Image # local import to avoid PiTFT import issues
+ im = Image.open(io.BytesIO(b)).convert("RGB")
+ _render_to_buf(im)
+ if DISP_OK:
+ disp.image(image_buf, rotation)
+ last_frame_ts = time.time()
+ except Exception as e:
+ log(f"MQTT frame error: {e}")
+
+ client.on_connect = on_connect
+ client.on_message = on_message
+
+ # Robust connect loop
+ while True:
+ try:
+ log(f"MQTT connecting to {host}:{port} ({transport}{path})")
+ client.connect(host, port, keepalive=60)
+ client.loop_forever()
+ except Exception as e:
+ log(f"MQTT connect error to {host}:{port}: {e}; retrying in 3s")
+ time.sleep(3)
+
+# -------------------- Mode button polling thread --------------------
+def mode_button_thread():
+ global CURRENT_MODE_INDEX
+ if (btn_up is None) or (btn_down is None):
+ log("Mode button thread not started (buttons not available).")
+ return
+
+ # Initial states
+ last_up = btn_up.value
+ last_down = btn_down.value
+ log(f"Mode buttons active. Starting at {get_mode_label()}")
+ show_mode_banner()
+
+ while True:
+ try:
+ val_up = btn_up.value
+ val_down = btn_down.value
+
+ # Buttons are wired with pull-ups: pressed => False
+ # Detect falling edges (released -> pressed)
+ if last_up and not val_up:
+ # UP: previous mode (wrap-around)
+ CURRENT_MODE_INDEX = (CURRENT_MODE_INDEX - 1) % len(MODE_DEFS)
+ log(f"Mode changed (UP) -> {get_mode_label()}")
+ show_mode_banner()
+
+ if last_down and not val_down:
+ # DOWN: next mode (wrap-around)
+ CURRENT_MODE_INDEX = (CURRENT_MODE_INDEX + 1) % len(MODE_DEFS)
+ log(f"Mode changed (DOWN) -> {get_mode_label()}")
+ show_mode_banner()
+
+ last_up = val_up
+ last_down = val_down
+ time.sleep(0.05)
+ except Exception as e:
+ log(f"Mode button thread error: {e}")
+ time.sleep(0.5)
+
+# -------------------- Main loop: capture -> FastVLM -> Qwen -> /generate --------------------
+def main_loop():
+ show_text([
+ "Starting...",
+ f"SDXL: {SDXL_BASE}",
+ f"MQTT: {MQTT_URL or (MQTT_HOST+':'+str(MQTT_PORT))}",
+ f"Topic: {MQTT_TOPIC}",
+ get_mode_label(),
+ ])
+ time.sleep(0.5)
+
+ # Optional SDXL health
+ try:
+ st, body = _http_json("GET", f"{SDXL_BASE}/", None, timeout=4.0)
+ if st == 200 and body.get("ok"):
+ log(f"SDXL ready: device={body.get('device')} mqtt={body.get('mqtt')}")
+ except Exception as e:
+ log(f"SDXL health check (ignored): {e}")
+
+ # Strictly sequential loop:
+ # - Capture ONE fresh frame
+ # - FastVLM caption with BASE_ACTORS_PROMPT
+ # - Transform caption via local Qwen (Ollama) using current mode style
+ # - POST /generate to SDXL with transformed caption
+ # - Sleep CAPTURE_EVERY and repeat
+ while True:
+ # Flush any stale frames from the driver buffer so we don't build up a backlog.
+ flush_start = time.time()
+ flushed = 0
+ while True:
+ # grab() throws away a frame without decoding it
+ if not cap.grab():
+ break
+ flushed += 1
+ # Don't spend more than ~100ms flushing
+ if time.time() - flush_start > 0.1:
+ break
+
+ ok, frame = cap.read()
+ if not ok:
+ show_text(["Camera read failed", "Retrying..."])
+ time.sleep(0.25)
+ continue
+ p = save_frame(frame)
+
+ # Snapshot the mode at the time of capture
+ mode_index_for_frame = CURRENT_MODE_INDEX
+ mode_label_for_frame = f"Mode {mode_index_for_frame+1}: {MODE_DEFS[mode_index_for_frame]['name']}"
+
+ try:
+ # 1) Get plain caption from FastVLM using the *normal* base prompt
+ if ONLINE_MODE:
+ desc = fastvlm_infer_online(
+ p,
+ BASE_ACTORS_PROMPT,
+ max_new_tokens=128,
+ timeout_s=60.0,
+ )
+ else:
+ desc = fastvlm_infer_local(
+ p,
+ BASE_ACTORS_PROMPT,
+ max_new_tokens=96,
+ timeout_s=60.0,
+ )
+
+ actors_last = (desc or "").strip()
+
+ # Print FastVLM caption
+ print(
+ f"\n=== FastVLM caption (base) ===\n"
+ + actors_last
+ + "\n=== end FastVLM caption ===\n",
+ flush=True,
+ )
+
+ # 2) Transform caption via Ollama/Qwen using current mode style
+ styled_caption = transform_caption_with_ollama(actors_last, mode_index_for_frame)
+ print(
+ f"\n=== Stylized caption ({mode_label_for_frame}) ===\n"
+ + styled_caption
+ + "\n=== end stylized caption ===\n",
+ flush=True,
+ )
+
+ # 3) Preempt/start video on SDXL with the *stylized* prompt
+ payload = {
+ "prompt": styled_caption or (actors_last or "no actors"),
+ "negative_prompt": None,
+ "steps": 1,
+ "guidance_scale": 0.0,
+ "width": 512,
+ "height": 512,
+ "video": True,
+ "video_uid": VIDEO_UID,
+ "preempt": True,
+ "video_fps": 0.0,
+ "max_frames": None,
+ }
+ st, body = _http_json(
+ "POST",
+ f"{SDXL_BASE}/generate",
+ payload,
+ timeout=6.0,
+ )
+ if st == 200 and body.get("ok"):
+ pre = " (preempted)" if body.get("preempted") else ""
+ log(f"SDXL video started{pre}: uid={body.get('uid')}")
+ else:
+ log(f"SDXL generate error {st}: {body}")
+
+ # If frames haven't started arriving yet, show stylized text on screen
+ if (time.time() - last_frame_ts) > 1.5:
+ lines = [mode_label_for_frame]
+ lines.extend(textwrap.wrap(styled_caption, width=22)[:5])
+ show_text(lines)
+
+ except Exception as e:
+ log(f"Infer/POST error: {e}")
+ # Small backoff on error so we don't hammer the services
+ time.sleep(max(1.0, CAPTURE_EVERY))
+ continue
+
+ # Wait *after* the whole cycle so we never queue up screenshots
+ if CAPTURE_EVERY > 0:
+ time.sleep(CAPTURE_EVERY)
+
+if __name__ == "__main__":
+ # Start MQTT subscriber thread first so you see frames as soon as they come
+ th_mqtt = threading.Thread(target=mqtt_thread, daemon=True)
+ th_mqtt.start()
+
+ # Start mode button polling thread (if buttons available)
+ th_mode = threading.Thread(target=mode_button_thread, daemon=True)
+ th_mode.start()
+
+ try:
+ main_loop()
+ except KeyboardInterrupt:
+ pass
diff --git a/Lab 6/sdxl_turbo_server_mqtt.py b/Lab 6/sdxl_turbo_server_mqtt.py
new file mode 100644
index 0000000000..40c51590bd
--- /dev/null
+++ b/Lab 6/sdxl_turbo_server_mqtt.py
@@ -0,0 +1,344 @@
+# sdxl_turbo_server_mqtt.py
+import os
+import io
+import time
+import uuid
+import json
+import base64
+import threading
+from datetime import datetime
+from typing import Optional, Dict, Any
+
+import torch
+from fastapi import FastAPI
+from pydantic import BaseModel
+from PIL import Image
+from diffusers import AutoPipelineForText2Image
+
+# --- MQTT (optional, enable with MQTT_ENABLE=1) ---
+# --- MQTT (single broker via URL) ---
+MQTT_ENABLE = os.environ.get("MQTT_ENABLE", "1") == "1"
+MQTT_URL = os.environ.get("MQTT_URL", "mqtt://127.0.0.1:1883") # e.g. mqtt://127.0.0.1:1883 or wss://mqtt-yourhost.trycloudflare.com/
+MQTT_TOPIC_BASE = os.environ.get("MQTT_TOPIC_BASE", "sdxl/frames")
+MQTT_QOS = int(os.environ.get("MQTT_QOS", "0"))
+
+mqtt_client = None
+def _build_mqtt_client_from_url(url: str):
+ from urllib.parse import urlparse
+ from paho.mqtt import client as mqtt
+ u = urlparse(url)
+ transport = "websockets" if u.scheme in ("ws","wss") else "tcp"
+ host = u.hostname or "127.0.0.1"
+ port = u.port or (443 if u.scheme=="wss" else 80 if u.scheme=="ws" else 1883)
+ c = mqtt.Client(
+ client_id=f"sdxl-{uuid.uuid4().hex[:6]}",
+ transport=transport,
+ protocol=mqtt.MQTTv311,
+ callback_api_version=mqtt.CallbackAPIVersion.VERSION2,
+ )
+ if transport == "websockets":
+ c.ws_set_options(path=u.path or "/")
+ if u.scheme == "wss":
+ c.tls_set() # use system CAs
+ c.reconnect_delay_set(1, 30)
+ c.connect_async(host, port, 60)
+ c.loop_start()
+ print(f"[MQTT] connecting -> {url}")
+ return c
+
+if MQTT_ENABLE and MQTT_URL:
+ try:
+ mqtt_client = _build_mqtt_client_from_url(MQTT_URL)
+ except Exception as e:
+ print(f"[MQTT] init failed (disabled): {e}")
+ mqtt_client = None
+else:
+ print("[MQTT] disabled or no MQTT_URL; skipping MQTT.")
+
+
+MODEL_ID = "stabilityai/sdxl-turbo"
+print("Loading pipeline...")
+pipe = AutoPipelineForText2Image.from_pretrained(
+ MODEL_ID, torch_dtype=torch.float16, variant="fp16"
+)
+
+if torch.cuda.is_available():
+ pipe.to("cuda")
+
+for fn in ("enable_attention_slicing", "enable_vae_slicing", "enable_xformers_memory_efficient_attention"):
+ try:
+ getattr(pipe, fn)()
+ except Exception:
+ pass
+
+try:
+ torch.backends.cuda.matmul.allow_tf32 = True # type: ignore
+except Exception:
+ pass
+
+os.makedirs("images", exist_ok=True)
+os.makedirs("videos", exist_ok=True)
+
+app = FastAPI(title="SDXL-Turbo Server", version="1.2-mqtt-preempt")
+
+# video_sessions[uid] = {"running": bool, "thread": Thread, "params": dict}
+video_sessions: Dict[str, Dict[str, Any]] = {}
+
+
+class GenerateRequest(BaseModel):
+ prompt: str
+ seed: Optional[int] = None
+ width: Optional[int] = 512
+ height: Optional[int] = 512
+ steps: Optional[int] = 1
+ guidance_scale: Optional[float] = 0.0
+ negative_prompt: Optional[str] = None
+
+ video: Optional[bool] = False
+ video_uid: Optional[str] = None
+ video_fps: Optional[float] = 0.0
+ max_frames: Optional[int] = None
+
+ # NEW: let callers flush/replace any running session with same uid
+ preempt: Optional[bool] = False
+ # Optional per-request override of MQTT topic (rarely needed)
+ mqtt_topic: Optional[str] = None
+
+
+@app.get("/")
+def health():
+ return {
+ "ok": True,
+ "model": MODEL_ID,
+ "device": "cuda" if torch.cuda.is_available() else "cpu",
+ "mqtt": bool(mqtt_client),
+ "video_sessions": list(video_sessions.keys()),
+ }
+
+
+def _safe_dims(v: Optional[int], default: int) -> int:
+ x = int(v or default)
+ x = max(64, min(x, 1024))
+ return x - (x % 8)
+
+
+def _make_image(prompt: str,
+ negative_prompt: Optional[str],
+ steps: int,
+ guidance_scale: float,
+ width: int,
+ height: int,
+ seed: int) -> Image.Image:
+ generator = torch.Generator(device="cuda" if torch.cuda.is_available() else "cpu").manual_seed(seed)
+ with torch.inference_mode():
+ out = pipe(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ num_inference_steps=steps,
+ guidance_scale=guidance_scale,
+ width=width,
+ height=height,
+ generator=generator,
+ )
+ return out.images[0]
+
+
+def _video_worker(uid: str, params: Dict[str, Any]) -> None:
+ """Continuously generate frames into videos// and publish over MQTT."""
+ folder = os.path.join("videos", uid)
+ os.makedirs(folder, exist_ok=True)
+
+ prompt = params["prompt"]
+ negative_prompt = params["negative_prompt"]
+ width = params["width"]
+ height = params["height"]
+ steps = params["steps"]
+ guidance_scale = params["guidance_scale"]
+ base_seed = params["seed"]
+ fps = params["video_fps"]
+ max_frames = params["max_frames"]
+ mqtt_topic = params.get("mqtt_topic") or f"{MQTT_TOPIC_BASE}/{uid}"
+
+ frame_idx = 0
+ last_tick = time.perf_counter()
+
+ while video_sessions.get(uid, {}).get("running", False):
+ seed = (base_seed + frame_idx) & 0xFFFFFFFF
+ try:
+ img = _make_image(
+ prompt=prompt,
+ negative_prompt=negative_prompt,
+ steps=steps,
+ guidance_scale=guidance_scale,
+ width=width,
+ height=height,
+ seed=seed,
+ )
+ fname = os.path.join(folder, f"frame_{frame_idx:06d}_seed{seed}_{width}x{height}.png")
+ img.save(fname)
+
+ # --- MQTT publish (JPEG base64 for compactness) ---
+ if mqtt_client is not None:
+ try:
+ buf = io.BytesIO()
+ img.save(buf, format="JPEG", quality=85)
+ b64 = base64.b64encode(buf.getvalue()).decode("ascii")
+ payload = {
+ "uid": uid,
+ "i": frame_idx,
+ "seed": seed,
+ "w": width,
+ "h": height,
+ "ts": time.time(),
+ "b64": b64,
+ }
+ mqtt_client.publish(mqtt_topic, json.dumps(payload), qos=MQTT_QOS, retain=False)
+ except Exception as me:
+ # Don't crash the worker if MQTT fails
+ try:
+ with open(os.path.join(folder, f"mqtt_error_{frame_idx:06d}.txt"), "w", encoding="utf-8") as f:
+ f.write(str(me))
+ except Exception:
+ pass
+
+ except Exception as e:
+ try:
+ with open(os.path.join(folder, f"error_{frame_idx:06d}.txt"), "w", encoding="utf-8") as f:
+ f.write(str(e))
+ except Exception:
+ pass
+
+ frame_idx += 1
+ if max_frames is not None and frame_idx >= max_frames:
+ break
+
+ if fps and fps > 0:
+ target_dt = 1.0 / float(fps)
+ now = time.perf_counter()
+ elapsed = now - last_tick
+ if elapsed < target_dt:
+ time.sleep(target_dt - elapsed)
+ last_tick = time.perf_counter()
+
+ sess = video_sessions.get(uid)
+ if sess:
+ sess["running"] = False
+
+
+@app.post("/generate")
+def generate(req: GenerateRequest):
+ w = _safe_dims(req.width, 512)
+ h = _safe_dims(req.height, 512)
+ steps = max(1, int(req.steps or 1))
+ gs = float(req.guidance_scale if req.guidance_scale is not None else 0.0)
+ seed = int(req.seed) if req.seed is not None else int.from_bytes(os.urandom(2), "little")
+
+ if not req.video:
+ t0 = time.time()
+ try:
+ img = _make_image(
+ prompt=req.prompt,
+ negative_prompt=req.negative_prompt,
+ steps=steps,
+ guidance_scale=gs,
+ width=w,
+ height=h,
+ seed=seed,
+ )
+ ts = datetime.now().strftime("%Y%m%d_%H%M%S_%f")
+ filename = os.path.join("images", f"sdxl_turbo_{ts}_seed{seed}_{w}x{h}.png")
+ img.save(filename)
+ latency = time.time() - t0
+ return {
+ "ok": True,
+ "mode": "single",
+ "path": filename,
+ "seed": seed,
+ "width": w,
+ "height": h,
+ "steps": steps,
+ "guidance_scale": gs,
+ "latency_sec": round(latency, 3),
+ }
+ except Exception as e:
+ return {"ok": False, "error": str(e), "mode": "single"}
+
+ # ---- Video mode ----
+ uid = (req.video_uid or uuid.uuid4().hex[:12].lower()).strip()
+ folder = os.path.join("videos", uid)
+ os.makedirs(folder, exist_ok=True)
+
+ existing = video_sessions.get(uid)
+ preempted = False
+ if existing and existing.get("running") and req.preempt:
+ # Stop the old session and wait briefly
+ existing["running"] = False
+ th = existing.get("thread")
+ if th:
+ try:
+ th.join(timeout=2.0)
+ except Exception:
+ pass
+ preempted = True
+
+ # If a session is running and not preempting, just report it
+ existing = video_sessions.get(uid)
+ if existing and existing.get("running"):
+ return {
+ "ok": True,
+ "mode": "video",
+ "message": "Session already running",
+ "uid": uid,
+ "folder": folder,
+ }
+
+ params = {
+ "prompt": req.prompt,
+ "negative_prompt": req.negative_prompt,
+ "width": w,
+ "height": h,
+ "steps": steps,
+ "guidance_scale": gs,
+ "seed": seed,
+ "video_fps": float(req.video_fps or 0.0),
+ "max_frames": req.max_frames,
+ "mqtt_topic": req.mqtt_topic,
+ }
+
+ video_sessions[uid] = {"running": True, "params": params, "thread": None}
+ th = threading.Thread(target=_video_worker, args=(uid, params), daemon=True)
+ video_sessions[uid]["thread"] = th
+ th.start()
+
+ return {
+ "ok": True,
+ "mode": "video",
+ "uid": uid,
+ "folder": folder,
+ "preempted": preempted,
+ "message": "Video generation started",
+ "params": {
+ "width": w, "height": h, "steps": steps, "guidance_scale": gs,
+ "seed": seed, "video_fps": params["video_fps"], "max_frames": req.max_frames,
+ },
+ }
+
+
+@app.post("/stop_video/{uid}")
+def stop_video(uid: str):
+ sess = video_sessions.get(uid)
+ if not sess:
+ return {"ok": False, "message": f"No session found for uid={uid}"}
+ sess["running"] = False
+ th = sess.get("thread")
+ if th:
+ try:
+ th.join(timeout=2.0)
+ except Exception:
+ pass
+ return {"ok": True, "message": f"Stopped uid={uid}"}
+
+
+if __name__ == "__main__":
+ import uvicorn
+ uvicorn.run(app, host="0.0.0.0", port=7985, reload=False, workers=1)
diff --git a/Lab 6/server/README.md b/Lab 6/server/README.md
new file mode 100644
index 0000000000..1fe22c6371
--- /dev/null
+++ b/Lab 6/server/README.md
@@ -0,0 +1,49 @@
+# Server Configuration Files
+
+This folder contains systemd service files for running the Lab 6 servers on `farlab.infosci.cornell.edu`.
+
+**Students do not need these files** - they are only for server deployment.
+
+## Services
+
+### pixel_grid.service
+Runs the collaborative pixel grid web server on port 5000.
+
+**Install:**
+```bash
+sudo cp pixel_grid.service /etc/systemd/system/
+sudo systemctl daemon-reload
+sudo systemctl enable pixel_grid.service
+sudo systemctl start pixel_grid.service
+```
+
+### mqtt_viewer.service
+Runs the MQTT message viewer debugging tool on port 5001.
+
+**Install:**
+```bash
+sudo cp mqtt_viewer.service /etc/systemd/system/
+sudo systemctl daemon-reload
+sudo systemctl enable mqtt_viewer.service
+sudo systemctl start mqtt_viewer.service
+```
+
+## Management Commands
+
+```bash
+# Check status
+sudo systemctl status pixel_grid.service
+sudo systemctl status mqtt_viewer.service
+
+# View logs
+sudo journalctl -u pixel_grid.service -n 50
+sudo journalctl -u mqtt_viewer.service -n 50
+
+# Restart
+sudo systemctl restart pixel_grid.service
+sudo systemctl restart mqtt_viewer.service
+
+# Stop
+sudo systemctl stop pixel_grid.service
+sudo systemctl stop mqtt_viewer.service
+```
diff --git a/Lab 6/server/mqtt_viewer.service b/Lab 6/server/mqtt_viewer.service
new file mode 100644
index 0000000000..c3c079300e
--- /dev/null
+++ b/Lab 6/server/mqtt_viewer.service
@@ -0,0 +1,18 @@
+[Unit]
+Description=MQTT Viewer - Interactive Device Design Lab 6
+After=network.target
+
+[Service]
+Type=simple
+User=farlab
+WorkingDirectory=/home/farlab/Interactive-Lab-Hub/Lab 6
+Environment="PATH=/home/farlab/Interactive-Lab-Hub/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
+ExecStart=/home/farlab/Interactive-Lab-Hub/.venv/bin/python3 mqtt_viewer.py
+Restart=always
+RestartSec=10
+StandardOutput=journal
+StandardError=journal
+SyslogIdentifier=mqtt_viewer
+
+[Install]
+WantedBy=multi-user.target
diff --git a/Lab 6/server/pixel_grid.service b/Lab 6/server/pixel_grid.service
new file mode 100644
index 0000000000..da9442b18d
--- /dev/null
+++ b/Lab 6/server/pixel_grid.service
@@ -0,0 +1,18 @@
+[Unit]
+Description=Pixel Grid Server - Interactive Device Design Lab 6
+After=network.target
+
+[Service]
+Type=simple
+User=farlab
+WorkingDirectory=/home/farlab/Interactive-Lab-Hub/Lab 6
+Environment="PATH=/home/farlab/Interactive-Lab-Hub/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
+ExecStart=/home/farlab/Interactive-Lab-Hub/.venv/bin/python3 app.py
+Restart=always
+RestartSec=10
+StandardOutput=journal
+StandardError=journal
+SyslogIdentifier=pixel_grid
+
+[Install]
+WantedBy=multi-user.target
diff --git a/Lab 6/templates/controller.html b/Lab 6/templates/controller.html
new file mode 100644
index 0000000000..23f96c4677
--- /dev/null
+++ b/Lab 6/templates/controller.html
@@ -0,0 +1,286 @@
+
+
+
+
+
+ Pixel Controller
+
+
+
+
+
+
+
+
🎨 Pixel Controller
+
MAC: Generating...
+
+
+
+
+
+
RGB(128, 128, 128)
+
+
+ Connecting...
+
+
+
+
+
+
+
+
+ 💡 Tip: Open multiple tabs to simulate multiple Pis!
+ Each tab gets a unique MAC address.
+