Hacking Windows APIs to build a “Cheat Code” overlay invisible to screenshare.

🏆 Winner of the AISE Hackathon at DA-IICT


In the high-stakes world of technical sales, you often have seconds to answer a difficult question about a competitor or pricing. You have your notes, your battlecards, and your CRM but you can’t look at them because you’re sharing your screen.

We asked ourselves: What if you could paint a UI on your screen that YOU see, but the screen-sharing software DOESN’T?

And what if that UI was powered by a local AI that answered questions in real-time?

This is the engineering story behind Meeting Monitor AI.

The Full Tech Stack

We built a hybrid architecture: a Python intelligence engine running local AI models (for privacy and speed) connected to a modern React dashboard.

LayerTechnologyPurpose
Stealth UIPyQt6 + Windows Safe APIsDesktop overlay hidden from screen capture (Zoom/Teams).
Backend APIFastAPI + WebSocketsHigh-performance async API for real-time streams.
Audio PipelineWhisperX (CTranslate2)GPU-accelerated transcription (<200ms latency).
Diarizationpyannote.audioDistinguishing [SALES_REP] vs [CLIENT].
IntelligenceGemini AI + OllamaLLM reasoning for hints and summaries.
NERGLiNERReal-time Entity Extraction (Companies, Products).
VisionDeepFaceReal-time sentiment analysis via webcam.
FrontendReact 19 + ViteManagement dashboard for history and analytics.
StylingTailwind CSS + Radix UIAccessible, headless UI components.
DatabaseSQLite + SQLAlchemyLocal persistence for all meeting data.
CRM SyncOdoo APIAutomated lead creation & qualification.
BrowsingTrafilatura + PlaywrightAdvanced web scraping for competitor research.

1. The “Invisible” UI: Hacking Windows Display Affinity

The “Killer Feature” is the Stealth Mode. We wanted the overlay to hover over the Zoom window but remain invisible to the client.

We achieved this using the Windows SetWindowDisplayAffinity API. This tells the Window Manager (DWM) to render our window on the physical display but EXCLUDE it from any screen capture buffers.

The Python Implementation

Using ctypes to call User32.dll directly from Python:

import ctypes

# WDA_EXCLUDEFROMCAPTURE (0x00000011) - The Magic Constant
# Supported on Windows 10 Version 2004+ (Build 19041+)
WDA_EXCLUDEFROMCAPTURE = 0x00000011 

def set_stealth_mode(window_handle, enabled: bool):
    try:
        user32 = ctypes.windll.user32
        # Toggle affinity: 0x11 for stealth, 0x00 for visible
        affinity = WDA_EXCLUDEFROMCAPTURE if enabled else 0x00000000
        user32.SetWindowDisplayAffinity(int(window_handle), affinity)
        return True
    except Exception as e:
        print(f"Stealth Error: {e}")
        return False

When enabled, you see the battlecards, but your client sees… nothing. Just your desktop wallpaper.


2. The Acoustic Engine: Local GPU Pipeline

Cloud APIs were too slow (3-5s latency). We moved the stack to the user’s local NVIDIA GPU using WhisperX.

Why WhisperX?

Standard Whisper is great but slow. WhisperX uses CTranslate2 (quantized execution) and VAD (Voice Activity Detection) batching to achieve 5-10x real-time speed. We also integrated pyannote.audio for speaker diarization.

The Pipeline (service.py)

  1. Audio Capture: sounddevice grabs the WASAPI Loopback (Stereo Mix).
  2. Streaming: Audio chunks sent via WebSockets to the backend.
  3. Inference:
    • Transcribe: WhisperX large-v2 (float16).
    • Align: Forced alignment for word-level timestamps.
    • Diarize: Speaker clustering.
#Real-world performance on RTX 3070:
Audio Chunk: 10 seconds
Processing Time: 0.8 seconds
Latency: ~1.5s total

3. The Real-Time Bridge: FastAPI & WebSockets

To glue the Python backend to the UI (both the Desktop Overlay and the React Dashboard), we used FastAPI WebSockets.

The Architecture

  • Audio Stream (/audio-stream): Incoming raw WAV bytes from the client.
  • Session Stream (/session-stream): Outgoing JSON events (transcript, hints, battlecards).

We solved the “Blocking I/O” problem by offloading heavy AI inference to background threads, while the async WebSocket loop keeps the connection alive.

@router.websocket("/session-stream")
async def session_stream(websocket: WebSocket):
    await websocket.accept()
    # Subscribe to the event bus
    session_websockets.add(websocket)
    try:
        while True:
            # Keep-alive loop
            await websocket.receive_text()
    except:
        session_websockets.discard(websocket)

# Broadcasting from a background thread
async def _broadcast(message: dict):
    for ws in session_websockets:
        await ws.send_json(message)

4. The Intelligence Layer: GLiNER & Gemini

Transcription is just raw data. We need insights.

Entity Extraction (GLiNER)

We use GLiNER (Zero-shot Named Entity Recognition) to pull out Competitors, Products, and Budget from the live text stream. It’s lighter and faster than LLMs for this specific task.

The “Battlecard” Trigger

If GLiNER detects Competitors (Example:Entity: Salesforce (Competitor)), the system:

  1. Triggers a BattlecardRequest.
  2. Fetches live web insights via DuckDuckGo.
  3. Uses Gemini Pro to synthesize a “Counter-Strategy”.
  4. Pushes a card to the Stealth Overlay.

The Vision Layer (DeepFace)

We didn’t stop at audio. We added DeepFace to analyze the client’s webcam feed (if visible) for real-time sentiment tracking (Happy/Negative). This runs on the CPU in a separate thread to avoid stalling the GPU audio pipeline.

The Research Layer (DuckDuckGo + Trafilatura)

When a battlecard is requested, we don’t just hallucinate. We search the web using DuckDuckGo, then use Trafilatura to scrape the top 3 results and feed the raw text into Gemini as context. This ensures our AI has up-to-the-minute pricing data.

Prompt Engineering for Gemini:

“You are a sales coach. The prospect just mentioned {competitor}. Give me 3 bullet points on why we are better, focusing on their weakness in {context}. Be brief.”


5. The Frontend: React 19 + Vite

While the overlay handles the live call, the React Dashboard handles the post-game analysis.

  • Vite: For instant HMR (Hot Module Replacement).
  • Tailwind CSS: For rapid UI development.
  • Recharts: To visualize engagement metrics (Talk-time ratio, Sentiment arc).
  • Radix UI: For accessible, unstyled primitives (Dialogs, Popovers) that we styled to look “Cyberpunk/Premium”.

We calculate a “Lead Score” (0-100) using vaderSentiment on the full transcript. If the score > 50, we automatically push the lead to Odoo CRM via its XML-RPC API, creating a new Opportunity complete with the summary and starred hints.

Conclusion

Meeting Monitor AI demonstrates that you don’t need a massive cloud infrastructure to build powerful, real-time AI tools. With modern GPUs and libraries like WhisperX and FastAPI, we can build “Local-First” applications that are faster, cheaper, and more private than their cloud counterparts.

The shift to a Microservices Approach was critical for performance. It enabled a “Local-First” architecture where specialized services could be updated or scaled independently, proving that complex, real-time AI tools are most effective when built as a distributed ecosystem.

Key Takeaways:

  1. Stealth is possible: Windows APIs are powerful if you know where to look.
  2. Local AI is ready: The RTX 3070 is a viable server for single-user LLM/Whisper workloads.
  3. Hybrid is the way: Python for AI, React for UI, WebSockets for the bridge.

This diagram visualizes the architecture we used for the hackathon project.

flowchart LR
    subgraph Audio["Audio Pipeline"]
        direction LR
        WASAPI["WASAPI Loopback"]
        WhisperX["WhisperX - GPU"]
        Pyannote["Pyannote Diarization"]
    end

    subgraph Intelligence["Intelligence Layer"]
        direction TB
        GLiNER["GLiNER - NER"]
        spacer1[ ]:::hidden
        Gemini["Gemini Pro"]
        spacer2[ ]:::hidden
        DeepFace["DeepFace - CPU"]
        spacer3[ ]:::hidden
        DuckDuckGo["DuckDuckGo Search"]
        spacer4[ ]:::hidden
        Trafilatura["Trafilatura Scraper"]
    end

    subgraph Backend["FastAPI Backend"]
        direction TB
        API["REST API"]
        WS["WebSocket Server"]
        SessionMgr["Session Manager"]
    end

    subgraph UI["User Interfaces"]
        direction TB
        subgraph Overlay["Stealth Overlay - PyQt6"]
            StealthUI["Stealth Window"]
            Battlecard["Battlecard Panel"]
        end
        spacer5[ ]:::hidden
        subgraph Frontend["Frontend - React"]
            Dashboard["Dashboard UI"]
            Analytics["Analytics"]
        end
    end

    subgraph Storage["Storage"]
        direction LR 
        SQLite["SQLite DB"]
        Odoo["Odoo CRM"]
    end

    WASAPI --> WhisperX
    WhisperX --> Pyannote
    Pyannote --> SessionMgr
    SessionMgr --> GLiNER
    GLiNER --> Gemini
    DuckDuckGo --> Trafilatura
    Trafilatura --> Gemini
    Gemini --> WS
    DeepFace --> WS
    WS <--> StealthUI
    WS <--> Dashboard
    SessionMgr --> SQLite
    SessionMgr --> Odoo
    API --> Analytics

    classDef hidden fill:none,stroke:none,color:transparent