This application uses Google's Gemini 2.0 Flash model to process quiz questions from screenshots and provide answers. The app monitors for macOS screenshots, processes them with the Gemini multimodal model, and displays the answers in real-time.
- Automatic detection of macOS screenshots (using Command + Shift + 4)
- Image processing using Gemini 2.0 Flash multimodal model
- Real-time answer generation for quiz questions
- Simple and intuitive user interface
- Answer history displayed in the sidebar
- macOS (for the native screenshot functionality)
- Python 3.8 or higher
- Google API key with access to Gemini models
- Install Git if you haven't:
- Visit https://git-scm.com/download/mac
- Download and install Git
- Open Terminal and run:
git clone https://github.com/yourusername/ai-quiz-answering-app.git cd ai-quiz-answering-app
Choose ONE of the following methods:
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
Open Terminal and run:
# Create conda environment
conda create -n quiz-app python=3.8
# Activate conda environment
conda activate quiz-app
# Install dependencies
pip install -r requirements.txt
- Visit Google AI Studio: https://makersuite.google.com/app/apikey
- Sign in with your Google account
- Click "Create API Key"
- Copy your API key
- Create a
.env
file in the application folder: - Open
.env
file in any text editor and add your API key:GOOGLE_API_KEY=your_api_key_here
- Press
Command (⌘) + Shift + 5
to open Screenshot toolbar - Click "Options"
- Under "Save to", select "Desktop"
- Make sure "Show Floating Thumbnail" is unchecked for instant processing
- Close the options menu
-
Start the application:
# If using venv: source venv/bin/activate python app.py # If using conda: conda activate quiz-app python app.py
-
Open your web browser and go to
http://127.0.0.1:5000/
-
The application will open with a simple interface showing:
- A sidebar with status indicator and answer history
- A main panel showing the latest processed screenshot
-
To analyze a quiz question:
- Position the quiz question on your screen
- Take a screenshot using macOS native screenshot tool (Command + Shift + 4)
- Select the area containing the question
- The application will automatically detect the new screenshot, process it, and display the answer
For a simpler interaction, you can also use the CLI version:
python cli.py
The CLI version will monitor for screenshots and output the answers directly in the terminal.
- The application monitors your Desktop folder for new screenshot files
- When a macOS screenshot is detected (files starting with "Screenshot"), it's processed
- The Gemini 2.0 Flash model analyzes the screenshot to extract the question
- The model generates an answer based on its understanding of the question
- The answer is displayed in the web interface or CLI
- The accuracy of the answers depends on the capabilities of the Gemini 2.0 Flash model
- The application works best with clear, well-formatted quiz questions
- Processing time may vary based on your internet connection and the complexity of the image