My First Android App as a Beginner: What I Learned Building an Offline ML Gallery Organizer (and How Copilot Helped)
Building my first Android application felt like jumping into the deep end, even though I already had a solid Java background. Android development has its own way of doing things: lifecycles, services, permissions, storage policies, UI patterns, background constraints… and all of that gets even more interesting when you add on-device machine learning.
I wanted to share this experience because it’s been both challenging and genuinely rewarding. I also want to be transparent about what helped, what didn’t, and what surprised me—especially when working with GitHub Copilot and different AI models.
The Project: GalleryKeeper (Offline, Privacy-Focused Image Classification)
The application I built is called GalleryKeeper. It embeds a YOLO11 model for image classification, with a simple goal:
Automatically classify sensitive images from the user’s photo gallery into folders based on 4 criteria:
- nudity
- children
- identity cards
- credit cards
And yes—I managed to make this work.
If you want to test it or read the full description, the project is here:
https://github.com/chabanenary/GalleryKeeper-App
Self-Training: Android + Machine Learning
I’m self-taught in Android development. I didn’t come from a traditional “Android background”, so I had to learn step by step:
- Android fundamentals (Activities, Fragments, ViewModels, services, permissions).
- Storage and data handling (especially modern Android rules around shared storage).
- ML integration: object detection and image classification workflows.
- A lot of experimentation with YOLO (specifically, working around the practical steps: preprocessing, outputs, confidence thresholds, categories, speed vs accuracy, etc.).
My focus wasn’t just “make it run”. It was: make it work reliably, offline, and in a way that respects user privacy.
App Architecture: MVVM, Persistence, and the “Boundaries” That Matter
Like many modern Android apps, GalleryKeeper naturally pushed me toward an MVVM-style architecture: UI in Fragments/Activities, state in ViewModels, and persistence behind repository layers.
One area where Copilot (both GPT‑5.2 and Gemini) gave genuinely solid guidance was Room. The suggested patterns for entities/DAOs, database wiring, and basic repository usage were usually correct, and the agent could implement Room without introducing too many issues.
Where things got a lot more fragile was around ViewModel boundaries — especially when the workflow involved background components:
- ViewModels shouldn’t hold references to a Fragment/Activity context (risk of leaks), and they shouldn’t “drive” UI navigation.
- Interaction with long-running work (like a ForegroundService monitoring MediaStore) ideally goes through explicit APIs: repositories, use-cases, or observable state — not direct calls into a Fragment.
- Copilot often proposed patterns where the ViewModel tried to call into a Service or a Fragment directly, or where lifecycle ownership was blurred. In Android terms: it struggled with separation of concerns, lifecycle-awareness, and choosing the right communication mechanism (LiveData/Flow, callbacks, broadcasts, WorkManager, etc.).
So overall: great help for Room and persistence plumbing, but I had to be very careful with any suggestion that involved threading/lifecycle or cross-layer communication between ViewModel ↔ UI ↔ Service.
My Experience With Copilot: GPT-5.2 vs Gemini
I relied heavily on GitHub Copilot during development. Overall, it helped—but not equally across models, and not for every task.
What worked best: GPT-5.2 (when guided properly)
The model that helped me the most for actual implementation was GPT-5.2, especially when I guided it clearly step by step. In that setup, it produced good, usable code—often faster than I could write it from scratch while still learning the framework.
However, I noticed limitations:
- When using Plan mode, things sometimes “went off the rails” (too many assumptions, wrong direction, or drifting away from constraints).
- When the problem involved Android’s real-world edge cases (permissions, MediaStore quirks, device variability), it still needed close supervision and careful prompts.
Gemini: great explanations, but too verbose for execution
Gemini was very good at explaining:
- Android concepts
- architecture principles
- how Android libraries are intended to be used
But in practice, for me:
- It was too verbose.
- Its agent mode felt unreliable: verbose, sometimes incorrect, and not very efficient for real implementation tasks.
So in my workflow, Gemini was more like a “textbook explanation tool”, while GPT-5.2 was more like a “pair programmer” when I kept it focused.
Where Copilot GPT Really Shined: UI Design
One area where Copilot (GPT) was surprisingly strong was UI/UX design:
- choosing color palettes
- improving ergonomics and layout clarity
- making the interface feel cleaner and more “Android-native”
This was honestly one of the most valuable parts, because UI decisions are hard when you’re a beginner—you don’t even know what looks wrong until you feel it.
The Hard Part: MediaStore, and Why No Agent Really Mastered It
My application uses MediaStore extensively, and I’ll say it directly: MediaStore is tricky, and none of the Copilot agents I tested seemed to fully “mastered” it in a reliable way.
In real projects, MediaStore isn’t just API calls—it’s:
- content URIs
- permissions differences across Android versions
- scoped storage behaviors
- file visibility rules
- background constraints when observing changes
- edge cases depending on manufacturer or Android build
One extra limitation I hit (and it cost me a lot of time) was emulator-specific: on the Android Studio emulator, the MediaStore API sometimes didn’t “see” images already present in the Gallery. The workaround I found was surprisingly manual: the user had to open a Gallery app and actually display/browse those photos first, and only then would MediaStore start returning URIs for them.
What made it extra confusing is that I couldn’t reproduce this on real devices (phones/tablets). It happened on the emulator across multiple Android versions, which is a good reminder that emulator behavior can diverge in subtle ways from physical devices—especially around MediaStore/database indexing.
So a lot of what I implemented around MediaStore had to be validated manually and tested repeatedly.
In the end, I found that agent mode was only useful for:
- UI design tasks
- extracting and organizing string resources
- cleaning unused functions / unused APIs
ML Integration: I Did It Myself
The integration of the full ML framework—detection and prediction pipeline—was done by me. Copilot didn’t really recognize the correct implementation steps, and it didn’t naturally “see” the full pipeline the way a developer would when integrating an on-device model end to end.
That part required actual understanding and iteration:
- Choosing the right Tensorflow librairies
- designing preprocessing correctly
- choosing thresholds and labels
- managing performance constraints on mobile
The Biggest Android Limitation I Hit: You Can’t “Lock” Gallery Folders
Here’s the frustrating part: Android does not allow third-party apps to truly lock folders created in shared storage.
There is no native locking mechanism that lets an app prevent other apps from accessing a folder in shared storage. If your app creates a folder and places images under DCIM/ or Pictures/, it will be visible in the gallery and accessible to other apps that have media access.
What can we do instead?
- Hide them (limited protection)
- Move them into a global hidden area
- Or truly secure them by moving files into the app’s internal storage (private storage)
But that last option has trade-offs:
- it reduces user visibility and control
- it gives the app too much “ownership” over personal photos
- it conflicts with my privacy-first goal (even though the app is fully offline)
In short: Android doesn’t let third-party applications lock a user-owned shared space, even with user authorization, and I think that’s a missed opportunity. It could exist with proper safeguards.
Still Ongoing: Testing, and Improving the Model
The app is still being tested and improved, especially:
- the service that detects new images added to the gallery
- reliability across devices and Android versions
- improving the model’s performance and recognition quality
This first Android app taught me a lot—not only about development, but about operating system constraints, privacy concerns, and what “real-world engineering” looks like beyond tutorials.
And it also taught me something important about AI tools: they can accelerate you, but they don’t replace understanding—especially when the platform is complex and full of edge cases.
Top comments (0)