ChabaneNary-programming

Posted on Feb 7

My First Android App as a Beginner: What I Learned Building an Offline ML Gallery Organizer (and How Copilot Helped)

#githubcopilot #android #mediastore #edgeai

My First Android App as a Beginner: What I Learned Building an Offline ML Gallery Organizer (and How Copilot Helped)

Building my first Android application felt like jumping into the deep end, even though I already had a solid Java background. Android development has its own way of doing things: lifecycles, services, permissions, storage policies, UI patterns, background constraints… and all of that gets even more interesting when you add on-device machine learning.

I wanted to share this experience because it’s been both challenging and genuinely rewarding. I also want to be transparent about what helped, what didn’t, and what surprised me—especially when working with GitHub Copilot and different AI models.

The Project: GalleryKeeper (Offline, Privacy-Focused Image Classification)

The application I built is called GalleryKeeper. It embeds a YOLO11 model for image classification, with a simple goal:

Automatically classify sensitive images from the user’s photo gallery into folders based on 4 criteria:

nudity
children
identity cards
credit cards

And yes—I managed to make this work.

If you want to test it or read the full description, the project is here:

https://github.com/chabanenary/GalleryKeeper-App

Self-Training: Android + Machine Learning

I’m self-taught in Android development. I didn’t come from a traditional “Android background”, so I had to learn step by step:

Android fundamentals (Activities, Fragments, ViewModels, services, permissions).
Storage and data handling (especially modern Android rules around shared storage).
ML integration: object detection and image classification workflows.
A lot of experimentation with YOLO (specifically, working around the practical steps: preprocessing, outputs, confidence thresholds, categories, speed vs accuracy, etc.).

My focus wasn’t just “make it run”. It was: make it work reliably, offline, and in a way that respects user privacy.

App Architecture: MVVM, Persistence, and the “Boundaries” That Matter

Like many modern Android apps, GalleryKeeper naturally pushed me toward an MVVM-style architecture: UI in Fragments/Activities, state in ViewModels, and persistence behind repository layers.

One area where Copilot (both GPT‑5.2 and Gemini) gave genuinely solid guidance was Room. The suggested patterns for entities/DAOs, database wiring, and basic repository usage were usually correct, and the agent could implement Room without introducing too many issues.

Where things got a lot more fragile was around ViewModel boundaries — especially when the workflow involved background components:

ViewModels shouldn’t hold references to a Fragment/Activity context (risk of leaks), and they shouldn’t “drive” UI navigation.
Interaction with long-running work (like a ForegroundService monitoring MediaStore) ideally goes through explicit APIs: repositories, use-cases, or observable state — not direct calls into a Fragment.
Copilot often proposed patterns where the ViewModel tried to call into a Service or a Fragment directly, or where lifecycle ownership was blurred. In Android terms: it struggled with separation of concerns, lifecycle-awareness, and choosing the right communication mechanism (LiveData/Flow, callbacks, broadcasts, WorkManager, etc.).

So overall: great help for Room and persistence plumbing, but I had to be very careful with any suggestion that involved threading/lifecycle or cross-layer communication between ViewModel ↔ UI ↔ Service.

My Experience With Copilot: GPT-5.2 vs Gemini

I relied heavily on GitHub Copilot during development. Overall, it helped—but not equally across models, and not for every task.

What worked best: GPT-5.2 (when guided properly)

The model that helped me the most for actual implementation was GPT-5.2, especially when I guided it clearly step by step. In that setup, it produced good, usable code—often faster than I could write it from scratch while still learning the framework.

However, I noticed limitations:

When using Plan mode, things sometimes “went off the rails” (too many assumptions, wrong direction, or drifting away from constraints).
When the problem involved Android’s real-world edge cases (permissions, MediaStore quirks, device variability), it still needed close supervision and careful prompts.

Gemini: great explanations, but too verbose for execution

Gemini was very good at explaining:

Android concepts
architecture principles
how Android libraries are intended to be used

But in practice, for me:

It was too verbose.
Its agent mode felt unreliable: verbose, sometimes incorrect, and not very efficient for real implementation tasks.

So in my workflow, Gemini was more like a “textbook explanation tool”, while GPT-5.2 was more like a “pair programmer” when I kept it focused.

Where Copilot GPT Really Shined: UI Design

One area where Copilot (GPT) was surprisingly strong was UI/UX design:

choosing color palettes
improving ergonomics and layout clarity
making the interface feel cleaner and more “Android-native”

This was honestly one of the most valuable parts, because UI decisions are hard when you’re a beginner—you don’t even know what looks wrong until you feel it.

The Hard Part: MediaStore, and Why No Agent Really Mastered It

My application uses MediaStore extensively, and I’ll say it directly: MediaStore is tricky, and none of the Copilot agents I tested seemed to fully “mastered” it in a reliable way.

In real projects, MediaStore isn’t just API calls—it’s:

content URIs
permissions differences across Android versions
scoped storage behaviors
file visibility rules
background constraints when observing changes
edge cases depending on manufacturer or Android build

One extra limitation I hit (and it cost me a lot of time) was emulator-specific: on the Android Studio emulator, the MediaStore API sometimes didn’t “see” images already present in the Gallery. The workaround I found was surprisingly manual: the user had to open a Gallery app and actually display/browse those photos first, and only then would MediaStore start returning URIs for them.

What made it extra confusing is that I couldn’t reproduce this on real devices (phones/tablets). It happened on the emulator across multiple Android versions, which is a good reminder that emulator behavior can diverge in subtle ways from physical devices—especially around MediaStore/database indexing.

So a lot of what I implemented around MediaStore had to be validated manually and tested repeatedly.

In the end, I found that agent mode was only useful for:

UI design tasks
extracting and organizing string resources
cleaning unused functions / unused APIs

ML Integration: I Did It Myself

The integration of the full ML framework—detection and prediction pipeline—was done by me. Copilot didn’t really recognize the correct implementation steps, and it didn’t naturally “see” the full pipeline the way a developer would when integrating an on-device model end to end.

That part required actual understanding and iteration:

Choosing the right Tensorflow librairies
designing preprocessing correctly
choosing thresholds and labels
managing performance constraints on mobile

The Biggest Android Limitation I Hit: You Can’t “Lock” Gallery Folders

Here’s the frustrating part: Android does not allow third-party apps to truly lock folders created in shared storage.

There is no native locking mechanism that lets an app prevent other apps from accessing a folder in shared storage. If your app creates a folder and places images under DCIM/ or Pictures/, it will be visible in the gallery and accessible to other apps that have media access.

What can we do instead?

Hide them (limited protection)
Move them into a global hidden area
Or truly secure them by moving files into the app’s internal storage (private storage)

But that last option has trade-offs:

it reduces user visibility and control
it gives the app too much “ownership” over personal photos
it conflicts with my privacy-first goal (even though the app is fully offline)

In short: Android doesn’t let third-party applications lock a user-owned shared space, even with user authorization, and I think that’s a missed opportunity. It could exist with proper safeguards.

Still Ongoing: Testing, and Improving the Model

The app is still being tested and improved, especially:

the service that detects new images added to the gallery
reliability across devices and Android versions
improving the model’s performance and recognition quality

This first Android app taught me a lot—not only about development, but about operating system constraints, privacy concerns, and what “real-world engineering” looks like beyond tutorials.

And it also taught me something important about AI tools: they can accelerate you, but they don’t replace understanding—especially when the platform is complex and full of edge cases.

DEV Community

My First Android App as a Beginner: What I Learned Building an Offline ML Gallery Organizer (and How Copilot Helped)

My First Android App as a Beginner: What I Learned Building an Offline ML Gallery Organizer (and How Copilot Helped)

The Project: GalleryKeeper (Offline, Privacy-Focused Image Classification)

Self-Training: Android + Machine Learning

App Architecture: MVVM, Persistence, and the “Boundaries” That Matter

My Experience With Copilot: GPT-5.2 vs Gemini

What worked best: GPT-5.2 (when guided properly)

Gemini: great explanations, but too verbose for execution

Where Copilot GPT Really Shined: UI Design

The Hard Part: MediaStore, and Why No Agent Really Mastered It

ML Integration: I Did It Myself

The Biggest Android Limitation I Hit: You Can’t “Lock” Gallery Folders

Still Ongoing: Testing, and Improving the Model

Top comments (0)