Solid-CLI: The "State of the Art" for Decentralized Data
"The Web of data with meaning in the sense that a computer program can learn enough about what the data means to process it." — Tim Berners-Lee, The Semantic Web
Solid-CLI is not just a file uploader. It is a Reasoning Agent that bridges the gap between your local Operating System and the Semantic Web.
Built for the GitHub Copilot CLI Challenge, this project aims to advance the state of the art in Decentralized Personal Data Stores (Solid Pods) by implementing cutting-edge academic research in a user-friendly CLI tool. It will also be used as a tool in my current work on the proxion-protocol.
Team Members
- @elblanquitolobo (Solo Lead)
Source Code
- GitHub Repository: cafeTechne/solid-cli
Scientific Foundation
I moved beyond "fun" features to implement rigorous computer science concepts, cited directly from arXiv research:
- Client-Side Reasoning (N3Logic): Based on the foundational work of Tim Berners-Lee et al. (0711.1533) and the W3C N3Logic framework, we implemented a local inference engine. The client doesn't just store data; it understands it, inferring metadata and trust levels automatically.
- Privacy-Preserving Compute (Libertas): Inspired by the Libertas framework (Zhao/Oxford, 2309.16365), our reasoning happens locally before data leaves your device, ensuring total privacy and data sovereignty as advocated in the Web Science Manifesto (1702.08291).
- The Open Metaverse (WebXR Interoperability): As proposed by Macario et al. (2404.05317 & 2408.13520), we provide a "Universal FUSE Mount" that projects decentralized Pods as local file systems, enabling legacy 3D tools (Blender, Unity) to access WebXR assets natively via standard Web protocols.
Key Features
1. The Semantic Reasoning Engine (solid-cli reasoner)
Most uploaders just copy bytes. solid-cli generates Knowledge.
-
PROV-O Provenance: Automatically signs every file with W3C PROV-O standard provenance data (
prov:wasAttributedTo). -
Inference Rules: Implements forward-chaining N3 logic to power Semantic Search Engines (1102.0695).
-
Example: "Confidentiality Propagation". If you upload a file to a folder marked
TopSecret, the reasoner automatically infers that the file is alsoTopSecret.
-
Example: "Confidentiality Propagation". If you upload a file to a folder marked
2. Universal FUSE Mount (solid-cli mount)
Access your Decentralized Pod as if it were a USB Drive.
- Cross-Platform: Works on Windows (via WinFsp), macOS (via FUSE-T/OSXFUSE), and Linux (via libfuse).
- Performance: Asynchronous pre-fetching allows for smooth media streaming directly from the Pod.
3. "Turbo" Parallel Sync
- Outcome: 6x Speedup over standard linear uploaders.
-
Tech: Uses
asyncio.gatherwith bounded Semaphores to saturate the network without crashing the server. - Benchmark: Validated at 3.03 MB/s sustained throughput.
4. Trust Audit (solid-cli verify)
- Verifiable Identity: Adopts the Issuer-Holder-Verifier model (2201.07034).
- Cryptographic Reports: Generates a signed JSON audit trail comparing local vs. remote state, proving data integrity for enterprise/healthcare use cases.
The "Architect + Builder" Workflow
This project was built using the GitHub Copilot CLI as a pair programmer. The workflow was distinct:
- Human (Architect): I researched the arXiv papers and defined the Logic (e.g., "Implement N3 forward-chaining").
- Human + AI (Dialectical Planning): I utilized the dialectical method to hone a plan using a range of various models provided to me (Gemini 3 Pro, Gemini 3 Flash, Claude Sonnet 4.5 and Claude Opus 4.5). I did this because I had previously committed to trying out Google's most recent offering (even though I had cancelled my subscription earlier this month, before I learned about this challenge, and have about a week and a half access still available to me on the Pro plan), but I don't think Gemini 3 is a mature enough model to reliably code (understatement of the year). I do think it excels at performing web research and raises good points in debate. I used a VS Code extension I wrote earlier this month to facilitate the discussion, planning and uploading of documents to Antigravity IDE. I would have used github-cli for this stage, but I don't have the resources to pay for a subscription (hence me giving it my all in this contest!) and had to be mindful of token usage to not hit the 5 hour limits accorded to the free plan users given my late entry into this contest.
- AI (Builder): Copilot CLI generated the all of the code in this project, specifically handling the complex
rdflibgraph operations andfusepyC-bindings. I fed it plans I derived in step 2 and used those models referred to above to conduct code review. I generally find that my results are much better when I use different LLM's to critique each others' work. I used the Claude Haiku 4.5 model and found it to be outrageously fast and incredibly competent. As an execution model for building quickly, I was very impressed with this model. The agentic scaffolding provided by github CLI was excellent. This was my first time ever using Claude Haiku or github-cli.
This allowed us to implement complex academic theories (like N3Logic) in hours, not weeks! None of the code created by haiku 4.5 was syntactically invalid or logically out of line with the specs I had refined through iterating on the plan.
Installation & Testing
Instructions for Judges
If you have a Solid Pod (e.g., from Inrupt or CSS), follow these steps to test the "Agentic" capabilities:
1. Docker Sync (Zero Dependency)
docker run --rm -it cafetechne/solid-cli sync ./local_data https://YOUR_POD/data
2. Local Installation
pip install solid-cli
solid login
solid mount S: https://YOUR_POD/data
3. Run the Trust Audit
solid verify https://YOUR_POD/data
Conclusion
Solid-CLI demonstrates that the Decentralized Web can be both performant (via Turbo Sync/FUSE) and intelligent (via N3 Logic). By strictly adhering to the principles of Data Sovereignty and Privacy-Preserving Compute, we are building a Web that is safer, smarter, and owned by you.
Works Cited: The Scientific Foundation
1. N3Logic: A Logical Framework For the World Wide Web (0711.1533)
- Authors: Tim Berners-Lee, Dan Connolly, et al.
- Implementation: (Semantic Reasoning Engine). We use N3 rules to infer trust and metadata.
2. Libertas: Privacy-Preserving Collaborative Computation (2309.16365)
- Authors: Rui Zhao, et al. (Oxford)
- Implementation: The Reasoning Engine runs locally to infer metadata, keeping "reasoning" close to the data. I think this could be expanded in future work as a MCP server and tailored to various use-cases.
3. A Manifesto for Web Science @10 (1702.08291)
- Authors: Wendy Hall, Jim Hendler, et al.
-
Implementation: We frame
solid verifyas a tool for Data Sovereignty. I use these theories to motivate my recent work on integrating some libraries from the Signal organization into Proxion, but that work is still very nascent.
4. WebXR as a Basis for an Open Metaverse (2404.05317 & 2408.13520)
- Authors: Giuseppe Macario, et al.
- Implementation: (Universal FUSE Mount). By mounting a Pod as a local filesystem, we allow standard 3D tools (Blender, Unity, A-Frame) to access decentralized assets natively. This is a key component of the "Open Metaverse" vision, and I think that future work in this direction will make it clear that Solid pods can help us liberate our data from proprietary data silos.
5. Verifiable Credentials for Healthcare (2201.07034)
- Authors: Gongjin Lan, et al.
- Implementation: Adopts the Issuer-Holder-Verifier model for trust in sensitive environments.
6. Semantic Search Engine (1102.0695)
- Authors: Debajyoti Mukhopadhyay, et al.
- Implementation: PKG Extraction generates the RDF metadata necessary for these semantic search engines to function.


Top comments (0)