How I Built a Confluence RAG Pipeline for WorkspaceGPT Without Breaking My Laptop

Hey there, fellow code wranglers! Recently, I embarked on a wild adventure to create a VS Code extension called WorkspaceGPT that uses a Retrieval-Augmented Generation (RAG) pipeline for Confluence. Spoiler alert: it didn’t involve sacrificing my laptop to the AI gods or downloading half the internet in dependencies. Today, I’m spilling the beans on how I made it happen, kept it lightweight, and threw in a few laughs along the way. Buckle up!


Why JavaScript? Because Python Packaging Is My Personal Nightmare

Picture this: you’re building a shiny AI app in Python, and then—bam!—you’re wrestling with dependency hell, virtual environments, and a package manager that’s moodier than my cat on bath day. Most RAG apps lean hard into Python, but I said, “Nope, not today!” and picked JavaScript instead.

Why? JavaScript is like that chill friend who shows up ready to party anywhere—VS Code, browsers, you name it. Write once, run everywhere (well, almost). Plus, packaging is a breeze, and I didn’t have to whisper sweet nothings to pip to make it work. This meant I could focus on coding instead of playing dependency Tetris.


Dodging the ML Package Bulge

If you’ve ever peeked at AI apps, you know they love their chunky machine learning libraries—HuggingFace’s Transformers, HNSWlib, FAISS, oh my! These packages are like that one friend who brings all their camping gear for a one-night trip. Sure, they’re powerful, but they’d make my extension’s bundle size look like it ate too many digital donuts.

My mission? Keep WorkspaceGPT lean and mean so it wouldn’t choke your average developer’s laptop. That meant saying “thanks, but no thanks” to those heavyweight libraries on the client side. My goal was an app that runs smoother than a sunny day playlist, not one that needs a gym membership to lift.


Vector Search: When You DIY Because Libraries Are Picky Eaters

Vector search is the secret sauce of any RAG pipeline—it’s what makes your app find the right Confluence page faster than you can say “where’s that dang meeting note?” Libraries like FAISS and HNSWlib are pros at this, but they’re also divas who don’t play nice across platforms. Getting them to work cross-platform felt like convincing my grandma to use TikTok—possible, but way too much effort.

So, I rolled up my sleeves and wrote my own vector search logic. Was it fancy? Nope. Did it get the job done? You bet! I figured most Confluence workspaces for developers (my target crew) have about 2,000–3,000 pages—small enough that I didn’t need to haul in the big guns. Sometimes, simplicity is the real MVP.


Trade-Offs: Because Perfect Is the Enemy of Done

To make WorkspaceGPT lightweight and client-friendly, I had to make some deals with the dev devil. Here’s the lowdown on how I kept things breezy:

1. Swapping HuggingFace for Ollama (and a Diet Model)

HuggingFace’s Transformers are awesome, but they’re also beefy enough to make my laptop fan sound like a jet engine. Instead, I went with Ollama, which is like the minimalist cousin who runs AI models locally without drama. To keep things extra chill, I used a 4-bit quantized model—because nobody’s got time for a resource-hogging app that demands 16GB of RAM just to say hello.

2. Data Preprocessing: My Love-Hate Relationship

Good embeddings need clean, structured data, and Confluence pages can be messier than my desk after a coding marathon. I spent hours crafting custom preprocessing logic to whip that content into shape. Could I have sent it to an LLM for a glow-up? Sure, but that felt like hiring a Michelin chef to make instant ramen. My DIY approach worked well enough, even if it occasionally made me question my life choices.

3. Picking Featherweight Models

For embeddings, I grabbed nomic-embed-text (a svelte 230MB—practically a digital yoga instructor). For inference, I went with llama3.2:1b (a mere 1GB, aka the size of one cat meme folder). These models are light enough to run on your average dev machine without triggering a “low disk space” warning. Did I mention they still get the job done? Because they totally do.


User Onboarding: Faster Than My Coffee Order

Nobody wants to spend an hour setting up a tool—especially not developers who’d rather be coding (or arguing about tabs vs. spaces). I made sure WorkspaceGPT’s onboarding is quicker than my barista handing me a latte. Lightweight models, minimal setup, and no “please install this 5GB library” nonsense. Just plug, play, and start searching those Confluence pages like a pro.


What’s Next? Codebase Integration and More Shenanigans

WorkspaceGPT is out there strutting its stuff, but I’m not done yet. The next big thing is integrating it with codebases—think of it as teaching my app to read both Confluence and your repo’s spaghetti code. I’m already knee-deep in figuring out the best trade-offs to keep it fast, light, and useful. Stay tuned for more adventures (and probably a few more bad jokes).


Wrapping It Up: Lightweight Doesn’t Mean Light on Awesome

Building WorkspaceGPT was like trying to pack for a weekend trip with only a carry-on—you’ve gotta make tough choices, but the results are worth it. By dodging bulky libraries, writing custom logic, and picking models that don’t eat your hard drive for breakfast, I created a Confluence RAG pipeline that’s accessible, efficient, and developer-friendly. Is it perfect? Nah. Does it get the job done without making your laptop cry? Oh, heck yeah.

Want to give WorkspaceGPT a spin? Check out the VS Code extension and let me know what you think! Bonus points if you share your own tales of wrangling code, Confluence, or overly demanding AI models. 😄


Thanks for reading! If you enjoyed my ramblings or have ideas to make WorkspaceGPT even cooler, drop a comment below. Now, if you’ll excuse me, I have a date with some code… and maybe a coffee.

Here’s the link of my vscode extension:https://marketplace.visualstudio.com/items?itemName=Riteshkant.workspacegpt-extension

Leave a Comment