Well, I’m bored, so I figured I’d spill the beans on a project I’ve been keeping under wraps for a while.
I’ve been working on getting everything about the portage tree into postgresql so you can run all kinds of queries. What kinds of queries? How many ebuilds use eclass ‘eutils’ and USE flag ‘alsa’ and are in ‘video’ herd and amd64 is masked but x86 isn’t. That kind. Funky ones. 🙂
I must say, I really love postgresql even though I haven’t been using it regularly for a long time, I’m quickly getting back into it. The simplicity, the standards, the power, the tools … postgres has it all. Ahh, fanboyism.
Anyway, getting the details of the ebuilds was made incredibly easy thanks to marienz and ferringb and their work on pkgcore (and a custom python script). After that, it was just a matter of parsing the information and setting up the schemas. My importer is written in PHP and the class to import / read the data is still in its slightly butt-ugly stage. It can use some cleaning up, for sure. The database layout is going to be where the real optimizations are though. I’m going to work on setting up some good views so it will be easy to query. Right now, here’s the list of tables I have setup: arch, category, ebuild, ebuild_arch, ebuild_eclass, ebuild_homepage, ebuild_license, ebuild_use, eclass, eclass_use, herd, license, package and use. All of them can already be populated by the scripts except for eclass_use and herd. I haven’t setup the dependency ones yet, though that’ll be pretty simple too.
So there’s my big announcement. Woots. I’m working on creating the SQL to import everything right now (which takes a long time), and once that’s done, I’ll throw up a db dump somewhere. There’s still lots to be done, like finishing the import scripts and setting up some webpages to browse the tree, but it shouldn’t be too hard. I’m definately over the worst of it.
Sounds like a really cool project! Will you provide a non-web frontend (because, you know, web apps are uncool and all that)?
Nice I was doing something on this but lack of time is cronic here, would you share some code just to calm down the curiosity of a frindly developer (me)?
once the importer works at a sufficient level it may be interesting to write something like portagefs for fuse. shouldn’t be a problem to put the fs-action into the database. a binary fallback provides safety and the parser provides all the database candy you can want.
I see CREATE VIEW on the portage tree, while it’s backend is still rw-accessable through /usr/portage
Hey, this is seriously cool! I though that I was the only one in the world who considered a db-backed portage to be a much needed feature! Damn, I’m tired of editing long-ass, unorganized, file-based /usr/portage/package.* files!
This is where it’s at – can’t wait to see it stabilize!
Maybe you’d want to consider making this work under berkdb or sqlite, as those are both much lighter weight?
( though I concur with your opinion of postgresql – it’s my rdbms of choice also )