-
Notifications
You must be signed in to change notification settings - Fork 55
WIP: add job-sql module and flux-sql query tool #7191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Problem: the JSON1 extension to sqlite3 is not available in RHEL 8 based distros. JSON1 is enabled by default in version 3.38.0. (or in earlier versions as an opt-in build option). Ubuntu 22.04 ships 3.37.2. RHEL 8 ships 3.26.0. Pull in the sqlite3 amalgomated source for 3.51.0 from here: https://sqlite.org/download.html License: "public domain". See https://sqlite.org/copyright.html
Problem: content-sqlite and job-archive use external libsqlite3 but we now have an internal copy of the library. Drop configure requirement and change the build rules for those broker modules to use the internal copy of libsqlite3.
Problem: the debian control file and scripts that install dependencies pull in libsqlite3-dev packages. Drop sqlite from those scripts.
Problem: docker test images pull in libsqlite3 but this is no longer a build requirement for flux-core. Update Dockerfiles.
Problem: libsqlite3 is included in coverage. Add it to CODE_COVERAGE_IGNORE_PATTERN.
Problem: libsqlite3 is spell checked in CI Add it to the .typos.toml extend-include list.
Problem: Flux doesn't have a raw SQL interface to job data that can utilize the sqlite JSON1 extensions. Add a service that consumes the job manager journal and populates an in-memory sqlite database with all jobs (active and inactive jobs). The schema simply stores the jobid, eventlog, jobspec, and R. The last three are kept in JSON format so sqlite JSON1 extensions can be used to construct queries. https://www.sqlite.org/json1.html
Problem: the job-sql service has no command line client. Add one.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7191 +/- ##
==========================================
- Coverage 83.71% 83.67% -0.04%
==========================================
Files 553 554 +1
Lines 92270 92309 +39
==========================================
+ Hits 77240 77244 +4
- Misses 15030 15065 +35
🚀 New features to boost your workflow:
|
|
Just a reminder, I did have an alternate solution, which was This work was never pushed hard b/c @ryanday36 eventually no longer needed it for his reporting purposes. I've been maintaining the PR for some time under the assumption of "we'll need this some day" ... I should say the PR above did assume no access to sqlite json queries (done before consideration of vendoring sqlite, b/c older versions of sqlite don't have the json support). There are downsides of that approach and positives to this approach. And there's devil in the details with the json queries. |
|
See also: |
See also: https://sqlite.org/loadext.html These are plugin extensions to sqlite3 and are how JSON1 was implemented. (I think @trws may have been referring to this in a meeting a few weeks ago) |
|
I was playing around with the This is what I came up with to get the job id of every job with a specific user: Although there's devils in the details, I think doing a giant AND of all the What is very different is the FROM statement, Thought 1) What if we did a database "mix" similar to what I did in if we added in columns for the most common things (I think Thought 2) Could we lay out the eventlog smarter? One idea: Instead of an array of all events, have an object that holds the array of events, but have keys that store a handful of the most important events, so lookups can be easy and we avoid the JOIN. Edit: Oh yeah, this may be critical. Some events can be duplicated (priority comes to mind) so we need the last entry in the array. This could make things hairy (internet says to use some ORDER by and LIMIT statements, but that may get hard / tricky if we're combining multiple Just some initial thoughts. I haven't prototyped anything deeply. |
|
Ok, I just realized something. The The join ( I'm obviously not the most expert sqlite person out there, but AFAICT, there is no way to to "simply" do this with any of the sqlite json functions available. The most logical way to do this (using my example above) would be to do a JOIN between jobs, "submit events", "finish events", which I think would involve doing nested SQL queries within the main SQL query. Add in my comment above about ensuring we get the last entry within an eventlog array (which may also involve nested sql queries), I'm not sure this is a path we want to go down. Having a slightly duplicated eventlog format or a database format may be a necessity. Edit: I suppose I did not include the possibility of a sqlite extension. We could hypothetically add a |
|
Ouch, yeah, that sounds like it would be really expensive. We could probably use the json support as a way to extract things into a table or tables, virtual or otherwise, to at least make the transformation machined if that would help. Not sure what that would look like off top of my head, but it might be a relatively low cost way to have the other view/format. |
Yeah, that's what I was refering to with my "nested SQL queries". Like (in pseudo-query language) |
|
Fair, I usually think in terms of views or constructed tables for things like this more than subqueries because of the performance or composability differences but they're effectively equivalent. One thing that comes to mind, would the |
I didn't try |
|
It depends on what the query is. For some things it would likely make it harder because you now need to aggregate things across rows, for others it could be easier because you can get all the entries where the field you care about matches then join back on the table with the parent and ID keys to get all the meaningful entries. If the performance of the decomposition is reasonable, I could see that being faster than having to separately run the |
|
A random thought I had tonight, regarding my idea 1 and idea 2 that I listed above. Idea 1, something like: idea 2: store events array like: I thought, why not do both? Idea 1 part - it will optimize the known most common queries now and presumably most of the common ones in the future idea 2 part - flexibility for the future. for example, e.g. new event types that are added to the eventlog, we have a mechanism that serves us going into the future. |
|
I like the idea. Are you thinking the events in the events array would be complete, or that they would be references out like [<type>,<index>]/[“submit”,5]?
Kinda orthogonal, and I’m not sure how big these normally get, but if we’re considering reworking the log, is it worth considering making it possible to incrementally parse it, or even just the events array maybe? Specifically we could do a JSONL style array so it doesn’t have to be parsed, serialized or even sent as one immense object.
|
Are you talking about the KVS eventlogs (see RFC 18)? They are already essentially in JSONL format IIUC, each eventlog entry is valid JSON object, and entries are separated by newlines. Edit: just realized you were probably talking about something specific to this implementation, just ignore me if so 🤦 |
|
Honestly I just didn't know that, that makes me think that we might want to expand on that rather than going from that to a big object though. We've already seen some scaling issues on large json objects in other places, and JGF is likely to get that treatment at some point, so may as well keep it that way here. Still a good idea to be able to get the events efficiently by type, but keeping it incremental would be good. |
I imagined them being complete. Minimally, b/c there can be multiples of some events (i.e. multiple "priority" events) the array of events is the "source of truth" for full history. The "keyed" events is the most recent one. Edit: Sorry, I think I read your comment backwards. The "keyed" events would point to the array, not the other way around. I initially imagined both being "complete". The array being complete for "full history". The "keyed" events being complete so that it would make json queries simpler. (i.e. accessible through |
Problem: we have a need in production to make job data persistent for longer than the 2 week purge window we typically use now. Short term, we'll try increasing the job purge window, but the KVS and the memory footprint of
job-listare potential limits to scalability. Some way to persist job data beyond the KVS purge window could be useful, and a solution that allows currentflux jobsqueries to work seamlessly on historical and current data is preferred.This is #5847 re-submitted with a newer vendored libsqlite3.
To recap, a
job-sqlmodule is added that is a job manager journal consumer that dumps job data into a sqlite database. In this WIP PR, the database is in-memory, but a configuration option to make it persistent could be trivially added, with the logic added needed to pick up from the journal where it left off when the system is restarted.As discussed in #5847, exposing SQL directly to guests is a non-starter, but either reworking
job-listto use (this?) sql back end, or adding a job-list compatible RPC method within this module is one possible path forward.This was posted primarily to get feet wet with job queries using the JSON1 syntax and some shortcomings with that approach were identified in #5847: for example JSON1 doesn't know about hostlists, idsets, or eventlogs. However, maybe a combination of SQL query post-processing and expansion of the schema to include some broken out fields could be used to get around those limitations.
Anyway posting this mainly for discussion. Since this initial work was already done, it's a potential fast path to...something.