Daily report(7/1)

Today was successful. I was able to get the control plane to stop crashing and get the configuration tests to run to completion properly. The benchmark is written and reloading the configuration is so much faster than killing the irods server and starting it again.

Of course, that means it has limitations. Currently it is believed that the authentication modules do not work properly when the settings are changed. Honestly, this isn't a very simple program, apache is way simpler than irods, especially in terms of the dependencies, and I'm skeptical that this can be made to work perfectly in all situations.

The problem I had yesterday was the control plane catching the SIGHUP signal and terminating the server. With that fixed, it's working like a charm, all I need to do is see if we can replace some more of the calls to restart in the tests. Might even gain a percent off the tests 🙂

Daily Report(6/27)

Today has been spent addressing comments made on the pull request.

  • Address that the field visibility of acquire_*_lock functions implied that it was meant to be derived
  • Replace rodslog calls in get_server_property with the modern logging system
  • Restore some whitespace

In addition, I updated the rodsLog calls in rodsServer and irods_server_properties.

I still need to rewrite the hook manager to avoid introducing yet another singleton

Daily Report(6/22)

I am starting today by running more of the test suites to see if the changes have broken anything unrelated. So far so good, except for finding that the build+test script was running the specific test "0"(uninitialized variable).

Resource testing seems to fail on the same spots as before erratically. I'm going to assume that that's fixed in main.

I should really pull from main sooner rather than later, rebasing could get tricky if I'm not careful.

Oddly enough, moving the delay server tests towards using the .reload_configuration() made things slower, this seems to be because until now, the delay server did not refresh the amount of time it waits for. I suspect that there will be many more instances of things like that.

Showing that to be true, the delay server will need to be kicked over whenever the number of executors or the size of the queue changes, as it does not appear that boost::asio::thread_pool supports changing the number of threads mid-lifetime. I should be able to swap it out after calling .stop and .join on the thread pool(I will not use placement new for assignment) :p

Right now, it appears to not be killing the delay server properly. Or grandpa(the toplevel irods process) is falling down unexpectedly.

I need to run over some of this stuff with K and T tomorrow.

The delay server should use the hook manager to change the executor count, and it should probably use .reload instead of .capture

Daily report(6/20)

Currently the issue I am facing is the rule language. Given that this was a sticking point last year, I can't say that I'm surprised. Currently I can't see what the msi_get_server_property is actually returning, and given that they running in the irods agent server it may need additional logic there.

I am having trouble getting the rule to write out the property to stdout, which is not making this task simpler. I have a suspicion that there will continue to be weird issues that will pop up.

The output appears normal to me.

Weekly journal(6/13-6/18)

This week was spent working on writing the configuration_hook_manager's code. I have started working on writing tests.

The primary issues that I have run into have included moving between machines and the performance issue with my desktop's harddrive substantially slowing the work that I am doing.

msiModAVUMetadata is a complicated little micro-service that I'm not sure about. It seems kinda weirdly designed as a function to be called, but it does feel like you're using some little bit of imeta.

I have added a flag into my build+test script in order to allow specifying the tests you want to run.

Right now I need to add the configuration reloading code back to the cron thread, it'll just use SIGUSR1 as a trigger to enable it. That might not be necessary, as the configuration is indeed being reloaded.

Currently I am having trouble getting the test to finish, and there is a conspicuous lack of logs related to the specific failure. Python's lack of ahead of time syntax checking is annoying

Daily report(6/16/2022)

Work continues slowly on the configuration_hook_manager. I moved my installation over to an ssd, so now development is happening faster.

At some point I should look into running an lsp server from inside the irods builder, I might actually be able to get good completion and navigation.

I have moved the configuration update logic so that it reloads every loop from the main thread. This may need revision or further amendment in other main loops.

The next thing I need to do is writing tests for this, and then revising the existing tests to use this instead of .restart() when possible. There are three places I would like to see work, some basic delay server properties, and some agent server properties that are relevant.

I don't think a more comprehensive JSON differ would enhance the abilities to satisfy the needs of the server thus far. If something like that is necessary it shouldn't be too hard to adapt it.

The tests will require the usage of a rule to query the server_properties of the server. So I have written a microservice called msi_get_server_property which allows it to query it. With luck it will permit the tests to avoid all the timing issues reading the log is susceptible to.

iRODS daily report(6/15/2022)

I have continued to work on configuration_hook_manager. It is slowly coalescing into something that can truthfully said to compile. There are a few issues that I have to figure out, such as how to handle nested properties, and whether or not dispatching every event to every hook makes sense from a performance point of view(okay, fine it won't matter yet, but it might one day).

Another point in this design is that it does not currently measure changes to nested objects such as advanced_settings in a manner that I find satisfying, namely, anything inside it has to watch for changes on its own.

Much of today was spent trying to get the development/testing environment working 100% again. The build script now manages to clean up everything from previous invocations, and takes options to skip rebuilding irods or the number of tests you want to run at once(I can't run 16 at once, my harddrive is too slow for that, but 2 works).

But as I write this, I'm not sure this is the right action, I think it might not even need the configuration hook manager for the immediately desired outcome of this(making testing faster by eliminating reloads).

I have yet to attach the SIGHUP handler to the reload mechanism. And I still need to talk to J about their ideas for this.

iRODS progress report 6/3/2022

What's been done

  • Development environment copied to laptop and working.
  • Using shared_mutex to synchronize access to the server_properties object
    • [x] Server compiles
    • [x] Server tests run successfully without adding regular calls to server_properties::capture
    • [ ] Server tests run successfully with calls to capture.
  • [ ] Integrate calls to capture back into the cron system

Problems run into so far

  • I didn't have vim installed on my laptop
  • The build script was set to use the 16 cores on my desktop
  • After adding the shared_mutex it still crashes when capture was called

What's next

  • use --leak-containers to keep the container up for long enough for me to muck around in when running the run_core_tests.sh script it and see what happens. Alternatively I could use stand_it_up.py to just stand up a server.
  • Investigate where the map() function is called on server_properties. It might be a good idea to do away with it if this is preventing thread safety.

iRODS progress report Day 2(6/2/2022)

Today I spent my time trying to get the configuration file to reload after I successfully ran the test suite.

The first approach that I tried was hooking it into the Cron system. This has failed thus far because reloading the configuration on the main server causes use-after-frees (I think) due to contention over the json objects across different threads.

Reloading the server_properties object from any thread causes the same issues.

The correct way to fix it, I believe, is to add a read-write mutex that guards against mutation in flight. This will require additional copying, but given that the reloading is going to be fairly rare it should not affect performance for long.

Once it is thread-safe the cron system should be able to be used to check for changes in the configuration file.

Tomorrow I will most likely be setting it up again on my laptop.