Daily Report(6/10/2022)

Today I have been investigating enhancing the reload() method with a list of difference objects. So far this has been relatively simple to implement.

Next week I plan to figure out the design of what will become the configuration_change_hook_manager facility, but let's go through some preliminary thoughts

  • The hook should be addressed by the configuration property's name.
  • The hook should have the new and old values passed to it.
  • The hook manager should be invoked on the list of changes to the configuration
  • configuration_hook should follow a builder pattern as the cron_manager's cron_task does
  • It should make noise about unhooked configuration changes in the log
    • This is important for debugging and properly informing admins about the potential issue not restarting their server might cause.
    • It might be worth adding a configuration option to make it error out when a configuration change isn't able to be handled.
  • I should talk to J about this since their work is likely to intersect it.

iRODS daily report(6/15/2022)

I have continued to work on configuration_hook_manager. It is slowly coalescing into something that can truthfully said to compile. There are a few issues that I have to figure out, such as how to handle nested properties, and whether or not dispatching every event to every hook makes sense from a performance point of view(okay, fine it won't matter yet, but it might one day).

Another point in this design is that it does not currently measure changes to nested objects such as advanced_settings in a manner that I find satisfying, namely, anything inside it has to watch for changes on its own.

Much of today was spent trying to get the development/testing environment working 100% again. The build script now manages to clean up everything from previous invocations, and takes options to skip rebuilding irods or the number of tests you want to run at once(I can't run 16 at once, my harddrive is too slow for that, but 2 works).

But as I write this, I'm not sure this is the right action, I think it might not even need the configuration hook manager for the immediately desired outcome of this(making testing faster by eliminating reloads).

I have yet to attach the SIGHUP handler to the reload mechanism. And I still need to talk to J about their ideas for this.

Daily Report(6/7/2022)

Today has been marked by some progress in eliminating some dead ends in exploring fixing the bug. Changing the merge behavior in .capture() was successful in getting the main server to start yesterday, but it broke the agent server by somehow eliminating some of the information. But I had a thought, if I just make a new function, maybe .reload(), it could just behave differently than .capture()

Daily report(6/9/2022)

Today, with the help of the team I was able to resolve the issue with the server not being able to be stood up. The solution was to use docker-compose to tear down the previously created containers, as the database lingered without doing that and that caused errors in setup which resulted in a non-viable configuration.

With that working, adding a reload() method which merges the file's properties with the existing properties(permitting runtime properties to remain) is sufficient to permit tests to pass.

Daily report(6/6/2022)

Today has been slow because of being stuck with nothing more than my laptop. This has kept builds glacial and prevented most of the testing that I was hoping to do today. There is a simple explaination of the problems though, for whatever reason, when the capture() method is called on the server settings, it appears to have trouble acquiring most of the session variables.

I have managed to change where the crash happens by merging the configuration objects instead(replacing the old values with new values, but not getting rid of other old values), but that doesn't do a lot of good right now because now it seems that there is still something missing from the configuration object that the agent server gets.

This probably means that I broke something or other. 🙂

iRODS progress report 6/3/2022

What's been done

  • Development environment copied to laptop and working.
  • Using shared_mutex to synchronize access to the server_properties object
    • [x] Server compiles
    • [x] Server tests run successfully without adding regular calls to server_properties::capture
    • [ ] Server tests run successfully with calls to capture.
  • [ ] Integrate calls to capture back into the cron system

Problems run into so far

  • I didn't have vim installed on my laptop
  • The build script was set to use the 16 cores on my desktop
  • After adding the shared_mutex it still crashes when capture was called

What's next

  • use --leak-containers to keep the container up for long enough for me to muck around in when running the run_core_tests.sh script it and see what happens. Alternatively I could use stand_it_up.py to just stand up a server.
  • Investigate where the map() function is called on server_properties. It might be a good idea to do away with it if this is preventing thread safety.

iRODS progress report Day 2(6/2/2022)

Today I spent my time trying to get the configuration file to reload after I successfully ran the test suite.

The first approach that I tried was hooking it into the Cron system. This has failed thus far because reloading the configuration on the main server causes use-after-frees (I think) due to contention over the json objects across different threads.

Reloading the server_properties object from any thread causes the same issues.

The correct way to fix it, I believe, is to add a read-write mutex that guards against mutation in flight. This will require additional copying, but given that the reloading is going to be fairly rare it should not affect performance for long.

Once it is thread-safe the cron system should be able to be used to check for changes in the configuration file.

Tomorrow I will most likely be setting it up again on my laptop.