-
Notifications
You must be signed in to change notification settings - Fork 12
Add tools for ContextDefinition create and cleanup #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Modified ContextDefinition to implement KeywordExecutable so that the user can use `accumulo create-context-definition` to create the json required for the ContextDefinition file. Added a Cleaner to call `URLClassLoader.close` on instances that are about to be garbage collected. The LocalCachingContextCleaner is used to register the URLClassLoader with the Cleaner and this class also keeps references to the URLClassLoaders. The list is accessed from a MXBean so that a user can use JMX to retrieve a list of files that are referenced in the local cache directory from the classloaders that have been created but not yet gc'd. This should aid users in determining which files can be removed from the local cache directory.
...test/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
| @Override | ||
| public void configureMiniCluster(MiniAccumuloConfigImpl cfg, | ||
| org.apache.hadoop.conf.Configuration coreSite) { | ||
| cfg.removeJvmOption("-XX:+PerfDisableSharedMem"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires apache/accumulo#5999 to be merged. This JVM option was not allowing the test to attach to the locally running Accumulo JVM processes. Additionally, I think this is what causes jps not to report the processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apache/accumulo#6062 probably affects this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working on that now.
pom.xml
Outdated
| <banDuplicatePomDependencyVersions /> | ||
| <dependencyConvergence /> | ||
| <banDynamicVersions /> | ||
| <!-- <banDynamicVersions />--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can revert this when 2.1.5 is released and the version changed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What in 2.1.5 does this PR depend on that is not in 2.1.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could create a reminder issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What in 2.1.5 does this PR depend on that is not in 2.1.4?
Could create a reminder issue for this.
I will do that when this PR is merged.
|
This won't build successfully until apache/accumulo#5999 is merged. Tests pass locally for me. |
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
ctubbsii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't quite understand why this needs to depend on 2.1.5-SNAPSHOT that isn't available in 2.1.4.
...sloader/src/main/resources/META-INF/services/org.apache.accumulo.start.spi.KeywordExecutable
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactory.java
Show resolved
Hide resolved
| and unused old files within a context cache directory. | ||
| and unused old files within a context cache directory. To aid in this task a JMX MXBean has been created to expose the | ||
| files that are still referenced by the classloaders that are created. For an example of how to use this MXBean, please | ||
| see the test method `MiniAccumuloClusterClassLoaderFactoryTest.getReferencedFiles`. This method attaches to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a link to the test class.
| see the test method `MiniAccumuloClusterClassLoaderFactoryTest.getReferencedFiles`. This method attaches to the | |
| see the test method [MiniAccumuloClusterClassLoaderFactoryTest.getReferencedFiles](src/main/test/.../MiniAccumuloClusterClassLoaderFactoryTest.java). This method attaches to the |
pom.xml
Outdated
| <banDuplicatePomDependencyVersions /> | ||
| <dependencyConvergence /> | ||
| <banDynamicVersions /> | ||
| <!-- <banDynamicVersions />--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could create a reminder issue for this.
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
...lassloader/src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextCleaner.java
Outdated
Show resolved
Hide resolved
...ssloader/src/main/java/org/apache/accumulo/classloader/lcc/definition/ContextDefinition.java
Outdated
Show resolved
Hide resolved
...ssloader/src/main/java/org/apache/accumulo/classloader/lcc/definition/ContextDefinition.java
Outdated
Show resolved
Hide resolved
|
PR updated and re-tested locally |
Co-authored-by: Keith Turner <kturner@apache.org>
…on/accumulo-classloaders into local-classloader-tooling
| safe to delete, it is not recommended to do so for any that are still in use, | ||
| because they can be useful for troubleshooting. | ||
|
|
||
| To aid in this task a JMX MXBean has been created to expose the files that are still referenced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This offers the building blocks to create a cleanup utility and leave that as an exercise to the reader. It would probably be best (in a follow on change) to write a cleanup utility in this project that uses these building blocks, for the following reasons.
- Can document and test the cleanup code in one place. If cleanup code is written, seems suboptimal for it to live in another repo or for there to be multiple implementions.
- Having a working end to end impl gives confidence that these building blocks provide everything that is needed to do cleanup.
- The cleanup code will need to have some awareness of the directory layout of the cache dir which is an internal implementation detail.
- The clean up utility will have to get the algorithm correct in order to avoid race conditions. Seems like its important to list the dir first prior to get refs from all processes, but not completely sure. Would need to experiment and see the code to be sure.
- Could make the cleanup code more robust by failing properly if no JMX seen or mismatch in dir passed in and dirs seen from jmx.
We could make the new command line utility have subcommands to support this. Something like the following.
accumulo context-classloader --help
# create context definition json and print to stdout
accumulo context-classloader create -i <interval> -f {files}
# remove files not in use by any of the given pids
accumulo context-classloader cleanup <cache dir> {pids}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to create a tool that works in one deployment scenario, but it might be hard to make it work in many of them. More information on how to make JMX client connections and server options here: https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html.
My test is using the connection mechanism described in the example at https://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html#gdhkq. it works because I'm using mini in the tests and I know that all of the processes are on the localhost and because I remove that one system property (If that system property is set then this mechanism doesn't find the processes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to create a tool that works in one deployment scenario, but it might be hard to make it work in many of them
Do you know if different JMX connections could require different java code? Or can the java JMX client APIs connect to any endpoint given correct config? If the user may have to write different java code for different JMX scenarios then the best we could probably do in the project is provide a good example of the java cleanup code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if different JMX connections could require different java code?
Yes. In the test I use
List<VirtualMachineDescriptor> vmdl = VirtualMachine.list();
and
VirtualMachine vm = VirtualMachine.attach(vmd);
String connectorAddress = vm.getAgentProperties()
.getProperty("com.sun.management.jmxremote.localConnectorAddress");
This works to find VMs that are running locally, attachs to them, and gets the connector address from them.
What I'm unsure of, is whether or not this approach would work if any of the options listed here are set on the processes. So, it's very possible that a client would need to be coded specific to how the server is configured, but I'm not 100% sure.
In the test I remove the JVM Option -XX:+PerfDisableSharedMem from the processes started by MiniAccumulo. If I don't remove this, then VirtualMachine.list() does not find the VMs (jps probably doesn't either). In that case, I would have to construct the address (service:jmx:rmi:///jndi/rmi://hostName:portNum/jmxrmi). If the server process is using security options (TLS, authentication, etc.), then more code is likely needed.
Or can the java JMX client APIs connect to any endpoint given correct config?
We would need to write, or find, a generic JMX client library that can be configured with all options. I didn't look for that specifically, but it might exist. If we can't find one, then we could write one, but I'm not sure we should do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't find one, then we could write one, but I'm not sure we should do that.
I agree, we should not write that. If something like that is needed, then it seems like an example is the best we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue #45 created for this.
...ssloader/src/main/java/org/apache/accumulo/classloader/lcc/definition/ContextDefinition.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/accumulo/classloader/lcc/MiniAccumuloClusterClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/accumulo/classloader/lcc/MiniAccumuloClusterClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/accumulo/classloader/lcc/MiniAccumuloClusterClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
...g-classloader/src/main/java/org/apache/accumulo/classloader/lcc/util/DeduplicationCache.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/accumulo/classloader/lcc/LocalCachingContextClassLoaderFactory.java
Outdated
Show resolved
Hide resolved
...test/java/org/apache/accumulo/classloader/lcc/MiniAccumuloClusterClassLoaderFactoryTest.java
Outdated
Show resolved
Hide resolved
* POM improvements * Use var and multi-catch in getReferencedFiles * Add newline to toJson() * Check for nulls before calling user consumer * Rename method in DeduplicationCache (values to forEach)
ctubbsii
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my suggested changes have been merged in. I am a little concerned if there are any implications to changing the DeduplicationCache instance to static, but I've thought about it and haven't been able to come up with any reason why it would be a problem.
|
What is the benefit of making the dedupe cache static? |
It had to be static to be called by the static getReferencedFiles method, which is called by the JMX MXBean. |
|
Test passes for me locally. Re-tested the |
Modified ContextDefinition to implement KeywordExecutable so that the user can use
accumulo create-context-definitionto create the json required for the ContextDefinition file.Added a Cleaner to call
URLClassLoader.closeon instances that are about to be garbage collected. The LocalCachingContextCleaner is used to register the URLClassLoader with the Cleaner and this class also keeps references to the URLClassLoaders. The list is accessed from a MXBean so that a user can use JMX to retrieve a list of files that are referenced in the local cache directory from the classloaders that have been created but not yet gc'd. This should aid users in determining which files can be removed from the local cache directory.