When a query is run on Hive it creates a directory under `/tmp/hive` on hdfs. These directories are never cleaned up, there is a command that the system administrator can run but it's not considered safe from the hive wiki.
Currently the hiveCleaner reads the information from the inodes table and look which directories haven't been modified for more than a week and then deletes them (using the hdfs native client). To be able to do so, it needs to run as glassfish user.
i'm not really happy with the design. I think a better design would be to move the code into hopsworks where we can easily impersonate the users and check that no tez application is running for that user. This would increase security and reduce the stuff hiveCleaner does, making it easier to eventually merge it with the ePipe project.