Deep learning is being incorporated in many modern software sys- tems. Deep learning approaches train a deep neural network (DNN) model using training examples, and then use the DNN model for prediction. While the structure of a DNN model as layers is observ- able, the model is treated in its entirety as a monolithic component. To change the logic implemented by the model, e.g. to add/remove logic that recognizes inputs belonging to a certain class, or to re- place the logic with an alternative, the training examples need to be changed and the DNN needs to be retrained using the new set of examples. We argue that decomposing a DNN into DNN modules— akin to decomposing a monolithic software code into modules—can bring the benefits of modularity to deep learning. In this work, we develop a methodology for decomposing DNNs for multi-class problems into DNN modules. For four canonical problems, namely MNIST, EMNIST, FMNIST, and KMNIST, we demonstrate that such decomposition enables reuse of DNN modules to create different DNNs, enables replacement of one DNN module in a DNN with another without needing to retrain. The DNN models formed by composing DNN modules are at least as good as traditional mono- lithic DNNs in terms of test accuracy for our problems.
A software component and a DNN model are similar in spirit— both encode logic and represent significant investments. The former is an investment of the developer’s efforts to encode desired logic in the form of software code, whereas the latter is an investment of the modeler’s efforts, an effort to label training data, and computation time to create a trained DNN model. The similarity ends there, however. While independent development of software components and a software developer’s ability to (re)use software parts has led to the rapid software-driven advances we enjoy today; the ability to (re)use parts of DNN models has not been, to the best of our knowledge, attempted before. The closest approach, transfer learning, attempts to reuse the entire DNN model for another problem. Could we decompose and reuse parts of a DNN model? To that end, we introduce the novel idea of decomposing a trained DNN model into DNN modules. Once the model has been decom- posed, the modules of that model might be reused to create a com- pletely different DNN model, for instance, a DNN model that needs logic present in two different existing models can be created by composing DNN modules from those two models, without having to retrain. DNN decomposition also enables replacement. A DNN module can be replaced by another module without having to re- train the DNN. The replacement could be needed for performance improvement or for replacing a functionality with a different one. To introduce our notion of DNN decomposition, we have fo- cused on decomposing DNN models for multi-label classification problems. We propose a series of techniques for decomposing a DNN model for n-label classification problem into n DNN modules, one for each label in the original model. We consider each label as a concern, and view this decomposition as a separation of concerns problem. Each DNN module is created due to its ability to hide one concern. As expected, a concern is tangled with other concerns and we have proposed initial strategies to identify and eliminate concern interaction. We have evaluated our DNN decomposition approach using 16 different models for four canonical datasets (MNIST, Fashion MNIST, EMNIST, and Kuzushiji MNIST). We have exper- imented with six approaches for decomposition, each successively refining the former. Our evaluation shows that for the majority of the DNN models (9 out of 16), decomposing a DNN model into modules and then composing the DNN modules together to form a DNN model that is functionally equivalent to the original model, but more modular, does not lead to any loss of performance in terms of model accuracy. We also examine intra- and inter-dataset reuse, where DNN modules are used to solve a different problem using the same training dataset or entirely different problem using an entirely different dataset. Our results show DNN models trained by reusing DNN modules are at least as good as DNN models trained from scratch for MNIST (+0.30%), FMNIST (+0.00%), EMNIST (+0.62%), and KMNIST (+0.05%), We have also evaluated replacement, where a DNN module is replaced by another and see similarly encouraging results.
The source code and the results can be found at https://github.com/rangeetpan/decomposeDNNintoModules
If you have any question, please contact the authors: [Rangeet Pan] (rangeet@iastate.edu) and [Hridesh Rajan] (hridesh@iastate.edu)
For more information, please see Contact.md
This project is licensed under the MIT License - see the LICENSE.md file for details.
Results are excepted to be different compared to that presented in the paper.