Update run.py | allow assigning different activation functions to em… (#461)

Vibsteamer · web-flow · commit 8f26aeac16e2 · 2021-07-16T09:27:09.000+08:00
* Update run.py | allow assigning different activation functions to embedding nets and fitting nets of each model Scenario: Models trained with the same assignment of the kinds of activation functions may give close inferences for the same configuration that is sampled from 01.model_devi, even when the feature in atomic environment of this configuration has not been well covered by the current data sets (models give similarly bad inference). The degree of the “similarly bad” maybe related to the feature in atomic environment, then may be different when configurations are sampled from different phases. Problem: In some circumstances, the deviation of the model ensemble may be scattered in a wide range for when 01.model_devi is initiated from different phases, thus introduce difficulties in setting the common trust_lo. Similar description could be found in issue #453. This update: Further update the expected format of the value for the key "model_devi_activation_func" in param.json, from a list (i.e. ["tanh","tanh","gelu","gelu"] for 4 models) into a list of list (i.e. [["tanh","tanh"],["tanh","gelu"],["gelu","tanh"],["gelu","gelu"] for 4 models ]. The index of the second dimension allows assigning different activation functions to embedding nets and fitting nets of the same model (The original version already allows assigning different activation functions to different models, but within each model, embedding and fitting nets using the same ones). note: This update is a preview feature. Please ensure this feature is secure and applicable for your occasion. Large enough "stop_batch" might be need to avoid models with different activation functions to deviate on configurations that had been well covered by the current data sets, due to their different sensitivities to training lengths and possibly insufficient training of particular models. The function of "init-model" supported by DP-GEN maybe a good choice to try in some circumstances (please see related keys such as "training_reuse_iter" and "training_init_model"). * Update run.py backward compatibility for the original 1-dim list * Create README.md for "model_devi_activation_func"
diff --git a/README.md b/README.md
@@ -541,7 +541,7 @@ The bold notation of key (such aas **type_map**) means that it's a necessary key
 | **model_devi_e_trust_hi**  | Float | 1e10                                                         | Upper bound of energies for the selection. |
 | **model_devi_clean_traj**  | Boolean | true                                                         | Deciding whether to clean traj folders in MD since they are too large. |
 | **model_devi_nopbc**  | Boolean | False                                                         | Assume open boundary condition in MD simulations. |
-| model_devi_activation_func | List of String | ["tanh", "tanh", "tanh", "tanh"]	| Set activation functions for models, length of the list should be the same as `numb_models` |
+| model_devi_activation_func | List of list of string | [["tanh","tanh"],["tanh","gelu"],["gelu","tanh"],["gelu","gelu"]]	| Set activation functions for models, length of the List should be the same as `numb_models`, and two elements in the list of string respectively assign activation functions to the embedding and fitting nets within each model. *Backward compatibility*: the orginal "List of String" format is still supported, where embedding and fitting nets of one model use the same activation function, and the length of the List should be the same as `numb_models`|
 | **model_devi_jobs**        | [<br/>{<br/>"sys_idx": [0], <br/>"temps": <br/>[100],<br/>"press":<br/>[1],<br/>"trj_freq":<br/>10,<br/>"nsteps":<br/> 1000,<br/> "ensembles": <br/> "nvt" <br />},<br />...<br />] | List of dict | Settings for exploration in `01.model_devi`. Each dict in the list corresponds to one iteration. The index of `model_devi_jobs` exactly accord with index of iterations |
 | **model_devi_jobs["sys_idx"]**    | List of integer           | [0]                                                          | Systems to be selected as the initial structure of MD and be explored. The index corresponds exactly to the `sys_configs`. |
 | **model_devi_jobs["temps"]**  | List of integer | [50, 300] | Temperature (**K**) in MD
diff --git a/dpgen/generator/run.py b/dpgen/generator/run.py
@@ -390,8 +390,12 @@ def make_train (iter_index,
             if LooseVersion(mdata["deepmd_version"]) < LooseVersion('1'):
                 raise RuntimeError('model_devi_activation_func does not suppport deepmd version', mdata['deepmd_version'])
             assert(type(model_devi_activation_func) is list and len(model_devi_activation_func) == numb_models)
-            jinput['model']['descriptor']['activation_function'] = model_devi_activation_func[ii]
-            jinput['model']['fitting_net']['activation_function'] = model_devi_activation_func[ii]
+            if len(np.array(model_devi_activation_func).shape) == 2 :                                    # 2-dim list for emd/fitting net-resolved assignment of actF
+                jinput['model']['descriptor']['activation_function'] = model_devi_activation_func[ii][0]
+                jinput['model']['fitting_net']['activation_function'] = model_devi_activation_func[ii][1]
+            if len(np.array(model_devi_activation_func).shape) == 1 :                                    # for backward compatibility, 1-dim list, not net-resolved
+                jinput['model']['descriptor']['activation_function'] = model_devi_activation_func[ii]
+                jinput['model']['descriptor']['activation_function'] = model_devi_activation_func[ii]
         # dump the input.json
         with open(os.path.join(task_path, train_input_file), 'w') as outfile:
             json.dump(jinput, outfile, indent = 4)