Hi all,
I have uploaded on github a skeleton python script that contains my thoughts on a software implementation. This is very early stage and will probably change considerably.
See
https://github.com/OpenBioSim/openbiosetup/tree/feature_driver
The current design has three categories of data
sequences
structures
models
Each specialises to various types (protein, organic, dna ...)
This suggests that work towards a minimal implementation could be split by having one group focus on 'protein' molecules and another group on 'organic' molecules.
The 'protein' group would have to implement:
- a protein sequences reader (see loadSequences )
- a protein structures reader (see loadStructures)
And then work on mapping sequences on structures (see mapProteinSequences)
The 'organic' group would have to implement
- an organic sequences (SMILES) reader (see loadSequences)
- a molecules structures reader (see loadStructures)
And then work on mapping organic sequences onto structures (see mapOrganicSequences)
From there, a common procedure to embed models into a system can be pursued (see embedModels), followed by solvation (see solvate).
The final stage is to output the data to a variety of file formats needed for an actual simulation setup.
I have some ideas based on the FESetup project for a first implementation of mapOrganicSequences .
I am hoping that the actions to be done in mapProteinSequences can re-use some of what has gone into the ENSEMBLER tool. This suggests a natural split of tasks between the Michel/Chodera labs.
Please post your comments here.