Difference between revisions of "Mimi Tzeng should make software executable by others"

From Geoscience Paper of the Future
Jump to: navigation, search
(Added PropertyValue: Expertise = geosciences)
Line 1: Line 1:
 
[[Category:Task]]
 
[[Category:Task]]
 
<br/><b>Details on how to do this task:</b> [[Make software executable by others]]<br/><br/>
 
<br/><b>Details on how to do this task:</b> [[Make software executable by others]]<br/><br/>
 +
 +
I can tell just by reading the instructions that this is going to be a major pain, because: Matlab.
 +
 +
First, as noted in "make sure software is usable", the version of Matlab is hugely important and I should've noted which one I was using when I did the original processing. I think it was 2010b or something. I am kind of interested whether something like Docker or Vagrant is possible for past versions of Matlab, when Matlab is proprietary, expensive, and has extremely restrictive licensing. If it were, I would be able to check that the software works in the previous version and then just note what version to use. Failing that: it will probably take an enormous amount of time and effort to get the scripts to run correctly in the 2015 version of Matlab.
 +
 +
Other concerns: the Matlab scripts are not as automated as they might first appear, because every single batch of data has some sort of issue with it that requires adjusting things in the code. I've automated as much as possible for the most common problems, such as sensor/sensor package lost battery power halfway through deployment, sensor/sensor package completely missing due to malfunction or just not deployed, variables not always in the same order, variable missing from a particular sensor package because the sensor that measures it malfunctioned or was removed, new variables due to new sensors added, etc. There is also the case where one of the ten thermistors also has a pressure sensor and it's present at every other deployment. The project as a whole started out in 2004 with 20 thermistors, 10 at a time spaced equally through the water column; in 2011 it was down to 10, with 5 of them at a time placed at strategic depths of interest to physical oceanographers.
 +
 +
And that's the core problem about having this software be executable by others. The scripts are highly specific to the particular mooring with the particular sensors in their particular deployment plan. Nobody else will have the exact same set of sensors doing this exact thing. Also, each PI will be interested in seeing different types and formats of preliminary figures and data files from any other PIs, so the outputs won't necessarily make everyone equally happy either.
 +
 +
So what should I adjust to make it more broadly useful to others? I can add code to ask a whole lot of "does X sensor have Y variable this time? If so, it's # what in the input file?" This will get very annoying to have to answer each and every time, which is why I just made a note in my processing steps instructions to check and adjust the variable order in the code directly. Can I just add to my processing steps instructions instead, and say "check and adjust these line numbers in the input file against these line numbers in the Matlab script" ?
 +
 +
 
<!-- Add any wiki Text above this Line -->
 
<!-- Add any wiki Text above this Line -->
 
<!-- Do NOT Edit below this Line -->
 
<!-- Do NOT Edit below this Line -->

Revision as of 17:41, 20 March 2015


Details on how to do this task: Make software executable by others

I can tell just by reading the instructions that this is going to be a major pain, because: Matlab.

First, as noted in "make sure software is usable", the version of Matlab is hugely important and I should've noted which one I was using when I did the original processing. I think it was 2010b or something. I am kind of interested whether something like Docker or Vagrant is possible for past versions of Matlab, when Matlab is proprietary, expensive, and has extremely restrictive licensing. If it were, I would be able to check that the software works in the previous version and then just note what version to use. Failing that: it will probably take an enormous amount of time and effort to get the scripts to run correctly in the 2015 version of Matlab.

Other concerns: the Matlab scripts are not as automated as they might first appear, because every single batch of data has some sort of issue with it that requires adjusting things in the code. I've automated as much as possible for the most common problems, such as sensor/sensor package lost battery power halfway through deployment, sensor/sensor package completely missing due to malfunction or just not deployed, variables not always in the same order, variable missing from a particular sensor package because the sensor that measures it malfunctioned or was removed, new variables due to new sensors added, etc. There is also the case where one of the ten thermistors also has a pressure sensor and it's present at every other deployment. The project as a whole started out in 2004 with 20 thermistors, 10 at a time spaced equally through the water column; in 2011 it was down to 10, with 5 of them at a time placed at strategic depths of interest to physical oceanographers.

And that's the core problem about having this software be executable by others. The scripts are highly specific to the particular mooring with the particular sensors in their particular deployment plan. Nobody else will have the exact same set of sensors doing this exact thing. Also, each PI will be interested in seeing different types and formats of preliminary figures and data files from any other PIs, so the outputs won't necessarily make everyone equally happy either.

So what should I adjust to make it more broadly useful to others? I can add code to ask a whole lot of "does X sensor have Y variable this time? If so, it's # what in the input file?" This will get very annoying to have to answer each and every time, which is why I just made a note in my processing steps instructions to check and adjust the variable order in the code directly. Can I just add to my processing steps instructions instead, and say "check and adjust these line numbers in the input file against these line numbers in the Matlab script" ?